So why it's called hadoop streaming, if it doesn't behave like a streaming application (The reduces don't receive data as long as it is produced by the map tasks)?
On 16 January 2013 05:41, Jeff Bean <jwfb...@cloudera.com> wrote: > me property. The reduce method is not called until the mappers are done, and > the reducers are not scheduled before the threshold set by > mapred.reduce.slowstart.completed.maps is reached. -- Best regards,