Re: Poly-reduce?

Doug Cutting Fri, 24 Aug 2007 09:54:27 -0700

Ted Dunning wrote:

It isn't hard to implement these programs as multiple fully fledged
map-reduces, but it appears to me that many of them would be better
expressed as something more like a map-reduce-reduce program.


[ ... ]

Expressed conventionally, this would have write all of the user sessions to
HDFS and a second map phase would generate the pairs for counting.  The
opportunity for efficiency would come from the ability to avoid writing
intermediate results to the distributed data store.

Has anybody looked at whether this would help and whether it would be hard

to do?

It would job tracker more complicated, and might not help job executiontime that much.

Consider implementing this as multiple map reduce steps, but using areplication level of one for intermediate data. That would mostly havethe performance characteristics you want. But if a node died, thingscould not intelligently automatically re-create just the missing data.Instead the application would have to re-run the entire job, or subsetsof it, in order to re-create the un-replicated data.

Under poly-reduce, if a node failed, all tasks that were incomplete onthat node would need to be restarted. But first, their input data wouldneed to be located. If you saved all intermediate data in the course ofa job (which would be expensive) then the inputs that need re-creationwould mostly just be those that were created on the failed node. Butthis failure would generally cascade all the way back to the initial mapstage. So a single machine failure in the last phase could double therun time of the job, with most of the cluster idle.

If, instead, you used normal mapreduce, with intermediate datareplicated in the filesystem, a single machine failure in the last phasewould only require re-running tasks from the last job.

Perhaps, when chaining mapreduces, one should use a lower replicationlevel for intermediate data, like two. Additionally, one might wish torelax the one-replica-off-rack criterion for such files, so thatreplication is faster, and since whole-rack failures are rare. Thismight give good chained performance, but keep machine failures fromknocking tasks back to the start of the chain. Currently its notpossible to disable the one-replica-off-rack preference, but that mightbe a reasonable feature request.


Doug

Re: Poly-reduce?

Reply via email to