Ah, ok. I think I did read about that partition function in the Google paper but I hand't put it quite together yet. So a pre-pass can make a partition function work such that the final output is well sorted. So no further map reduce necessary and that was a bad example.

But back to my original question... Doug suggests that dependence on a driver process is acceptable. But has anyone needed true MapReduce chaining or tried it successfully? Or is it generally accepted that a multi-MapReduce algorithm should always be driven by a single process?

Doug Cutting wrote:
James Kennedy wrote:
So far what I've had trouble finding examples of MapReduce jobs that are kicked-off by some one time process that in turn kick off other MapReduce jobs long after the initial driver process is dead. This would be more distributed and fault tolerant since it removes dependency on a driver process.

Yes, but it wouldn't be that much more fault tolerant. The biggest cause of failures isn't particular nodes failing, but that some nodes fail. A driver program only fails if the particular node running it fails. If the MTBF of a particular node is ~1 year, that's probably okay for a driver program, since driver programs only need to run for hours or days at the most. However it's a problem if you have 1000 nodes, and see 3+ failures on average per day, and you require that all nodes stay up for the duration of a job.

Also, I notice that both Google and Hadoop's example of the distributed sort fails to deal with the fact that the result is multiple sorted files... this isn't a complete sort since the output files still need to be merge-sorted don't they? To complete the algorithm, could the Reducer kick of a subsequent merge sort MapReduce on the result files? Or maybe there's something I'm not understanding...

Yes, MapReduce doesn't actually do a full sort. It produces a set of sorted partitions. Sometimes the partition function can arrange things so that this is in fact a full sort, but frequently it is just a hash function. Google mentions this in the original MapReduce paper:

   We guarantee that within a given partition, the intermediate
   key/value pairs are processed in increasing key order.
   This ordering guarantee makes it easy to generate
   a sorted output file per partition, which is useful when
   the output file format needs to support efficient random
   access lookups by key, or users of the output find it convenient
   to have the data sorted. (from page 6)

and

   Our partitioning function for this benchmark has builtin
   knowledge of the distribution of keys. In a general
   sorting program, we would add a pre-pass MapReduce
   operation that would collect a sample of the keys and
   use the distribution of the sampled keys to compute splitpoints
   for the final sorting pass. (from page 9)

Doug

Reply via email to