Stu,

After seeing most of the video, to a non-academic novice it's
incredible technology but "passing C++ pointers directly" in a cluster to
turn sophisticated SQL joins into multi-stage MR is complicated stuff!

Most developers will wait for a front-end so Hadoop is unaffected.
Does it even have a reduce built-in or is it a recursive merge sort?

Microsoft will let .NET developers call this thru C# in Visual Studio
(DryadLINQ) eventually giving it more uptake. I do see "Simplified SQL" as
a clone of Sawzall (Perl-like) passing data thru PC's at first and later
with Samba to Ubuntu clusters running Mono (Opensource .NET). ;)

The in-memory, "full dataset" sort functionality of Dryad and it's labeled, multiple outputcollector "stage portions" sound cool, wonder if Hadoop could
just adopt those features, do local-rack preferences and call it a day.

Like the Heritrix module moving text (byte arrays) into HDFS directly
you would map, hash a key (for reduce subset) then reduce into HBase
and be able to repeat the process. Keys for separate jobs would mimic
"stages" concept and the data would still only have one jobtracker.

Thanks for sharing. Sweet link dude.

Peter W.


Stu Hood wrote:

Google has a very interesting tech-talk up about Dryad: Microsoft's distributed execution framework. There has been a paper out about it for a while, but the video has some more information about the ways that the system has been used since it was published.

http://www.youtube.com/watch?v=WPhE5JCP2Ak

The slide comparing the time taken to spill to disk between vertices vs operating purely in memory (around minute 26) is definitely something to think about. Higher level frameworks such as HBase and Pig are already being developed on top of the MapReduce primitive, and so allowing the (perennial discussed) 'multi-reduce' concept to sneak in to Hadoop ought to be very attractive (see: http://www.nabble.com/Poly-reduce-- tf4313116.html#a13437687 ).

I really hope this will help restart the discussion of direct map- >reduce links.

Thanks,

Stu Hood

Reply via email to