Steve,
> >you can improve Hadoop to make it more agile; my defunct Hadoop >lifecycle branch did a lot of that, but you have to have everyone else >using Hadoop to be willing to let the changes go in -and those changes >mustn't impose a cost or risk to the physical cluster model. Until Hadoop 0.20, when Hadoop On Demand (HoD) was in widespread use, quickly bringing up a mapreduce cluster, and making it go away quickly, was an explicit goal. After that, focus shifted to multi-tenancy for MR in hadoop. When HoD went away, I made a comment on one of the internal mailing list, that it will make a comeback when Vms become first class citizens of the hadoop world. I have heard of several efforts from well-known vendors *wink* to make this happen. I have been looking closely at the defunct HoD code to see if it still can be used, but with the new MRv2 architecture, it looks like that will require major surgery. We can have the RM allocate containers, and should be able to run custom MR runtime there (essentially replacing torque in HoD with RM). Is this something you had in mind too ? - milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.)
