For those who are not aware of 6-year old history :-), Sameer, Owen and I made a trip to Wisconsin, Madison to meet with Miron Livny, who built Condor, exploring how matchmaking could be used with MapReduce (pre-hadoop) in October 2005. (We believed it could be used for locality-aware scheduling.) One thing that came out of that meeting was that Condor folks were not ready to incorporate multi-threading, which we felt was needed for scheduler responsiveness.
- Milind On 9/13/11 9:46 PM, "Brian Bockelman" <bbock...@cse.unl.edu> wrote: > >On Sep 13, 2011, at 7:20 AM, Steve Loughran wrote: > >> >> I missed a talk at the local university by a Platform sales rep last >>month, though I did get to offend one of the authors of condor team >>instead [1]. by pointing out that all grid schedulers contain a major >>assumption: that storage access times are constant across your cluster. >>It is if you can pay for something like GPFS, but you don't get 50TB of >>GPFS storage for $2500, which is what adding 25*2TB SATA drives would >>cost if you stuck them on your compute nodes; $7500 for a fully >>replicated 50TB. That's why I'm not a fan of grid systems -cost of >>storage and networking aren't taken into account. Then there's the >>availablity issues with the larger filesystems, that are a topic for >>another day. > >For what it's worth - I do know folks who have done (are doing) data >locality with Condor. Condor is wonderfully flexible, easily flexible >enough to shoot yourself in the foot. There was also a grad student who >did work in allowing Condor to fire up Hadoop datanodes and job trackers >directly. > >For the most part you are right though - all these systems have long >treated nodes as individual, independent units (either because the >systems were job-oriented, not data oriented, or because they ran at >supercomputing centers where money was no concern). > >This is starting to change, but change is always frustratingly slow. On >the upside, we now have single Condor pools that span 80 sites around the >globe and it is easy to have two Condor pools interoperate and exchange >jobs. So, each system has its own strengths and weaknesses. > >Brian