Here's some links to it: Long Version: http://csse.usc.edu/csse/TECHRPTS/2008/usc-csse-2008-820/usc-csse-2008-820.pdf Shorter Version (published in WICSA): http://wwwp.dnsalias.org/w/images/3/3f/AnatomyPhysiologyGridRevisited66.pdf
Cheers, Chris On Jan 11, 2012, at 4:02 PM, Mattmann, Chris A (388J) wrote: > Also check out my paper on The Anatomy and Physiology of the Grid Revisited > just Google for it where we also tried to look at this very issue. > > Cheers, > Chris > > Sent from my iPhone > > On Jan 11, 2012, at 3:55 PM, "Brian Bockelman" <[email protected]> wrote: > >> >> On Jan 11, 2012, at 10:15 AM, George Kousiouris wrote: >> >>> >>> Hi, >>> >>> see comments in text >>> >>> On 1/11/2012 4:42 PM, Merto Mertek wrote: >>>> Hi, >>>> >>>> I was wondering if anyone knows any paper discussing and comparing the >>>> mentioned topic. I am a little bit confused about the classification of >>>> hadoop.. Is it a /cluster/comp grid/ a mix of them? >>> I think that a strict definition would be an implementation of the >>> map-reduce computing paradigm, for cluster usage. >>> >>>> What is hadoop in >>>> relation with a cloud - probably just a technology that enables cloud >>>> services.. >>> It can be used to enable cloud services through a service oriented >>> framework, like we are doing in >>> http://users.ntua.gr/gkousiou/publications/PID2095917.pdf >>> >>> in which we are trying to create a cloud service that offers MapReduce >>> clusters as a service and distributed storage (through HDFS). >>> But this is not the primary usage. This is the back end heavy processing in >>> a cluster-like manner, specifically for parallel jobs that follow the MR >>> logic. >>> >>>> >>>> Can it be compared to cluster middleware like beowulf, oscar, condor, >>>> sector/sphere, hpcc, dryad, etc? Why not? >>> I could see some similarities with condor, mainly in the job submission >>> processes, however i am not really sure how condor deals with parallel jobs. >>> >> >> Since you asked… >> >> <condor-geek> >> >> Condor has a built-in concept of a set of jobs (called a "job cluster"). On >> top of its scheduler, there is a product called "DAGMan" (DAG = directed >> acyclic graph) that can manage a large number of jobs with interrelated >> dependencies (providing a partial ordering between jobs). Condor with DAG >> is somewhat comparable to the concept of Hadoop tasks plus Oozie workflows >> (although the data aspects are very different - don't try to stretch it too >> far). >> >> Condor / PBS / LSF / {OGE,SGE,GE} / SLURM provide the capability to start >> many identical jobs in parallel for MPI-type computations, but I consider >> MPI wildly different than the sort of workflows you see with MapReduce. >> Specifically, "classic MPI" programming (the ones you see in wide use, MPI2 >> and later are improved) mostly requires all processes to start >> simultaneously and the job crashes if one process dies. I think this is why >> the Top10 computers tend to measure mean time between failure in tens of >> hours. >> >> Unlike Hadoop, Condor jobs can flow between pools (they call this >> "flocking") and pools can naturally cover multiple data centers. The >> largest demonstration I'm aware of is 100,000 cores across the US; the >> largest production pool I'm aware of is about 20-30k cores across 100 >> universities/labs on multiple continents. This is not a criticism of Hadoop >> - Condor doesn't really have the same level of data-integration as Hadoop >> does, so tackles a much simpler problem (i.e., >> bring-your-own-data-management!). >> >> </condor-geek> >> >> Brian >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
