All this feedback is informative and valuable -- Thanks! - Bob Futrelle Northeastern U.
On Dec 5, 2007 12:55 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > > I just read the xgrid page and it is clear that apple has pushed on the > following parameters (they may be doing lots of other cool stuff that I > don't know about): > > A) auto-configuration > B) wider distribution of computation > C) local checkpointing of processes for restarts > > What they have apparently not done includes > > X) map/reduce > Y) magic process restarts in the face of failure (see map/reduce) > Z) distributed file system > > When newbies try to run hadoop the ALWAYS seem to run head-long into the > lack of (A) (how many times has somebody essentially said "I have a totally > screwed up DNS and hadoop won't run"?). > > Item (B) is probably a bad thing for hadoop given the bandwidth required for > the shuffle phase. > > Item (C) is inherent in map-reduce and is pretty neutral either way. > > > > > > On 12/5/07 9:23 AM, "Ted Dunning" <[EMAIL PROTECTED]> wrote: > > > > > > > Sorry about not addressing this. (and I appreciate your gentle prod) > > > > The Xgrid would likely work well on these problems. They are, after all, > > nearly trivial to parallelize because of clean communication patterns. > > > > Consider an alternative problem of solving n-body gravitational dynamics for > > n > 10^6 bodies. Here there is nearly universal communication. > > > > As another example, last week I heard from some Sun engineers that one of > > their HPC systems had to satisfy a requirement for checkpointing large > > numerical computations in which a large number of computational nodes were > > required to dump 10's of TB of checkpoint data to disk in less than 10 > > seconds. > > > > Finally, many of these HPC systems are designed to fit the entire working > > set into memory so that high numerical computational throughput can be > > maintained. In this regime, communications have to work on memory > > time-scales rather than disk time-scales. > > > > None of these three example problems are very suitable for Hadoop. > > > > The sample problems you gave are a different matter. > > > > > > On 12/5/07 2:04 AM, "Bob Futrelle" <[EMAIL PROTECTED]> wrote: > > > >> why an Xgrid cluster with its attendant.management system > >> would or would not be equally good for these problems > > > >
