Why don't we start from https://github.com/ashenfad/hadooptree ?
On 20 February 2013 13:25, Marty Kube <[email protected]> wrote: > Hi Lorenz, > > Very interesting, that's what I was asking for when I mentioned non-MR > implementations :-) > > I have not looked at spark before, interesting that it uses Mesos for > clustering. I'll check it out. > > > On 02/19/2013 09:32 PM, Lorenz Knies wrote: >> >> Hi Marty, >> >> i am currently working on a PLANET-like implementation on top of spark: >> http://spark-project.org >> >> I think this framework is a nice fit for the problem. >> If the input data fits into the "total cluster memory" you benefit from >> the caching of the RDD's. >> >> regards, >> >> lorenz >> >> >> On Feb 20, 2013, at 2:42 AM, Marty Kube <[email protected]> >> wrote: >> >>> You had mentioned other "resource management" platforms like Giraph or >>> Mesos. I haven't looked at those yet. I guess I was think of other >>> parallelization frameworks. >>> >>> It's interesting that the planet folks thought it was really worthwhile >>> working on top of map reduce for all of the resource management that is >>> built in. >>> >>> >>> On 02/19/2013 08:04 PM, Ted Dunning wrote: >>>> >>>> If non-MR means map-only job with communicating mappers and a state >>>> store, >>>> I am down with that. >>>> >>>> What did you mean? >>>> >>>> On Tue, Feb 19, 2013 at 5:53 PM, Marty Kube < >>>> [email protected]> wrote: >>>> >>>>> Right now I'd lean towards the planet model, or maybe a non-MR >>>>> implementation. Anyone have a good idea for a non-MR solution? >>>>> > -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ [email protected] | +447799647538
