Re: Out-of-core random forest implementation

Lorenz Knies Tue, 19 Feb 2013 18:33:05 -0800

Hi Marty,

i am currently working on a PLANET-like implementation on top of spark: 
http://spark-project.org


I think this framework is a nice fit for the problem.
If the input data fits into the "total cluster memory" you benefit from the 
caching of the RDD's.

regards,

lorenz


On Feb 20, 2013, at 2:42 AM, Marty Kube <[email protected]> wrote:

> You had mentioned other "resource management" platforms like Giraph or Mesos. 
>  I haven't looked at those yet.  I guess I was think of other parallelization 
> frameworks.
> 
> It's interesting that the planet folks thought it was really worthwhile 
> working on top of map reduce for all of the resource management that is built 
> in.
> 
> 
> On 02/19/2013 08:04 PM, Ted Dunning wrote:
>> If non-MR means map-only job with communicating mappers and a state store,
>> I am down with that.
>> 
>> What did you mean?
>> 
>> On Tue, Feb 19, 2013 at 5:53 PM, Marty Kube <
>> [email protected]> wrote:
>> 
>>> Right now I'd lean towards the planet model, or maybe a non-MR
>>> implementation.  Anyone have a good idea for a non-MR solution?
>>> 
>

Re: Out-of-core random forest implementation

Reply via email to