Re: Re : Reg: Maximum Split size in Random Forest

Jake Mannix Wed, 09 Jun 2010 10:49:27 -0700

On Tue, Jun 8, 2010 at 9:19 PM, deneche abdelhakim <[email protected]>wrote:


> mapred.max.split.size controls how many partitions will be generated from
> the data.
> the current implementation of random forest is pretty memory intensive, and
> because all the work is done in the mappers' close method, when the data is
> Big, Hadoop just thinks that the mappers have failed (I will solve this
> problem some day).
>

By periodically hitting Reporter.progress() in the long-lived mapper, this
typically fixes this.

  -jake

Re: Re : Reg: Maximum Split size in Random Forest

Reply via email to