On Tue, Jun 8, 2010 at 9:19 PM, deneche abdelhakim <[email protected]>wrote:

> mapred.max.split.size controls how many partitions will be generated from
> the data.
> the current implementation of random forest is pretty memory intensive, and
> because all the work is done in the mappers' close method, when the data is
> Big, Hadoop just thinks that the mappers have failed (I will solve this
> problem some day).
>

By periodically hitting Reporter.progress() in the long-lived mapper, this
typically fixes this.

  -jake

Reply via email to