Ted gave a very good summary of the situation. I do have plans to get rid of
the memory limitation and already started working on a solution, but
unfortunately I am lacking the necessary time and motivation to get it done
:(

On Sun, Aug 14, 2011 at 11:12 AM, Xiaobo Gu <[email protected]> wrote:

> Do you have any plan to get rid of the memory limitation in Random Forest?
>
> Regards,
>
> Xiaobo Gu
>
> On Thu, Jul 7, 2011 at 11:48 PM, Ted Dunning <[email protected]>
> wrote:
> > The summary of the reason is that this was a summer project and
> > parallelizing the random forest algorithm at all was a big enough
> project.
> >
> > Writing a single pass on-line algorithm was considered a bit much for the
> > project size.  Figuring out how to make multiple passes through an input
> > split was similarly out of scope.
> >
> > If you have a good alternative, this would be of substantial interest
> > because it could improve the currently limited scalability of the
> decision
> > forest code.
> >
> > On Thu, Jul 7, 2011 at 8:20 AM, Xiaobo Gu <[email protected]>
> wrote:
> >
> >> Why can't a tree be built against a dataset resides on the disk as
> >> long as we can read it ?
> >>
> >
>

Reply via email to