the in-memory implementation has basically similar precision of the original Random Forests described by Breiman. The following Jira post shows some results obtained on the same datasets used in Breiman's paper:
https://issues.apache.org/jira/browse/MAHOUT-122?focusedCommentId=12718777&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12718777 On Tue, Jul 12, 2011 at 4:13 PM, Ted Dunning <[email protected]> wrote: > I don't believe that Mahout's random forests have been used in production. > I have heard that some people got pretty good results in testing. > > On Tue, Jul 12, 2011 at 6:03 AM, Xiaobo Gu <[email protected]> wrote: > > > Hi, > > > > When the training data set can be loaded into memory, or each split > > can be, what's accuracy of the decision forest algorithm, compared > > with LogisticRegression. Do you have production usages with random > > forest? > > > > Regards, > > > > Xiaobo Gu > > >
