Re: Mahout 0.8 Random Forest Accuracy

2013-10-18 Thread Ted Dunning
On Fri, Oct 18, 2013 at 7:48 AM, Tim Peut t...@timpeut.com wrote: Has anyone found that Mahout's random forest doesn't perform as well as other implementations? If not, is there any reason why it wouldn't perform as well? This is disappointing, but not entirely surprising. There has been

Re: Mahout 0.8 Random Forest Accuracy

2013-10-18 Thread Ted Dunning
On Fri, Oct 18, 2013 at 3:50 PM, j.barrett Strausser j.barrett.straus...@gmail.com wrote: How difficult would it be to wrap the RF classifier into an ensemble learner? It is callable. Should be relatively easy.

RE: Mahout 0.8 Random Forest Accuracy

2013-10-18 Thread DeBarr, Dave
Another difference... R's randomForest package (which RRF is based on) evaluates subsets of values when partitioning nominal values. [This is why it complains if there are more than 32 distinct values for a nominal variable.] For example, if our nominal variable has values { A, B, C, D }, the

Re: Mahout 0.8 Random Forest Accuracy

2013-10-18 Thread Sean Owen
Yes I looked at the impl here, and I think it is aging, since I'm not sure Deneche had time to put in many bells or whistles at the start, and not sure it's been touched much since. My limited experience is that it generally does less clever stuff than R, which in turn is less clever than sklearn

Re: Mahout 0.8 Random Forest Accuracy

2013-10-18 Thread Tim Peut
Thanks for the info and suggestions everyone. On 19 October 2013 01:00, Ted Dunning ted.dunn...@gmail.com wrote: On Fri, Oct 18, 2013 at 3:50 PM, j.barrett Strausser j.barrett.straus...@gmail.com wrote: How difficult would it be to wrap the RF classifier into an ensemble