Hi Suneel, I spent a significant amount of effort trying to get this working against 0.8, but unfortunately it seemed a bad fit. Instead I wrote a version against spark, which is now available as a service - http://featurestream.io
I'm open to open-sourcing it, but I wanted to see what use cases would come out of it first. If anyone has any good idea, let me know. Cheers, Andy -- [email protected] On 5 November 2013 05:27, Suneel Marthi (JIRA) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813912#comment-13813912] > > Suneel Marthi commented on MAHOUT-1153: > --------------------------------------- > > Hey Andy, > > The github link doesn't work anymore, do u think this can be part of 0.9? > > > Implement streaming random forests > > ---------------------------------- > > > > Key: MAHOUT-1153 > > URL: https://issues.apache.org/jira/browse/MAHOUT-1153 > > Project: Mahout > > Issue Type: New Feature > > Components: Classification > > Reporter: Andy Twigg > > Labels: features > > Fix For: Backlog > > > > > > The current random forest implementations are in-core and not scalable. > This issue is to add an out-of-core, scalable, streaming implementation. > Initially it could be based on [1], and using mappers in a master-worker > style. > > [1] > http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf > > > > -- > This message was sent by Atlassian JIRA > (v6.1#6144) >
