Yes, we could also consider committing it into the current mahout code base. There are probably some advantages over the current impl. What direction are you thinking?
On 2 March 2014 13:57, Suneel Marthi (JIRA) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917581#comment-13917581 > ] > > Suneel Marthi commented on MAHOUT-1153: > --------------------------------------- > > [~andytwigg] I understand this has been implemented on Spark and an > implementation is available at (http://featurestream.io), do u think we > should start the conversation of rolling this into Mahout? > >> Implement streaming random forests >> ---------------------------------- >> >> Key: MAHOUT-1153 >> URL: https://issues.apache.org/jira/browse/MAHOUT-1153 >> Project: Mahout >> Issue Type: New Feature >> Components: Classification >> Reporter: Andy Twigg >> Labels: features >> Fix For: Backlog >> >> >> The current random forest implementations are in-core and not scalable. This >> issue is to add an out-of-core, scalable, streaming implementation. >> Initially it could be based on [1], and using mappers in a master-worker >> style. >> [1] http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf > > > > -- > This message was sent by Atlassian JIRA > (v6.2#6252)
