Also, we don't have any mappings for Spark Streaming -- so if your implementation heavily relies on Spark streaming, i think Spark itself is the right place for it to be a part of.
On Tue, Jun 17, 2014 at 5:59 PM, Andy Twigg <andy.tw...@gmail.com> wrote: > Hi Sebastian - sorry about the lack of activity here. I've looked at > the scala dsl, but I think it makes more sense to push this work into > MLLib as it really relies on spark streaming and RDDs. I'm not how you > would build the streaming abstraction within the current DSL setup. > Let me know if I'm missing something. > > On 17 May 2014 23:23, Sebastian Schelter (JIRA) <j...@apache.org> wrote: > > > > [ > https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > > > Sebastian Schelter resolved MAHOUT-1153. > > ---------------------------------------- > > > > Resolution: Won't Fix > > > > no activity for more than a month > > > >> Implement streaming random forests > >> ---------------------------------- > >> > >> Key: MAHOUT-1153 > >> URL: https://issues.apache.org/jira/browse/MAHOUT-1153 > >> Project: Mahout > >> Issue Type: New Feature > >> Components: Classification > >> Reporter: Andy Twigg > >> Labels: features > >> Fix For: 1.0 > >> > >> > >> The current random forest implementations are in-core and not scalable. > This issue is to add an out-of-core, scalable, streaming implementation. > Initially it could be based on [1], and using mappers in a master-worker > style. > >> [1] > http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf > > > > > > > > -- > > This message was sent by Atlassian JIRA > > (v6.2#6252) >