Hi all,
Back in september there was a bunch of machine learning profile results published here:
https://github.com/szilard/benchm-ml/

Spark's Random Forest seemed to fall down with memory issues at about 10m entries:
https://github.com/szilard/benchm-ml/blob/master/2-rf/5c-spark-crash.txt

It was discussed for a bit here:
https://github.com/szilard/benchm-ml/issues/19

But I haven't seen an update. Is there an open ticket on the Spark JIRA?

I didn't see any in the searches I made:
https://issues.apache.org/jira/issues/?jql=text%20~%20%22bench-ml%22
https://issues.apache.org/jira/issues/?jql=text%20~%20%22randomforest%20gc%22

I have a user who is trying to use Spark's RF implementation but is running into memory issues which look exactly like the ones seen in the benchmarking example.

Thanks,
Ewan

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to