Hi all,
Back in september there was a bunch of machine learning profile results
published here:
https://github.com/szilard/benchm-ml/
Spark's Random Forest seemed to fall down with memory issues at about
10m entries:
https://github.com/szilard/benchm-ml/blob/master/2-rf/5c-spark-crash.txt
It was discussed for a bit here:
https://github.com/szilard/benchm-ml/issues/19
But I haven't seen an update. Is there an open ticket on the Spark JIRA?
I didn't see any in the searches I made:
https://issues.apache.org/jira/issues/?jql=text%20~%20%22bench-ml%22
https://issues.apache.org/jira/issues/?jql=text%20~%20%22randomforest%20gc%22
I have a user who is trying to use Spark's RF implementation but is
running into memory issues which look exactly like the ones seen in the
benchmarking example.
Thanks,
Ewan
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org