Hi! I'm playing with random forest implementation in Apache Spark. First impression is - it is not fast :-(
Does somebody know how random forest is parallelized in Spark? I mean both fitting and predicting. And also what do mean this parameters? Didn't find documentation for them. maxMemoryInMB=256, cacheNodeIds=False, checkpointInterval=10 Sergey