Hi, I'm running RandomForest model from Spark ML API on a medium sized data (2.25 million rows and 60 features), most of my time goes in the CollectAsMap of RandomForest but I've no option to avoid it as it is in the API.
Is there a way to cutshort my end to end runtime? Thanks, Aakash.