Hi Tomasz Parallel evaluation for CrossValidation and TrainValidationSplit was added for Spark 2.3 in https://issues.apache.org/jira/browse/SPARK-19357
On Wed, 29 Nov 2017 at 16:31 Tomasz Dudek <megatrontomaszdu...@gmail.com> wrote: > Hey, > > is there a way to make the following code: > > val paramGrid = new ParamGridBuilder().//omitted for brevity - lets say we > have hundreds of param combinations here > > val cv = new > CrossValidator().setNumFolds(3).setEstimator(pipeline).setEstimatorParamMaps(paramGrid) > > automatically distribute itself over all the executors? What I mean is > to simultaneously compute few(or hundreds of it) ML models, instead of > using all the computation power on just one model at time. > > If not, is such behavior in the Spark's road map? > > ...if not, do you think a person without prior Spark development > experience(me) could do it? I'm using SparkML daily, since few months, at > work. How much time would it take, approximately? > > Yours, > Tomasz > > >