Pyspark ML - Unable to finish cross validation

Simone Mon, 26 Sep 2016 10:24:57 -0700

Hello,

I am using pyspark to train a Logistic Regression model using cross validation 
with ML. My dataset is - for testing purposes very small - like no more than 50 
records for train.
On the other hand, my "feature" column has a very large size - i.e., 1500+ 
columns.


I am running on yarn using 3 executors, with 4gb and 4 cores each. I am using 
cache to store dataframes.

Unfortunately, my process does not finish and hangs in doing cross validation. 

Any clues? 

Thanks guys

Simone

Pyspark ML - Unable to finish cross validation

Reply via email to