I train my LogisticRegressionModel like this, I want my model to retain only
some of the features(e.g. 500 of them), not all the 5555 features. What shou I
do?
I use .setElasticNetParam(1.0), but still all the features is in
lrModel.coefficients.
import org.apache.spark.ml.classification.LogisticRegression
val
data=spark.read.format("libsvm").option("numFeatures","5555").load("/tmp/data/training_data3")
val Array(trainingData, testData) = data.randomSplit(Array(0.5, 0.5),
seed = 1234L)
val lr = new LogisticRegression()
val lrModel = lr.fit(trainingData)
println(s"Coefficients: ${lrModel.coefficients} Intercept:
${lrModel.intercept}")
val predictions = lrModel.transform(testData)
predictions.show()
Thanks,
lujinhong