Hi all I believe I have created a multi-label classifier using LogisticRegression but there is one snag. No matter what features I use to get the prediction, it will always return the label. I feel like I need to set a threshold but can't seem to figure out how to do that. I attached the code below. It's super simple. Hopefully someone can point me in the correct :
val labels = labeledPoints.map(l => l.label).take(1000).distinct // stupid hack val groupedRDDs = labels.map { l => labeledPoints.filter (m => m.label == l) }.map(l => l.cache()) // should use groupBy val models = groupedRDDs.map(rdd => new LogisticRegressionWithLBFGS().setNumClasses(101).run(rdd)) val results = models.map(m => m.predict(Vectors.dense(query.features))) Thanks Peter -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Multilabel-classification-using-logistic-regression-tp23054.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org