It seems MLlib right now doesn't support weighted training, training samples have equal importance. Weighted training can be very useful to reduce data size and speed up training.
Do you have plan to support it in future? The data format will be something like: label:*weight * index1:value1 index2:value2 ... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/LabeledPoint-with-weight-tp10291.html Sent from the Apache Spark User List mailing list archive at Nabble.com.