pass unique ID to mllib algorithms pyspark

jamborta Tue, 04 Nov 2014 02:32:02 -0800

Hi all, 

There are a few algorithms in pyspark where the prediction part is
implemented in scala (e.g. ALS, decision trees) where it is not very easy to
manipulate the prediction methods.


I think it is a very common scenario that the user would like to generate
prediction for a datasets, so that each predicted value is identifiable
(e.g. have a unique id attached to it). this is not possible in the current
implementation as predict functions take a feature vector and return the
predicted values where, I believe, the order is not guaranteed, so there is
no way to join it back with the original data the predictions are generated
from. 

Is there a way around this at the moment? 

thanks, 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/pass-unique-ID-to-mllib-algorithms-pyspark-tp18051.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

pass unique ID to mllib algorithms pyspark

Reply via email to