Re: Proper saving/loading of MatrixFactorizationModel
I know that this haven't been accepted yet but any news on it ? How can we cache the product and user factor ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Proper-saving-loading-of-MatrixFactorizationModel-tp23952p27959.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Proper saving/loading of MatrixFactorizationModel
The partitioner is not saved with the RDD. So when you load the model back, we lose the partitioner information. You can call repartition on the user/product factors and then create a new MatrixFactorizationModel object using the repartitioned RDDs. It would be useful to create a utility method for this, e.g., `MatrixFactorizationModel.repartition(num: Int): MatrixFactorizationModel`. -Xiangrui On Wed, Jul 22, 2015 at 4:34 AM, PShestov wrote: > Hi all! > I have MatrixFactorizationModel object. If I'm trying to recommend products > to single user right after constructing model through ALS.train(...) then it > takes 300ms (for my data and hardware). But if I save model to disk and load > it back then recommendation takes almost 2000ms. Also Spark warns: > 15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor does not have a > partitioner. Prediction on individual records could be slow. > 15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor is not cached. > Prediction could be slow. > 15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor does not > have a partitioner. Prediction on individual records could be slow. > 15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor is not > cached. Prediction could be slow. > How can I create/set partitioner and cache user and product factors after > loading model? Following approach didn't help: > model.userFeatures().cache(); > model.productFeatures().cache(); > Also I was trying to repartition those rdds and create new model from > repartitioned versions but that also didn't help. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Proper-saving-loading-of-MatrixFactorizationModel-tp23952.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Proper saving/loading of MatrixFactorizationModel
Hi all! I have MatrixFactorizationModel object. If I'm trying to recommend products to single user right after constructing model through ALS.train(...) then it takes 300ms (for my data and hardware). But if I save model to disk and load it back then recommendation takes almost 2000ms. Also Spark warns: 15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor does not have a partitioner. Prediction on individual records could be slow. 15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor is not cached. Prediction could be slow. 15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor does not have a partitioner. Prediction on individual records could be slow. 15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor is not cached. Prediction could be slow. How can I create/set partitioner and cache user and product factors after loading model? Following approach didn't help: model.userFeatures().cache(); model.productFeatures().cache(); Also I was trying to repartition those rdds and create new model from repartitioned versions but that also didn't help. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Proper-saving-loading-of-MatrixFactorizationModel-tp23952.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Proper saving/loading of MatrixFactorizationModel
Hi all! I have MatrixFactorizationModel object. If I'm trying to recommend products to single user right after constructing model through ALS.train(...) then it takes 300ms (for my data and hardware). But if I save model to disk and load it back then recommendation takes almost 2000ms. Also Spark warns: 15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor does not have a partitioner. Prediction on individual records could be slow. 15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor is not cached. Prediction could be slow. 15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor does not have a partitioner. Prediction on individual records could be slow. 15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor is not cached. Prediction could be slow. How can I create/set partitioner and cache user and product factors after loading model? Following approach didn't help: model.userFeatures().cache(); model.productFeatures().cache(); Also I was trying to repartition those rdds and create new model from repartitioned versions but that also didn't help. --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ---