GitHub user mdagost opened a pull request:
https://github.com/apache/spark/pull/2636
SPARK-3770: Make userFeatures accessible from python
https://issues.apache.org/jira/browse/SPARK-3770
We need access to the underlying latent user features from python. However,
the userFeatures RDD from the MatrixFactorizationModel isn't accessible from
the python bindings. I've added a method to the underlying scala class to turn
the RDD[(Int, Array[Double])] to an RDD[String]. This is then accessed from the
python recommendation.py
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mdagost/spark mf_user_features
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2636.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2636
----
commit e1fbe5e82a6b9436ce745175670cd005f6481173
Author: Michelangelo D'Agostino <[email protected]>
Date: 2014-10-02T13:33:45Z
Added scala function to stringify userFeatures for access in python.
commit cdd98e3a43cc465844a3b38432f4edc679ffa0dd
Author: Michelangelo D'Agostino <[email protected]>
Date: 2014-10-02T16:05:48Z
It's working now.
commit 34cb2a2889649e3f29f1686745320884f1fbc945
Author: Michelangelo D'Agostino <[email protected]>
Date: 2014-10-02T21:41:51Z
A couple of lint cleanups and a comment.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]