[GitHub] spark issue #12896: [SPARK-14489][ML][PYSPARK] ALS unknown user/item predict...

MLnick Mon, 01 Aug 2016 08:03:52 -0700

Github user MLnick commented on the issue:

    https://github.com/apache/spark/pull/12896
  
    Your suggestion is, to me, the ideal solution. It's probably the more 
common method of splitting "ratings" datasets for CV purposes.
    
    I'm interested in working on it but I think it would be a whole new 
specific cross-validator class. I'm not quite sure what the best approach is 
for efficiency (refer #14321 for stratified sampling approach, it's more for 
labels and is not efficient for this case, but the general concept might 
apply). In short, it's obviously a lot more effort and will take time. Perhaps 
it also starts life outside of Spark in packages. Not sure on this yet, but 
happy to collaborate on ideas!
    
    Originally this PR was intended for `2.0` to at least make ALS useable with 
the CV classes.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #12896: [SPARK-14489][ML][PYSPARK] ALS unknown user/item predict...

Reply via email to