[GitHub] spark issue #17090: [Spark-19535][ML] RecommendForAllUsers RecommendForAllIt...

jkbradley Tue, 28 Feb 2017 11:11:23 -0800

Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/17090
  
    @MLnick Thanks for showing those comparison numbers.  If your 
implementation is faster, then I'm happy going with it.  I do wonder if we 
might hit scalability issues with RDDs which we would not hit with DataFrames, 
so it'd be worth revisiting a DF-based implementation later on.
    
    In terms of the API, my main worry about 
https://github.com/apache/spark/pull/12574 is that I haven't seen a full design 
of how ALS would be plugged into cross validator.  I still don't see how CV 
could handle ALS unless we specialized it for recommendation.  It was this 
uncertainty which made me comment on 
https://issues.apache.org/jira/browse/SPARK-13857 to recommend we go ahead and 
merge basic recommendAll methods, while continuing to figure out a good design 
for tuning.
    
    Feel free to push back, but I would really like to see a sketch of how ALS 
could plug into tuning.  I haven't spent the time to do a literature review on 
how tuning is generally done for recommendation, especially on the best ways to 
split the data into folds.
    
    > further methods to support recommending for all users (or items) in an 
input DF? like recommendForAllUsers(dataset: DataFrame, num: Int)
    
    I do think this sounds useful, but I'm focused on feature parity w.r.t. the 
RDD-based API right now.  It'd be nice to add later, though that could be via 
your proposed transform-based API.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #17090: [Spark-19535][ML] RecommendForAllUsers RecommendForAllIt...

Reply via email to