[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

srowen Thu, 18 May 2017 01:53:44 -0700

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18022#discussion_r117192420
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala 
---
    @@ -1624,15 +1628,15 @@ object ALS extends DefaultParamsReadable[ALS] with 
Logging {
                 val srcFactor = sortedSrcFactors(blockId)(localIndex)
                 val rating = ratings(i)
                 if (implicitPrefs) {
    -              // Extension to the original paper to handle b < 0. 
confidence is a function of |b|
    -              // instead so that it is never negative. c1 is confidence - 
1.0.
    +              // Extension to the original paper to handle rating < 0. 
confidence is a function
    +              // of |rating| instead so that it is never negative. c1 is 
confidence - 1.
                   val c1 = alpha * math.abs(rating)
    -              // For rating <= 0, the corresponding preference is 0. So 
the term below is only added
    -              // for rating > 0. Because YtY is already added, we need to 
adjust the scaling here.
    -              if (rating > 0) {
    +              // For rating <= 0, the corresponding preference is 0. So 
the second argument of add
    +              // is only there for rating > 0.
    +              if (rating > 0.0) {
                     numExplicits += 1
    -                ls.add(srcFactor, (c1 + 1.0) / c1, c1)
                   }
    +              ls.add(srcFactor, if (rating > 0.0) 1.0 + c1 else 0.0, c1)
    --- End diff --
    
    Is this the substance of the change? I might need some help understanding 
why this is needed. Yes, even negative values should be recorded for implicit 
prefs, I agree. It adds 1 + c1 now instead of (1 + c1) / c1, so that's why the 
factor of c is taken out above?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

Reply via email to