Hi Sean, I agree there is no need for that if the implementation actually assigns c=1 for all missing ratings but from the current implementation of ALS, I don't think it is doing that. The idea is that for missing ratings, they are assigned to c=1 (in the paper) and they do contribute to the optimization of equation (3).
The lines of code that I'm referring to is: {code} if (implicitPrefs) { // Extension to the original paper to handle b < 0. confidence is a function of |b| // instead so that it is never negative. c1 is confidence - 1.0. val c1 = alpha * math.abs(rating) // For rating <= 0, the corresponding preference is 0. So the term below is only added // for rating > 0. Because YtY is already added, we need to adjust the scaling here. if (rating > 0) { numExplicits += 1 ls.add(srcFactor, (c1 + 1.0) / c1, c1) } } else { ls.add(srcFactor, rating) numExplicits += 1 } {code} Regards, Jerry On Mon, Dec 5, 2016 at 3:27 PM, Sean Owen <so...@cloudera.com> wrote: > That doesn't mean this 0 value is literally included in the input. There's > no need for that. > > On Tue, Dec 6, 2016 at 4:24 AM Jerry Lam <chiling...@gmail.com> wrote: > >> Hi Sean, >> >> I'm referring to the paper (http://yifanhu.net/PUB/cf.pdf) Section 2: >> " However, with implicit feedback it would be natural to assign values to >> all rui variables. If no action was observed rui is set to zero, thus >> meaning in our examples zero watching time, or zero purchases on record." >> >> In the implicit setting, apparently there should have values for all >> pairs (u, i) instead of just the observed ones according to the paper. This >> is also true for other implicit feedback papers I read. >> >> In section 4, when r=0, p=0 BUT c=1. Therefore, when we optimize the >> value for this pair. (x^Ty)^2 + regularization. >> >> Do I misunderstand the paper? >> >> Best Regards, >> >> Jerry >> >> >> On Mon, Dec 5, 2016 at 2:43 PM, Sean Owen <so...@cloudera.com> wrote: >> >> What are you referring to in what paper? implicit input would never >> materialize 0s for missing values. >> >> On Tue, Dec 6, 2016 at 3:42 AM Jerry Lam <chiling...@gmail.com> wrote: >> >> Hello spark users and developers, >> >> I read the paper from Yahoo about CF with implicit feedback and other >> papers using implicit feedbacks. Their implementation require to set the >> missing rating with 0. That is for unobserved ratings, the confidence for >> those is set to 1 (c=1). Therefore, the matrix to be factorized is a dense >> matrix. >> >> I read the source code of the ALS implementation in spark (version 1.6.x) >> for implicit feedback. Apparently, it ignores rating that is 0 (Line 1159 >> in ALS.scala). It could be a mistake or it could be an optimization. Just >> want to see if anyone steps on this yet. >> >> Best Regards, >> >> Jerry >> >> >>