As I recall, it is in there in the math, but doesn't appear as an explicit term in the computation. You don't actually materialize the 0 input or the "c=1" corresponding to them.
Or: do you have a computation that agrees with the paper but not this code? Put another way, none of this would scale/work if you had to materialize and compute all of these elements. On Tue, Dec 6, 2016 at 5:55 AM Jerry Lam <chiling...@gmail.com> wrote: > Hi Sean, > > I agree there is no need for that if the implementation actually assigns > c=1 for all missing ratings but from the current implementation of ALS, I > don't think it is doing that. > The idea is that for missing ratings, they are assigned to c=1 (in the > paper) and they do contribute to the optimization of equation (3). > > The lines of code that I'm referring to is: > > {code} > if (implicitPrefs) { > // Extension to the original paper to handle b < 0. > confidence is a function of |b| > // instead so that it is never negative. c1 is confidence - > 1.0. > val c1 = alpha * math.abs(rating) > // For rating <= 0, the corresponding preference is 0. So > the term below is only added > // for rating > 0. Because YtY is already added, we need to > adjust the scaling here. > if (rating > 0) { > numExplicits += 1 > ls.add(srcFactor, (c1 + 1.0) / c1, c1) > } > } else { > ls.add(srcFactor, rating) > numExplicits += 1 > } > {code} > > Regards, > > Jerry > > > On Mon, Dec 5, 2016 at 3:27 PM, Sean Owen <so...@cloudera.com> wrote: > > That doesn't mean this 0 value is literally included in the input. There's > no need for that. > > On Tue, Dec 6, 2016 at 4:24 AM Jerry Lam <chiling...@gmail.com> wrote: > > Hi Sean, > > I'm referring to the paper (http://yifanhu.net/PUB/cf.pdf) Section 2: > " However, with implicit feedback it would be natural to assign values to > all rui variables. If no action was observed rui is set to zero, thus > meaning in our examples zero watching time, or zero purchases on record." > > In the implicit setting, apparently there should have values for all pairs > (u, i) instead of just the observed ones according to the paper. This is > also true for other implicit feedback papers I read. > > In section 4, when r=0, p=0 BUT c=1. Therefore, when we optimize the value > for this pair. (x^Ty)^2 + regularization. > > Do I misunderstand the paper? > > Best Regards, > > Jerry > > > On Mon, Dec 5, 2016 at 2:43 PM, Sean Owen <so...@cloudera.com> wrote: > > What are you referring to in what paper? implicit input would never > materialize 0s for missing values. > > On Tue, Dec 6, 2016 at 3:42 AM Jerry Lam <chiling...@gmail.com> wrote: > > Hello spark users and developers, > > I read the paper from Yahoo about CF with implicit feedback and other > papers using implicit feedbacks. Their implementation require to set the > missing rating with 0. That is for unobserved ratings, the confidence for > those is set to 1 (c=1). Therefore, the matrix to be factorized is a dense > matrix. > > I read the source code of the ALS implementation in spark (version 1.6.x) > for implicit feedback. Apparently, it ignores rating that is 0 (Line 1159 > in ALS.scala). It could be a mistake or it could be an optimization. Just > want to see if anyone steps on this yet. > > Best Regards, > > Jerry > > > >