In your experience with using implicit factorization for document
clustering, how did you tune alpha ? Using perplexity measures or just
something simple like 1 + rating since the ratings are always positive in
this case....

On Sun, Jul 26, 2015 at 1:23 AM, Sean Owen <so...@cloudera.com> wrote:

> It sounds like you're describing the explicit case, or any matrix
> decomposition. Are you sure that's best for count-like data? "It
> depends," but my experience is that the implicit formulation is
> better. In a way, the difference between 10,000 and 1,000 count is
> less significant than the difference between 1 and 10. However if your
> loss function penalizes the square of the error, then the former case
> not only matters more for the same relative error, it matters 10x more
> than the latter. It's very heavily skewed to pay attention to the
> high-count instances.
>
>
> On Sun, Jul 26, 2015 at 9:19 AM, Debasish Das <debasish.da...@gmail.com>
> wrote:
> > Yeah, I think the idea of confidence is a bit different than what I am
> > looking for using implicit factorization to do document clustering.
> >
> > I basically need (r_ij - w_ih_j)^2 for all observed ratings and (0 -
> > w_ih_j)^2 for all the unobserved ratings...Think about the document x
> word
> > matrix where r_ij is the count that's observed, 0 are the word counts
> that
> > are not in particular document.
> >
> > The broadcasted value of gram matrix w_i'wi or h_j'h_j will also count
> the
> > r_ij those are observed...So I might be fine using the broadcasted gram
> > matrix and use the linear term as \sum (-r_ijw_i) or \sum (-rijh_j)...
> >
> > I will think further but in the current implicit formulation with
> > confidence, looks like I am really factorizing a 0/1 matrix with weights
> 1 +
> > alpha*rating for  . It's a bit different from LSA model.
> >
> >
> >
> >
> >
> > On Sun, Jul 26, 2015 at 12:34 AM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> confidence = 1 + alpha * |rating| here (so, c1 means confidence - 1),
> >> so alpha = 1 doesn't specially mean high confidence. The loss function
> >> is computed over the whole input matrix, including all missing "0"
> >> entries. These have a minimal confidence of 1 according to this
> >> formula. alpha controls how much more confident you are in what the
> >> entries that do exist in the input mean. So alpha = 1 is low-ish and
> >> means you don't think the existence of ratings means a lot more than
> >> their absence.
> >>
> >> I think the explicit case is similar, but not identical -- here. The
> >> cost function for the explicit case is not the same, which is the more
> >> substantial difference between the two. There, ratings aren't inputs
> >> to a confidence value that becomes a weight in the loss function,
> >> during this factorization of a 0/1 matrix. Instead the rating matrix
> >> is the thing being factorized directly.
> >>
> >> On Sun, Jul 26, 2015 at 6:45 AM, Debasish Das <debasish.da...@gmail.com
> >
> >> wrote:
> >> > Hi,
> >> >
> >> > Implicit factorization is important for us since it drives
> >> > recommendation
> >> > when modeling user click/no-click and also topic modeling to handle 0
> >> > counts
> >> > in document x word matrices through NMF and Sparse Coding.
> >> >
> >> > I am a bit confused on this code:
> >> >
> >> > val c1 = alpha * math.abs(rating)
> >> > if (rating > 0) ls.add(srcFactor, (c1 + 1.0)/c1, c1)
> >> >
> >> > When the alpha = 1.0 (high confidence) and rating is > 0 (true for
> word
> >> > counts), why this formula does not become same as explicit formula:
> >> >
> >> > ls.add(srcFactor, rating, 1.0)
> >> >
> >> > For modeling document, I believe implicit Y'Y needs to stay but we
> need
> >> > explicit ls.add(srcFactor, rating, 1.0)
> >> >
> >> > I am understanding confidence code further. Please let me know if the
> >> > idea
> >> > of mapping implicit to handle 0 counts in document word matrix makes
> >> > sense.
> >> >
> >> > Thanks.
> >> > Deb
> >> >
> >
> >
>

Reply via email to