Re: LatentLogLinear code

Ted Dunning Tue, 24 May 2011 18:20:04 -0700

On Tue, May 24, 2011 at 6:11 PM, Dmitriy Lyubimov <[email protected]> wrote:


> > Are you asking whether the random projection code finds the best
> (largest)
> > singular
> > values and corresponding vectors?  If so, the answer is yes, it does with
> > high probability
> > of low error.
> >
>
> Well you have alternating scheme there, right? you do learn left
> singular vectors, then you switch, find the right singular vectors,
> but as far as i can tell you are not doing it the same way as
> incremental SVD does
>

Well, I should be alternating between the side information and the latent
factors.

This is not really an SVD implementation at all.  It is a log linear latent
variable implementation.

Whether it will reach the same results isn't clear to me.  The SGD
implementation that it uses
should definitely converge, however.

There are analogies with SVD computation of course.


>
> Incremental SVD goes thru the entire dataset the same way but only for
> 1 factor first. then it frozes it once testing rmse curve is flat and
> starts doing the same for the second one. Intuitively it's clear that
> the first pass this way finds the largest factor and the next one
> finds the next largest etc. Hence there's a 'step' curve on RMSE chart
> for this process as it switches from factor to factor.
>

Actually it isn't even clear that this converges to the correct eigenvectors
if two eigenvalues
are nearly the same size.

Regardless, the SGD algorithm is basically just a matter of inverting the
loops.  Instead of
looping first on eigenvector, then on examples, I iterated on examples, then
on eigenvector.

They should be similar.



>
> But in your case, it looks like you are learning all the factors at
> once. Is it going to result into the same result as incremental SVD
> algorithm? if yes, why did they even do it incrementally, for it's
> clear incremental approach would require more iterations?
>

Nothing is all that clear with these algorithms.  Speed varies more with
memory bandwidth than
with number operations.  My guess is that the SGD algorithm is a bit better
in this regard.

Re: LatentLogLinear code

Reply via email to