Re: Why is Lanczos deprecated?

Dmitriy Lyubimov Sat, 03 Aug 2013 10:27:19 -0700
On Aug 3, 2013 3:06 AM, "Fernando Fernández" <
[email protected]> wrote:
>
> Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
> inventing the numbers here cause I don't remeber exact figures): 1834.58,
> 756.34, 325,67,125,67 and providing very good recommendations in the
> recommender system, and SSVD giving eigenvalues (invented numbers again)
> 723,56, 354,67, 111.67, 101.46 and provinding nonsense recommendations...
> that's why I'm suspecting there might be a bug in the input code. Small
> changes in decimal places and even in units, like 723,56 to 730,78 would
be
> reasonable. 1834 to 723 is not. I put this numbers in quarantine until I
> determine everything's ok with the input code.
Yes theres definitely something fishy. q0 usually results in no more than
5% error on a real life dataset and even that is never for 1st value. you
need a well controlled experiment where you provide exactly the same input.
I thought both methods accept exactly the same drm format so u could just
feed the same thing to them?
>
> Thanks for the link to Halko's dissertation. I know it's a nice piece of
> work and a reference and had already given it a look but I always have to
> do my own experiments because I have found so often that things doesn't
> work as expected with certain real cases that I always try to at least
> validate what is in papers and dissertations also apply to my data..
>
> I'm aware SSVD is non-deterministic, I always check this kind of
algorithms
> with several runs. Here are some results on movielens 100k data using R's
> implementation of SSVD provided  here
>
https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
> (I hope there are no significant differences between the results with this
> implementation and Mahout's):
>
> First line is 10 first eigenvalues computed with R's svd. Next three are
> computed with ssvd.svd with q=0 and next three are with q=1:
>
> > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with q=0"
[1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572 126.42295
121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568 159.13914
158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1] 640.63362
244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920
106.45790  99.01242
>
> [1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
> 157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
> 640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
> 119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
> 157.13394 145.08462 125.33024 120.20984 102.45867  95.37994
>
>
> I have repeated the runs several times with the same results... Maybe I'm
> still missing something else but given these results I can't apply the
rule
> of q=1 improves accuracy. At least I have to experiment, my guess is it do
> depends on the dataset. I would like also to repeat this comparison with
> Mahout's SSVD and my dataset and see what happens.
>
> Dmitriy, thank you very much for your attention and sharing your thoughts
> with me. I really appreciate it.
>
> Best,
> Fernando.
>
>
> 2013/8/3 Dmitriy Lyubimov <[email protected]>
>
> > On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> >
> > >
> > >
> > >
> > > On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
> > > [email protected]> wrote:
> > >
> > >> I don't agree with k>10 being unlikely meaningful. I've used SVD in
text
> > >> mining problems where k~150 yielded best results (not only a good
choice
> > >> based on plotting eigenvalues and seeing elbow in decay was near 150
but
> > >> checking results with different k's and seeing around 150 made much
more
> > >> sense). Currently I'm working in a recommender system and already
have
> > >> Lanczos running with k~50 producing best results, again, based on
visual
> > >> exploration of eigenvalues and exploring results one by one and
seeing
> > >> they
> > >> were more meaningful. Current tests with SSVD are based on the latter
> > and
> > >> when I say I'm not getting good results I mean Lanczos is working
> > properly
> > >> on the same problem (I've explored eigenvalues up to 150 and have a
good
> > >> decay) and SSVD is not, but as I said, this might be caused by some
bug
> > in
> > >> the input process, seems to strange to me that results are so
different
> > so
> > >>
> > >
> > > Depends on how you define "so". But again, in that respect all i can
> > point
> > > to is to the accuracy study by N. Halko, out of published work.
> > >
> > I guess i can save you digging thru Mahout wiki, here is the reference
> >
http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf
> > .
> > Specifically, look at eigen values chart comparison at page  179. This
is
> > run on Mahout's Lanczos and SSVD neck-to-neck. The order of accuracy for
> > first 40 values is claimed as "Order of accuracy is q = 3; q = 2; q = 1,
> > lanczos, q = 0." (see source for details of accuracy assessment).
> >
> > One thing i did not understand there is why Lanczos showed such
> > uncharacteristic values fall-off for values between 40 and 60. I have
> > always assumed -q=1 was showing something much closer to reality after
> > first 40 values as well.
> >
> >
> > >
> > >> I'll get back to this discussions when I figure it out :) . If you
are
> > >> curious about the numbers: 1MM rows by 150k columns for text mining
case
> > >> and 18 MM rows by 80k columns for recommender.
> > >>
> > >> About p and q, I have been playing around with movielens 100k dataset
> > and
> > >> found q>0 actually worsens results in terms of precision (nothing
severe
> > >> though, but it happens) and its better to increase p a little in that
> > >> particular case, so my guess is it depends a lot on the dataset
though I
> > >> don't know how.
> > >>
> > >
> > > This again sounds very strange.  The algorithm is non-deterministic,
> > which
> > > means errors you get in one run, will be different from errors in
another
> > > run, but honesly, you would be the first to report that power
iterations
> > > worsen expectation of an error. All theoretical work and practical
> > > estimates did not confirm that observation; in fact, quite a bit to
the
> > > contrary.
> > >
> > >
> > >>
> > >> 2013/8/2 Dmitriy Lyubimov <[email protected]>
> > >>
> > >> > the only time you would not get good results is if spectrum does
not
> > >> have a
> > >> > good decay. Which is equivalent to mostly same variance in most of
> > >> original
> > >> > basis directions. This problem is similar to problem that arises
with
> > >> PCA
> > >> > when you try to do dimensionality reduction with retaining certain
> > >> %-tage
> > >> > of variance. in case of flat spectrum decay, you'd need much
bigger k
> > to
> > >> > retain same amount of variance in dimensionally reduced
projection. In
> > >> that
> > >> > sense SSVD solution for a given k is as good as PCA gets for the
same
> > k.
> > >> > Also, i believe (but not 100% sure) "problems too small" exhibit
> > higher
> > >> > errors due to the law of large numbers.
> > >> >
> > >> >
> > >> > On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <
[email protected]>
> > >> > wrote:
> > >> >
> > >> > > if you use k > 40 you are already beating Lanczos for larger
> > datasets.
> > >> > > k>10 is unlikely meaninful. p need not be more than 15% of k
> > (default
> > >> is
> > >> > > 15). use q=1, q>1 does not yield tangible improvements in real
> > world.
> > >> > >  Again, see Nathan Halko's dissertation on accuracy comparison.
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> > >> > > [email protected]> wrote:
> > >> > >
> > >> > >> Keeping Lanczos would be nice, Like I said, it's currently being
> > >> used in
> > >> > >> some projects with good results and I think it's easier to tune
so
> > it
> > >> > >> would
> > >> > >> be my first choice for future developments. I still need to
further
> > >> test
> > >> > >> SSVD, specially because in the current example I'm working it
> > yields
> > >> > very
> > >> > >> different results from Lanczos. We are investigating if it can
be
> > due
> > >> > to a
> > >> > >> bug when loading the data, though dimensions of the ouptut seem
ok,
> > >> or
> > >> > if
> > >> > >> it's a question of increasing p or q parameters. If it's a
question
> > >> of
> > >> > >> increasing p and q I think running times would make SSVD not
> > viable.
> > >> I
> > >> > >> hope
> > >> > >> to be able to provide some comparison figures in terms of
precision
> > >> and
> > >> > >> running time in a month or so.
> > >> > >>
> > >> > >> I hope that other users reads this and say wether they are using
> > >> > Lanczos.
> > >> > >>
> > >> > >> Best,
> > >> > >> Fernando.
> > >> > >>
> > >> > >> 2013/8/2 Sebastian Schelter <[email protected]>
> > >> > >>
> > >> > >> > I would also be fine with keeping if there is demand. I just
> > >> proposed
> > >> > to
> > >> > >> > deprecate it and nobody voted against that at that point in
time.
> > >> > >> >
> > >> > >> > --sebastian
> > >> > >> >
> > >> > >> >
> > >> > >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > >> > >> > > There's a part of Nathan Halko's dissertation referenced on
> > >> > algorithm
> > >> > >> > page
> > >> > >> > > running comparison.  In particular, he was not able to
compute
> > >> more
> > >> > >> than
> > >> > >> > 40
> > >> > >> > > eigenvectors with Lanczos on wikipedia dataset. You may
refer
> > to
> > >> > that
> > >> > >> > > study.
> > >> > >> > >
> > >> > >> > > On the accuracy part, it was not observed that it was a
> > problem,
> > >> > >> assuming
> > >> > >> > > high level of random noise is not the case, at least not in
> > >> LSA-like
> > >> > >> > > application used there.
> > >> > >> > >
> > >> > >> > > That said, i am all for diversity of tools, I would
actually be
> > >> +0
> > >> > on
> > >> > >> > > deprecating Lanczos, it is not like we are lacking support
for
> > >> it.
> > >> > >> SSVD
> > >> > >> > > could use improvements too.
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > >> > >> > > [email protected]> wrote:
> > >> > >> > >
> > >> > >> > >> Hi everyone,
> > >> > >> > >>
> > >> > >> > >> Sorry if I duplicate the question but I've been looking
for an
> > >> > answer
> > >> > >> > and I
> > >> > >> > >> haven't found an explanation other than it's not being used
> > >> > (together
> > >> > >> > with
> > >> > >> > >> some other algorithms). If it's been discussed in depth
before
> > >> > maybe
> > >> > >> you
> > >> > >> > >> can point me to some link with the discussion.
> > >> > >> > >>
> > >> > >> > >> I have successfully used Lanczos in several projects and
it's
> > >> been
> > >> > a
> > >> > >> > >> surprise to me finding that the main reason (according to
what
> > >> I've
> > >> > >> read
> > >> > >> > >> that might not be the full story) is that it's not being
used.
> > >> At
> > >> > the
> > >> > >> > >> begining I supposed it was because SSVD is supposed to be
much
> > >> > faster
> > >> > >> > with
> > >> > >> > >> similar results, but after making some tests I have found
that
> > >> > >> running
> > >> > >> > >> times are similar or even worse than lanczos for some
> > >> > configurations
> > >> > >> (I
> > >> > >> > >> have tried several combinations of parameters, given child
> > >> > processes
> > >> > >> > enough
> > >> > >> > >> memory, etc. and had no success in running SSVD at least in
> > 3/4
> > >> of
> > >> > >> time
> > >> > >> > >> Lanczos runs, thouh they might be some combinations of
> > >> parameters I
> > >> > >> have
> > >> > >> > >> still not tried). It seems to be quite tricky to find a
good
> > >> > >> > combination of
> > >> > >> > >> parameters for SSVD and I have seen also a precision loss
in
> > >> some
> > >> > >> > examples
> > >> > >> > >> that makes me not confident in migrating Lanczos to SSVD
from
> > >> now
> > >> > on
> > >> > >> > (How
> > >> > >> > >> far can I trust results from a combination of parameters
that
> > >> runs
> > >> > in
> > >> > >> > >> significant less time, or at least a good time?).
> > >> > >> > >>
> > >> > >> > >> Can someone convince me that SSVD is actually a better
option
> > >> than
> > >> > >> > Lanczos?
> > >> > >> > >> (I'm totally willing to be convinced... :) )
> > >> > >> > >>
> > >> > >> > >> Thank you very much in advance.
> > >> > >> > >>
> > >> > >> > >> Fernando.
> > >> > >> > >>
> > >> > >> > >
> > >> > >> >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
Re: Why is Lanczos deprecated?

Reply via email to