if you use k > 40 you are already beating Lanczos for larger datasets. k>10 is unlikely meaninful. p need not be more than 15% of k (default is 15). use q=1, q>1 does not yield tangible improvements in real world. Again, see Nathan Halko's dissertation on accuracy comparison.
On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández < [email protected]> wrote: > Keeping Lanczos would be nice, Like I said, it's currently being used in > some projects with good results and I think it's easier to tune so it would > be my first choice for future developments. I still need to further test > SSVD, specially because in the current example I'm working it yields very > different results from Lanczos. We are investigating if it can be due to a > bug when loading the data, though dimensions of the ouptut seem ok, or if > it's a question of increasing p or q parameters. If it's a question of > increasing p and q I think running times would make SSVD not viable. I hope > to be able to provide some comparison figures in terms of precision and > running time in a month or so. > > I hope that other users reads this and say wether they are using Lanczos. > > Best, > Fernando. > > 2013/8/2 Sebastian Schelter <[email protected]> > > > I would also be fine with keeping if there is demand. I just proposed to > > deprecate it and nobody voted against that at that point in time. > > > > --sebastian > > > > > > On 02.08.2013 03:12, Dmitriy Lyubimov wrote: > > > There's a part of Nathan Halko's dissertation referenced on algorithm > > page > > > running comparison. In particular, he was not able to compute more > than > > 40 > > > eigenvectors with Lanczos on wikipedia dataset. You may refer to that > > > study. > > > > > > On the accuracy part, it was not observed that it was a problem, > assuming > > > high level of random noise is not the case, at least not in LSA-like > > > application used there. > > > > > > That said, i am all for diversity of tools, I would actually be +0 on > > > deprecating Lanczos, it is not like we are lacking support for it. SSVD > > > could use improvements too. > > > > > > > > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández < > > > [email protected]> wrote: > > > > > >> Hi everyone, > > >> > > >> Sorry if I duplicate the question but I've been looking for an answer > > and I > > >> haven't found an explanation other than it's not being used (together > > with > > >> some other algorithms). If it's been discussed in depth before maybe > you > > >> can point me to some link with the discussion. > > >> > > >> I have successfully used Lanczos in several projects and it's been a > > >> surprise to me finding that the main reason (according to what I've > read > > >> that might not be the full story) is that it's not being used. At the > > >> begining I supposed it was because SSVD is supposed to be much faster > > with > > >> similar results, but after making some tests I have found that running > > >> times are similar or even worse than lanczos for some configurations > (I > > >> have tried several combinations of parameters, given child processes > > enough > > >> memory, etc. and had no success in running SSVD at least in 3/4 of > time > > >> Lanczos runs, thouh they might be some combinations of parameters I > have > > >> still not tried). It seems to be quite tricky to find a good > > combination of > > >> parameters for SSVD and I have seen also a precision loss in some > > examples > > >> that makes me not confident in migrating Lanczos to SSVD from now on > > (How > > >> far can I trust results from a combination of parameters that runs in > > >> significant less time, or at least a good time?). > > >> > > >> Can someone convince me that SSVD is actually a better option than > > Lanczos? > > >> (I'm totally willing to be convinced... :) ) > > >> > > >> Thank you very much in advance. > > >> > > >> Fernando. > > >> > > > > > > > >
