if you use k > 40 you are already beating Lanczos for larger datasets. k>10
is unlikely meaninful. p need not be more than 15% of k (default is 15).
use q=1, q>1 does not yield tangible improvements in real world.  Again,
see Nathan Halko's dissertation on accuracy comparison.



On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
[email protected]> wrote:

> Keeping Lanczos would be nice, Like I said, it's currently being used in
> some projects with good results and I think it's easier to tune so it would
> be my first choice for future developments. I still need to further test
> SSVD, specially because in the current example I'm working it yields very
> different results from Lanczos. We are investigating if it can be due to a
> bug when loading the data, though dimensions of the ouptut seem ok, or if
> it's a question of increasing p or q parameters. If it's a question of
> increasing p and q I think running times would make SSVD not viable. I hope
> to be able to provide some comparison figures in terms of precision and
> running time in a month or so.
>
> I hope that other users reads this and say wether they are using Lanczos.
>
> Best,
> Fernando.
>
> 2013/8/2 Sebastian Schelter <[email protected]>
>
> > I would also be fine with keeping if there is demand. I just proposed to
> > deprecate it and nobody voted against that at that point in time.
> >
> > --sebastian
> >
> >
> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > > There's a part of Nathan Halko's dissertation referenced on algorithm
> > page
> > > running comparison.  In particular, he was not able to compute more
> than
> > 40
> > > eigenvectors with Lanczos on wikipedia dataset. You may refer to that
> > > study.
> > >
> > > On the accuracy part, it was not observed that it was a problem,
> assuming
> > > high level of random noise is not the case, at least not in LSA-like
> > > application used there.
> > >
> > > That said, i am all for diversity of tools, I would actually be +0 on
> > > deprecating Lanczos, it is not like we are lacking support for it. SSVD
> > > could use improvements too.
> > >
> > >
> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > > [email protected]> wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Sorry if I duplicate the question but I've been looking for an answer
> > and I
> > >> haven't found an explanation other than it's not being used (together
> > with
> > >> some other algorithms). If it's been discussed in depth before maybe
> you
> > >> can point me to some link with the discussion.
> > >>
> > >> I have successfully used Lanczos in several projects and it's been a
> > >> surprise to me finding that the main reason (according to what I've
> read
> > >> that might not be the full story) is that it's not being used. At the
> > >> begining I supposed it was because SSVD is supposed to be much faster
> > with
> > >> similar results, but after making some tests I have found that running
> > >> times are similar or even worse than lanczos for some configurations
> (I
> > >> have tried several combinations of parameters, given child processes
> > enough
> > >> memory, etc. and had no success in running SSVD at least in 3/4 of
> time
> > >> Lanczos runs, thouh they might be some combinations of parameters I
> have
> > >> still not tried). It seems to be quite tricky to find a good
> > combination of
> > >> parameters for SSVD and I have seen also a precision loss in some
> > examples
> > >> that makes me not confident in migrating Lanczos to SSVD from now on
> > (How
> > >> far can I trust results from a combination of parameters that runs in
> > >> significant less time, or at least a good time?).
> > >>
> > >> Can someone convince me that SSVD is actually a better option than
> > Lanczos?
> > >> (I'm totally willing to be convinced... :) )
> > >>
> > >> Thank you very much in advance.
> > >>
> > >> Fernando.
> > >>
> > >
> >
> >
>

Reply via email to