On Thu, Oct 04, 2012 at 04:23:04PM -0600, Alejandro Weinstein wrote:
> I tried with a home-made "naive" implementation of lars/lasso, and I
> don't observe the unstable behavior. Code and results available here:
> https://gist.github.com/3828759 .

The reason that your Lars implementation does not suffer from this
problem, and probably the reason that the R implementation does not
either, is most likely that our Lars implementation iteratively refines
the computation of the residuals and of the Cholesky gram matrix. This is
outlined as a pro of the algorithm in the original Lars paper, as it
enables the algorithm to be very fast, however the author of that paper
actually do not implement it fully :). The complete naive implementation
that you have will be very slow compared to an iteratively refined
implementation.

This is a fundamental problem of the lasso path, that can have an huge
number of kinks:
http://arxiv.org/abs/1205.0079
computing the full path can thus be very complex, and accumulate
numerical errors.

My usual point of view on this problem is that the reason that we use
Lars is because it is fast in some situations. If we are not in these
situations, we should not use it. However, your example, and most
important your simple Lars implementation, made me think that there is
probably a middle ground. Maybe we can recompute the residues and the
Cholesky matrix of the active set every once in a while. If we do it
right, it should stabilize the Lars algorithm and still keep it
reasonably fast.

Now this is a 'big picture email'. I haven't taken the time to check
where exactly the numerical errors build up. I am just doing guesses.
Your example is an interesting one and it can probably be used as a
starting point to investigate how to control errors in the lars and
improve it.

Cheers,

Gaƫl


------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to