2012/9/9 Mathieu Blondel <[email protected]>:
> I've just tried scipy.sparse.linalg.lsqr [*] on the full news20 dataset. On
> my box it takes 8 seconds to run with tol=1e-3 and 5 seconds with tol=1e-2
> without any accuracy loss. It also solves the memory problem mentioned by
> Lars, as it works directly with X and y.
>
> Unlike scipy.linalg.lsqr, scipy.sparse.linalg.lsqr supports a regularization
> term so it can actually be used to implement Ridge. Also, despite the name,
> it supports dense arrays too so it may be worth comparing it with
> solver="dense_cholesky" in the dense case. It cannot be used if
> sample_weight != 1.0 though.

Thanks for investigating, this is very interesting. If the
scipy.sparse.linalg.lsqr is competitive, we could use it as the
default and fallback to the current implementation only when
sample_weight != 1.0.

Also what is the impact on RdigeCV? Is this independent? I am not
familiar with that code so sorry if my question is naive.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to