2012/9/9 Mathieu Blondel <[email protected]>: > I've just tried scipy.sparse.linalg.lsqr [*] on the full news20 dataset. On > my box it takes 8 seconds to run with tol=1e-3 and 5 seconds with tol=1e-2 > without any accuracy loss. It also solves the memory problem mentioned by > Lars, as it works directly with X and y. > > Unlike scipy.linalg.lsqr, scipy.sparse.linalg.lsqr supports a regularization > term so it can actually be used to implement Ridge. Also, despite the name, > it supports dense arrays too so it may be worth comparing it with > solver="dense_cholesky" in the dense case. It cannot be used if > sample_weight != 1.0 though.
Sounds great! Another noob question then: why won't it handle sample_weight? Would it be possible to transform y using a LabelBinarizer and multiply sample_weight in? -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
