If I'm not mistaken (I just read the source code on github), the copy that Peter is experiencing is due to ravel() in this method: https://github.com/scipy/scipy/blob/master/scipy/sparse/compressed.py#L264
This method in turn invokes csr_matvecs which is implemented here: https://github.com/scipy/scipy/blob/master/scipy/sparse/sparsetools/csr.h#L1010 This method takes a sparse matrix and a flat array (C-style ordered) as inputs. The advantage of using ravel() here is that another implementation is not needed to handle Fortran-style arrays. However, it does result in a copy. In predict, SGDClassifier does a safe_sparse_dot(X, self.coef_.T). Therefore, if coef_ is Fortran-style, coef_.T becomes C-style, which is the format expected by ravel() to avoid a copy. Olivier's solution sounds good. Another would be to implement a routine that can handle the dot product with a Fortran-style array directly in utils/sparsefuncs.pyx. Mathieu On Mon, Jan 9, 2012 at 5:21 AM, Olivier Grisel <[email protected]> wrote: > If the only change would be to do a: > > self.coef_ = np.asfortranarray(coef_) > > at the end of the fit method of the SGDClassifier and SGDRegressor > then I am all for it. > > We should just check that this indeed solves the memory copy issue you > suspect. > > -- > Olivier > > ------------------------------------------------------------------------------ > Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex > infrastructure or vast IT resources to deliver seamless, secure access to > virtual desktops. With this all-in-one solution, easily deploy virtual > desktops for less than the cost of PCs and save 60% on VDI infrastructure > costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
