Hello,

I just got the impression that linear_model.base.center_data always returns
a C_CONTIGUOUS array even if
it has been made f-continous before X = np.asfortranarray(X) (see ipython
session below). It looks to me that this causes some expensive
memory operation when fitting some models (see line profile below)
Am I missing something here?

best,
Immanuel

-----
In [10]: from sklearn.linear_model.base import center_data

In [12]: from sklearn.datasets.samples_generator import make_regression

In [13]: %paste
X, y, coef = make_regression(n_samples=10000, n_features=5000,
n_informative=1000,
                random_state=0, coef=True)
X = np.asfortranarray(X)
## -- End pasted text --

In [14]: X.flags
Out[14]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [17]: %paste
        X, y, X_mean, y_mean, X_std = center_data(X, y,
                fit_intercept=False, normalize=False, copy=True)
## -- End pasted text --

In [18]: X.flags
Out[18]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
----

---
Timer unit: 1e-06 s

File:
/home/mane/git/enet_strong_rules/sklearn/linear_model/coordinate_descent.py
Function: _dense_fit at line 167
Total time: 7.60257 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   167                                               def _dense_fit(self,
X, y, Xy=None, coef_init=None,
   168
active_set_init=None, alpha_init=None, R_init=None):

   177         1            2      2.0      0.0          X, y, X_mean,
y_mean, X_std = self._center_data(X, y,
   178         1      4457984 4457984.0     58.6
self.fit_intercept, self.normalize,
copy=self.copy_X)
   200         1      1273583 1273583.0     16.8          X =
np.asfortranarray(X)  # make data contiguous in memory
   213         1            5      5.0      0.0
self._fit_enet_with_strong_rule(X, y, Xy,
   214         1            4      4.0      0.0
active_set_init=active_set_init, coef_init=coef_init,
   215         1      1867135 1867135.0
24.6                                       alpha_init=alpha_init,
R_init=R_init)
----
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to