Hello,
I just got the impression that linear_model.base.center_data always returns
a C_CONTIGUOUS array even if
it has been made f-continous before X = np.asfortranarray(X) (see ipython
session below). It looks to me that this causes some expensive
memory operation when fitting some models (see line profile below)
Am I missing something here?
best,
Immanuel
-----
In [10]: from sklearn.linear_model.base import center_data
In [12]: from sklearn.datasets.samples_generator import make_regression
In [13]: %paste
X, y, coef = make_regression(n_samples=10000, n_features=5000,
n_informative=1000,
random_state=0, coef=True)
X = np.asfortranarray(X)
## -- End pasted text --
In [14]: X.flags
Out[14]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [17]: %paste
X, y, X_mean, y_mean, X_std = center_data(X, y,
fit_intercept=False, normalize=False, copy=True)
## -- End pasted text --
In [18]: X.flags
Out[18]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
----
---
Timer unit: 1e-06 s
File:
/home/mane/git/enet_strong_rules/sklearn/linear_model/coordinate_descent.py
Function: _dense_fit at line 167
Total time: 7.60257 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
167 def _dense_fit(self,
X, y, Xy=None, coef_init=None,
168
active_set_init=None, alpha_init=None, R_init=None):
177 1 2 2.0 0.0 X, y, X_mean,
y_mean, X_std = self._center_data(X, y,
178 1 4457984 4457984.0 58.6
self.fit_intercept, self.normalize,
copy=self.copy_X)
200 1 1273583 1273583.0 16.8 X =
np.asfortranarray(X) # make data contiguous in memory
213 1 5 5.0 0.0
self._fit_enet_with_strong_rule(X, y, Xy,
214 1 4 4.0 0.0
active_set_init=active_set_init, coef_init=coef_init,
215 1 1867135 1867135.0
24.6 alpha_init=alpha_init,
R_init=R_init)
----
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general