Hi.
Does it work for larger alpha? And the R implementation works with the
same alpha?
Can you reproduce with synthetic data?
It would be great if you could post a self-contained example as an
issue, preferably using synthetic data:
https://github.com/scikit-learn/scikit-learn/issues
Thanks,
Andy
On 09/17/2015 11:55 AM, conahorse wrote:
Hi everyone,
I am trying to apply |glasso| on a very simple as well as sparse
dataset made by 60+ features and 30k+ observations.
Here(http://www.mediafire.com/download/ek8kk0pg3jpc6ll/weight_comp_simple_prop.df.train.csv)
<https://www.mediafire.com/?ek8kk0pg3jpc6ll> you can find it in a csv
format, if you are interested in reproducing the issue.
I am using the sklearn implementation
<http://scikit-learn.org/stable/modules/generated/sklearn.covariance.GraphLasso.html#sklearn.covariance.GraphLasso.mahalanobis>
with very few lines of code, by trying different values for the
regularization coefficient α:
|for alpha in [0.00000001, 0.0000001, 0.000001, 0.00001, 0.0001]:
glasso_model = GraphLasso(alpha=alpha, mode='lars', max_iter=2000)
glasso_model.fit(scaled_train) |
What I am experiencing is that the model cannot fit a covariance
estimate since it stops after raising an exception complaining about
the non PSD nature of the problem:
|/usr/local/lib/python3.4/dist-packages/sklearn/covariance/graph_lasso_.py
in graph_lasso(emp_cov, alpha, cov_init, mode, tol, max_iter, verbose,
return_costs, eps, return_n_iter) 245 e.args = (e.args[0] 246 + '. The
system is too ill-conditioned for this solver',) --> 247 raise e 248
249 if return_costs:
/usr/local/lib/python3.4/dist-packages/sklearn/covariance/graph_lasso_.py
in graph_lasso(emp_cov, alpha, cov_init, mode, tol, max_iter, verbose,
return_costs, eps, return_n_iter) 236 break 237 if not
np.isfinite(cost) and i > 0: --> 238 raise FloatingPointError('Non SPD
result: the system is ' 239 'too ill-conditioned for this solver') 240
else: FloatingPointError: Non SPD result: the system is too
ill-conditioned for this solver. The system is too ill-conditioned for
this solver |
If I try to do an mle of the covariance with another function by
sklearn(http://scikit-learn.org/stable/modules/generated/sklearn.covariance.empirical_covariance.html#sklearn.covariance.empirical_covariance)
<http://scikit-learn.org/stable/modules/generated/sklearn.covariance.empirical_covariance.html#sklearn.covariance.empirical_covariance>
(which is btw the same function that the |graph_lasso| procedure
uses), this matrix is indeed PSD. So, I suspect that the problem lies
somewhere in the computation of the code.
Now I am normalizing or standardazing the data (zero mean, 1.0 var)
the data before applying the method but the problem still persist.
The same data works nice under the R package glasso. So it may be an
sklearn issue. Ah, I am using python 3.4.
Any idea about it? Am I missing some keypoint in applying the glasso?
------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general