Hi.
Does it work for larger alpha? And the R implementation works with the same alpha?
Can you reproduce with synthetic data?
It would be great if you could post a self-contained example as an issue, preferably using synthetic data:
https://github.com/scikit-learn/scikit-learn/issues

Thanks,
Andy


On 09/17/2015 11:55 AM, conahorse wrote:
Hi everyone,

I am trying to apply |glasso| on a very simple as well as sparse dataset made by 60+ features and 30k+ observations. Here(http://www.mediafire.com/download/ek8kk0pg3jpc6ll/weight_comp_simple_prop.df.train.csv) <https://www.mediafire.com/?ek8kk0pg3jpc6ll> you can find it in a csv format, if you are interested in reproducing the issue.

I am using the sklearn implementation <http://scikit-learn.org/stable/modules/generated/sklearn.covariance.GraphLasso.html#sklearn.covariance.GraphLasso.mahalanobis> with very few lines of code, by trying different values for the regularization coefficient α:

|for alpha in [0.00000001, 0.0000001, 0.000001, 0.00001, 0.0001]: glasso_model = GraphLasso(alpha=alpha, mode='lars', max_iter=2000) glasso_model.fit(scaled_train) |

What I am experiencing is that the model cannot fit a covariance estimate since it stops after raising an exception complaining about the non PSD nature of the problem:

|/usr/local/lib/python3.4/dist-packages/sklearn/covariance/graph_lasso_.py in graph_lasso(emp_cov, alpha, cov_init, mode, tol, max_iter, verbose, return_costs, eps, return_n_iter) 245 e.args = (e.args[0] 246 + '. The system is too ill-conditioned for this solver',) --> 247 raise e 248 249 if return_costs: /usr/local/lib/python3.4/dist-packages/sklearn/covariance/graph_lasso_.py in graph_lasso(emp_cov, alpha, cov_init, mode, tol, max_iter, verbose, return_costs, eps, return_n_iter) 236 break 237 if not np.isfinite(cost) and i > 0: --> 238 raise FloatingPointError('Non SPD result: the system is ' 239 'too ill-conditioned for this solver') 240 else: FloatingPointError: Non SPD result: the system is too ill-conditioned for this solver. The system is too ill-conditioned for this solver |

If I try to do an mle of the covariance with another function by sklearn(http://scikit-learn.org/stable/modules/generated/sklearn.covariance.empirical_covariance.html#sklearn.covariance.empirical_covariance) <http://scikit-learn.org/stable/modules/generated/sklearn.covariance.empirical_covariance.html#sklearn.covariance.empirical_covariance> (which is btw the same function that the |graph_lasso| procedure uses), this matrix is indeed PSD. So, I suspect that the problem lies somewhere in the computation of the code.

Now I am normalizing or standardazing the data (zero mean, 1.0 var) the data before applying the method but the problem still persist.

The same data works nice under the R package glasso. So it may be an sklearn issue. Ah, I am using python 3.4.

Any idea about it? Am I missing some keypoint in applying the glasso?



------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to