Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Olivier Grisel
A quick remark: Instead of: %pylab inline --no-import-all you can just do: %matplotlib inline -- Olivier -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Olivier Grisel
2013/11/7 Vlad Niculae zephy...@gmail.com: Hi everybody, I just updated the gist quite a lot, please take a look: http://nbviewer.ipython.org/7224672 I'll go to sleep and interpret it with a fresh eye tomorrow, but what's interesting at the moment is: KKT's performance is quite constant,

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Vlad Niculae
The regularization is the same, I think the higher residuals come from the fact that the gradient is raveled, so compared to `n_targets` independent problems, it will take different steps. I don't think there are any convergence issues because I made the solvers print a warning in case they don't

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Vlad Niculae
Come to think of it, Olivier, what do you mean when you say L-BFGS-B has higher residuals? I fail to see this trend; what I see is that L1 L2 no reg. in terms of residuals, with different methods coming very close to one another for the same regularisation objective. Could you be more specific?

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Vlad Niculae
Also I found this pretty big difference in timing when computing elementwise norms and products. In [1]: X = np.random.randn(1000, 900) In [2]: %timeit np.linalg.norm(X, 'fro') 100 loops, best of 3: 4.8 ms per loop In [3]: %timeit np.sqrt(np.sum(X ** 2)) 100 loops, best of 3: 4.5 ms per loop

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Vlad Niculae
In reply to Olivier's previous comment, as it's not at all obvious from the plots, I chose a case where lbfgsb-l1 seems very far away and printed the residuals of it and of pg-l1: In [227]: tall_med[tall_med['solver'] == 'lbfgsb-l1']['residual'] Out[227]: 2580.9370832 2650.9405044 272

Re: [Scikit-learn-general] Prototype Extraction algorithm implementation

2013-11-07 Thread Lars Buitinck
2013/11/7, Andy t3k...@gmail.com: Could you please give a link to the reference paper? I couldn't find it. Could you maybe also give a quick description of the algorithm, I'm afraid I'm not familiar with it (by that name). Maybe it's this one? (Didn't read it yet.)

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Lars Buitinck
2013/11/7, Vlad Niculae zephy...@gmail.com: Also I found this pretty big difference in timing when computing elementwise norms and products. This is a known problem with np.linalg.norm, and so is the memory consumption. You should use sklearn.utils.extmath.norm for the Frobenius norm. Also

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Vlad Niculae
This is a known problem with np.linalg.norm, and so is the memory consumption. You should use sklearn.utils.extmath.norm for the Frobenius norm. Hmm. Indeed I missed that, but still, this is a bit odd. sklearn.utils.extmath.norm is slower than raveling on my anaconda with MKL accelerate setup:

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Mathieu Blondel
Thanks for the awesome work Vlad! It's nice to see good progress. On Thu, Nov 7, 2013 at 7:12 PM, Vlad Niculae zephy...@gmail.com wrote: The regularization is the same, I think the higher residuals come from the fact that the gradient is raveled, so compared to `n_targets` independent

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Lars Buitinck
2013/11/7 Vlad Niculae zephy...@gmail.com: This is a known problem with np.linalg.norm, and so is the memory consumption. You should use sklearn.utils.extmath.norm for the Frobenius norm. Hmm. Indeed I missed that, but still, this is a bit odd. sklearn.utils.extmath.norm is slower than

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Lars Buitinck
2013/11/7 Mathieu Blondel math...@mblondel.org: Do we need two different regularization parameters for coefficients and components? MiniBatchDictionaryLearning seems to have only one alpha. For reproducing results from literature this is useful. E.g. Hoyer only regularizes one of the matrices.

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Mathieu Blondel
On Thu, Nov 7, 2013 at 11:57 PM, Lars Buitinck larsm...@gmail.com wrote: For reproducing results from literature this is useful. E.g. Hoyer only regularizes one of the matrices. For efficient grid-search with shared values, we could do this: if self.alpha_comp is None and self.alpha_coef is

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Vlad Niculae
I feel like this would go against explicit is better than implicit, but without it grid search would indeed be awkward. Maybe: if self.alpha_coef == 'same': alpha_coef = self.alpha_comp ? On Thu, Nov 7, 2013 at 4:19 PM, Mathieu Blondel math...@mblondel.org wrote: On Thu, Nov 7, 2013 at

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-07 Thread Mathieu Blondel
On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae zephy...@gmail.com wrote: I feel like this would go against explicit is better than implicit, but without it grid search would indeed be awkward. Maybe: if self.alpha_coef == 'same': alpha_coef = self.alpha_comp ? Sounds good to me!

Re: [Scikit-learn-general] Online learning

2013-11-07 Thread Jim
Andy t3kcit@... writes: I would venture that which one is better would depend on the nature of your data. Do you know the number of types beforehand? And do all types have 1000 categories? The number of Types is defined, however the number of categories keeps increasing...but as I see

[Scikit-learn-general] Custom function in decision-tree based classifiers

2013-11-07 Thread Thomas Dent
Hi, the only current options for deciding on feature splits in trees / forests are 'entropy' and 'gini', two questions on this: - is anyone planning on implementing others? - how feasible would it be to have the option of passing custom function to the tree or forest to use in splitting?

Re: [Scikit-learn-general] Custom function in decision-tree based classifiers

2013-11-07 Thread Gilles Louppe
Hi Thomas, Indeed, gini and entropy are the only supported impurity criteria for classification. I don't think we have plans right now to add others - which one do you have in mind? how feasible would it be to have the option of passing custom function to the tree or forest to use in splitting?