A quick remark:
Instead of:
%pylab inline --no-import-all
you can just do:
%matplotlib inline
--
Olivier
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable
2013/11/7 Vlad Niculae zephy...@gmail.com:
Hi everybody,
I just updated the gist quite a lot, please take a look:
http://nbviewer.ipython.org/7224672
I'll go to sleep and interpret it with a fresh eye tomorrow, but
what's interesting at the moment is:
KKT's performance is quite constant,
The regularization is the same, I think the higher residuals come from
the fact that the gradient is raveled, so compared to `n_targets`
independent problems, it will take different steps.
I don't think there are any convergence issues because I made the
solvers print a warning in case they don't
Come to think of it, Olivier, what do you mean when you say L-BFGS-B
has higher residuals? I fail to see this trend; what I see is that L1
L2 no reg. in terms of residuals, with different methods coming
very close to one another for the same regularisation objective.
Could you be more specific?
Also I found this pretty big difference in timing when computing
elementwise norms and products.
In [1]: X = np.random.randn(1000, 900)
In [2]: %timeit np.linalg.norm(X, 'fro')
100 loops, best of 3: 4.8 ms per loop
In [3]: %timeit np.sqrt(np.sum(X ** 2))
100 loops, best of 3: 4.5 ms per loop
In reply to Olivier's previous comment, as it's not at all obvious
from the plots, I chose a case where lbfgsb-l1 seems very far away and
printed the residuals of it and of pg-l1:
In [227]:
tall_med[tall_med['solver'] == 'lbfgsb-l1']['residual']
Out[227]:
2580.9370832
2650.9405044
272
2013/11/7, Andy t3k...@gmail.com:
Could you please give a link to the reference paper? I couldn't find it.
Could you maybe also give a quick description of the algorithm, I'm
afraid I'm not familiar with it (by that name).
Maybe it's this one? (Didn't read it yet.)
2013/11/7, Vlad Niculae zephy...@gmail.com:
Also I found this pretty big difference in timing when computing
elementwise norms and products.
This is a known problem with np.linalg.norm, and so is the memory
consumption. You should use sklearn.utils.extmath.norm for the
Frobenius norm.
Also
This is a known problem with np.linalg.norm, and so is the memory
consumption. You should use sklearn.utils.extmath.norm for the
Frobenius norm.
Hmm. Indeed I missed that, but still, this is a bit odd.
sklearn.utils.extmath.norm is slower than raveling on my anaconda with
MKL accelerate setup:
Thanks for the awesome work Vlad! It's nice to see good progress.
On Thu, Nov 7, 2013 at 7:12 PM, Vlad Niculae zephy...@gmail.com wrote:
The regularization is the same, I think the higher residuals come from
the fact that the gradient is raveled, so compared to `n_targets`
independent
2013/11/7 Vlad Niculae zephy...@gmail.com:
This is a known problem with np.linalg.norm, and so is the memory
consumption. You should use sklearn.utils.extmath.norm for the
Frobenius norm.
Hmm. Indeed I missed that, but still, this is a bit odd.
sklearn.utils.extmath.norm is slower than
2013/11/7 Mathieu Blondel math...@mblondel.org:
Do we need two different regularization parameters for coefficients and
components? MiniBatchDictionaryLearning seems to have only one alpha.
For reproducing results from literature this is useful. E.g. Hoyer
only regularizes one of the matrices.
On Thu, Nov 7, 2013 at 11:57 PM, Lars Buitinck larsm...@gmail.com wrote:
For reproducing results from literature this is useful. E.g. Hoyer
only regularizes one of the matrices.
For efficient grid-search with shared values, we could do this:
if self.alpha_comp is None and self.alpha_coef is
I feel like this would go against explicit is better than implicit,
but without it grid search would indeed be awkward. Maybe:
if self.alpha_coef == 'same':
alpha_coef = self.alpha_comp
?
On Thu, Nov 7, 2013 at 4:19 PM, Mathieu Blondel math...@mblondel.org wrote:
On Thu, Nov 7, 2013 at
On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae zephy...@gmail.com wrote:
I feel like this would go against explicit is better than implicit,
but without it grid search would indeed be awkward. Maybe:
if self.alpha_coef == 'same':
alpha_coef = self.alpha_comp
?
Sounds good to me!
Andy t3kcit@... writes:
I would venture that which one is better would depend on the nature of
your data.
Do you know the number of types beforehand? And do all types have 1000
categories?
The number of Types is defined, however the number of categories keeps
increasing...but as I see
Hi,
the only current options for deciding on feature splits in trees / forests are
'entropy' and 'gini', two questions on this:
- is anyone planning on implementing others?
- how feasible would it be to have the option of passing custom function to
the tree or forest to use in splitting?
Hi Thomas,
Indeed, gini and entropy are the only supported impurity criteria for
classification. I don't think we have plans right now to add others - which
one do you have in mind?
how feasible would it be to have the option of passing custom function to
the tree or forest to use in splitting?
18 matches
Mail list logo