Hi Thomas,
Indeed, gini and entropy are the only supported impurity criteria for
classification. I don't think we have plans right now to add others - which
one do you have in mind?
> how feasible would it be to have the option of passing custom function to
the tree or forest to use in splitting?
Hi,
the only current options for deciding on feature splits in trees / forests are
'entropy' and 'gini', two questions on this:
- is anyone planning on implementing others?
- how feasible would it be to have the option of passing custom function to
the tree or forest to use in splitting?
W
Andy writes:
> I would venture that which one is better would depend on the nature of
> your data.
> Do you know the number of types beforehand? And do all types have 1000
> categories?
The number of Types is defined, however the number of categories keeps
increasing...but as I see it is un
On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae wrote:
> I feel like this would go against "explicit is better than implicit",
> but without it grid search would indeed be awkward. Maybe:
>
> if self.alpha_coef == 'same':
> alpha_coef = self.alpha_comp
>
> ?
>
Sounds good to me!
Mathieu
I feel like this would go against "explicit is better than implicit",
but without it grid search would indeed be awkward. Maybe:
if self.alpha_coef == 'same':
alpha_coef = self.alpha_comp
?
On Thu, Nov 7, 2013 at 4:19 PM, Mathieu Blondel wrote:
>
> On Thu, Nov 7, 2013 at 11:57 PM, Lars Bui
On Thu, Nov 7, 2013 at 11:57 PM, Lars Buitinck wrote:
> For reproducing results from literature this is useful. E.g. Hoyer
> only regularizes one of the matrices.
>
For efficient grid-search with shared values, we could do this:
if self.alpha_comp is None and self.alpha_coef is None:
alpha_
2013/11/7 Mathieu Blondel :
> Do we need two different regularization parameters for coefficients and
> components? MiniBatchDictionaryLearning seems to have only one "alpha".
For reproducing results from literature this is useful. E.g. Hoyer
only regularizes one of the matrices.
2013/11/7 Vlad Niculae :
>> This is a known problem with np.linalg.norm, and so is the memory
>> consumption. You should use sklearn.utils.extmath.norm for the
>> Frobenius norm.
>
> Hmm. Indeed I missed that, but still, this is a bit odd.
> sklearn.utils.extmath.norm is slower than raveling on my
Thanks for the awesome work Vlad! It's nice to see good progress.
On Thu, Nov 7, 2013 at 7:12 PM, Vlad Niculae wrote:
> The regularization is the same, I think the higher residuals come from
> the fact that the gradient is raveled, so compared to `n_targets`
> independent problems, it will take
> This is a known problem with np.linalg.norm, and so is the memory
> consumption. You should use sklearn.utils.extmath.norm for the
> Frobenius norm.
Hmm. Indeed I missed that, but still, this is a bit odd.
sklearn.utils.extmath.norm is slower than raveling on my anaconda with
MKL accelerate setu
2013/11/7, Vlad Niculae :
> Also I found this pretty big difference in timing when computing
> elementwise norms and products.
This is a known problem with np.linalg.norm, and so is the memory
consumption. You should use sklearn.utils.extmath.norm for the
Frobenius norm.
Also note that the curren
2013/11/7, Andy :
> Could you please give a link to the reference paper? I couldn't find it.
> Could you maybe also give a quick description of the algorithm, I'm
> afraid I'm not familiar with it (by that name).
Maybe it's this one? (Didn't read it yet.)
http://www.cs.kent.edu/~dragan/ST/papers/
In reply to Olivier's previous comment, as it's not at all obvious
from the plots, I chose a case where lbfgsb-l1 seems very far away and
printed the residuals of it and of pg-l1:
In [227]:
tall_med[tall_med['solver'] == 'lbfgsb-l1']['residual']
Out[227]:
2580.9370832
2650.9405044
272
Also I found this pretty big difference in timing when computing
elementwise norms and products.
In [1]: X = np.random.randn(1000, 900)
In [2]: %timeit np.linalg.norm(X, 'fro')
100 loops, best of 3: 4.8 ms per loop
In [3]: %timeit np.sqrt(np.sum(X ** 2))
100 loops, best of 3: 4.5 ms per loop
In
Come to think of it, Olivier, what do you mean when you say L-BFGS-B
has higher residuals? I fail to see this trend; what I see is that L1
> L2 > no reg. in terms of residuals, with different methods coming
very close to one another for the same regularisation objective.
Could you be more specific?
The regularization is the same, I think the higher residuals come from
the fact that the gradient is raveled, so compared to `n_targets`
independent problems, it will take different steps.
I don't think there are any convergence issues because I made the
solvers print a warning in case they don't
2013/11/7 Vlad Niculae :
> Hi everybody,
>
> I just updated the gist quite a lot, please take a look:
> http://nbviewer.ipython.org/7224672
>
> I'll go to sleep and interpret it with a fresh eye tomorrow, but
> what's interesting at the moment is:
>
> KKT's performance is quite constant,
> PG with
A quick remark:
Instead of:
%pylab inline --no-import-all
you can just do:
%matplotlib inline
--
Olivier
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programmi
18 matches
Mail list logo