On 12/04/2012 02:05 AM, Afik Cohen wrote:
> Andreas Mueller writes:
>
>> On 12/03/2012 09:39 PM, Afik Cohen wrote:
>>> No, we aren't doing multi-label classification, just multiclass. He was
> saying
>>> we could just use SGDClassifier directly, which is true, but AFAIK there is
> no
>>> way to ge
> It's probably better to train a linear classifier on the text features
> alone and a second (potentially non linear classifier such as GBRT or
> ExtraTrees) on the predict_proba outcome of the text classifier + your
> additional low dim features.
>
> This is some kind of stacking method (a sort
Am 04.12.2012 11:45, schrieb Philipp Singer:
>> It's probably better to train a linear classifier on the text features
>> alone and a second (potentially non linear classifier such as GBRT or
>> ExtraTrees) on the predict_proba outcome of the text classifier + your
>> additional low dim features.
>
2012/12/4 Philipp Singer :
>
>> It's probably better to train a linear classifier on the text features
>> alone and a second (potentially non linear classifier such as GBRT or
>> ExtraTrees) on the predict_proba outcome of the text classifier + your
>> additional low dim features.
>>
>> This is som
Am 04.12.2012 12:20, schrieb Olivier Grisel:
> 2012/12/4 Philipp Singer :
>>> It's probably better to train a linear classifier on the text features
>>> alone and a second (potentially non linear classifier such as GBRT or
>>> ExtraTrees) on the predict_proba outcome of the text classifier + your
>
Am 04.12.2012 12:26, schrieb Andreas Mueller:
> Am 04.12.2012 12:20, schrieb Olivier Grisel:
>> 2012/12/4 Philipp Singer :
It's probably better to train a linear classifier on the text features
alone and a second (potentially non linear classifier such as GBRT or
ExtraTrees) on the p
I have updated the virtualenvs of the jenkins vm to use:
- ubuntu LTS matplotlib 0.99.1 on python 2.6
- latest stable matplotlib 1.2.0 on python 2.7
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
--
Lo
> Have you scaled your additional features to the [0-1] range as the
> probability features from the text classifier?
>
Until now I performed Scaler() (im on 0.12 atm) on the new feature
space. Should I do this on my appended features only? But well, they are
not exactly between 0 or 1 then. I
Am 04.12.2012 12:35, schrieb Olivier Grisel:
> I have updated the virtualenvs of the jenkins vm to use:
>
> - ubuntu LTS matplotlib 0.99.1 on python 2.6
> - latest stable matplotlib 1.2.0 on python 2.7
>
Merci beaucoup :)
thanks!
2012/12/4 Andreas Mueller :
> Am 04.12.2012 12:35, schrieb Olivier Grisel:
>> I have updated the virtualenvs of the jenkins vm to use:
>>
>> - ubuntu LTS matplotlib 0.99.1 on python 2.6
>> - latest stable matplotlib 1.2.0 on python 2.7
>>
> Merci beaucoup :)
>
> ---
2012/12/4 Philipp Singer :
>
>> Have you scaled your additional features to the [0-1] range as the
>> probability features from the text classifier?
>>
>
> Until now I performed Scaler() (im on 0.12 atm) on the new feature
> space. Should I do this on my appended features only? But well, they are
>
2012/12/4 Philipp Singer :
>
> I use a linear SVM for learning my probabilities for the samples (I have
> used grid search for determining the optimal paramters). Then I append
> the additional features and do as suggested gradient boosting or extra
> tree classifier. What do you mean by testing ju
Am 04.12.2012 15:15, schrieb Olivier Grisel:
> 2012/12/4 Philipp Singer :
>>
>>> Have you scaled your additional features to the [0-1] range as the
>>> probability features from the text classifier?
>>>
>>
>> Until now I performed Scaler() (im on 0.12 atm) on the new feature
>> space. Should I do t
Tree based models such as ExtraTrees do not need scaling at all. So
the difference you see is probably just cross validation (especially
with such a small number of samples).
Scaling is only useful for models that makes a prior assumption on the
feature distributions such as a the L2 regularizer (
> > Will this let us run SGDClassifier and show us per-class probability
outputs?
> > Again, that's the only reason we've been using OneVsRestClassifier. Let me
> > explain what I mean by per-class probability, just in case it isn't clear:
> >
> > SGDClassifier's predict_proba() returns probabilit
>>> [(0.4, 0.5), (0.7, 0.3), (0.8, 0.2), (0.9, 0.1), (0.6, 0.4)] for five
>>> classes
>>> showing the probability that the input does not belong/does belong to that
>>> class, respectively.
>>>
>> Yes, if you don't normalize.
>> You are aware that this is inconsistent when you are doing multi-cla
Doug,
You will be happy to hear that this is now freshly fixed in master.
Attributes are now flat in case of single output problems (as you
expected) and nested for multi-output problems (as before).
Best,
Gilles
On 30 November 2012 17:31, Gael Varoquaux wrote:
>> I guess transforming it would
17 matches
Mail list logo