> It's probably better to train a linear classifier on the text features
> alone and a second (potentially non linear classifier such as GBRT or
> ExtraTrees) on the predict_proba outcome of the text classifier + your
> additional low dim features.
>
> This is some kind of stacking method (a sort of ensemble method). It
> should make the text features not overwhelm the final classifier if
> the other features are informative.

Hey Olivier!

Thanks for the hints. I just tried it, but unfortunately the results are 
much worse than just using my textual features alone.

just to be sure if I am doing it right:

At first I create my textual features using a vectorizer. Then I fit a 
linear SVC on these features (training data ofc) and use predict_proba 
for my training samples again resulting in a probability distribution of 
dimension 7 (I have 7 classes).

Then I append my additional features (those are 15) and fit another 
classifier on the new data. (I tried several scaling/normalizing ideas 
without improvement)

I do the same procedure for test data. (Btw I do cross val)

While I get 0.85 f1 score for just using textual data the combined 
approach results in only 0.4.

Regards,
Philipp


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to