btw you could also use a different multiclass strategy like error correcting 
output codes (exists in sklearn) or a binary tree of classifiers (would have to 
implement yourself)



Ark <[email protected]> schrieb:

>> 
>> The size is dominated by the n_features * n_classes coef_ matrix,
>> which you can't get rid of just like that. What does your problem
>look
>> like?
>> 
>
>Document classification of ~3000 categories with ~12000 documents. 
>The number of features comes out to be 500,000 [in which case the
>joblib
>classifier dumped is 10g]. If I use SelectKbest to select 200000 best
>features
>the size comes down to 4.8g maintain the accuracy to 97%. But I am not
>sure if
>there would be another alternative without sacrificing the accuracy. 
>
>
>
>------------------------------------------------------------------------------
>Everyone hates slow websites. So do we.
>Make your web apps faster with AppDynamics
>Download AppDynamics Lite for free today:
>http://p.sf.net/sfu/appdyn_d2d_feb
>_______________________________________________
>Scikit-learn-general mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to