you could try some backward feature selection like recursive feature 
elimination or just dropping features with neglectible coeficients. group l1 
penalty on the weigths would probably be the way to go but we don't have that 
...



Ark <[email protected]> schrieb:

>> 
>> The size is dominated by the n_features * n_classes coef_ matrix,
>> which you can't get rid of just like that. What does your problem
>look
>> like?
>> 
>
>Document classification of ~3000 categories with ~12000 documents. 
>The number of features comes out to be 500,000 [in which case the
>joblib
>classifier dumped is 10g]. If I use SelectKbest to select 200000 best
>features
>the size comes down to 4.8g maintain the accuracy to 97%. But I am not
>sure if
>there would be another alternative without sacrificing the accuracy. 
>
>
>
>------------------------------------------------------------------------------
>Everyone hates slow websites. So do we.
>Make your web apps faster with AppDynamics
>Download AppDynamics Lite for free today:
>http://p.sf.net/sfu/appdyn_d2d_feb
>_______________________________________________
>Scikit-learn-general mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to