Re: [Scikit-learn-general] Packaging large objects

Ark Thu, 21 Feb 2013 14:46:54 -0800

> 
> The size is dominated by the n_features * n_classes coef_ matrix,
> which you can't get rid of just like that. What does your problem look
> like?
>


Document classification of ~3000 categories with ~12000 documents. 
The number of features comes out to be 500,000 [in which case the joblib
 classifier dumped is 10g]. If I use SelectKbest to select 200000 best features
the size comes down to 4.8g maintain the accuracy to 97%. But I am not sure if
there would be another alternative without sacrificing the accuracy. 



------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Packaging large objects

Reply via email to