Hi,
As soon as number of trees and features goes higher, 70Gb of ram is gone
and i am getting out of memory errors.
file size is 700Mb. Dataframe quickly shrinks from 14 to 2 columns but
there is ton of text ...
with 10 estimators and 100 features per word I can't tackle ~900 k of
records ...
Training set, about 15% of data does perfectly fine but when test come that
is it.

i can split stuff and multiprocess it but I believe that will simply skew
results...

Any ideas?


-- 
Aleksandar Kacanski
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to