Hi, As soon as number of trees and features goes higher, 70Gb of ram is gone and i am getting out of memory errors. file size is 700Mb. Dataframe quickly shrinks from 14 to 2 columns but there is ton of text ... with 10 estimators and 100 features per word I can't tackle ~900 k of records ... Training set, about 15% of data does perfectly fine but when test come that is it.
i can split stuff and multiprocess it but I believe that will simply skew results... Any ideas? -- Aleksandar Kacanski
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
