I'm pickling a random forest model (128 estimators, trained on 50k
examples) and the resulting .pkl size is on the order of 200MB.
Is that expected? The whole dataset size is only 400k...
Here's the code that reproduces it:
import sklearn.ensemble, pickle
clf = sklearn.ensemble.RandomForestClassifier(n_estimators=128)
clf.fit(X = [[i % 6, i % 7, i % 8] for i in range(50000)], y=[i % 5 > 0 for
i in range(50000)])
pickle.dump(clf, open("test.pkl", 'wb'))
Regards,
Dmitry
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general