Which version of scikit-learn are you using?
We recently (0.17) removed storing of data point indices in trees which greatly reduced the size in some cases.


On 04/10/2016 09:28 AM, Piotr Płoński wrote:
Thanks for comments! I put more details of my problem here http://stackoverflow.com/questions/36523989/why-sklearn-randomforest-model-take-a-lot-of-disk-space-after-save

Indeed, saving with joblib takes less space but there is still a lot of space used on the disk.

Best,
Piotr

2016-04-10 15:24 GMT+02:00 Mathieu Blondel <math...@mblondel.org <mailto:math...@mblondel.org>>:

    You may also want to save your model using joblib (possibly with
    compression enabled) instead of cPickle.

    Mathieu

    On Sun, Apr 10, 2016 at 9:13 AM, Piotr Płoński
    <pplonsk...@gmail.com <mailto:pplonsk...@gmail.com>> wrote:

        Hi All,

        I am saving RandomForestClassifier model from sklearn library
        with code below

        |

        with open('/tmp/rf.model', 'wb') as f: cPickle.dump(RF_model, f)

        |

        ||It takes a lot of space on my hard drive. There are only 50
        trees in the model, however it takes over 50 MB on disk
        (analyzed dataset is ~ 20MB, with 21 features). Does anybody
        have idea why? I observe similar behavior for
        ExtraTreesClassifier.

        Best,

        Piotr



        
------------------------------------------------------------------------------
        Find and fix application performance issues faster with
        Applications Manager
        Applications Manager provides deep performance insights into
        multiple tiers of
        your business applications. It resolves application problems
        quickly and
        reduces your MTTR. Get your free trial!
        http://pubads.g.doubleclick.net/
        gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
        
<http://pubads.g.doubleclick.net/gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532>
        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------
    Find and fix application performance issues faster with
    Applications Manager
    Applications Manager provides deep performance insights into
    multiple tiers of
    your business applications. It resolves application problems
    quickly and
    reduces your MTTR. Get your free trial!
    http://pubads.g.doubleclick.net/
    gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
    
<http://pubads.g.doubleclick.net/%0Agampad/clk?id=1444514301&iu=/ca-pub-7940484522588532>
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to