We also have similar issues. It'd be great to hear any cool solutions :-)
On Thu, 24 Mar 2016 at 12:47 Keith Lehman <kleh...@intercapenergy.com>
wrote:
> Thanks Sebastian.
>
> This is basically what we are doing too. The hard/time consuming part is
> determining what attributes of each sckikit-learn object need to be saved
> and how best to extract them.
>
> - Keith
>
> -----Original Message-----
> From: Sebastian Raschka [mailto:se.rasc...@gmail.com]
> Sent: Wednesday, March 23, 2016 4:05 PM
> To: scikit-learn-general@lists.sourceforge.net
> Subject: Re: [Scikit-learn-general] Scikit-learn standards for
> serializing/saving objects
>
> I also had some issues with Pickle in the past and have to admit that I
> actually don't trust pickle files ;). Maybe, I am too paranoid, but I am
> always afraid of corrupting or losing the data.
> Probably not the most elegant solution, but I typically store estimator
> settings and model parameters as JSON files (since they are human readable
> in the worst case scenario having "reproducible research" in mind ;)).
>
>
> For example:
>
>
> # Model fitting and saving params to JSON
>
> from sklearn.linear_model import LinearRegression from sklearn.datasets
> import load_diabetes
>
> diabetes = load_diabetes()
> X, y = diabetes.data, diabetes.target
> regr = LinearRegression()
> regr.fit(X, y)
>
> import json
>
> with open('./params.json', 'w', encoding='utf-8') as outfile:
> json.dump(regr.get_params(), outfile)
>
> with open('./weights.json', 'w', encoding='utf-8') as outfile:
> json.dump(regr.coef_.tolist(), outfile, separators=(',', ':'),
> sort_keys=True, indent=4)
>
> with open('./intercept.json', 'w', encoding='utf-8') as outfile:
> json.dump(regr.intercept_, outfile)
>
>
> # In a new session: load the params from the JSON files
>
>
> import json
> import codecs
> from sklearn.linear_model import LinearRegression from sklearn.datasets
> import load_diabetes import numpy as np
>
> diabetes = load_diabetes()
> X, y = diabetes.data, diabetes.target
>
> obj_text = codecs.open('./params.json', 'r', encoding='utf-8').read()
> params = json.loads(obj_text)
>
> obj_text = codecs.open('./weights.json', 'r', encoding='utf-8').read()
> weights = json.loads(obj_text)
>
> obj_text = codecs.open('./intercept.json', 'r', encoding='utf-8').read()
> intercept = json.loads(obj_text)
>
> regr = LinearRegression()
> regr.set_params(**params)
> regr.intercept_, regr.coef_ = intercept, np.array(weights)
>
> regr.predict(X[:10])
>
> array([ 206.11706979, 68.07234761, 176.88406035, 166.91796559,
> 128.45984241, 106.34908972, 73.89417947, 118.85378669,
> 158.81033076, 213.58408893])
>
>
> In any case, I know that this isn't pretty, and I would also be looking
> forward to a better solution!
>
> Best,
> Sebastian Raschka
>
>
> > On Mar 23, 2016, at 12:47 PM, Keith Lehman <kleh...@intercapenergy.com>
> wrote:
> >
> > Hi:
> >
> > I’m fairly new to scikit-learn, python, and machine learning. This
> community has built a great set of libraries though, and is actually a
> large part of the reason why my company has selected python to experiment
> with ML.
> >
> > As we are developing our product, however, we keep running into trouble
> saving various objects. When possible, we use pickle to save the objects,
> but this can cause problems in development – objects saved during a debug
> session can not be loaded outside of the debugger. The reason appears to be
> because even when pickling a “pickleable” object (such as a trained
> LinearRegression), pickle finds and saves more primitive objects that have
> been instantiated within the debug environment. Dill and cpickle have the
> same issue. My question is, does the scikit-learn community plan to add
> standard load/save or dump/dumps and load/loads methods that would not
> create these dependencies?
> >
> > If there is a better forum for posting questions like these, please let
> me know and I’ll be happy to post there instead.
> >
> > Thanks!
> >
> > Keith Lehman
> > Cell: 617-834-2863
> > Skype: k.lehman
> > e-mail: kleh...@intercapenergy.com
> >
> > ----------------------------------------------------------------------
> > --------
> > Transform Data into Opportunity.
> > Accelerate data analysis in your applications with Intel Data
> > Analytics Acceleration Library.
> > Click to learn more.
> > http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140______
> > _________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with Intel Data Analytics
> Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2016.0.7497 / Virus Database: 4545/11867 - Release Date: 03/23/16
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general