I also had some issues with Pickle in the past and have to admit that I
actually don't trust pickle files ;). Maybe, I am too paranoid, but I am always
afraid of corrupting or losing the data.
Probably not the most elegant solution, but I typically store estimator
settings and model parameters as JSON files (since they are human readable in
the worst case scenario having "reproducible research" in mind ;)).
For example:
# Model fitting and saving params to JSON
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_diabetes
diabetes = load_diabetes()
X, y = diabetes.data, diabetes.target
regr = LinearRegression()
regr.fit(X, y)
import json
with open('./params.json', 'w', encoding='utf-8') as outfile:
json.dump(regr.get_params(), outfile)
with open('./weights.json', 'w', encoding='utf-8') as outfile:
json.dump(regr.coef_.tolist(), outfile, separators=(',', ':'),
sort_keys=True, indent=4)
with open('./intercept.json', 'w', encoding='utf-8') as outfile:
json.dump(regr.intercept_, outfile)
# In a new session: load the params from the JSON files
import json
import codecs
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_diabetes
import numpy as np
diabetes = load_diabetes()
X, y = diabetes.data, diabetes.target
obj_text = codecs.open('./params.json', 'r', encoding='utf-8').read()
params = json.loads(obj_text)
obj_text = codecs.open('./weights.json', 'r', encoding='utf-8').read()
weights = json.loads(obj_text)
obj_text = codecs.open('./intercept.json', 'r', encoding='utf-8').read()
intercept = json.loads(obj_text)
regr = LinearRegression()
regr.set_params(**params)
regr.intercept_, regr.coef_ = intercept, np.array(weights)
regr.predict(X[:10])
array([ 206.11706979, 68.07234761, 176.88406035, 166.91796559,
128.45984241, 106.34908972, 73.89417947, 118.85378669,
158.81033076, 213.58408893])
In any case, I know that this isn't pretty, and I would also be looking forward
to a better solution!
Best,
Sebastian Raschka
> On Mar 23, 2016, at 12:47 PM, Keith Lehman <[email protected]> wrote:
>
> Hi:
>
> I’m fairly new to scikit-learn, python, and machine learning. This community
> has built a great set of libraries though, and is actually a large part of
> the reason why my company has selected python to experiment with ML.
>
> As we are developing our product, however, we keep running into trouble
> saving various objects. When possible, we use pickle to save the objects, but
> this can cause problems in development – objects saved during a debug session
> can not be loaded outside of the debugger. The reason appears to be because
> even when pickling a “pickleable” object (such as a trained
> LinearRegression), pickle finds and saves more primitive objects that have
> been instantiated within the debug environment. Dill and cpickle have the
> same issue. My question is, does the scikit-learn community plan to add
> standard load/save or dump/dumps and load/loads methods that would not create
> these dependencies?
>
> If there is a better forum for posting questions like these, please let me
> know and I’ll be happy to post there instead.
>
> Thanks!
>
> Keith Lehman
> Cell: 617-834-2863
> Skype: k.lehman
> e-mail: [email protected]
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140_______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general