>  if the unpickling failed, 
> what would you do?

One lesson “scientific research” taught me is to store the code and dataset 
along with a “make” file under version control (git) :). I would just run my 
make file to re-construct the model and pickle the objects.

> I imagine that the ideal way of storing each model (coefficients, 
> parameters, etc.) would differ from model to model.  But having 
> that knowledge stored in the scikit-learn code base would, from a 
> user standpoint at least, be better than trying to maintain it 
> outside.

I think for “simple” linear models, it would be not a bad idea to save the 
weight coefficients in a log file or so. Here, I think that your model is 
really not that dependent on the changes in the scikit-learn code base (for 
example, imagine that you trained a model 10 years ago and published the 
results in a research paper, and today, someone asked you about this model). I 
mean, you know all about how a logistic regression, SVM etc. works, in the 
worst case you just use those weights to make the prediction on new data — I 
think in a typical “model persistence” case you don’t “update” your model 
anyways so “efficiency” would not be that big of a deal in a typical “worst 
case use case”.


> On Aug 19, 2015, at 12:16 AM, Stefan van der Walt <stef...@berkeley.edu> 
> wrote:
> 
> Hi Sebastian
> 
> On 2015-08-18 20:47:12, Sebastian Raschka <se.rasc...@gmail.com> 
> wrote:
>> Stefan, I have no experience with this problem in particular 
>> since I am not pickling objects that often. However, I deployed 
>> a webapp some time ago on Pythonanywhere 
>> (http://raschkas.pythonanywhere.com 
>> <http://raschkas.pythonanywhere.com/>) and meanwhile they 
>> upgraded their scikit-learn module; I was curious and just 
>> checked it out: it seems that it still works.
> 
> It would depend a lot on whether and how the underlying class code 
> got modified.  One question to ask is: if the unpickling failed, 
> what would you do?
> 
> I imagine that the ideal way of storing each model (coefficients, 
> parameters, etc.) would differ from model to model.  But having 
> that knowledge stored in the scikit-learn code base would, from a 
> user standpoint at least, be better than trying to maintain it 
> outside.
> 
> It's a non-trivial problem, since you'll have to track any changes 
> in API carefully and somehow determine which versions are 
> compatible with which (or can be made compatible with a few basic 
> assumptions w.r.t. parameters etc.).
> 
> Stéfan
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to