> if the unpickling failed, > what would you do? One lesson “scientific research” taught me is to store the code and dataset along with a “make” file under version control (git) :). I would just run my make file to re-construct the model and pickle the objects.
> I imagine that the ideal way of storing each model (coefficients, > parameters, etc.) would differ from model to model. But having > that knowledge stored in the scikit-learn code base would, from a > user standpoint at least, be better than trying to maintain it > outside. I think for “simple” linear models, it would be not a bad idea to save the weight coefficients in a log file or so. Here, I think that your model is really not that dependent on the changes in the scikit-learn code base (for example, imagine that you trained a model 10 years ago and published the results in a research paper, and today, someone asked you about this model). I mean, you know all about how a logistic regression, SVM etc. works, in the worst case you just use those weights to make the prediction on new data — I think in a typical “model persistence” case you don’t “update” your model anyways so “efficiency” would not be that big of a deal in a typical “worst case use case”. > On Aug 19, 2015, at 12:16 AM, Stefan van der Walt <stef...@berkeley.edu> > wrote: > > Hi Sebastian > > On 2015-08-18 20:47:12, Sebastian Raschka <se.rasc...@gmail.com> > wrote: >> Stefan, I have no experience with this problem in particular >> since I am not pickling objects that often. However, I deployed >> a webapp some time ago on Pythonanywhere >> (http://raschkas.pythonanywhere.com >> <http://raschkas.pythonanywhere.com/>) and meanwhile they >> upgraded their scikit-learn module; I was curious and just >> checked it out: it seems that it still works. > > It would depend a lot on whether and how the underlying class code > got modified. One question to ask is: if the unpickling failed, > what would you do? > > I imagine that the ideal way of storing each model (coefficients, > parameters, etc.) would differ from model to model. But having > that knowledge stored in the scikit-learn code base would, from a > user standpoint at least, be better than trying to maintain it > outside. > > It's a non-trivial problem, since you'll have to track any changes > in API carefully and somehow determine which versions are > compatible with which (or can be made compatible with a few basic > assumptions w.r.t. parameters etc.). > > Stéfan > > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general