Depending on how productionized or robust you want the model to be, you
might pick a language-agnostic format and wrap the fit/predict methods in a
web service.  A couple ideas that have worked well in various projects:

For web service:
1. Flask for very light-weight, barebones implementations
2. Tornado for more protocol support or various bells-and-whistles
3. Zato is a newer entrant that splits the difference; great visibility and
protocol support but also easy to get started

For format, I'd recommend just sticking to something that pandas has robust
support for:
http://pandas.pydata.org/pandas-docs/stable/io.html

One other small suggestion is that you can, as mentioned, pickle the
models.  However, instead of keeping these as file-system backed, I'd
recommend versioning them as BLOBs.  For example, Postgres's bytea will
happily take pickle.dumps (even compressed) via psycopg2 or SQLAlchemy.

Thanks,
Michael J. Bommarito II, CEO
Bommarito Consulting, LLC
*Web:* http://www.bommaritollc.com
*Mobile:* +1 (646) 450-3387


On Mon, Jun 16, 2014 at 11:12 AM, Maheshakya Wijewardena <
pmaheshak...@gmail.com> wrote:

> Hi George,
>
> You can use Python pickle to save your model.
>
> import pickle
>> with open('my_model.pickle', 'wb') as handle:
>>     pickle.dump(model, handle)
>>
>
> then it can be loaded again with:
>
> model = pickle.load(open(''my_model.pickle', 'rb'))
>>
>
> But this should be done with Python. To use the model with Java, you will
> have to use Jython or any other supported framework to call Python
> functions from Java.
> See :
> http://stackoverflow.com/questions/8898765/calling-python-in-java
>
> Hope this would help you.
>
> Regards,
> Maheshakya
>
>
> On Mon, Jun 16, 2014 at 8:20 PM, George Bezerra <gbeze...@gmail.com>
> wrote:
>
>> Hi!
>>
>> I have a trained scikit learn model that I would like to export to
>> production. The idea is to have this model loaded into memory and
>> accessible through another language, such as Java or PhP. The application
>> would query the model with some input data and the model would spit out the
>> result.
>>
>> Any ideas/experience with this?
>>
>> Thanks!
>>
>> --
>> George Bezerra
>>
>>
>> ------------------------------------------------------------------------------
>> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
>> Find What Matters Most in Your Big Data with HPCC Systems
>> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
>> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
>> http://p.sf.net/sfu/hpccsystems
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> --
> Undergraduate,
> Department of Computer Science and Engineering,
> Faculty of Engineering.
> University of Moratuwa,
> Sri Lanka
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to