MLLib in Production

2014-12-10 Thread Klausen Schaefersinho
Hi,


I would like to use Spark to train a model, but use the model in some other
place,, e.g. a servelt to do some classification in real time.

What is the best way to do this? Can I just copy I model file or something
and load it in the servelt? Can anybody point me to a good tutorial?


Cheers,


Klaus



-- 
“Overfitting” is not about an excessive amount of physical exercise...


Re: MLLib in Production

2014-12-10 Thread Simon Chan
Hi Klaus,

PredictionIO is an open source product based on Spark MLlib for exactly
this purpose.
This is the tutorial for classification in particular:
http://docs.prediction.io/classification/quickstart/

You can add custom serving logics and retrieve prediction result through
REST API/SDKs at other places.

Simon

On Wed, Dec 10, 2014 at 2:25 AM, Klausen Schaefersinho 
klaus.schaef...@gmail.com wrote:

 Hi,


 I would like to use Spark to train a model, but use the model in some
 other place,, e.g. a servelt to do some classification in real time.

 What is the best way to do this? Can I just copy I model file or something
 and load it in the servelt? Can anybody point me to a good tutorial?


 Cheers,


 Klaus



 --
 “Overfitting” is not about an excessive amount of physical exercise...



Re: MLLib in Production

2014-12-10 Thread Yanbo Liang
Hi Klaus,

There is no ideal method but some workaround.
Train model in Spark cluster or YARN cluster, then use RDD.saveAsTextFile
to store this model which include weights and intercept to HDFS.
Load weights file and intercept file from HDFS, construct a GLM model, and
then run model.predict() method, you can get what you want.

The Spark community also have some ongoing work about export model with
PMML.

2014-12-10 18:32 GMT+08:00 Simon Chan simonc...@gmail.com:

 Hi Klaus,

 PredictionIO is an open source product based on Spark MLlib for exactly
 this purpose.
 This is the tutorial for classification in particular:
 http://docs.prediction.io/classification/quickstart/

 You can add custom serving logics and retrieve prediction result through
 REST API/SDKs at other places.

 Simon


 On Wed, Dec 10, 2014 at 2:25 AM, Klausen Schaefersinho 
 klaus.schaef...@gmail.com wrote:

 Hi,


 I would like to use Spark to train a model, but use the model in some
 other place,, e.g. a servelt to do some classification in real time.

 What is the best way to do this? Can I just copy I model file or
 something and load it in the servelt? Can anybody point me to a good
 tutorial?


 Cheers,


 Klaus



 --
 “Overfitting” is not about an excessive amount of physical exercise...





Re: MLLib in Production

2014-12-10 Thread Sonal Goyal
You can also serialize the model and use it in other places.

Best Regards,
Sonal
Founder, Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal



On Wed, Dec 10, 2014 at 5:32 PM, Yanbo Liang yanboha...@gmail.com wrote:

 Hi Klaus,

 There is no ideal method but some workaround.
 Train model in Spark cluster or YARN cluster, then use RDD.saveAsTextFile
 to store this model which include weights and intercept to HDFS.
 Load weights file and intercept file from HDFS, construct a GLM model, and
 then run model.predict() method, you can get what you want.

 The Spark community also have some ongoing work about export model with
 PMML.

 2014-12-10 18:32 GMT+08:00 Simon Chan simonc...@gmail.com:

 Hi Klaus,

 PredictionIO is an open source product based on Spark MLlib for exactly
 this purpose.
 This is the tutorial for classification in particular:
 http://docs.prediction.io/classification/quickstart/

 You can add custom serving logics and retrieve prediction result through
 REST API/SDKs at other places.

 Simon


 On Wed, Dec 10, 2014 at 2:25 AM, Klausen Schaefersinho 
 klaus.schaef...@gmail.com wrote:

 Hi,


 I would like to use Spark to train a model, but use the model in some
 other place,, e.g. a servelt to do some classification in real time.

 What is the best way to do this? Can I just copy I model file or
 something and load it in the servelt? Can anybody point me to a good
 tutorial?


 Cheers,


 Klaus



 --
 “Overfitting” is not about an excessive amount of physical exercise...






Re: MLLib in Production

2014-12-10 Thread Ganelin, Ilya
Hi all – I’ve been storing the model userFeatures and productFeatures vectors 
that are generated internally serialized on disk and importing them as a 
separate job.

From: Sonal Goyal sonalgoy...@gmail.commailto:sonalgoy...@gmail.com
Date: Wednesday, December 10, 2014 at 5:31 AM
To: Yanbo Liang yanboha...@gmail.commailto:yanboha...@gmail.com
Cc: Simon Chan simonc...@gmail.commailto:simonc...@gmail.com, Klausen 
Schaefersinho klaus.schaef...@gmail.commailto:klaus.schaef...@gmail.com, 
user@spark.apache.orgmailto:user@spark.apache.org 
user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: MLLib in Production

You can also serialize the model and use it in other places.

Best Regards,
Sonal
Founder, Nube Technologieshttp://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal



On Wed, Dec 10, 2014 at 5:32 PM, Yanbo Liang 
yanboha...@gmail.commailto:yanboha...@gmail.com wrote:
Hi Klaus,

There is no ideal method but some workaround.
Train model in Spark cluster or YARN cluster, then use RDD.saveAsTextFile to 
store this model which include weights and intercept to HDFS.
Load weights file and intercept file from HDFS, construct a GLM model, and then 
run model.predict() method, you can get what you want.

The Spark community also have some ongoing work about export model with PMML.

2014-12-10 18:32 GMT+08:00 Simon Chan 
simonc...@gmail.commailto:simonc...@gmail.com:
Hi Klaus,

PredictionIO is an open source product based on Spark MLlib for exactly this 
purpose.
This is the tutorial for classification in particular: 
http://docs.prediction.io/classification/quickstart/

You can add custom serving logics and retrieve prediction result through REST 
API/SDKs at other places.

Simon


On Wed, Dec 10, 2014 at 2:25 AM, Klausen Schaefersinho 
klaus.schaef...@gmail.commailto:klaus.schaef...@gmail.com wrote:
Hi,


I would like to use Spark to train a model, but use the model in some other 
place,, e.g. a servelt to do some classification in real time.

What is the best way to do this? Can I just copy I model file or something and 
load it in the servelt? Can anybody point me to a good tutorial?


Cheers,


Klaus



--
“Overfitting” is not about an excessive amount of physical exercise...





The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.