Hi Chirag,

Could you please provide more information on your Java server environment?


On Fri, Nov 7, 2014 at 9:57 AM, chirag lakhani <chirag.lakh...@gmail.com>

> Thanks for letting me know about this, it looks pretty interesting.  From
> reading the documentation it seems that the server must be built on a Spark
> cluster, is that correct?  Is it possible to deploy it in on a Java
> server?  That is how we are currently running our web app.
> On Tue, Nov 4, 2014 at 7:57 PM, Simon Chan <simonc...@gmail.com> wrote:
>> The latest version of PredictionIO, which is now under Apache 2 license,
>> supports the deployment of MLlib models on production.
>> The "engine" you build will including a few components, such as:
>> - Data - includes Data Source and Data Preparator
>> - Algorithm(s)
>> - Serving
>> I believe that you can do the feature vector creation inside the Data
>> Preparator component.
>> Currently, the package comes with two templates: 1)  Collaborative
>> Filtering Engine Template - with MLlib ALS; 2) Classification Engine
>> Template - with MLlib Naive Bayes. The latter one may be useful to you. And
>> you can customize the Algorithm component, too.
>> I have just created a doc: http://docs.prediction.io/0.8.1/templates/
>> Love to hear your feedback!
>> Regards,
>> Simon
>> On Mon, Oct 27, 2014 at 11:03 AM, chirag lakhani <
>> chirag.lakh...@gmail.com> wrote:
>>> Would pipelining include model export?  I didn't see that in the
>>> documentation.
>>> Are there ways that this is being done currently?
>>> On Mon, Oct 27, 2014 at 12:39 PM, Xiangrui Meng <men...@gmail.com>
>>> wrote:
>>>> We are working on the pipeline features, which would make this
>>>> procedure much easier in MLlib. This is still a WIP and the main JIRA
>>>> is at:
>>>> https://issues.apache.org/jira/browse/SPARK-1856
>>>> Best,
>>>> Xiangrui
>>>> On Mon, Oct 27, 2014 at 8:56 AM, chirag lakhani
>>>> <chirag.lakh...@gmail.com> wrote:
>>>> > Hello,
>>>> >
>>>> > I have been prototyping a text classification model that my company
>>>> would
>>>> > like to eventually put into production.  Our technology stack is
>>>> currently
>>>> > Java based but we would like to be able to build our models in
>>>> Spark/MLlib
>>>> > and then export something like a PMML file which can be used for model
>>>> > scoring in real-time.
>>>> >
>>>> > I have been using scikit learn where I am able to take the training
>>>> data
>>>> > convert the text data into a sparse data format and then take the
>>>> other
>>>> > features and use the dictionary vectorizer to do one-hot encoding for
>>>> the
>>>> > other categorical variables.  All of those things seem to be possible
>>>> in
>>>> > mllib but I am still puzzled about how that can be packaged in such a
>>>> way
>>>> > that the incoming data can be first made into feature vectors and then
>>>> > evaluated as well.
>>>> >
>>>> > Are there any best practices for this type of thing in Spark?  I hope
>>>> this
>>>> > is clear but if there are any confusions then please let me know.
>>>> >
>>>> > Thanks,
>>>> >
>>>> > Chirag

Donald Szeto

Reply via email to