ML PipelineModel to be scored locally

Simone Miraglia Wed, 20 Jul 2016 07:09:10 -0700

Hi all,

I am working on the following use case involving ML Pipelines.


1. I created a Pipeline composed from a set of stages
2. I called "fit" method on my training set
3. I validated my model by calling "transform" on my test set
4. I stored my fitted Pipeline to a shared folder

Then I have a very low latency interactive application (say a kinda of web
service), that should work as follows:
1. The app receives a request
2. A scoring needs to be made, according to my fitted PipelineModel
3. The app sends the score to the caller, in a synchronous fashion

Is there a way to call the .transform method of the PipelineModel over a
single Row?

I will definitely not want to parallelize a single record to a DataFrame,
nor relying on Spark Streaming due to latency requirements.
I would like to use something similar to mllib .predict(Vector) method
which does not rely on Spark Context performing all the computation locally.

Thanks in advance
Best

ML PipelineModel to be scored locally

Reply via email to