Hi all,

I am working on the following use case involving ML Pipelines.

1. I created a Pipeline composed from a set of stages
2. I called "fit" method on my training set
3. I validated my model by calling "transform" on my test set
4. I stored my fitted Pipeline to a shared folder

Then I have a very low latency interactive application (say a kinda of web
service), that should work as follows:
1. The app receives a request
2. A scoring needs to be made, according to my fitted PipelineModel
3. The app sends the score to the caller, in a synchronous fashion

Is there a way to call the .transform method of the PipelineModel over a
single Row?

I will definitely not want to parallelize a single record to a DataFrame,
nor relying on Spark Streaming due to latency requirements.
I would like to use something similar to mllib .predict(Vector) method
which does not rely on Spark Context performing all the computation locally.

Thanks in advance
Best

Reply via email to