Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/3099#issuecomment-61934791
@jegonzal @jkbradley (This may be slightly off-topic.) To serve models,
which are usually long pipelines in practice, I'm thinking of the following:
1. serialize to portable format, like PMML. For example, Google Prediction
API accepts PMML with transformations. @selvinsource is working on #3062 and
@srowen is helping.
2. code generation, this is less compatible than PMML, but faster and maybe
easier to implement. For example, we can do the following:
~~~
val model = pipeline.fit(dataset)
val code = model.generateCode(schema)
~~~
Then the compiled code can make predictions on normal Java input, without
dependencies on Spark.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]