[GitHub] spark pull request: [WIP][SPARK-3530][MLLIB] pipeline and paramete...

mengxr Wed, 05 Nov 2014 23:00:07 -0800

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/3099#issuecomment-61934791
  
    @jegonzal @jkbradley (This may be slightly off-topic.) To serve models, 
which are usually long pipelines in practice, I'm thinking of the following:
    
    1. serialize to portable format, like PMML. For example, Google Prediction 
API accepts PMML with transformations. @selvinsource is working on #3062  and 
@srowen is helping.
    2. code generation, this is less compatible than PMML, but faster and maybe 
easier to implement. For example, we can do the following:
    
    ~~~
    val model = pipeline.fit(dataset)
    val code = model.generateCode(schema)
    ~~~
    
    Then the compiled code can make predictions on normal Java input, without 
dependencies on Spark.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [WIP][SPARK-3530][MLLIB] pipeline and paramete...

Reply via email to