[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948444#comment-14948444 ]
ASF GitHub Bot commented on FLINK-1966: --------------------------------------- Github user chiwanpark commented on a diff in the pull request: https://github.com/apache/flink/pull/1186#discussion_r41497773 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala --- @@ -124,6 +121,52 @@ class MultipleLinearRegression extends Predictor[MultipleLinearRegression] { } } + + override def toPMML(): PMML = { + weightsOption match { + case None => { + throw new RuntimeException("The MultipleLinearRegression has not been fitted to the " + + "data. This is necessary to learn the weight vector of the linear function.") + } + case Some(weights) => { + val model = weights.collect().head + val pmml = new PMML() + pmml.setHeader(new Header().setDescription("Multiple Linear Regression")) + + // define the fields + val target = FieldName.create("prediction") + val fields = scala.Array.ofDim[FieldName](model.weights.size) + Range(0, model.weights.size).foreach(index => + fields(index) = FieldName.create("field_" + index) + ) + + // define the data dictionary, mining schema and regression table + val dictionary = new DataDictionary() + val miningSchema = new MiningSchema() + val regressionTable = new RegressionTable().setIntercept(model.intercept) + Range(0, model.weights.size).foreach(index => { + miningSchema.addMiningFields( + new MiningField(fields(index)).setUsageType(FieldUsageType.ACTIVE) + ) + regressionTable.addNumericPredictors( + new NumericPredictor(fields(index), model.weights(index)) + ) + dictionary.addDataFields( + new DataField(fields(index), OpType.CONTINUOUS, DataType.DOUBLE) + ) + }) --- End diff -- We can simplify this using `zipWithIndex` method for `fields`. > Add support for predictive model markup language (PMML) > ------------------------------------------------------- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library > Reporter: Till Rohrmann > Assignee: Sachin Goel > Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)