Jeremy Freeman created SPARK-6345:
-------------------------------------

             Summary: Model update propagation during prediction in Streaming 
Regression
                 Key: SPARK-6345
                 URL: https://issues.apache.org/jira/browse/SPARK-6345
             Project: Spark
          Issue Type: Bug
          Components: MLlib, Streaming
            Reporter: Jeremy Freeman


During streaming regression analyses (Streaming Linear Regression and Streaming 
Logistic Regression), model updates based on training data are not being 
reflected in subsequent calls to predictOn or predictOnValues, despite updates 
themselves occurring successfully. It may be due to recent changes to model 
declaration, and I have a working fix prepared to be submitted ASAP (alongside 
expanded test coverage).

A temporary workaround is retrieve the updated model within a foreachRDD, as in:

{code}
model.predictOn(trainingData)
testData.foreachRDD{ rdd =>
    val latest = model.latestModel()
    val predictions = rdd.map(lp => latest.predict(lp.features)
}
{code}

Note that this does not affect Streaming KMeans, which works as expected for 
combinations of training and prediction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to