Jeremy Freeman created SPARK-6345:
-------------------------------------
Summary: Model update propagation during prediction in Streaming
Regression
Key: SPARK-6345
URL: https://issues.apache.org/jira/browse/SPARK-6345
Project: Spark
Issue Type: Bug
Components: MLlib, Streaming
Reporter: Jeremy Freeman
During streaming regression analyses (Streaming Linear Regression and Streaming
Logistic Regression), model updates based on training data are not being
reflected in subsequent calls to predictOn or predictOnValues, despite updates
themselves occurring successfully. It may be due to recent changes to model
declaration, and I have a working fix prepared to be submitted ASAP (alongside
expanded test coverage).
A temporary workaround is retrieve the updated model within a foreachRDD, as in:
{code}
model.predictOn(trainingData)
testData.foreachRDD{ rdd =>
val latest = model.latestModel()
val predictions = rdd.map(lp => latest.predict(lp.features)
}
{code}
Note that this does not affect Streaming KMeans, which works as expected for
combinations of training and prediction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]