Jeremy Freeman created SPARK-2438:
-------------------------------------
Summary: Streaming + MLLib
Key: SPARK-2438
URL: https://issues.apache.org/jira/browse/SPARK-2438
Project: Spark
Issue Type: Improvement
Components: MLlib
Reporter: Jeremy Freeman
This is a ticket to track progress on developing streaming analyses in MLLib.
Many streaming applications benefit from or require fitting models online,
where the parameters of a model (e.g. regression, clustering) are updated
continually as new data arrive. This can be accomplished by incorporating MLLib
algorithms into model-updating operations over DStreams. In some cases this can
be achieved using existing updaters (e.g. those based on SGD), but in other
cases will require custom update rules (e.g. for KMeans). The goal is to have
streaming versions of many common algorithms, in particular regression,
classification, clustering, and possibly dimensionality reduction.
--
This message was sent by Atlassian JIRA
(v6.2#6252)