Andre Schumacher created SPARK-6462: ---------------------------------------
Summary: UpdateStateByKey should allow inner join of new with old keys Key: SPARK-6462 URL: https://issues.apache.org/jira/browse/SPARK-6462 Project: Spark Issue Type: Improvement Components: Streaming Affects Versions: 1.3.0 Reporter: Andre Schumacher In a nutshell: provide a (inner join) instead of a cogroup for updateStateByKey in StateDStream. Details: It is common to read data (saw weblog data) from a streaming source (say Kafka) and each time update the state of a relatively small number of keys. If only the state changes need to be propagated to a downstream sink then one could avoid filtering out unchanged state in the user program and instead provide this functionality in the API (say by adding a updateStateChangesByKey method). Note that this is related but not identical to: https://issues.apache.org/jira/browse/SPARK-2629 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org