Hi guys,

sure you have similar use case and want to know how you deal with that. In
our application, we want to check the previous state of some keys and
compare with their current state.

AFAIK, Spark Streaming does not have key-value access. So current what I am
doing is storing the previous and current data as one date type in the
state. Call updateStateByKey in every interval and work on the state (have
previous and current data)  of the generated new DStream. But it has
limitations:

1. can not access keys that do appear in this time interval.
2. can not update key A's state from key B's if only key B appears in this
time interval.

Am I doing something wrong? Any suggestions? Thank you.

Cheers,

Fang, Yan
yanfang...@gmail.com
+1 (206) 849-4108

Reply via email to