Hello everyone, I am new to stream processing and need a clarification on the table-table join example in the state management document.
http://samza.apache.org/learn/documentation/0.9/container/state-management.html > Implementation: The job subscribes to the change streams for the user > profiles database and the user settings database, both partitioned by > user_id. The job keeps a key-value store keyed by user_id, which contains the > latest profile record and the latest settings record for each user_id. When a > new event comes in from either stream, the job looks up the current value in > its store, updates the appropriate fields (depending on whether it was a > profile update or a settings update), and writes back the new joined record > to the store. The changelog of the store doubles as the output stream of the > task. I understand that the job stores the latest profile and settings records in the local key-value store (for performance). I don't understand where to store joined results. Should I store them in the local kv store or external database? How can other tasks or services fetch the joined results if they are stored in the local kv store? Sincerely, Shouichi -- Shouichi Kamiya