Where to store joined results in table-table join example

Shouichi Kamiya Tue, 29 Sep 2015 11:21:14 -0700

Hello everyone,

I am new to stream processing and need a clarification on the
table-table join example in the state management document.


http://samza.apache.org/learn/documentation/0.9/container/state-management.html

> Implementation: The job subscribes to the change streams for the user 
> profiles database and the user settings database, both partitioned by 
> user_id. The job keeps a key-value store keyed by user_id, which contains the 
> latest profile record and the latest settings record for each user_id. When a 
> new event comes in from either stream, the job looks up the current value in 
> its store, updates the appropriate fields (depending on whether it was a 
> profile update or a settings update), and writes back the new joined record 
> to the store. The changelog of the store doubles as the output stream of the 
> task.

I understand that the job stores the latest profile and settings
records in the local key-value store (for performance). I don't
understand where to store joined results. Should I store them in the
local kv store or external database? How can other tasks or services
fetch the joined results if they are stored in the local kv store?

Sincerely,
Shouichi

-- 
Shouichi Kamiya

Where to store joined results in table-table join example

Reply via email to