eduwercamacaro commented on code in PR #20749:
URL: https://github.com/apache/kafka/pull/20749#discussion_r2455820031


##########
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java:
##########
@@ -157,9 +159,14 @@ public RocksDBStore(final String name,
     @Override
     public void init(final StateStoreContext stateStoreContext,
                      final StateStore root) {
+        initialized.set(true);
         // open the DB dir
         metricsRecorder.init(metricsImpl(stateStoreContext), 
stateStoreContext.taskId());
-        openDB(stateStoreContext.appConfigs(), stateStoreContext.stateDir());
+        if (!open) {
+            preInit(stateStoreContext);

Review Comment:
   Thank you for your time reviewing this PR.
   
   That is a valid concern. 
   
   As things are right now in this PR, it's not clear when each method in the 
state store lifecycle is called. This makes things a lot harder for the people 
who write state stores, including the existing `RocksDBStore`. 
   
   I agree that we need to make the implementation a little harder on our end 
and a little easier on the specific state stores’ end. 
   
   I was thinking that we could document the `preInit` method and specify that 
it will be called immediately after the state store is built (in the topology 
builder), allowing state stores to implement logic for opening resources that 
will be required during the `init` phase. In the case of the `RocksDBStore`, we 
will open the store during this phase.
   
   Although I am not familiar with custom state stores, I would assume that 
they could also open a database connection during this phase. 
   
   Since the state store was already opened during the `preInit` phase, the 
`init` method can only concentrate on preparing it for use in StreamThread 
processing (i.e., registering metrics in the process context). This is because 
the init method will run during the rebalance process.
   
   State store authors would find it easy to implement or evolve their custom 
implementations with a more clear lifecycle.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to