wuchong opened a new issue, #2220: URL: https://github.com/apache/fluss/issues/2220
### Search before asking - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and found nothing similar. ### Fluss version 0.8.0 (latest release) ### Please describe the bug 🐞 PR [#1375](https://github.com/apache/fluss/pull/1375) changes the KV snapshot commit to ZooKeeper to be asynchronous by moving the call to `completedSnapshotStore.add(completedSnapshot)` into an `ioExecutor`. However, `CompletedSnapshotStore` is **not thread-safe**, and this change introduces a **concurrency issue**, as the store may now be accessed simultaneously from different I/O executor threads. We should make `CompletedSnapshotStore` thread-safe. Otherwise, race conditions or data corruption may occur. See `CoordinatorEventProcessor#tryProcessCommitKvSnapshot`: ```java // commit the kv snapshot asynchronously ioExecutor.execute( () -> { try { TableBucket tb = event.getTableBucket(); CompletedSnapshot completedSnapshot = event.getAddCompletedSnapshotData().getCompletedSnapshot(); // add completed snapshot CompletedSnapshotStore completedSnapshotStore = completedSnapshotStoreManager.getOrCreateCompletedSnapshotStore(tb); // this involves IO operation (ZK), so we do it in ioExecutor completedSnapshotStore.add(completedSnapshot); coordinatorEventManager.put( new NotifyKvSnapshotOffsetEvent( tb, completedSnapshot.getLogOffset())); callback.complete(new CommitKvSnapshotResponse()); } catch (Exception e) { callback.completeExceptionally(e); } }); ``` ### Solution _No response_ ### Are you willing to submit a PR? - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
