platinumhamburg commented on PR #1946: URL: https://github.com/apache/fluss/pull/1946#issuecomment-3506180702
> @platinumhamburg This is 🔥 I’m curious though, doesn’t the aggregation merge engine, require some kind of transactions mechanism to be introduced first? If a job fails and restarts, does it ensure deterministic results? Thanks @polyzos, your concern is absolutely correct. Currently, the KV engine does not have distributed transaction capabilities. The KV transaction commit relies solely on the local High Watermark, and even its idempotency mechanism is only based on the LogTablet's idempotency manager. Therefore, the KV engine currently only provides AT LEAST ONCE semantics and cannot provide Exactly Once semantics. This is not an issue introduced by the Aggregation Merge Engine itself, but aggregation operations are particularly sensitive to this limitation. After careful evaluation, I agree that the reliability of the Aggregation Merge Engine does depend on distributed transactions. However, I still believe it can be treated as a relatively independent functional unit. The distributed transaction support for the KV engine is a broader topic that probably deserves a separate discussion or issue to address comprehensively. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
