Samrat002 commented on PR #6: URL: https://github.com/apache/flink-connector-redis-streams/pull/6#issuecomment-4413202218
https://github.com/apache/flink-connector-redis-streams/pull/6#issuecomment-4398129085 > Is it really the job parallelism, or the source parallelism? It's the source parallelism, not the job parallelism. Will fix that wording in the PR description. > Why is ACK-ing needed in the first place? My AI tells me ACKing is only needed when using consumer groups. But why do we need consumer groups for Flink? Yes, this is true. XACK is consumer-group-only. We use consumer groups for the reasons defined below. In short, server-side in-flight tracking, finer-grained recovery, native operational visibility, and multi-consumer composition. None of these is achievable with XREAD. > Can't we do the assignment of Flink splits to source instances ourselves, persist the last consumed message ID in Flink state, and on a failure we recover from the last message ID per partition? Yes, that's exactly the XREAD model, and it works. The trade-offs of choosing this approach are defined below. It's strictly simpler in the connector code, but it loses the PEL-bounded recovery window, native operational visibility, multi-consumer composition, and within-job XAUTOCLAIM failover. For a Redis Streams connector specifically (as opposed to a generic log connector), I think those losses outweigh the simplification. I'll update the PR description to make this reasoning explicit upfront. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
