Hi Apache Kafka Community,

I'm working on a problem involving cross-cluster data flow with
exactly-once guarantees and would love to get your thoughts or experiences
on this.
The Setup:

   -

   I have a set of *consumers reading from a source topic* located on Kafka
   cluster *K1*.
   -

   There's a *producer writing to a target topic* on a *different Kafka
   cluster*, let's call it *K2*.
   -

   My goal is to *only commit offsets on K1* *after* a successful write to
   the topic on *K2*.

The Challenge:

If the producer gets stuck in a transaction (e.g., due to a GC pause), and
the consumer is removed from the group (e.g., no heartbeats during the
session timeout), then the *transaction coordinator on K2 has no context to
determine the validity of the transaction*.

If both the source and target topics were on the *same cluster*, I could
have used sendOffsetsToTransaction() with ConsumerGroupMetadata. This
allows the broker to gate the commit based on the consumer group's
generation ID. But since the clusters are different, this isn’t possible.
Workarounds Considered:

   1.

   *Static partition-to-consumer assignment*:
   -

      Assign each consumer to a fixed set of partitions to avoid partition
      migration and generation ID mismatch.
      -

      Downside: Doesn't scale well and is brittle in dynamic environments.
      2.

   *One producer per partition with deterministic transactional.id
   <http://transactional.id>* (e.g., transactional.id = topic-partition):
   -

      If a partition is reassigned, a new producer with the same
      transactional.id starts, causing fencing of the old one.
      -

      Downside: The number of producers scales linearly with partitions,
      which isn’t ideal in large-scale systems.

The Ask:

Both approaches are limiting in terms of *scalability and resilience*. Has
anyone faced this issue or found an elegant solution for achieving
*transactional
guarantees across Kafka clusters*?

Would love to hear if:

   -

   There are *patterns or tools* in the ecosystem that help with this.
   -

   Anyone has explored *outbox patterns*, *exactly-once delivery bridges*,
   or *custom fencing mechanisms* to solve similar problems.
   -

   There's *any roadmap* for Kafka itself to support cross-cluster
   sendOffsetsToTransaction or a similar API.

Any insights, suggestions, or shared pain would be incredibly helpful!

Thanks in advance,
Pritam Kumar

Reply via email to