wombatu-kun commented on issue #15844:
URL: https://github.com/apache/iceberg/issues/15844#issuecomment-4552327124

   The control topic carries only commit-coordination events, not your table's 
row data. On each commit cycle (`iceberg.control.commit.interval-ms`, default 5 
minutes) the coordinator and workers exchange a round of events on it: the 
coordinator sends `StartCommit`, each worker replies with `DataWritten` (which 
carries the *metadata* of the data/delete files it wrote to object storage — 
file paths, record counts, column stats — not the rows themselves) and 
`DataComplete`, and the coordinator finishes with `CommitToTable` and 
`CommitComplete`. Once a commit is done those events are never read again: the 
durable commit position is stored in the Iceberg table snapshot (as a snapshot 
property), and on restart the coordinator resumes from the control-topic offset 
it last committed — so nothing older than the last completed commit is ever 
needed.
   
   The connector doesn't manage the control topic's retention itself. When the 
topic is auto-created (via `auto.create.topics.enable`) it inherits your 
broker's default topic settings — i.e. `cleanup.policy=delete` with the 
broker's default `retention.ms`. If that default is very large (or `-1` for 
infinite), the topic simply keeps every coordination event indefinitely, which 
is the growth you're seeing. With 5 sinks pointing at the same control topic, 
all of their event rounds land on that one topic, so it fills up that much 
faster. (Sharing a single control topic is fine — each task only acts on events 
tagged with its own group id — it just multiplies the volume on that one topic.)
   
   The fix is to put a finite retention on the control topic. It only needs to 
comfortably exceed your commit interval and timeout (leave some margin for a 
coordinator restart) — for a 5-minute commit interval, an hour or a few hours 
is plenty.
   
   Check the topic's current effective retention:
   
   ```bash
   bin/kafka-configs.sh \
     --command-config command-config.props \
     --bootstrap-server ${CONNECT_BOOTSTRAP_SERVERS} \
     --describe \
     --entity-type topics \
     --entity-name control-iceberg
   ```
   
   Set a bounded retention on the existing topic (1 hour shown here):
   
   ```bash
   bin/kafka-configs.sh \
     --command-config command-config.props \
     --bootstrap-server ${CONNECT_BOOTSTRAP_SERVERS} \
     --alter \
     --entity-type topics \
     --entity-name control-iceberg \
     --add-config retention.ms=3600000
   ```
   
   Or create it up front with retention instead of relying on auto-creation:
   
   ```bash
   bin/kafka-topics.sh \
     --command-config command-config.props \
     --bootstrap-server ${CONNECT_BOOTSTRAP_SERVERS} \
     --create \
     --topic control-iceberg \
     --partitions 1 \
     --config retention.ms=3600000
   ```
   
   `retention.ms=3600000` is 1 hour; raise it if you want more headroom, but 
keep it well above `iceberg.control.commit.interval-ms` (default `300000` = 5 
min). After this Kafka will age out old coordination records and the topic size 
will plateau instead of growing without bound.
   
   I've opened a PR to document all of this in the connector docs: 
https://github.com/apache/iceberg/pull/16576
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to