wombatu-kun commented on issue #15844:
URL: https://github.com/apache/iceberg/issues/15844#issuecomment-4552327124
The control topic carries only commit-coordination events, not your table's
row data. On each commit cycle (`iceberg.control.commit.interval-ms`, default 5
minutes) the coordinator and workers exchange a round of events on it: the
coordinator sends `StartCommit`, each worker replies with `DataWritten` (which
carries the *metadata* of the data/delete files it wrote to object storage —
file paths, record counts, column stats — not the rows themselves) and
`DataComplete`, and the coordinator finishes with `CommitToTable` and
`CommitComplete`. Once a commit is done those events are never read again: the
durable commit position is stored in the Iceberg table snapshot (as a snapshot
property), and on restart the coordinator resumes from the control-topic offset
it last committed — so nothing older than the last completed commit is ever
needed.
The connector doesn't manage the control topic's retention itself. When the
topic is auto-created (via `auto.create.topics.enable`) it inherits your
broker's default topic settings — i.e. `cleanup.policy=delete` with the
broker's default `retention.ms`. If that default is very large (or `-1` for
infinite), the topic simply keeps every coordination event indefinitely, which
is the growth you're seeing. With 5 sinks pointing at the same control topic,
all of their event rounds land on that one topic, so it fills up that much
faster. (Sharing a single control topic is fine — each task only acts on events
tagged with its own group id — it just multiplies the volume on that one topic.)
The fix is to put a finite retention on the control topic. It only needs to
comfortably exceed your commit interval and timeout (leave some margin for a
coordinator restart) — for a 5-minute commit interval, an hour or a few hours
is plenty.
Check the topic's current effective retention:
```bash
bin/kafka-configs.sh \
--command-config command-config.props \
--bootstrap-server ${CONNECT_BOOTSTRAP_SERVERS} \
--describe \
--entity-type topics \
--entity-name control-iceberg
```
Set a bounded retention on the existing topic (1 hour shown here):
```bash
bin/kafka-configs.sh \
--command-config command-config.props \
--bootstrap-server ${CONNECT_BOOTSTRAP_SERVERS} \
--alter \
--entity-type topics \
--entity-name control-iceberg \
--add-config retention.ms=3600000
```
Or create it up front with retention instead of relying on auto-creation:
```bash
bin/kafka-topics.sh \
--command-config command-config.props \
--bootstrap-server ${CONNECT_BOOTSTRAP_SERVERS} \
--create \
--topic control-iceberg \
--partitions 1 \
--config retention.ms=3600000
```
`retention.ms=3600000` is 1 hour; raise it if you want more headroom, but
keep it well above `iceberg.control.commit.interval-ms` (default `300000` = 5
min). After this Kafka will age out old coordination records and the topic size
will plateau instead of growing without bound.
I've opened a PR to document all of this in the connector docs:
https://github.com/apache/iceberg/pull/16576
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]