kevinjqliu opened a new pull request, #15340:
URL: https://github.com/apache/iceberg/pull/15340
Follow up to #15124, I noticed an issue when rerunning the quickstart docker
container again (`docker compose -f
docker/iceberg-flink-quickstart/docker-compose.yml up -d --build`)
## Repro
To reproduce, run the quickstart docker container with the command above,
then run the flink sql commands using `docker exec -it jobmanager
./bin/sql-client.sh`.
Rerun the container and these flink sql commands again; `CREATE TABLE` fails
without this PR.
Flink SQL:
```
CREATE CATALOG iceberg WITH (
'type' = 'iceberg',
'catalog-impl' = 'org.apache.iceberg.rest.RESTCatalog',
'uri' = 'http://iceberg-rest:8181',
'io-impl' = 'org.apache.iceberg.aws.s3.S3FileIO',
's3.endpoint' = 'http://minio:9000'
);
CREATE DATABASE IF NOT EXISTS `iceberg`.demo;
CREATE TABLE IF NOT EXISTS `iceberg`.`demo`.sample (
id BIGINT COMMENT 'unique id',
data STRING COMMENT 'payload',
ts TIMESTAMP(3) COMMENT 'event time'
);
INSERT INTO `iceberg`.`demo`.sample VALUES
(1, 'alpha', TIMESTAMP '2026-02-16 10:00:00'),
(2, 'bravo', TIMESTAMP '2026-02-16 10:01:00'),
(3, 'charlie', TIMESTAMP '2026-02-16 10:02:00');
SELECT * FROM `iceberg`.`demo`.sample;
```
## Summary
Fix the Flink quickstart `docker-compose.yml` so that `docker compose up -d
--build` is safe to rerun without breaking the Iceberg REST catalog.
## Problem
The `create-bucket` init container ran `mc rm -r --force minio/warehouse` on
every execution, wiping all S3 data (metadata JSON, Parquet files, Avro
manifests). However, the Iceberg REST catalog's SQLite database persisted
inside its running container, leaving it with stale references to deleted
metadata files. Any subsequent table operation would fail with:
```
NotFoundException: Location does not exist:
s3://warehouse/demo/sample/metadata/00001-....metadata.json
```
## Changes
- **Idempotent bucket creation**: Replace destructive `mc rm -r --force` +
`mc mb` with `mc mb --ignore-existing` to create the bucket only if it doesn't
exist
- **Prevent re-execution on rerun**: Add `tail -f /dev/null` to keep the
`create-bucket` container alive, so `docker compose up` treats it as already
running
- **Healthcheck-gated startup**: Add a healthcheck (`mc ls minio/warehouse`)
to `create-bucket` and update `iceberg-rest` to depend on `service_healthy`,
ensuring the bucket is verified to exist before the catalog starts
- **Fix deprecated CLI**: Replace `mc policy set` with `mc anonymous set` to
avoid deprecation warnings
- **Remove redundant retry loop**: The `until` loop in `create-bucket` is no
longer needed since it now depends on `minio: service_healthy`
## Behavior
| Command | Before | After |
|---|---|---|
| `docker compose up -d --build` (first) | Works | Works |
| `docker compose up -d --build` (rerun) | **Broken** — S3 wiped, catalog
has stale refs | Works — no-op, state preserved |
| `docker compose down && up` | Works (fresh start) | Works (fresh start) |
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]