[
https://issues.apache.org/jira/browse/TIKA-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080018#comment-18080018
]
Tim Allison commented on TIKA-4725:
-----------------------------------
Claude's proposal for 2.
# Adopt `<version>` + `<version>-rN` Docker tagging
## Problem
Today we tag `apache/tika:<tika>.<docker-build-number>` (e.g.
`3.3.0.0`). The trailing `.N` is a Tika-only convention; users naturally
type `apache/tika:<tika-version>` and miss it. In May 2026 a CI workflow
in tika-main pushed `apache/tika:4.0.0-alpha-1` while the manually
published image lives at `apache/tika:4.0.0-alpha-1.0`. Both tags
currently exist on Docker Hub pointing at different (and one of them
broken) images.
We need a scheme that's intuitive at the headline tag, supports
Docker-only rebuilds (OS bumps, CVE fixes) without bumping the Tika
version, and lets users pull a specific older OS by tag alone.
## Proposal
Each push emits two tags pointing at the same manifest digest:
- **Mutable headline:** `apache/tika:<tika-version>` and
`apache/tika:<tika-version>-full`. Retagged in place on rebuilds.
- **Immutable build tag:** `apache/tika:<tika-version>-r<N>` and
`apache/tika:<tika-version>-r<N>-full`. `-r0` is the initial release;
`-rN` for each subsequent Docker-only rebuild. Never reassigned.
`-rN` follows the Alpine apk and Bitnami image conventions.
### Example
Initial release: push tags `4.0.0`, `4.0.0-r0`, `4.0.0-full`,
`4.0.0-r0-full` (all four point at the same two digests).
CVE rebuild a month later: push `4.0.0-r1`, `4.0.0-r1-full` (new
digests). Retag `4.0.0` and `4.0.0-full` to those new digests. `4.0.0-r0`
remains reachable for anyone wanting the original OS.
Bit-level immutability is always available via `@sha256:…` digest pin.
## Migration
- Past releases stay as they are; no renumbering.
- New scheme starts with 4.0.0 (and retroactively for 4.0.0-alpha-1).
- One-line note in `CHANGES.md` and a short Available Tags blurb in
`README.md`.
- `docker-tool.sh publish` adds a second `--tag` per `buildx build` for
the `-rN` form.
## Open questions
1. **Is `<N>` a manual CLI arg, or derived from
`git tag --list "<tika-version>-r*"`?**
2. **Sliding aliases (`4`, `4.0`)?** Independent decision; can adopt
later.
3. **Existing colliding tags:** retag `apache/tika:4.0.0-alpha-1` to the
same digest as `apache/tika:4.0.0-alpha-1.0`, freeze the `.0` tag
forever, don't continue the `.N` scheme.
4. **Disable `docker-release.yml` in tika-main** so tika-docker is the
single publishing pipeline (the source of the May 4 collision).
> Tweaks to github publishing
> ---------------------------
>
> Key: TIKA-4725
> URL: https://issues.apache.org/jira/browse/TIKA-4725
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Priority: Minor
>
> Many, many thanks to [~ndipiazza] , we have automated pushes to docker hub.
> During the 4.0.0-alpha-1 release, I noticed two things we might want to clean
> up.
> # The 4.0.0-alpha-1 image was released May 4 at 7:45pm (not sure which tz,
> guessing utc?), which was probably during one of my early attempts to make
> the release. The vote email didn't go out until May 5 at 6:41am (ET). The
> push shouldn't happen until after a successful vote, no?
> # How do we handle versioning of docker releases outside of a Tika release?
> For example, if we get a request to bump the base image or an OS dependency?
> We had a homegrown versioning system to tack another number: 3.3.0.0 – was
> the first release with Tika 3.3.0. If we had to bump the underlying os and
> republish with Tika 3.3.0, that would be bumped to 3.3.0.1.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)