Hi everyone,

Allow me to share a great milestone we've been able to recently reach: Our WSL image [0] is now bit-for-bit reproducible [1]!

Here is a quick sum up of the issue we faced and the fixes we applied for them:

-----

## Installing packages from an archived repo snapshot

To ensure that the same exact versions / releases of packages get installed in the image across builds, we defined an archived snapshot of our repositories (via our "Arch Linux Archive" service [2][3]) as the source to download packages during build. The date for the daily archived repo snapshot is based against the version of the built image (which is date based). The same date is also used as the `SOURCE_DATE_EPOCH` (SDE) timestamp (see below points for SDE usage).

## Normalize filesystem `mtimes` with SDE

With SDE set in our build script, we normalized `mtimes` of the rootFS against it as a post-build operation to avoid non-deterministic timestamps on that front:

```
find "$BUILDDIR" -exec touch --no-dereference --date="@$SOURCE_DATE_EPOCH" {} +
```

## Get rid of Pacman logs during build (timestamps recording)

Pacman records each operation with the associated timestamp in its logfile (/var/log/pacman.log). Since we don't particularly need Pacman logs to be recorded during the image build, we redirected them to `/dev/null/`:

```
pacman \
    [...]
    --logfile /dev/null \
    [...]
```

## Normalize packages' installation date in Pacman's local package DB

Pacman records packages' installation date, resulting in non-deterministic timestamps recording in the metadata included in the local package database. Fortunately, Jelle van der Waa recently brought support in Pacman for honoring SDE on that front [4]. With the related commit being merged in our latest pacman package release, simply exporting SDE in our build script was enough to normalize packages' installation date in Pacman's local package database.

## Delete Pacman keyring

Pacman OpenPGP keys (generated with `pacman-key`) implies non-deterministic data in the local Pacman keyring. Fortunately, WSL has a built-in "oobe" (Out Of the Box Experience) mechanism [5] that allows to automatically run a script at the first boot of the image. We therefore took advantage of this mechanism to completely delete the Pacman's keyring as a post-build operation and automatically recreate it at the first boot of the image via the "oobe" script (by running `pacman-key --init && pacman-key --populate archlinux` from it).

## Normalize tar's `mtimes` and ordering

Our rootFS for the WSL image is archived with `tar`, to which we added a few options to normalize `mtimes` (against SDE) as well as ordering, in order to avoid non-determinism on that front:

```
tar \
    [...]
    --mtime="@$SOURCE_DATE_EPOCH" \
    --clamp-mtime \
    --sort=name \
    [...]
```

## Delete tar's `atimes` / `ctimes`

Finally, we got rid of tar's `atimes` / `ctimes` to avoid non-deterministic timestamps in the archive's metadata:

```
tar \
    [...]
    --pax-option=delete=atime,delete=ctime \
    [...]
```

-----

For more details, you can take a look at the related merge requests, respectively to test the image reproducibility from a dedicated GitLab CI stage [6] and to actually make the image reproducible (including all the above fixes) [7]. At the moment, the result of the dedicated CI stage verifying the image reproducibility is non-blocking for releasing, but we intend to make it mandatory soon (given that we achieved to maintain a full reproducibility until then).

The additional good news is that, since it's built in a similar way, most of those fixes should also apply to our Docker image [8] as well ; with the exception of the Pacman keyring related issue which will need to be dealt with differently, somehow... (since, as far as I know, OCI images doesn't have any built-in / straightforward mechanism to run a script or commands automatically at first boot, like the WSL "oobe" mechanism).

This represents a meaningful achievement regarding our global "reproducible builds" related efforts and is encouraging for future related work on our other releng images!

[0] https://gitlab.archlinux.org/archlinux/archlinux-wsl
[1] https://reproducible-builds.org/
[2] https://archive.archlinux.org
[3] https://wiki.archlinux.org/title/Arch_Linux_Archive
[4] https://gitlab.archlinux.org/pacman/pacman/-/commit/f4bdb77470528019aaba4d8b8f947e918c6db17d [5] https://learn.microsoft.com/en-us/windows/wsl/build-custom-distro#add-the-wsl-distribution-configuration-file
[6] https://gitlab.archlinux.org/archlinux/archlinux-wsl/-/merge_requests/74
[7] https://gitlab.archlinux.org/archlinux/archlinux-wsl/-/merge_requests/76
[8] https://gitlab.archlinux.org/archlinux/archlinux-docker

--
Regards,
Robin Candau / Antiz

Attachment: OpenPGP_0xFDC3040B92ACA748.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to