This patch-set contains basic changes needed in order to support building of
reproducible bianries. The set containes the following patches:


Using this patch set while building core-image minimal (two clean builds, same
machine/OS, same date, two different folders, at two different times) I got the
following results:


(Some binaries i.e. ext4 differ, but the differnce is due to conversion to

Comparing Debian packages in tmp/deploy/deb:

Same:  4005
Different:  38
Total: 4043

(The remaining packages that still differ can be dealt with on an individual 

Although the patches contain commit messages explaining the purpose and 
a somewhat more detailed description of selected patches seems prudent:


This patch creates a new class "reproducible_build.bbclass",
introducing two new variables:

BUILD_REPRODUCIBLE_BINARIES: "0" (default) business as usual, "1" turn on 
various pieces of
codes to improve reproducible builds

Catch-all timestamp for various rootfs files, pre-linker, etc. If needed, 
timestamps can
be better granulated later on, right now we use a single value.

Having a new variable BUILD_REPRODUCIBLE_BINARIES serves two purposes:
1. Lets user decide (there are minor trade-offs)
2. Setting to "0" will guarantee to cause zero regressions.
3. Setting to "1" will force the the environment to contain SOURCE_DATE_EPOCH

BUILD_REPRODUCIBLE_BINARIES is globally exported, as this will initially force 
all kinds
of rebuilds. I know no simple way around this, though. This variable is needed 
in numerous
places: configuration, compilation, rootfs creation, packaging etc. 
REPRODUCIBLE_TIMESTAMP_ROOTFS does not need to be globally exported, it is 
exported locally
based on the need.
Once these variables are "official", various classes and recipes can be 
modified to conditionally
support binary reproducibility.

Setting SOURCE_DATE_EPOCH is essential for binary reproducibility.
We need to set a recipe specific SOURCE_DATE_EPOCH in each recipe environment 
for various tasks.
One way would be to modify all recipes one-by-one, but that is not realistic. 
So determining
SOURCE_DATE_EPOCH is done in this class automatically: After sources are 
unpacked (but
before they are patched), we try to determine the value for SOURCE_DATE_EPOCH.

There are 4 ways to determine SOURCE_DATE_EPOCH:
1. Use value from src-data-epoch.txt file if this file exists. This file was 
most likely created
  in the previous build by one of the following methods 2,3,4.
  (But it could be actually provided by a recipe via SRC_URI)

If the file does not exist:
2. Use .git last commit date timestamp (git does not allow checking out files 
and preserving their
3. Use "known" files such as NEWS, CHANGLELOG, ...
4. Use the youngest file of the source tree.

Once the value of SOURCE_DATE_EPOCH is determined, it is stored in the recipe 
source tree in
a text file "src-date-epoch.txt'.

If this file is found by other recipe task, the value is placed in the 
the task environment. This is done in an anonymous python function, so 
guaranteed to exist for all tasks. (If the file is not found SOURCE_DATE_EPOCH 
is set to 0)
This can optimized in the future, as some tasks (all tasks before fetch, tasks 
such as package QA,
rm_work, ...) do not need SOURCE_DATE_EPOCH in the environment.

These are back ports of existing patches. They ensure the compiled .pyc files
contain timestamp based on SOURCE_DATE_EPOCH (if defined in the environment).
(May not be needed in the future, my understanding is support for 
upstreamed in master)


This patch contains several changes, was created by squashing several commits.
Several tweaks to improve reproducibility:

We want to set KBUILD_BUILD_TIMESTAMP to some reproducible value. Normally,
we would use the value for SOURCE_DATE_EPOCH. However, to accommodate local 
kernel sources,
these are not obtained the usual way via do_unpack and hHence we end up with
SOURCE_DATE_EPOCH set to 0. In this case we obtain the timestamp from top entry 
of GIT repo,
or (if there is no GIT repo) fallback to REPRODUCIBLE_TIMESTAMP_ROOTFS as the 
last resort.
Kernel and kernel modules contain hard coded paths referencing the host
build system. This is usually because the source code contains __FILE__
at some place. This prevents binary reproducibility. However, some compilers
allow remapping of the __FILE__ value. If we detect the compiler is capable
of doing this, we replace the source path $(S) part of __FILE__ by a string 
This works very well for oe-embedded cross-compilers, but it is not guaranteed 
to work for
external toolchains. Hence, the check for the option being supported. Note that 
is done regardless of the value od BUILD_REPRODUCIBLE_BINARIES.

When compressing vmlinux.gz, use gzip "-n" option as recommended in all 
guidelines to achieve
binary reproducibility.

Support building of reproducible images by setting

This is mostly for convenience so the user does not have to modify

Please note setting LDCONFIGDEPEND = ""
This prevents building of ldconfig cache, which (when built) breaks binary

Also, it should avoid reproducibility issue with etc/passwd, where for example
two different builds can lead to two different values i.e:

build 1:

build 2:

Juro Bystricky (11):
  reproducible_build.bbclass: initial support for binary reproducibility
  image-prelink.bbclass: support binary reproducibility
  rootfs-postcommands.bbclass: support binary reproducibility improve reproducibility
  image.bbclass: support binary reproducibility
  cpio: provide cpio-replacement-native
  image_types.bbclass: improve cpio image reproducibility
  python2.7: improve reproducibility
  python3: improve reproducibility
  kernel.bbclass: improve reproducibility
  poky-reproducible.conf: Initial version

 meta-poky/conf/distro/include/reproducible-group   |  50 ++++++++++
 meta-poky/conf/distro/include/reproducible-passwd  |  25 +++++
 meta-poky/conf/distro/poky-reproducible.conf       |  38 ++++++++
 meta/classes/base.bbclass                          |   4 +
 meta/classes/image-prelink.bbclass                 |  12 ++-
 meta/classes/image.bbclass                         |  16 ++-
 meta/classes/image_types.bbclass                   |  14 ++-
 meta/classes/kernel.bbclass                        |  39 +++++++-
 meta/classes/reproducible_build.bbclass            | 108 +++++++++++++++++++++
 meta/classes/rootfs-postcommands.bbclass           |  27 +++++-
 meta/recipes-core/busybox/              |   7 ++
 .../python/                 |   1 +
 .../python/python/reproducible.patch               |  34 +++++++
 .../python/                 |   1 +
 .../support_SOURCE_DATE_EPOCH_in_py_compile.patch  |  97 ++++++++++++++++++
 meta/recipes-devtools/python/      |   1 +
 meta/recipes-devtools/python/      |   1 +
 meta/recipes-extended/cpio/             |   2 +
 18 files changed, 467 insertions(+), 10 deletions(-)
 create mode 100644 meta-poky/conf/distro/include/reproducible-group
 create mode 100644 meta-poky/conf/distro/include/reproducible-passwd
 create mode 100644 meta-poky/conf/distro/poky-reproducible.conf
 create mode 100644 meta/classes/reproducible_build.bbclass
 create mode 100644 meta/recipes-devtools/python/python/reproducible.patch
 create mode 100644 


Openembedded-core mailing list

Reply via email to