from:"Richard Purdie"

Re: rust non-determinism

2024-08-06 Thread Richard Purdie

On Tue, 2024-08-06 at 11:48 +0200, Bernhard M. Wiedemann wrote:
> On 06/08/2024 10.03, Richard Purdie wrote:
> > 
> > We found we had to disable both lto and use codegen-units=1 to make
> > things always reproduce. It took us months to work that out!
> 
> This sounds similar to the issue I see.
> Do you have a simple reproducer?
> When I try rustc with a hello-world.rs the binary only varies from
> path 
> when I add -g and no -C strip=debuginfo
> but that is independent of lto and codegen-units.
> 
> Maybe it needs some library?

We've struggled to reduce it down outside of the rust build environment
so sadly we don't have a simple reproducer, no.

Cheers,

Richard

Re: rust non-determinism

2024-08-06 Thread Richard Purdie

On Tue, 2024-08-06 at 09:06 +0200, Bernhard M. Wiedemann via rb-general wrote:
> On 05/08/2024 18.37, John Gilmore wrote:
> > Bernhard M. Wiedemann  wrote:
> > > => https://github.com/rust-lang/rust/issues/128675
> > 
> > Two Rustc developers closed it within 8 hours as "already completed",
> > even though it isn't.
> > 
> > They also said "CGU partitioning is very deliberately designed to be
> > deterministic."  Implying that therefore there is no bug, because design
> > and implementation are the same thing?
> > 
> > Rust has 36 open reproducibility bugs (not including this closed one).
> > It'd be worth seeing what other ones they closed unfixed.
> 
> my workaround for openSUSE at
> https://github.com/Firstyear/cargo-packaging/pull/11 was also rejected.
> 
> It is a complex ecosystem. pop-launcher pulls in 258 modules and many
> have their own build.rs that could break reproducibility. There is 
> `crate` and `just` as build tools that can cause issues, too.
> In the past there were also LLVM bugs that caused non-determinism.
> 
> So it is often hard to pin-point the source of non-determinism with 
> precision and confidence.
> 
> And it does not help that rust HashMaps have non-deterministic order by 
> default.
> 
> 
> I'm getting closer to think that llvm's LTO and rust's codegen-units=16 
> might be determistic by themselves, but (similar to PGO[1]) amplify 
> other sources of non-determinism which makes it harder to debug.
> 
> If that is indeed the case, we can disable them during debugging r-b but 
> leave them on when the package-specific issue is fixed.

You might find this interesting:

https://git.yoctoproject.org/poky/commit/?id=e2e7017350d0b5324811fef3b841f98b00273887

Yocto Project has been chasing a problem where rust itself was
reproducible but rustdoc was not. We found it only happened if the
build was in paths of a different length. We had a pid in the buildpath
so this sometimes happened and sometimes did not, we've fixed things so
they're always different lengths now.

We found we had to disable both lto and use codegen-units=1 to make
things always reproduce. It took us months to work that out!

Exactly where the bug is, we're not sure but it does mean we have a
working workaround for now.

Cheers,

Richard

Re: May 2024: whatsrc.org distro status

2024-06-07 Thread Richard Purdie

On Tue, 2024-06-04 at 00:54 +0200, kpcyrd wrote:
> On 5/31/24 1:56 PM, Richard Purdie wrote:
> > > > Taking bash as an example, you can see information from the
> > > > layer
> > > > index:
> > > > 
> > > > https://layers.openembedded.org/layerindex/recipe/81/
> > > > 
> > > > which doesn't list the checksum but does point at the recipe:
> > > > 
> > > > https://git.openembedded.org/openembedded-core/tree/meta/recipes-extended/bash/bash_5.2.21.bb?h=master
> > > > 
> > > > which does list it:
> > > > 
> > > > SRC_URI[tarball.sha256sum] =
> > > > "c8e31bdc59b69aaffc5b36509905ba3e5cbb12747091d27b4b977f078560d5
> > > > b8"
> 
> Thanks for your pointers, Yocto Project is now listed too:
> 
> https://whatsrc.org/artifact/sha256:c8e31bdc59b69aaffc5b36509905ba3e5cbb12747091d27b4b977f078560d5b8

Looks good thanks!

I'm interested in how you were able to get to some of the data as I
think variable expansion may be limiting access to some of it. I did
have a look at your code but I'm not that experienced with rust. This
is a great start though.

Cheers,

Richard

Re: May 2024: whatsrc.org distro status

2024-05-31 Thread Richard Purdie

On Fri, 2024-05-31 at 13:05 +0200, kpcyrd wrote:
> On 5/30/24 12:41 AM, Richard Purdie wrote:
> > There is such data in Yocto Project too, although it would be
> > spread
> > into the layers that contain the software components in question.
> > 
> > Taking bash as an example, you can see information from the layer
> > index:
> > 
> > https://layers.openembedded.org/layerindex/recipe/81/
> > 
> > which doesn't list the checksum but does point at the recipe:
> > 
> > https://git.openembedded.org/openembedded-core/tree/meta/recipes-extended/bash/bash_5.2.21.bb?h=master
> > 
> > which does list it:
> > 
> > SRC_URI[tarball.sha256sum] =
> > "c8e31bdc59b69aaffc5b36509905ba3e5cbb12747091d27b4b977f078560d5b8"
> 
> Thanks!
> 
>  From a user's point of view, what's the best way to determine what's
> currently available in Yocto?

The layer index listed above is our search engine of what is available
in the system, so it is that, or the layers themselves. The layer index
just reads data from the layers themselves.

> In Debian this is decided through /source/Sources.xz, in Alpine it's 
> APKINDEX.tar.gz, in Arch Linux it's core.db/extra.db (although
> whatsrc 
> uses https://gitlab.archlinux.org/archlinux/packaging/state instead).
> 
>  From what I read Yocto doesn't have this "traditional" concept of a 
> package manager, do users clone this repository[1] (similar to how 
> nixpkgs works) or do they download some kind of index from somewhere?
> 
> [1]: https://git.openembedded.org/openembedded-core/tree/

Users would clone openembedded-core as per that link and then add the
additional layers they want to use alongside that. Those additional
layers can bring in extra software. We don't ship binaries, we ship
pointers to source code.

The layers users often add are indexed in the layer index which is how
people find things: https://layers.openembedded.org/

> I'm trying to be as canonical as possible, for example with Alpine
> and 
> WolfiOS I could also take snapshots of their aports repository[2] (or
> equivalent[3]), but instead I download and parse their
> APKINDEX.tar.gz 
> since it records the specific git commit a package was built from.
> 
> [2]: https://gitlab.alpinelinux.org/alpine/aports
> [3]: https://github.com/wolfi-dev/os

The openembedded-core repository is canonical for the core software.
The meta-openmbedded layer contains a lot of additional software and
then there are extra layers on top. The metadata in those layers is the
canonical definition. Other layers can redefine things if they so wish
and there is a priority system to determine which layer ultimately
would be built for a given component if there are conflicts.

bitbake is the tool which understands these layers, knows how to
combine everything and pull data out. It does have a query/control API
called tinfoil where you can effectively custom query/extract data from
the layers.

"poky" is an easier to get started with merged git repo of bitbake +
opembedded-core so you can get running with one clone (from times
before submodules existed!).

Cheers,

Richard

Re: May 2024: whatsrc.org distro status

2024-05-29 Thread Richard Purdie

On Wed, 2024-05-29 at 18:38 +0200, kpcyrd wrote:
> Dear list,
> 
> As of May 2024, I have imported source code data from the following 
> distributions:
> 
> - Alpine Linux edge
> - Arch Linux
> - Debian sid, stable, stable-updates, stable-backports, stable-
> security
> - Fedora rawhide
> - Gentoo
> - Guix
> - Homebrew
> - Kali Linux Rolling
> - openSUSE Tumbleweed
> - Ubuntu 24.04 (jammy, jammy-updates, jammy-security, jammy-
> backports)
> - Void Linux
> - WolfiOS
> 
> In total, at the time of writing, I've collected and indexed 224,790 
> unique source code archives, and 33,193 dependency lockfiles 
> (Cargo.lock, go.sum, package-lock.json, ...).
> 
> Think of this as "myspace for source code", you can check which 
> operating systems a specific tar-file is friends with.
> 
> Or, put differently, each operating system gets a vote what they 
> consider the source code for a given software release.

There is such data in Yocto Project too, although it would be spread
into the layers that contain the software components in question.

Taking bash as an example, you can see information from the layer
index:

https://layers.openembedded.org/layerindex/recipe/81/

which doesn't list the checksum but does point at the recipe:

https://git.openembedded.org/openembedded-core/tree/meta/recipes-extended/bash/bash_5.2.21.bb?h=master

which does list it:

SRC_URI[tarball.sha256sum] = 
"c8e31bdc59b69aaffc5b36509905ba3e5cbb12747091d27b4b977f078560d5b8"

Ours tools parse and validate the sources match the checksums or we use
specific git (or other source control) revisions. I'm wondering if
there is interest in exporting that information for comparision
somehow...

Cheers,

Richard

Re: Bootstrapping and autotools

2024-04-19 Thread Richard Purdie

On Fri, 2024-04-19 at 00:11 +0200, kpcyrd wrote:
> does somebody know what the bootstrapping status of autotools is?
> 
> Also, is something considered "bootstrapped from source" if the build
> is making use of a pre-generated `./configure` script that is 25k
> lines long?
> 
> In my opinion something is only "bootstrapped from source" if
> autoreconf is used as part of the build instead of executing some
> pre-compiled `./configure` script. This however means that, to
> compile bash, one needs to compile autotools first.

FWIW Yocto Project/OpenEmbedded autoreconf pretty much everything we
can. We bootstrap our own autoconf/automake native tools too. To do
that you effectively need m4, perl and gnu-config and we assume perl
from the host system.

Cheers,

Richard

Re: Arch Linux minimal container userland 100% reproducible - now what?

2024-04-03 Thread Richard Purdie

On Tue, 2024-04-02 at 10:11 -0700, John Gilmore wrote:
> James Addison wrote that local storage can contain errors.  I agree.
> 
> > My guess is that we could get into near-unsolvable philosophical territory
> > along this path, but I think it's worth being skeptical of the notions that
> > local-storage is always trustworthy and that the network should always be
> > avoided.
> 
> For me, the distinction is that the local storage is under the direct
> control of the person trying to rebuild, while the network and the
> servers elsewhere in the network are not.  If local storage is
> unreliable, you can fix or replace it, and continue with your work.
> 
> I am looking for reproducibility that is completely doable by the person
> trying to do it, at any time after when they obtain a limited number of
> key items by any means: the bootable binary of the OS release, and what
> the GPL calls the "Corresponding Source".
> 
> And, I am very happy to be seeing lots of incremental progress along the way!

FWIW Yocto Project/OpenEmbedded is able to do something like this.

The builds are "cross" and sufficiently isolated from the host that the
host OS doesn't influence the output. By that I mean we build a cross
compiler and then use the cross compiler to build the target. 

Whilst the intermediate cross compiler may differ bitwise depending on
the host compiler, the generated target output should always be the
same. I say "should" as there can be theoretical contamination sources
but we test this on our infrastructure with diverse hosts (Debian,
Ubuntu, Fedora, Alma, Rocky and OpenSUSE systems of differing versions)
and check we always get the same output. This is what our reproducible
claim is measuring, that this output doesn't differ between those
systems.

The build system doesn't allow network access outside the initial
"fetch" step and it verifies some form of checksum of every external
source input.

The inputs can be fetched from their upstream location, or from a
mirror. The project maintains a mirror but users can also have a local
one of their own. Since the inputs are checksum verified, it doesn't
really matter where.

So the things needed to build a given output are:

* the metadata (build instructions)
* the build system itself
* sources or a sources mirror (which is verified against the metadata)
* some kind of host to run the build

For the host to run the build, it can be an off the shelf
ubuntu/debian/fedora/whatever or it can also be one of our own output
images, leading to effective self hosting.

Each of the above things are things which someone can easily archive
and restore without significant issue or knowledge.

This can therefore all be done by anyone, meaning someone building a
product using embedded linux (our target users) can rebuild their
output incorporating any security fixes needed for example, years from
now.

I'd note this isn't theoretical, there are companies doing this today
using the self hosting images so there isn't a dependency on any other
distro either.

Cheers,

Richard

Re: Two questions about build-path reproducibility in Debian

2024-03-07 Thread Richard Purdie

On Wed, 2024-03-06 at 14:57 +, Holger Levsen wrote:
> On Tue, Mar 05, 2024 at 11:51:16PM +0000, Richard Purdie wrote:
> > FWIW Yocto Project is a strong believer in build reproducibiity
> > independent of build path and we've been quietly chipping away at
> > those
> > issues.
> [...] 
> > OpenEmbedded-Core (around 1000 pieces of software) is 100%
> > reproducible
> > and we have the tests to prove it running daily, building in
> > different
> > build paths and comparing the output.
> 
> that's awesome!
> 
> btw, https://www.yoctoproject.org/reproducible-build-results/ (linked
> from https://reproducible-builds.org/who/projects/#Yocto%20Project)
> doesn't show any results?

We made changes to the website and that stopped working. We had noticed
and raised it with the website people but I've used your question to
encourage them to get it fixed :).

It now shows "36754 out of 36754 (100.00%) packages tested were
reproducible" :)

The tests were always running, the webpage was just broken.

> > We're working on our wider layers too, e.g. meta-openembedded has
> > another 2000+ pieces of software and less than 100 are not
> > reproducible.
> 
> nice.
> 
> we had 35000 pieces of software in Debian of which ~2000 were not 
> reproducible with undeterministic build pathes. Now with build pathes
> as part of the build environment it's less than half.

Very nice too! :)

FWIW we've made reproducibility an unconditional thing in our
configuration and processes now so everyone sees the common errors and
we're all using the same build command lines and so on.

The idea behind getting meta-openembedded tested was to ensure (and
demonstrate) our tools and tests could be used against arbitrary layers
which should encourage people to test their own software.

Lots of small steps which should help the overall ecosystem and goal.

Cheers,

Richard

Re: Two questions about build-path reproducibility in Debian

2024-03-05 Thread Richard Purdie

On Tue, 2024-03-05 at 08:08 -0800, John Gilmore wrote:
> > > But today, if you're building an executable for others, it's common to 
> > > build using a
> > > container/chroot or similar that makes it easy to implement "must compile 
> > > with these paths",
> > > while *fixing* this is often a lot of work.
> 
> I know that my opinion is not popular, but let me try again before we lay 
> this decision to rest.
> 
> In avoiding fixing directory dependencies, you can move the complexity 
> around, but in doing so you didn't reduce the complexity.

FWIW Yocto Project is a strong believer in build reproducibiity
independent of build path and we've been quietly chipping away at those
issues.

There are issues we resolve by using carefully selected compiler
options or environment variables like SOURCE_DATE_EPOCH but also things
we do highlight to upstreams and ask if they'd mind improving them. In
general once they're aware of the issues, they do try and help. We have
identified several regressions in rust in that regard in the last few
versions for example and also helped test fixes.

OpenEmbedded-Core (around 1000 pieces of software) is 100% reproducible
and we have the tests to prove it running daily, building in different
build paths and comparing the output.

We're working on our wider layers too, e.g. meta-openembedded has
another 2000+ pieces of software and less than 100 are not
reproducible.

So even if debian doesn't do this, there is interest elsewhere and I
believe good progress is being made.

Cheers,

Richard

Re: Reproducibility terminology/definitions

2023-11-09 Thread Richard Purdie

On Thu, 2023-11-09 at 22:12 +0100, kpcyrd wrote:
>  > - $E$, the set of all possible software environment
> 
> I'm not aware of any project having this in scope. It's crucial for 
> projects to document their build environment (see buildinfo files) and 
> matching it when reproducing the build. If you use a different compiler 
> version, or a different linker version/configuration, you're almost 
> guaranteed to get mismatching binaries. Knowing two different compiler 
> versions produce the same binary is also not inherently useful.

FWIW I think Yocto Project could claim to do this. We don't care about
the build location within reason. We build our own cross compilers and
have a fairly small host dependencies list. We can also build a tarball
of our dependencies, so effectively self bootstrap.

Our infrastructure runs regular reproducibility tests on "random"
distro bases (various versions of Debian, Ubuntu, Fedora, opensuse) to
ensure we stay reproducible too.

So it definitely is possible!

Cheers,

Richard

Pitfall of using shortened git hashes compiled into code

2023-09-17 Thread Richard Purdie

We recently noticed igt-gpu-tools failed our reproducibility tests with
seemingly no changes made to it.

The change was the string g2b29e8ac becoming g2b29e8ac0:

http://autobuilder.yocto.io/pub/repro-fail/oe-reproducible-20230917-br60if6q/packages/diff-html/

Investigating showed this comes from VCS_TAG.

What appears to have happened is we pulled new revisions into the git
tree and even though we didn't use them, "g2b29e8ac" was no longer
unique so the hash was lengthened. This resulted in significant changes
to the binary output.

I'm not sure if there is a general recommendation on not using short
hashes but if not, there should be!

Cheers,

Richard

Re: breaking CI if build is not reproducible?

2023-06-07 Thread Richard Purdie

On Wed, 2023-06-07 at 15:50 +0200, Martin Monperrus wrote:
>  We're researching on build reproducibility.
>  
>  Are you aware of any project where reproducibility is checked in a
> continuous integration pipeline?
>  
>  (For instance, by building twice in CI and comparing the output)
>  
>  If yes, thanks to share here or via private email.

The Yocto Project does, here:

https://autobuilder.yoctoproject.org/typhoon/#/builders/117

Cheers,

Richard

Re: SBOMs - Anywhere?

2023-02-25 Thread Richard Purdie

On Sat, 2023-02-25 at 15:56 +, Anthony Harrison wrote:
> This post follows on a chat with Chris Lamb and he felt it was worthy
> of a wider readership and debate.
> As I am sure everyone is aware, there is a growing interest in
> Software Bills of Materials (SBOMs) as a way of improving software
> security and resilience. In the last two years, the US through the
> Exec Order, the EU through the proposed Cyber Resilience Act (CRA)
> and this month the UK has issued a consultation paper looking at
> software security and SBOMs appear very prominently in each
> publication. There are various initiatives ongoing related to
> increasing the adoption of SBOMs, including the SBOM Everywhere
> initiative which is run by the OpenSSF - this is where I 'met' Chris.
> I am also involved in a number of other groups looking at
> vulnerability management and SBOMs as well as looking at the wider
> use of SBOMs (primarily in the US). 
> 
> What is very clear is that SBOMs are starting to be produced but as I
> found out at FOSDEM, they are only really being used
> internally within development teams (which is a great start) rather
> than being used by end user organisations. The two use cases are
> license management and some limited use for identifying
> vulnerabilities in. build. It is also very clear from work I have
> been doing over the past few months, is that whilst there are some
> tools starting to appear, they are very variable in quality or are
> very focused on particular use cases (e.g. SBOMs for a container
> image). 
> 
> To try and fill the gap, I have created a number of tools in Python
> to help generate and consume SBOMs. All are available on PyPI. Based
> on the discussions at FOSDEM, I am going to update a number of them
> to add some new features. All of my tools work with SPDX (TagValue,
> JSON, YAML) and CycloneDX (JSON) formats - the SBOM world is going to
> have two competing standards for the foreseeable future. My tools
> include:
>  * SBOM4PYTHON - produces an SBOM for an INSTALLED Python module.. It
> identifies all of the dependent modules including those which are
> indirect.
>  * SBOM4FILES - produces an SBOM for files in a directory. It
> attempts to determine the type of file, the licence information and
> copyright information
>  * SBOM4RUST - produces an SBOM from a Cargo.lock file showing the
> relationships between the components
>  * SBOMDIFF - compares two SBOMs and reports the differences.
> Currently just looks at changes to a component version, licence and
> identifies any new/deleted components
>  * SBOM2DOC - produces some human readable output of the content of
> an SBOM. Can produce PDF and Markdown formatted documents as well as
> to the console.
>  * SBOM-MANAGER - a repository for storing and interrogating SBOMs
> More tools are in the pipeline, including one to generate an SBOM
> from an installed platform distribution or package (currently works
> for Debian systems, work in progress for RPM based systems) and an
> audit tool. I hope to publish these in the next couple of weeks.
> 
> As SBOMS are starting to be produced, I can already start to see some
> interesting characteristics which aren't immediately obvious without
> an SBOM. So I believe SBOMs can form a valuable addition to the
> development lifecycle
> 
> So should Reproducible Builds start creating and using SBOMs (and
> delivering them with builds)?

The tools sound like something the ecosystem could benefit from so I'm 
happy to see those being worked on.

You make a good point about older releases. We have a "no new features"
rule for our LTS releases but the point you have here about improving
existing software is an interesting one. There is a case for us adding
this to our LTS with that reasoning.

For my part I've helped ensure SBoM generation is a first class citizen
within the Yocto Project and we've also worked hard to ensure our core
is entirely reproducible and we have tools to let others check their
own layers which add to the core.

Cheers,

Richard

Re: Reproducible builds bug reported to Rust compiler as https://github.com/rust-lang/rust/issues/97955

2022-06-11 Thread Richard Purdie

On Sat, 2022-06-11 at 11:39 +0200, Jelle van der Waa wrote:
> Hey,
> 
> On 10/06/2022 19:55, Richard Purdie wrote:
> > On Fri, 2022-06-10 at 13:52 -0400, David A. Wheeler wrote:
> > > All, FYI:
> > > 
> > > The current LLVM-based Rust compiler generates builds that
> > > aren't (easily) reproducible, at least in part because full paths
> > > to the source code is in the panic and debug strings recorded in
> > > the generated executable. I was made aware of this via bunny's
> > > "Rust: A Critical Retrospective" 
> > > <https://www.bunniestudios.com/blog/?p=6375>.
> 
> Yes, this is the case if you share a debug build with someone. I'd say 
> this not that bad. GCC by default also suffers from this issue, just 
> make any debug build and the path will be included:
> 
> [jelle@t14s][/tmp/dontincludethisdir]%gcc -lhidapi-hidraw 
> -I/usr/include/hidapi -ggdb ../foo.c -o test
> [jelle@t14s][/tmp/dontincludethisdir]%strings test| grep dontincl
> /tmp/dontincludethisdir
> 
> In practice you want to reproduce a production release builds which 
> strip this information.

That isn't strictly true, you can also map things to the target
locations. In Yocto Project/OpenEmbedded we use something like:

DEBUG_PREFIX_MAP = " \
-fmacro-prefix-map=${WORKDIR}=/usr/src/debug/${PN}/${EXTENDPE}${PV}-${PR} \
-fdebug-prefix-map=${WORKDIR}=/usr/src/debug/${PN}/${EXTENDPE}${PV}-${PR} \
-fdebug-prefix-map=${STAGING_DIR_HOST}= \
-fdebug-prefix-map=${STAGING_DIR_NATIVE}= \
"

which maps our build paths to our target system paths. We put the
source in to /usr/src/debug/XXX in our debug packages. This way you get
fully reproducible sources, reproducible packages, small target
packages yet full debug info available too.

I think newer gcc has simplified this a bit, we use the above since
there used to be issues where both macro-prefix-map and debug-prefix-
map were needed.

Cheers,

Richard

Re: Reproducible builds bug reported to Rust compiler as https://github.com/rust-lang/rust/issues/97955

2022-06-10 Thread Richard Purdie

On Fri, 2022-06-10 at 13:52 -0400, David A. Wheeler wrote:
> All, FYI:
> 
> The current LLVM-based Rust compiler generates builds that
> aren't (easily) reproducible, at least in part because full paths
> to the source code is in the panic and debug strings recorded in
> the generated executable. I was made aware of this via bunny's
> "Rust: A Critical Retrospective" .
> 
> I've filed this as a bug report to the Rust compiler developers:
> https://github.com/rust-lang/rust/issues/97955
> 
> Filing a bug report is obviously not the same as getting it fixed.
> But filing a bug report to the right place *is* a good first step.
> If you know of other build tools where this is a problem, I encourage
> filing bug reports with them too.

Interestingly we haven't see that in Yocto Project and we do have some
rust libraries in our system. That suggests it can be done through
configuration somehow...

Cheers,

Richard

Re: Call for real-world scenarios prevented by RB practices

2022-03-25 Thread Richard Purdie

Hi,

On Tue, 2022-03-22 at 12:46 +, Chris Lamb wrote:
> Just wondering if anyone on this list is aware of any real-world
> instances where RB practices have made a difference and flagged
> something legitimately "bad"?
> 
> Pretty sure that everyone here believes that reproducible builds *can*
> detect such issues (and might even prevent them from being attempted
> in the first place), but would be interested in anything that has been
> made public that can be specifically credited to reproducible builds
> practices or similar.

As we underwent this work with Yocto Project (which cross compiles), the biggest
thing we found were floating dependencies, for example where code would change
configuration depending on whether the host has /usr/bin/sendmail or not.

A second class of issues for us were build paths leaking into output binaries.
Our reproducible build tests always build in two different locations so we're
pretty happy to have that class of issues resolved now.

I pulled out a few random examples of things we've fixed:

A fun rpm issue where rpms built on an aarch64 host differed from those built on
an x86_64 system:

https://git.yoctoproject.org/poky/commit/?id=d441b484ebb4cdde228cedb3378019ffbdc391ac

Broken file ownership in output packages from build races:

https://git.yoctoproject.org/poky/commit/?id=fb1fe1a60d93eb05010c7b6a8077eddd4f910e95

gtk+3 shipping a file but also sometimes regenerating it during build:

https://git.yoctoproject.org/poky/commit/?id=df2c56f4d55d5edce77548cde0e1dc4d83503844

Makefile globing leading to varying binary output:

https://git.yoctoproject.org/poky/commit/?id=2aac5700b80f8e207db0a212d97dd9a650151dc7

Host paths to grep/python leaking into the output:

https://git.yoctoproject.org/poky/commit/?id=fe8c75f72e1a75c2f964e1e41c3dad84a6b87c38

Sendmail path issue example:

https://git.yoctoproject.org/poky/commit/?id=9f0b69e91c51bad8722e765df95b545e3ecec9b1

Most of these aren't malicious but they are "bad" in the sense that we wanted to
identify and fix them.

Cheers,

Richard

Re: Reproducible tarballs on Github?

2021-10-24 Thread Richard Purdie

On Sat, 2021-10-23 at 21:41 -0400, David A. Wheeler wrote:
> 
> > On Oct 23, 2021, at 3:23 PM, Arthur Gautier  wrote:
> > 
> > I would expect Github to use the tar implementation of git-archive (or
> > libgit2). git-archive is specifically designed to be reproducible.
> 
> I don’t know if it does, but that does seem likely.
> 
> > All I'm suggesting is to checksum the inflated version of the archive
> > and not the compressed one.
> 
> Checksumming the inflated version makes sense to me, so that improved/varying
> compression doesn’t matter (since it produces the same result).
> 
> Sounds like maybe GitHub doesn’t need to change anything.
> If someone thinks GitHub *does* need to change something, I’d like to know
> exactly what practical change is desired.

Yocto Project has struggled with this too FWIW. The tarballs generated by github
are (or were?) dynamically generated and could change checksum over time (as
caches were invalidated and they were rebuilt?). For something like YP where we
list the checksums of the input source archives to validate that the inputs were
the same, this was an issue. As such we only support official tarball releases
of github projects and not the dynamically generated tarballs. We added checks
to try and ensure we don't use the "bad" urls.

I've not heard of this being an issue for us for a while but that could be we
now just don't use the problematic dynamic tarballs. We did raise it with github
and were told they reserved the right to change the output and we shouldn't
depend on checksums like that. They do not use git-archive or certainly didn't
at the time.

Cheers,

Richard

Re: Recoding the configuration for live-build images

2021-09-04 Thread Richard Purdie

On Sat, 2021-09-04 at 09:18 +0200, Jan Nieuwenhuizen wrote:
> John Gilmore writes:
> 
> Hi!
> 
> > Does the GNU Mes bootstrap-reducing team have a plan to replace Grub and
> > the Linux kernel and init (and perhaps a BIOS?) with something tiny that
> > runs on bare metal and implements a file system, the mount command, and
> > processes?  Many realtime OS's are much smaller than Linux or BSD and
> > yet have those capabilities.  eCos might be a great start, and is free,
> > highly portable, and includes a POSIX layer (and TCP/IP for debugging),
> > though it currently lacks fork/exec/wait.  The original V7 UNIX kernel
> > would work, if process sizes and filename sizes are patched, and a few
> > device drivers written for modern disk and CDROM drives.  Such a
> > bootstrap kernel would enable the Scheme bootstrap programs to run well
> > enough to build gcc, then use gcc to build the Linux kernel, then boot
> > it, and continue building.
> 
> This is a valid concern and these are nice pointers.  In my FOSDEM21
> talk (https://fosdem.org/2021/schedule/event/gnumes/) I mention the fact
> that we still have work to do here.  However, there are no concrete
> plans that I know of just yet, at least not as a basis for real world
> bootstrap of a full GNU/Linux system.
> 
> GNU Mes is being deployed in the GNU Guix bootstrap and a new effort has
> just started to port the reduced binary seed bootstrap to NixOS.  We are
> still working to integrate the "full source bootstrap" into GNU Guix.
> The plans after that are to replace critical usage of GNU Guile with GNU
> Mes.  There is also work on ARM, RISC-V and the Hurd going on.  This
> probably means, e.g., backporting RISC-V support to gcc-4.6, in short:
> lots of work todo here.
> 
> The fact remains that the team is still quite small and GNU Guix and
> even NixOS I think are niche distributions.  It would be amazing to get
> a GNU Mes based reduced binary seed bootstrap into Debian.  That could
> increase our exposure a lot and may free up some development time for
> doing more wild stuff.  Alas, except for chats some of us had I do not
> know of concrete plans here either.

You may want to consider Yocto Project in that area. The strength YP has is that
it is a cross compiled and customised Linux (and other RTOS) built reproducibly
from source with tightly controlled host dependencies. It can "self host", i.e.
build itself reproducibily from within its own tools too.

As such, there would be a very specific target that would need to be built to
achieve bootstrap and the system could take it from there. There are also
probably ways to minimise the bootstrap needed, that just hasn't been too much
of a focus for the project.

Cheers,

Richard

Re: Help us map the reproducible builds ecosystem

2021-08-05 Thread Richard Purdie

On Thu, 2021-08-05 at 14:38 +, Holger Levsen wrote:
> On Thu, Aug 05, 2021 at 02:51:17PM +0100, Chris Lamb wrote:
> > There is definitely an argument to be as complete as possible, but I
> > think the best thing from the perspective of the ecosystem map is to
> > be as consistent as possible across similar entities.
> 
> I'm not sure there is so much consistancy...
>  
> 
> > Therefore we should _probably_ stick to "% reproducible of packages",
> > as this is a number that most, if not all, distributions have.
> 
> tails doesnt have any packages, nor yocto, and the java world has artifacts

For yocto, we can generate packages and we do run tests of our default config 
on those:

https://www.yoctoproject.org/reproducible-build-results/

so 36 exclusions (known issues, effectively golang) of 34170 packages.

This is for our core layer only though.  There are many layers/configurations 
and our aim is to provide the tools and configurations that let anyone build 
something which is reproducible and prove that though testing themselves. We 
therefore use our core test as an indication that we can build reproducible 
things and that our testing process works.

>From Yocto, this then feeds into Linux distros from our members/users like 
Wind River, Montavista, ENEA, Automotive Grade Linux (AGL), OpenBMC and
many other projects, some we know, many we don't. An example end result would
be reproducible binaries in your car, aeroplane and TV :)

Not sure how that looks on your map! We struggle a lot to know where Yocto
ends up too.

FWIW we've been trying to get people to add known users to
https://wiki.yoctoproject.org/wiki/Project_Users but that is ongoing and
currently far from being even remotely complete.

Cheers,

Richard

Re: Please review the draft for March's report

2021-04-06 Thread Richard Purdie

On Tue, 2021-04-06 at 10:39 -0400, David A. Wheeler wrote:
> Press releases are not the best way to learn technical details :-).
> 
> I suggest adding a link to more details e.g.:
> 
> See https://sigstore.dev/what_is_sigstore/“>”What is sigstore" 
> for more details.
> 
> I think mentioning sigstore is value. Reproducible builds let you verify that
> a given build *is* generated from a given source; sigstore can let you
> verify that you got the *correct* source or build.

An interesting aside but Yocto Project sidesteps this issue by
encoding the checksums of the source tarballs in it's recipes
for software.

Whilst that doesn't guarantee it is the correct source, it means
that you're all using the same source and given the breadth of
use of the project, you'd assume differences would be noticed 
and can certainly be audited.

You'd be amazed how often we find projects that rebuild their
release tarballs :/

Cheers,

Richard

Re: How could we accelerate deployment of verified reproducible builds?

2021-01-30 Thread Richard Purdie

On Sat, 2021-01-30 at 12:22 +, Holger Levsen wrote:
> On Fri, Jan 29, 2021 at 05:39:01PM -0500, David A. Wheeler wrote:
> > What would be especially helpful for accelerating deployment of
> > verified reproducible builds in a few key places? E.g., what tools,
> > infrastructure, people paid to do XYZ?
> 
> first, having verified reproducible builds! then, we can deploy them.

I believe Yocto Project can do this today. Obviously its easy to say
that so I intend to prove it :).

To have a verified build you need to share some kind of configuration
and something to verify against.

The binaries I'm going to 'share' for verification are a linux kernel
bzImage and a tarball of a busybox based rootfs linked against musl as
a libc. The configuration you need for this is:

Poky repo: git://git.yoctoproject.org/poky
Poky revision: 36aef08dcd5e45c4138ccd72e8de01157f7213c4

and the configuration:

PACKAGE_CLASSES = "package_ipk"
TCLIBC = "musl"
INHERIT += "rm_work"
DEBUG_FLAGS = "${DEBUG_PREFIX_MAP}"
DISTRO_FEATURES_remove = "opengl"
EXTRA_IMAGEDEPENDS_qemux86-64 = ""
IMAGE_FSTYPES_qemux86-64 = "tar.bz2"
IMAGE_CMD_tar_qemux86-64 = "${IMAGE_CMD_TAR} --format=gnu --sort=name 
--numeric-owner -cf ${IMGDEPLOYDIR}/${IMAGE_NAME}${IMAGE_NAME_SUFFIX}.tar -C 
${IMAGE_ROOTFS} . || [ $? -eq 1 ]"

which gives the sha256sum of the output binaries:

99c6d9ba1162043f348cef1b8385a03e4246323182bdada0200ab69bfe61ecd4  
core-image-minimal-qemux86-64.tar.bz2
e36841e544c8ffe628d56e08e0c3965fb350b05c6dc553390283b8330e2ebdcd  bzImage

I've put a small bit of shell script at the end of the mail which takes
this information and builds that result. To verify, run as script with
a parameter specifying which directory to use to build in. I will warn
this this will download source code for everything it needs and it will
build and use its own compiler to build the output. As such it needs
network bandwidth, disk space and will take a while.

You might wonder why I'm not being specific about which distro to use
or the path it should run in. It doesn't matter, it handles that. It
will run on most recent Linux systems and it should tell you if there
are any dependencies it needs which are missing (it needs python 3.6+
and to be able to compile things with gcc 6+).

You could extract the output tar.bz2 and "sudo chroot /bin/ash" into
the image if you wanted.

You may wonder why I'm specifying the tar command to use. I actually
found a bug when testing this as the output differed, using gnu tar
format on one machine and not on another. We'll get that fixed, we tend
to diff packages rather than full images which is why that hadn't been
spotted. I added some config to avoid that issue here for now.

If you want to rerun this with more speed, the standard "cache" options
for Yocto Project can be added:

DL_DIR = "/media/sources/"
SSTATE_DIR = "/media/sstate/"

where downloads are placed into DL_DIR and our cache artefacts (sstate)
are placed in the second location. If you place those outside the build
directory it will make subsequent executions much faster.

The other configuration options are mainly to try and cut down the
build time a bit, musl builds more quickly than glibc and its
faster/simpler without added qemu dependencies or debug information.

I'm taking a bit of a risk by making a bold claim quite publicly and
this could go wrong but I think its interesting to explore :)

Cheers,

Richard

#!/bin/bash
VBDIR=$1
if [ -z "$VBDIR" -o -e "$VBDIR" ]; then
echo "Please specify an empty directory to run the build in"
exit 1
fi
mkdir -p $VBDIR
cd $VBDIR
git clone git://git.yoctoproject.org/poky
cd $VBDIR/poky 
git checkout 36aef08dcd5e45c4138ccd72e8de01157f7213c4
. $VBDIR/poky/oe-init-build-env $VBDIR/poky/build
printf 'PACKAGE_CLASSES = "package_ipk"
TCLIBC = "musl"
INHERIT += "rm_work"
DL_DIR = "/media/sources/"
SSTATE_DIR = "/media/sstate/master"
DEBUG_FLAGS = "${DEBUG_PREFIX_MAP}"
DISTRO_FEATURES_remove = "opengl"
EXTRA_IMAGEDEPENDS_qemux86-64 = ""
IMAGE_FSTYPES_qemux86-64 = "tar.bz2"
IMAGE_CMD_tar_qemux86-64 = "${IMAGE_CMD_TAR} --format=gnu --sort=name 
--numeric-owner -cf ${IMGDEPLOYDIR}/${IMAGE_NAME}${IMAGE_NAME_SUFFIX}.tar -C 
${IMAGE_ROOTFS} . || [ $? -eq 1 ]"
' > $VBDIR/poky/build/conf/auto.conf
bitbake core-image-minimal || exit 1
sha256sum $VBDIR/poky/build/tmp/deploy/images/qemux86-64/bzImage
sha256sum 
$VBDIR/poky/build/tmp/deploy/images/qemux86-64/core-image-minimal-qemux86-64.tar.bz2
echo 
$VBDIR/poky/build/tmp/deploy/images/qemux86-64/core-image-minimal-qemux86-64.tar.bz2
 ready!

Re: Attack on SolarWinds could have been countered by reproducible builds

2020-12-21 Thread Richard Purdie

On Mon, 2020-12-21 at 15:57 -0500, David A. Wheeler wrote:
> I think these things need to happen in stages. Broadly:
> 1. Get key applications & libraries reproducible (assuming toolchains
> are okay)
> 2. Establish independent processes that *check* that the binaries are
> what they’re supposed to be.
> 3. Extend the work to more/all applications/libraries in given
> domains.
> 4. Work on verifying underlying toolchains, and again, creating
> independent processes that *check* the toolchain results (DDC &
> bootstrapping).
> 
> The long-term goal should be that “we can ensure that all OSS
> compiled code is accurately represented by its source code”. The
> source code may include malicious statements, but source code is what
> developers review, so we’ve fundamentally changed the game to ensure
> that “what is reviewed is what is run”.

Not sure its so long term for some of us!

With Yocto Project, what we now effectively have is a build from
"scratch" environment where the inputs are checksum validated and the
output bitwise reproducible.

I say "scratch" since we do assume a working host compiler and basic
tools (we have a list) which are used to build the cross compiler.

We are host system independent in that it doesn't matter which distro
you build on, or in which path, the output tarball containing "Linux"
is the same for anything inside OE-Core with a small number of
exceptions. OE-Core is about 800 pieces of software generating ~11,000
packages of which we have about 65 marked as not reproducible at
present. We're obviously working on improving those 65, and the
techniques used will "just work" to a large extend throughout our wider
layers of other software, we're just note testing that until we sort
the core.

The net result is multiple people on multiple different platforms can
run the build and generate the same result consistently. Our
autobuilder does run that exact test regularly.

Cheers,

Richard

Re: Reproducible Builds Verification Format

2020-05-12 Thread Richard Purdie

On Tue, 2020-05-12 at 11:00 -1000, Paul Spooren wrote:
> at the RB Summit 2019 in Marrakesh there were some intense discussions about
> *rebuilders* and a *verification format*. While first discussed only with
> participants of the summit, it should now be shared with a broader audience!
> 
> A quck introduction to the topic of *rebuilders*: Open source projects usually
> offer compiled packages, which is great in case I don't want to compile every
> installed application. However it raises the questions if distributed packages
> are what they claim. This is where *reproducible builds* and *rebuilders* join
> the stage. The *rebuilders* try to recreate offered binaries following the
> upstream build process as close as necessary.
> 
> To make the results accessible, store-able and create tools around them, they
> should all follow the same schema, hello *reproducible builds verification
> format* (rbvf). The format tries to be as generic as possible to cover all 
> open
> source projects offering precompiled source code. It stores the rebuilder
> results of what is reproducible and what not.
> 
> Rebuilders should publish those files publicly and sign them. Tools then 
> collect
> those files and process them for users and developers.
> 
> Ideally multiple institutions spin up their own rebuilders so users can trust
> those rbuilders and only install packages verified by them.
> 
> The format is just a draft, please join in and share you thoughts. I'm happy 
> to
> extend, explain and discuss all the details. Please find it here[0].
> 
> As a proof of concept, there is already a *collector* which compares upstream
> provided packages of Archlinux and OpenWrt with the results of rebuilders.
> Please see the frontend here[1].
> 
> If you already perform any rebuilds of your project, please contacy me on how 
> to
> integrate the results in the collector!

I'm not sure how relevant this is but I can mention what the Yocto
Project is doing. We're not a traditional distro in that we don't ship
binaries. We do however care a lot about getting consistent results.

Whilst we don't ship binaries, we do cache build artefacts in our
"sstate". This means something can be reused if its in the cache rather
than building it again.

We thought long and hard about how to prove reproducibility and we
ended up adding a new "selftest" to our autobuilder:

http://git.yoctoproject.org/cgit.cgi/poky/tree/meta/lib/oeqa/selftest/cases/reproducible.py

This takes one set of artefacts from our "sstate" binary cache (if
available) and builds another set locally, both in different build
directories. It them compares the build results and flags up
differences, saving the diffoscope html output and the differing
binaries somewhere we can analyse them.

The sstate can be built and come from any worker in our cluster so many
different host distros and in arbitrary paths.

We have these tests passing for our deb and ipk package backends for
the 'core-image-minimal', 'core-image-sato' and 'core-image-full-
cmdline' images. Sato is an X11 based desktop target.

I'd say these count as "rebuilds", even if we are testing them against
ourselves and not worrying about signing (sstate can be signed but its
not particularly relevant here as we trust ourselves). Not sure they're
useful from a statistics perspective but we are running this quite
heavily, day in, day out.

Cheers,

Richard

Re: [rb-general] Please review the draft for September's report

2019-10-03 Thread Richard Purdie

On Thu, 2019-10-03 at 11:20 +0100, Chris Lamb wrote:
> Hi all,
> 
> Please review the draft for September's Reproducible Builds report:
> 
>   https://reproducible-builds.org/reports/2019-09/?draft
> 
> … or, via the Git respository itself:
> 
>   
> https://salsa.debian.org/reproducible-builds/reproducible-website/blob/master/_reports/2019-09.md
> 
> I intend to publish it no earlier than:
> 
>   $ date -d 'Sat, 05 Oct 2019 18:30:00 +'
> 
>   https://time.is/compare/1830_05_Oct_2019_in_UTC
> 

A while ago I mentioned here that Yocto Project is reproducibile, at
least for its major components. I was asked something like "ok, prove
it". Its taken us a while to get back to it but we now have.

I'm not a regular contributor so don't have access to the above however
I am pleased to be able to say that:

"""
The Yocto Project[1] has now implemented and is regularly running
tests[2] on its reproducibility. These can be seen as a line item in
our "oe-selftest" test runs on our autobuilder[3]. Right now these are
for our minimal images but the tests are generic, available for our
users to use against their own images and our plan is to extend this to
cover all our core recipes over the next months. This functionality
will in in our upcoming 3.0 release.

[1] https://www.yoctoproject.org
[2] h
ttp://git.yoctoproject.org/cgit.cgi/poky/tree/meta/lib/oeqa/selftest/cases/reproducible.py
[3] https://autobuilder.yoctoproject.org/typhoon/#/console
"""

I was thinking this might be worth a mention! :)

Cheers,

Richard

___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible build of a GCC cross-compiler?

2019-06-12 Thread richard . purdie

On Wed, 2019-06-12 at 14:42 +0200, Sebastian Huber wrote:
> Hello Richard,
> 
> On 07/06/2019 14:42, richard.pur...@linuxfoundation.org wrote:
> > > How did you propagate the settings to the newly built cross
> > > compiler
> > > invoked to build the target libraries?
> > We already pass in various compiler flags when building things so
> > we
> > included the right options for reproducibility as well.
> > 
> > > It would be nice if I could build the cross compiler and the
> > > target
> > > libraries at an arbitrary location and then install the binaries
> > > and
> > > sources at some prefix. If I debug an application it will find
> > > the
> > > Newlib sources in the prefix automatically.
> > Yocto Project's builds and SDKs can be run or installed at
> > arbitrary
> > locations without changing the build output result. We also have
> > fairly
> > good debug symbol/debug source handling for the target (split into
> > separate but linked objects which can be optionally installed).
> 
> do you have a log of the cross compiler build which shows the GCC 
> configure command line? An example for a proven reproducible build
> would be a big help for me.

There is a lot more to it than that, I'd suggest you make a Yocto
Project test build following our quick start.

Cheers,

Richard


___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible build of a GCC cross-compiler?

2019-06-07 Thread richard . purdie

On Fri, 2019-06-07 at 14:29 +0200, Sebastian Huber wrote:
> On 07/06/2019 13:54, Richard Purdie wrote:
> > On Fri, 2019-06-07 at 13:41 +0200, Sebastian Huber wrote:
> > > Hello,
> > > 
> > > has someone tried to do a reproducible build of a GCC cross-
> > > compiler?
> > > I
> > > am not interested in a native GCC. I guess it needs some tweaks
> > > in
> > > the
> > > build system of GCC, e.g.
> > > 
> > > * use of -frandom-seed=???
> > > 
> > > * export SOURCE_DATE_EPOCH=???
> > > 
> > > * -ffile-prefix-map=???=???
> > > 
> > > * etc.
> > > 
> > > It seems there is no magic configuration option in
> > > 
> > > https://gcc.gnu.org/install/configure.html
> > > 
> > > to enable a reproducible build.
> > 
> > The Yocto Project builds and uses gcc cross compilers and we
> > believe
> > our builds to be reproducible when configured to be.
> > 
> > We do use SOURCE_DATE_EPOCH and -ffile-prefix-map=.
> 
> Interesting, a "-ffile-prefix-map=." is not documented. What I find
> is this:
> 
> https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options
> 
> "-fdebug-prefix-map=old=new

Sorry, "." was just the end of the sentence. I just mean that we use
that commandline option to ensure the output is reproducible.

> How did you propagate the settings to the newly built cross compiler 
> invoked to build the target libraries?

We already pass in various compiler flags when building things so we
included the right options for reproducibility as well.

> It would be nice if I could build the cross compiler and the target 
> libraries at an arbitrary location and then install the binaries and 
> sources at some prefix. If I debug an application it will find the 
> Newlib sources in the prefix automatically.

Yocto Project's builds and SDKs can be run or installed at arbitrary
locations without changing the build output result. We also have fairly
good debug symbol/debug source handling for the target (split into
separate but linked objects which can be optionally installed).

Cheers,

Richard

___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible build of a GCC cross-compiler?

2019-06-07 Thread Richard Purdie

On Fri, 2019-06-07 at 13:41 +0200, Sebastian Huber wrote:
> Hello,
> 
> has someone tried to do a reproducible build of a GCC cross-compiler? 
> I 
> am not interested in a native GCC. I guess it needs some tweaks in
> the 
> build system of GCC, e.g.
> 
> * use of -frandom-seed=???
> 
> * export SOURCE_DATE_EPOCH=???
> 
> * -ffile-prefix-map=???=???
> 
> * etc.
> 
> It seems there is no magic configuration option in
> 
> https://gcc.gnu.org/install/configure.html
> 
> to enable a reproducible build.

The Yocto Project builds and uses gcc cross compilers and we believe
our builds to be reproducible when configured to be.

We do use SOURCE_DATE_EPOCH and -ffile-prefix-map=.

We're in the final stages of putting together tests which actually test
and prove that in our CI setup, I'm looking forward to being able to
announce those here soon.

Cheers,

Richard

___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Definition of "reproducible build"

2019-02-14 Thread Richard Purdie

On Thu, 2019-02-14 at 12:25 -0800, John Gilmore wrote:
> > I like the idea, however what you are proposing is basically a new
> > distro/fork, where you would remove all unreproducible packages, as
> > every distro still has some unreproducible bits.
> 
> I suggest going the other way -- produce a distro that is "80%
> reproducible" from its source code USB stick and its binary boot USB
> stick.  You'd already have the global reproducibility structure and
> scripts written and working, even before the last packages are
> individually reproducible.  That global reproducibility tech would be
> immediately adoptable by any distro.  The output of the reproduction
> scripts would be a bootable binary that does boot and run!  It would
> still have differences from the "release master" bootable binary, but
> those differences would be irrelevant to the functioning of the binary,
> and would be clearly visible with "diff -r".
> 
> (For one thing, this would cause the distros to actually produce a
> "source code USB stick image".  Currently most of them don't.  They
> instead require you to download thousands of separate source packages or
> tarballs, and have no scripts readily visible for building those into a
> bootable binary image.)
> 
> After accomplishing that, then the focus could go on the 20% (or 10% or
> whatever) of packages that aren't yet reproducible.  And, people making
> small distros could cut out such packages to make a 100% reproducible
> distro, as Holger suggested.

FWIW, the Yocto Project supports that today in the form of our "build-
appliance" images. They contain all the sources and tools to rebuild
the image.

We don't go for full reproducibilty "out the box" at a timestamp level
but you can configure the build to do that. Even out the box we're way
better than 80% though!

Cheers,

Richard




___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: rust non-determinism

Re: rust non-determinism

Re: May 2024: whatsrc.org distro status

Re: May 2024: whatsrc.org distro status

Re: May 2024: whatsrc.org distro status

Re: Bootstrapping and autotools

Re: Arch Linux minimal container userland 100% reproducible - now what?

Re: Two questions about build-path reproducibility in Debian

Re: Two questions about build-path reproducibility in Debian

Re: Reproducibility terminology/definitions

Pitfall of using shortened git hashes compiled into code

Re: breaking CI if build is not reproducible?

Re: SBOMs - Anywhere?

Re: Reproducible builds bug reported to Rust compiler as https://github.com/rust-lang/rust/issues/97955

Re: Reproducible builds bug reported to Rust compiler as https://github.com/rust-lang/rust/issues/97955

Re: Call for real-world scenarios prevented by RB practices

Re: Reproducible tarballs on Github?

Re: Recoding the configuration for live-build images

Re: Help us map the reproducible builds ecosystem

Re: Please review the draft for March's report

Re: How could we accelerate deployment of verified reproducible builds?

Re: Attack on SolarWinds could have been countered by reproducible builds

Re: Reproducible Builds Verification Format

Re: [rb-general] Please review the draft for September's report

Re: [rb-general] Reproducible build of a GCC cross-compiler?

Re: [rb-general] Reproducible build of a GCC cross-compiler?

Re: [rb-general] Reproducible build of a GCC cross-compiler?

Re: [rb-general] Definition of "reproducible build"

28 matches

Site Navigation

Mail list logo

Footer information