Re: Reproducible Builds Verification Format

Eric Myhre Tue, 12 May 2020 14:45:23 -0700

Some of these dreams and the outlines of these concepts have been aroundquite a bit longer than this year, even. I think some differentialdiagnosis about what makes this draft different, and why it makes thechoices it does, would be useful.

Some things I'd like to see identified and explicitly discussed morefrequently in this concept space:

- What's the "primary key"? In other words, how can I meaningfullyexpect to identify this one attestation record, or this one buildinstruction document?

- What are the "secondary keys" I could plausibly expect to select on ifI have a zillion of these, and want to find those that should or shouldnot align in results?

- What parts of this info do we expect to be useful, and why? (Whatuser story caused a certain piece of info to seem relevant andactionable enough to include?)

- What things we *could* imagine someone proposing putting in this infowhich we might reject because we don't believe it would be useful, and why?

The motivations of "a generic way to compare results" are good. Butgood intentions can only carry us so far. These four things are some ofthe first considerations I have when looking at a format proposal. Without some thought about the "keys", I don't know how it will deliveron "comparability" at scale. Without some meta-documentation of notjust the data that goes _in_, but also the kind of data that _doesn't_,I worry that the spec will become a kitchen sink, sopping up more datawith time regardless of its relevance, and correspondingly becoming lessand less useful over time.

I don't know if these are the only four questions to ask, nor will Iclaim they are perfect, but they're some of the first things that cometo my mind as heuristics, and I share them in the hope that they can bea useful whetstone for someone else's thoughts.

As an incidental aside, I think what's currently listed in that githublink as "origin_uri" may be mistaken in its conception of "URI". Theexamples are such things as "http://ftp.us.debian.org/"; and"https://download.docker.com/";, and I'm sure these are _locations_, not_identifiers_ -- URLs, not URIs.

And I would question (begging forgiveness from anyone who knows myrefrain already) if "locations" as any sort of primary key are a sturdyidea to try to build upon. They're terribly centralized. And providevery little insurance against mutability events which can make all otherdocuments that refer to them become instantly useless. Content-addressing may have some potential to address this, git (atleast in concept) has shown us the way...




Cheers to all hopeful rebuilders :)


On 5/12/20 11:00 PM, Paul Spooren wrote:

Hi all,

at the RB Summit 2019 in Marrakesh there were some intense discussions about
*rebuilders* and a *verification format*. While first discussed only with
participants of the summit, it should now be shared with a broader audience!

A quck introduction to the topic of *rebuilders*: Open source projects usually
offer compiled packages, which is great in case I don't want to compile every
installed application. However it raises the questions if distributed packages
are what they claim. This is where *reproducible builds* and *rebuilders* join
the stage. The *rebuilders* try to recreate offered binaries following the
upstream build process as close as necessary.

To make the results accessible, store-able and create tools around them, they
should all follow the same schema, hello *reproducible builds verification
format* (rbvf). The format tries to be as generic as possible to cover all open
source projects offering precompiled source code. It stores the rebuilder
results of what is reproducible and what not.

Rebuilders should publish those files publicly and sign them. Tools then collect
those files and process them for users and developers.

Ideally multiple institutions spin up their own rebuilders so users can trust
those rbuilders and only install packages verified by them.

The format is just a draft, please join in and share you thoughts. I'm happy to
extend, explain and discuss all the details. Please find it here[0].

As a proof of concept, there is already a *collector* which compares upstream
provided packages of Archlinux and OpenWrt with the results of rebuilders.
Please see the frontend here[1].

If you already perform any rebuilds of your project, please contacy me on how to
integrate the results in the collector!

Best,
Paul


[0]: https://github.com/aparcar/reproducible-builds-verification-format
[1]: https://rebuild.aparcar.org/

Re: Reproducible Builds Verification Format

Reply via email to