On 20/09/15 20:43, Jérémy Bobbio wrote:
> Ximin Luo:
>> With our current .buildinfo setup, the above process is more
>> complicated, because we *only* store hashes of the binary build
>> environment.
> [..]
> The idea to put a hash of the binary package in the
> Build-Environment is a late addition to the original idea. 

Sure, I realised after I posted that the binary hashes hadn't been implemented 
yet. That's a side issue though.

> In any cases, we currently don't have code to store any hash of the
> Build-Environment. If we wanted to store hashes of binary packages, then
> we would need to have them in /var/lib/dpkg/status and it's not done
> yet, even if Guillem said this would be a good thing to have.

`apt-cache show [pkg]` will list hashes of binaries. Is there some reason we 
can't just do this?

>> Currently, to run a DDC test, we would have to read the buildinfo
>> file, find the hashes of the binary build-deps, lookup the source
>> packages that corresponds to these hashes, find a different binary
>> build-deps for these hashes, and run our DDC-checker. This takes many
>> round trips, and contacting external infrastructure that isn't
>> necessary.
> You would not need to lookup the source packages using hashes. Using
> package and version gives you enough info to retrieve a specific source
> package from the archive.
>> If .buildinfo files contained source hashes, the DDC-checker could
>> work more directly, without requiring a remote repository of source
>> hash <-> binary hash mappings.
> I'm interested in `.buildinfo` in the context of the Debian project. The
> Debian archive is designed to be immutable. A specific version of a
> package will always correspond to the same source and binary files.
> So I don't see why one would do complex “source hash - binary hash
> mapping” when you can just rely on what is in the archive (and what has
> been archived by snapshot.debian.org).

It's a good principle to design something to rely on the least amount of 
external infrastructure as possible. Just because we already depend on some 
infrastructure, doesn't mean we need to add more dependencies to it.

Suppose someone did a source-only mirror in the future, because binaries are 
too costly to store. Then, the .buildinfo files (with source hashes) can still 
be used against this mirror.

The "intuitive meaning" that we would like a .buildinfo file to have, is to 
describe immutably the input and the output. For testing and verification 
purposes, the input is the *source code* of the build-deps and of the target.

Getting reproducible builds to work is IMO fixing a massive bug that has 
existed for decades. Normally, when you run a fixed program against fixed 
input, what do you expect? Fixed output. Binary-hash-only .buildinfo files 
would only help to prove that this bug doesn't exist. *But that's not an 
incredible achievement.* Great, f(x) == g(y) when f == g and x == y, whoopee? 
We should aim higher, to be able to generate fixed-binary proofs for when only 
the source code (and not necessarily the binaries) matches.

> If by building thing that ought to match a specific package version you
> get different result, then you will have to investigate in any cases.
> Implementation-wise, getting the hash of the .dsc in the .buildinfo is
> going to be very tricky. dpkg does not know about what's available in
> the archive. It just knows about packages which are or were installed.

`apt-cache showsrc [pkg]` has the right information in there, but it's a bit 
messy. I need to test this without a deb-src line, though.



Attachment: signature.asc
Description: OpenPGP digital signature

Reproducible-builds mailing list

Reply via email to