Re: Three bytes in a zip file

2024-03-12 Thread Larry Doolittle
Chris -

On Wed, Mar 13, 2024 at 01:01:47AM +, Chris Lamb wrote:
> >   TZ=UTC zip -X --latest-time "$zipfile" fab/*
> >   # Note the -X flag; to be pedantic about timestamps,
> >   # that means you should unpack with TZ=UTC unzip "$zipfile".  See
> >   # 
> > https://lists.reproducible-builds.org/pipermail/rb-general/2023-April/002927.html
> Ah, interesting! Does that -X mean that
>   https://reproducible-builds.org/docs/archives/
> ... is incomplete?

Yes.  The -X isn't needed, sometimes, and then when you least expect it, it is.
Classic reproducible-builds gotcha.

> I'm happy to update this document myself if need be. :)

Go for it.

  - Larry


Re: Three bytes in a zip file

2024-03-12 Thread Chris Lamb
Hey,

>   echo "Forcing timestamp $SOURCE_DATE_EPOCH"
>   touch --date="@$SOURCE_DATE_EPOCH" fab/*
>   TZ=UTC zip -X --latest-time "$zipfile" fab/*
>   # Note the -X flag; to be pedantic about timestamps,
>   # that means you should unpack with TZ=UTC unzip "$zipfile".  See
>   # 
> https://lists.reproducible-builds.org/pipermail/rb-general/2023-April/002927.html

Ah, interesting! Does that -X mean that

  https://reproducible-builds.org/docs/archives/

... is incomplete? I'm happy to update this document myself if need be. :)


Best wishes,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 
⬊   ⬋
  o


Reproducible Arch Linux in 2024/Q1 (irregular status update)

2024-03-12 Thread kpcyrd

hello,

since there's currently a lengthy discussion about the relevance of 
build paths in reproducible builds, I took some time to do a status 
update on the implementation of reproducible builds in Arch Linux.


There has been a system in place since late 2017 (run by 
reproducible-builds.org) that builds Arch Linux packages twice and 
compares the resulting binary packages. Since early 2020 there's also a 
system that  tries to 100%-match the official packages (as distributed 
to and installed by users) using buildinfo files:


https://reproducible.archlinux.org

The last few years have been fairly event-less (aka things are going 
mostly fine):


- In February 2022 there has been a regression, causing the pacman 
packaging tools to not be reproducible anymore in many cases

- In August 2023 I submitted a fix that was accepted by pacman upstream
- The fix was released in Arch Linux February 2024, with pacman 6.0.2-9

This fix I also mentioned last time I wrote about reproducible Arch 
Linux in August 2023:


https://lists.reproducible-builds.org/pipermail/rb-general/2023-August/003059.html

The email back then mentions 86% reproducible, while Arch Linux is now 
overall 88.8% reproducible (hopefully reaching 89% soon).


There's also other groups rebuilding Arch Linux, in total this is the 
list of known instances, each having their own build servers:


- https://reproducible.archlinux.org/
- https://reproducible.crypto-lab.ch/
- https://wolfpit.net/rebuild/
- https://rebuilder.pitastrudl.me/
- https://r-b.engineering.nyu.edu/

As of today, it's not yet possible to install a fully reproducible Arch 
Linux system, however out of the packages used in 
docker.io/library/archlinux there's only one unreproducible package left 
(according to the instance at reproducible.archlinux.org):


- libcap 2.69-3

Which was built with pacman 6.0.2-8 (prior to the fix).

To make the connection back to the original topic (how much do 
buildpaths matter): There's still a few other problems left Arch Linux 
struggles with (that can't be solved with a normalized build path). 
Besides the issue mentioned above, this is the current list of top root 
causes of Linux software not being reproducible in praxis (no particular 
order):


1) Build outputs of ghc, the haskell compiler, are currently not 
deterministic with concurrency enabled. This bug has a lot of impact on 
the total-percentage number of Arch Linux, but it's possible to install 
and use Arch Linux without having any haskell packages installed. 
https://gitlab.haskell.org/ghc/ghc/-/issues/12935


2) Build outputs of cgo (which Arch Linux uses for most go packages) 
often have a mismatching `GO BUILDID`.


3) Timestamps embedded in .jar files (unreproducible zip files are a big 
thing for some reason).


4) Missing dependency lockfiles (Cargo.lock, yarn.lock, ...). Some 
distros like Debian do not make use of these files, but in Arch Linux 
they are used to declare a resolved dependency tree to another 
ecosystem, like crates.io or npm. If this file is missing, the 
dependency tree might get resolved at build time (which is not 
guaranteed to match the versions you'd get when resolving dependencies 
in a week or three).


5) Binaries with the build time embedded in them (as part of it's 
version output or user-agent strings).


6) Binaries with the hostname of the build server embedded in them.

7) Binaries with the Linux kernel version of the build server embedded 
in them.


8) Ordering issues, e.g. a list of strings being embedded in a different 
order each build.


9) Documentation with the build time embedded in it.

Most unreproducible packages fall into on of those buckets. The only 
build path related problem in Arch Linux, are randomized filenames or 
directory names that sometimes get embedded into the binary.


Anyway, cheers
kpcyrd


Re: Two questions about build-path reproducibility in Debian

2024-03-12 Thread Vagrant Cascadian
On 2024-03-12, Holger Levsen wrote:
> On Mon, Mar 11, 2024 at 06:24:22PM +, James Addison via rb-general wrote:
>> Please find below a draft of the message I'll send to each affected 
>> bugreport.
>
> looks good to me, thank you for doing this!
>  
>> Note: I confused myself when writing this; in fact Salsa-CI reprotest _does_
>> continue to test build-path variance, at least until we decide otherwise.
>
> this is in fact a bug and should be fixed with the next reprotest release.

That is not a reprotest bug, but an infrastructure issue for the
debian-specific salsa-ci configuration. Reprotest is not a
debian-specific tool.

Reprotest should continue to vary build paths by default; reprotest
historically and currently defaults to enabling all variations and
making an exception does not seem worth the opinionated change of
behavior. By design, reprotest is easy to configure which variations to
enable and disable as needed.


live well,
  vagrant


signature.asc
Description: PGP signature


Reproducible Builds in February 2024

2024-03-12 Thread Chris Lamb

o
  ⬋   ⬊  February 2024 in Reproducible Builds
 o o
  ⬊   ⬋  https://reproducible-builds.org/reports/2024-02/
o


Welcome to the February 2024 report from the Reproducible Builds
project. In our reports, we try to outline what we have been up to over
the past month as well as mentioning some of the important things
happening in software supply-chain security.

§


Reproducible Builds at FOSDEM 2024
--

Core Reproducible Builds developer Holger Levsen presented at the main
track at FOSDEM [2] on Saturday 3rd February this year in Brussels,
Belgium. However, that wasn't the only talk related to
Reproducible Builds.

However, please see our comprehensive FOSDEM 2024 news post [3] for the
full details and links.

 [2] https://fosdem.org/2024/
 [3] 
https://reproducible-builds.org/news/2024/02/08/reproducible-builds-at-fosdem-2024/

§


Maintainer Perspectives on Open Source Software Security


Bernhard M. Wiedemann spotted that a recent report entitled "Maintainer
Perspectives on Open Source Software Security" [5] written by Stephen
Hendrick and Ashwin Ramaswami of the Linux Foundation [6] sports an
infographic which mentions that "56% of [polled] projects support
reproducible builds" [4].

 [4] 
https://www.linuxfoundation.org/hubfs/LF%20Research/MaintainerSecurityBPs_Infographic.pdf
 [5] 
https://www.linuxfoundation.org/research/maintainer-perspectives-on-security?hsLang=en
 [6] https://www.linuxfoundation.org/

§


Three new reproducibility-related academic papers
-

A total of three separate scholarly papers related to Reproducible
Builds have appeared this month:

"Signing in Four Public Software Package Registries: Quantity, Quality,
and Influencing Factors" [7] by Taylor R. Schorlemmer, Kelechi G. Kalu,
Luke Chigges, Kyung Myung Ko, Eman Abdul-Muhd, Abu Ishgair, Saurabh
Bagchi, Santiago Torres-Arias and James C. Davis (Purdue University [8],
Indiana, USA) is concerned with the problem that:

> Package maintainers can guarantee package authorship through
> software signing [but] it is unclear how common this practice is,
> and whether the resulting signatures are created properly. Prior
> work has provided raw data on signing practices, but measured single
> platforms, did not consider time, and did not provide insight on
> factors that may influence signing. We lack a comprehensive,
> multi-platform understanding of signing adoption and relevant
> factors. This study addresses this gap.

(arXiv [9], full PDF [10])

 [ 7] https://arxiv.org/abs/2401.14635
 [ 8] https://www.purdue.edu/
 [ 9] https://arxiv.org/abs/2401.14635
 [10] https://arxiv.org/pdf/2401.14635.pdf

"Reproducibility of Build Environments through Space and Time" [11] by
Julien Malka, Stefano Zacchiroli and Théo Zimmermann (Institut
Polytechnique de Paris, France [12]) addresses:

> [The] principle of reusability […] makes it harder to reproduce
> projects’ build environments, even though reproducibility of build
> environments is essential for collaboration, maintenance and
> component lifetime. In this work, we argue that functional package
> managers provide the tooling to make build environments reproducible
> in space and time, and we produce a preliminary evaluation to
> justify this claim.

The abstract continues with the claim that "Using historical data, we
show that we are able to reproduce build environments of about 7
million Nix [13] packages, and to rebuild 99.94% of the 14 thousand
packages from a 6-year-old Nixpkgs revision. (arXiv [14], full PDF
[15])

 [11] https://arxiv.org/abs/2402.00424
 [12] https://www.ip-paris.fr/
 [13] https://nixos.org/
 [14] https://arxiv.org/abs/2402.00424
 [15] https://arxiv.org/pdf/2402.00424.pdf

"Options Matter: Documenting and Fixing Non-Reproducible Builds in
Highly-Configurable Systems" [16] by Georges Aaron Randrianaina, Djamel
Eddine Khelladi, Olivier Zendra and Mathieu Acher (Inria centre at
Rennes University, France [17]):

> This paper thus proposes an approach to automatically identify
> configuration options causing non-reproducibility of builds. It
> begins by building a set of builds in order to detect
> non-reproducible ones through binary comparison. We then develop
> automated techniques that combine statistical learning with
> symbolic reasoning to analyze over 20,000 configuration options.
> Our methods are designed to both detect options causing
> non-reproducibility, and remedy non-reproducible configurations,
> two tasks that are challenging and costly to perform manually.
> (HAL Portal [18], full PDF [19])

 [16] https://inria.hal.science/hal-04441579v2
 

Re: Two questions about build-path reproducibility in Debian

2024-03-12 Thread Holger Levsen
On Mon, Mar 11, 2024 at 06:24:22PM +, James Addison via rb-general wrote:
> Please find below a draft of the message I'll send to each affected bugreport.

looks good to me, thank you for doing this!
 
> Note: I confused myself when writing this; in fact Salsa-CI reprotest _does_
> continue to test build-path variance, at least until we decide otherwise.

this is in fact a bug and should be fixed with the next reprotest release.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Historians have a word for Germans who joined the Nazi party, not because they
hated Jews,  but out of hope for  restored patriotism,  or a sense of economic
anxiety,  or a hope  to preserve their  religious values,  or dislike of their
opponents,  or raw  political opportunism,  or convenience,  or ignorance,  or 
greed.
That word is "Nazi". Nobody cares about their motives anymore.


signature.asc
Description: PGP signature


Re: Two questions about build-path reproducibility in Debian

2024-03-12 Thread James Addison via rb-general
Hi folks,

On Wed, 6 Mar 2024 at 01:04, James Addison  wrote:
> [ ... snip ...]
>
> The Debian bug severity descriptions[1] provide some more nuance, and that
> reassures me that wishlist should be appropriate for most of these bugs
> (although I'll inspect their contents before making any changes).

Please find below a draft of the message I'll send to each affected bugreport.

Note: I confused myself when writing this; in fact Salsa-CI reprotest _does_
continue to test build-path variance, at least until we decide otherwise.

--- BEGIN DRAFT ---
Because Debian builds packages from a fixed build path, customized build paths
are _not_ currently evaluated by the 'reprotest' utility in Salsa-CI, or during
package builds on the Reproducible Builds team's package test infrastructure
for Debian[1].

This means that this package will pass current reproducibility tests; however
we still believe that source code and/or build steps embed the build path into
binary package output, making it more difficult that necessary for independent
consumers to confirm whether their local compilations produce identical binary
artifacts.

As a result, this bugreport will remain open and be assigned the 'wishlist'
severity[2].

...

[1] - https://tests.reproducible-builds.org/debian/reproducible.html

[2] - https://www.debian.org/Bugs/Developer#severities
--- END DRAFT ---