Fwd: an update, vsobfs-Linux (Re: verifiable source-only bootstrap from scratch)

2024-05-30 Thread Bernhard M. Wiedemann via rb-general

Hi,

I'm forwarding an update on the
Verifiable Source-only Bootstrap From Scratch (VSOBFS) project.

I mirrored it to
https://www.zq1.de/~bernhard/mirror/rbzfp7h25zcnmxu4wnxhespe64addpopah5ckfpdfyy4qetpziitp5qd.onion/


 Forwarded Message 
Subject: an update, vsobfs-Linux (Re: verifiable source-only bootstrap 
from scratch)

Date: Thu, 30 May 2024 10:07:39 +0200
From: aho...@0w.se
To: Bernhard M. Wiedemann 

Dear Bernhard,

I share this update with you because of the interest and the friendly
attitude you showed earlier by mirroring the vsobfs tor web site.

The core data of the vsobfs project is immutable by design, but there 
are now some additions, among others a verifiable OS disk image with a 
Linux kernel instead of Minix-vmd (building upon, not skipping the 
previous steps on Minix-vmd).


The new image might possibly feel more conventional than Minix-vmd one,
depending on one's preferences. There is still no GNU or LLVM toolchain
there and the kernel is remarkably old for this very reason, the modern
kernels depend on specific compilers. At the same time, it should be
straightforward to build binutils and gcc there with tinycc (I did this
earlier, reproducibly, in a comparable Linux setup).

The update is presented at the same site as earlier:
http://rbzfp7h25zcnmxu4wnxhespe64addpopah5ckfpdfyy4qetpziitp5qd.onion

Kind regards,
 an


Re: Which conferences are folks attending these days?

2024-04-23 Thread Bernhard M. Wiedemann via rb-general

On 18/04/2024 15.45, Chris Lamb wrote:


To that end, what conferences are folks on this list still  going  to,
and, hopefully, still getting something from?  I mean, there  must  be
some exceptions other than FOSDEM… :)


My list has become rather short:
rb conf (if within Europe)
openSUSE conf, Nuremberg
and a mini-openSUSE conf in Berlin, co-located with SUSECon


OBS/rpm & java-21 success

2024-03-31 Thread Bernhard M. Wiedemann via rb-general

Hi,

today I want to share with you two successes on our path to total 
reproducibility in openSUSE:


Through the persistence of my colleague Jan Zerebecki and the help of 
mls (SUSE's rpm maintainer) we made nice progress on

https://bugzilla.opensuse.org/show_bug.cgi?id=1148824
to finally normalize mtimes in official openSUSE Tumbleweed rpms.

Together with a workaround for
https://github.com/rpm-software-management/rpm/issues/2965
this allowed me to create bit-identical rpms to the ones pulled from 
build.opensuse.org , processed with rpm --delsign


Now everything that was reproducible in my QA-tests is also 
reproducible+verifiable in practice.



The other success is that I saw 2 bit-identical java-21-openjdk rpm 
builds, but only when both were done on 1-core VMs, so there might only 
be some raciness left. [1]

javadoc output still has an issue from filesystem-readdir-order.
We have a build-tool workaround for that in place [2]


Ciao
Bernhard M.


[1] 
https://rb.zq1.de/compare.factory-20240331/diffs/java-21-openjdk-compare.out
[2] 
https://github.com/bmwiedemann/openSUSE/blob/54e27e1/packages/_/_project/_config#L19-L20


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Arch Linux minimal container userland 100% reproducible - now what?

2024-03-24 Thread Bernhard M. Wiedemann via rb-general

On 21/03/2024 21.38, kpcyrd wrote:
- libjpeg-turbo: this package contains a .jar file that is built by 
CMake and contains timestamps of the buildtime, but there's no way in 
CMake to pass --date to the jar executable to normalize this


You could use strip-nondeterminism for post-processing there.
For some reason it is reproducible in my openSUSE tests without us doing 
any extra steps.

https://ismypackagereproducibleyet.org/?pkg=libjpeg-turbo


- librsvg: the 3 rebuilders I've checked produced a .text section that is 6 bytes shorter (0x2dda2c vs 0x2dda26), I didn't investigate further yet, the diff is quite long because a lot of addresses are mismatching as a consequence 


My notes have https://gitlab.gnome.org/GNOME/librsvg/-/issues/1015 which 
turned out to be from pango mis-rendering text when font files were absent.


Ciao
Bernhard M.


Re: Why is not everything reproducible yet?

2024-02-14 Thread Bernhard M. Wiedemann via rb-general



On 14/02/2024 16.19, Santiago Torres-Arias wrote:

1. can we study the conflicting interestes (i.e., above) that stop
reproducibility from happening.


Yes, that should be possible. The above summarized my experience from 
the 1000 patches and bug-reports I did and the interactions with various 
upstreams.

The links are public and recorded in the monthly reports
https://salsa.debian.org/reproducible-builds/reproducible-website/-/tree/master/_reports
and earlier weekly posts
https://salsa.debian.org/reproducible-builds/reproducible-website/-/tree/master/_blog/posts

I can probably provide more input for such a study.


2. Are misunderstandings about reproducibility getting in the way from
pushing to it (e.g., the notion that docker containers are
inherrently reproducible). Is the perfect the enemy of the good?
what notions of reproducibility exist and how can we build a
roadmap from the weak to the strong?


There are some.
One is the confusion with what we started to call "repeatable builds" = 
the ability to be able to do a second build with the same explicit 
inputs. SBOMs help with repeatable builds, but if they become embedded 
in the build output, they can even hinder some side-benefits of 
reproducible builds, because every minor change in inputs now causes a 
change in output.


The other thing was
https://web.archive.org/web/20200807033032/https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html
that gained some anti-r-b mindshare, even though it neglegted several 
important aspects. E.g. it mentions the risk of stealing source-code 
which obviously does not apply to FLOSS.




3. What other uses of r-b exist beyond the malicious toolchain example?
can we use them as leverage to increase interest in the space?


On a past r-b summit we collected
https://reproducible-builds.org/docs/buy-in/
e.g. in openSUSE we always pushed for some level of binary equivalence 
to do build-tree-pruning in our open-build-service to save build power, 
shorten rebuild time and save bandwidth for mirrors and users that do 
not need to update unchanged packages.
We also publish updates as delta-rpm-packages that probably were more 
compact with fewer random variations.


The page also lists the QA aspect. I did find a dozen corruption bugs 
that went unnoticed for years.


e.g. https://gitlab.gnome.org/GNOME/libxslt/-/issues/37 had this 
memorable quote from upstream:
This was caused by an interesting bug in libxml2's streaming XPath engine. I'm still puzzled why it took so long to discover this issue. 


So for your study, you could find this link in _reports/2020-04.md

another corruption bug in
_reports/2023-10.md:* 
[`OpenRGB`](https://gitlab.com/CalcProgrammer1/OpenRGB/-/issues/3675) 
([corruption-related 
issue](https://gitlab.com/CalcProgrammer1/OpenRGB/-/merge_requests/2103))



One benefit not listed is that with r-b it is possible to say "version 
1.2.3 has hash abcdef" and you can provide a signature of the file, 
without uploading the file itself. With content-addressable storage such 
as IPFS, you can then also link to such an artifact and anyone else can 
provide the correct file.


e.g. in
http://bafybeiezodttpdsrhy7gj7zuzklbs3exh42a4ezorsepnn74ar2gkicujy.ipfs.cf-ipfs.com/
if we had reproducible ISOs, I could build and sign them in a 
low-bandwidth place but build+upload from another.




Ciao
Bernhard M.


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Potential issues with the snippet to parse SOURCE_DATE_EPOCH in C

2024-01-21 Thread Bernhard M. Wiedemann via rb-general



On 19/01/2024 21.03, Chris Lamb wrote:

Was there any reason to reject >ULONG_MAX? I'm touching this code,
and don't see a reason for it; it looks very arbitrary; especially
since some systems can have 32-bit long, but 64-bit time_t. Should I
just drop that check, or keep it? And why?


There is another issue with the ULONG_MAX - that is it allows to 
represent timestamps up to 2106, but a 32-bit time_t is a signed int32, 
so will roll over in 2038 back to 1901.


At some (not too far) point in time, programs and libraries compiling 
with glibc should start to build with -D_TIME_BITS=64 
-D_FILE_OFFSET_BITS=64 to get 64-bit time_t everywhere. [1]


Ciao
Bernhard M.

[1] 
https://www.reddit.com/r/linux/comments/19a95cl/today_is_y2k38_commemoration_day_t14/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Why is not everything reproducible yet?

2023-12-20 Thread Bernhard M. Wiedemann via rb-general

Sometimes people wonder:
Why is not everything reproducible yet?

And the general reason is that there are other interests that result in 
added non-determinism.

I collected some with examples



Performance (PGO, benchmarking, -march=native, parallelism/races)
 https://build.opensuse.org/request/show/1130552
 https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/pgo


Simplicity (e.g. using random UUIDs instead of hashed inputs)
 https://github.com/ipxe/ipxe/pull/1082


Security (Signatures):
 https://bugzilla.opensuse.org/show_bug.cgi?id=1217690
 https://bugzilla.opensuse.org/show_bug.cgi?id=1208478
 https://bugzilla.opensuse.org/show_bug.cgi?id=1081723


Traceability of provenance (date+user+hostname):

https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/timestamp


repeatable builds:
 https://github.com/rpm-software-management/rpm/issues/2343


Portability:
 https://github.com/ipxe/ipxe/pull/1082#issuecomment-1862899660
 - see also the code-monster we need to support SOURCE_DATE_EPOCH with 
sh on *NIX



Ciao
Bernhard M.


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: scheme and lisp

2023-11-23 Thread Bernhard M. Wiedemann via rb-general



On 23/11/2023 17.53, Ludovic Courtès wrote:

The implementations are also very different: for instance, Chez
implements a native ahead-of-time compiler whereas Guile has bytecode
compilation plus just-in-time compilation.  Thus problems and solutions
for one implementation are unlikely to translate to other
implementations.


Thanks for that insight.


That said I’m surprised about Emacs, this needs more investigation…


Here is the diff from our emacs-29.1
https://rb.zq1.de/compare.factory-20230830/diffs/emacs-compare.out
it has 2 different variations in .eln and .pdmp files.
The .eln one probably comes from ASLR [1].
While the .pdmp has some sequences of very random bytes.

Ciao
Bernhard M.

[1] https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/aslr


OpenPGP_signature.asc
Description: OpenPGP digital signature


scheme and lisp

2023-11-23 Thread Bernhard M. Wiedemann via rb-general

Hi,

in openSUSE there are some packages that so far refuse to build 
reproducibly. The common theme around them is that they use scheme or 
lisp to produce binaries with a 'dump' command.


e.g. for scheme48 I extracted this reproducer:

pushd ~/rpmbuild/BUILD/scheme48-*/ps-compiler
../go -h 2000 -a batch <<- 'EOF'
,config ,load ../scheme/prescheme/interface.scm
,config ,load ../scheme/prescheme/package-defs.scm
,exec ,load load-ps-compiler.scm
,in prescheme-compiler prescheme-compiler
,user (define prescheme-compiler ##)
,dump ../ps-compiler.image "(Pre-Scheme)"
,exit
EOF

I also know that guile implements scheme and builds reproducibly (with 
-j1). So there must be a way to do it right.


The list of our packages I think are affected by this is:
clisp
scheme48
chezscheme
emacs
maxima
scsh
xindy

Most distros seem to be affected by this:
http://ismypackagereproducibleyet.org/?pkg=scheme48
http://ismypackagereproducibleyet.org/?pkg=emacs surprisingly shows as 
reproducible in Archlinux, but I could not figure out why.

maxima also shows as green there.

Can we get them reproducible? Or can we drop+replace these 
implementations with guile?


I'd appreciate some insight from this knowledgeable crowd.

Ciao
Bernhard M.


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Reproducibility terminology/definitions

2023-11-11 Thread Bernhard M. Wiedemann via rb-general



On 08/11/2023 16.38, Pol Dellaiera wrote:

you define functions doing I/O as Impure functions.
But without I/O, no build output can be written, so all builds must use 
impure functions.



In practice we see non-determinism from approx 10 sources, such as 
documented in https://github.com/bmwiedemann/theunreproduciblepackage/


e.g.
https://github.com/bmwiedemann/theunreproduciblepackage/blob/master/race/race.2.sh

#!/bin/sh
racepart()
{
input=$1
sleep 0.1
echo $input
}
for i in $(seq 1 10) ; do
# & backgrounds the process to do parallel processing
racepart $i &
done


OpenPGP_signature.asc
Description: OpenPGP digital signature


LibreOffice success story

2023-11-07 Thread Bernhard M. Wiedemann via rb-general

Dear fellow R-B-ings

Just 2 weeks ago, when I re-reviewed the remaining ~120 major issues in 
openSUSE, I pretty much skipped over LibreOffice (and only this one), 
noting it down as "various issues", because some years ago, when I had 
previously taken a closer look, there had been so many issues in files 
of various formats and additionally the build of that large package took 
hours, so that I had quickly stopped debugging and left it "for later".


Then on the r-b-summit I met Thorsten from LibreOffice upstream and 
together we reviewed the old diff I had from 2022 [1]. We noticed that 
there was an issue with docs, but I had recently made

https://code.opensuse.org/package/doxygen/blob/master/f/reproducible.patch
that turned out to solve this exact issue. It still needs upstreaming.

There was a timestamp-issue from long-orphaned clucene and we found 3 
others that Thorsten quickly fixed:

https://gerrit.libreoffice.org/q/topic:reprobuild

Now there were only mtimes left in .jar and .zip files that were easily 
normalized with strip-nondeterminism.


So today I hold in my hands the first two bit-identical LibreOffice rpm 
packages.

And this is the success I wanted to share with you all today.

It makes me feel as if we can solve anything.

Ciao
Bernhard M.


[1] https://rb.zq1.de/compare.factory-20230109/diffs/libreoffice-compare.out


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Bug#1051801: document DEB_BUILD_OPTIONS value nopgo

2023-09-13 Thread Bernhard M. Wiedemann via rb-general

On 11/09/2023 09.25, Helmut Grohne wrote:

It also
is unclear how it affects reproducible builds since such builds depend
on the performance characteristics of the system performing the build.


It is worth noting that the performance (execution time) of a 
build-system does not matter for profiling, so it is possible to achieve 
reproducible builds with PGO enabled. It is just hard.


See https://build.opensuse.org/request/show/499887
linked in 
https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/pgo


In that gzip patch we even had to hide the name of the temporary file 
from gzip to not get variations from a 'tolower' call that would be 
optimized for different amounts of upper/lower-case letters.


Parallel builds and variations in ordering are also problematic, because 
some of the performance-counter-logic in gcc is not commutative, so 
running A and B calls produces different results from first calling B 
and then A.



With all that said, in openSUSE we also have a %do_profiling value to 
disable PGO for gcc, python and some others, because these profiling 
runs are too large to make deterministic.


Ciao
Bernhard M.


Re: Reproducible Arch Linux (August 2023)

2023-08-25 Thread Bernhard M. Wiedemann via rb-general

On 25/08/2023 06.56, kpcyrd wrote:


It seems the order for this has an impact on the elf binary.

```
find . -type f -perm -u+w -print0 2>/dev/null | while IFS= read -rd '' 
binary ; do


This should be trivial to fix with

find . -type f -perm -u+w -print0 2>/dev/null |
 sort -z |
 while IFS= read -rd '' binary ; do



For some reason, this did not show up in our
https://github.com/openSUSE/brp-check-suse/blob/master/brp-15-strip-debug#L22

but then I run most tests without debuginfo (for performance) and only 
use debuginfo in verification builds.



Ciao
Bernhard M.


Re: trying to reproduce hello-traditional from Debian. .buildinfo file? next steps?

2023-08-02 Thread Bernhard M. Wiedemann via rb-general

On 02/08/2023 11.26, Carles Pina i Estany wrote:


Hi,

This is Debian specific but I cannot find a reproducible builds Debian
specific mailing list. Let me know if I should ask elsewhere. Feel free
to send me some pointers to read it myself.

TL;DR: I'm trying to build hello-traditional from Debian and have the
same result as Debian. I cannot do it. Pointers welcome. I thought of
using the .buildinfo file to reproduce the build environment and deps
but unsure of the best way and if this is the way.

I'm trying to reproduce the build of the package hello-traditional. I
understand from here:
https://tests.reproducible-builds.org/debian/rb-pkg/bookworm/amd64/hello-traditional.html

That should be reproducible.

I've done:
$ sbuild --no-clean --arch-any --arch-all --no-source --dist=stable 
--arch=amd64 
http://deb.debian.org/debian/pool/main/h/hello-traditional/hello-traditional_2.10-6.dsc


https://www.reddit.com/r/reproduciblebuilds/comments/tqrf9q/the_binary_that_varies_from_full_moon/

It might be something else, but since you mentioned "hello", there is a 
chance that this amazing story is relevant.



Ciao
Bernhard M.


Re: Introducing: Semantically reproducible builds

2023-05-29 Thread Bernhard M. Wiedemann via rb-general



On 29/05/2023 06.10, Vagrant Cascadian wrote:

Do such tools actually exist, or are we talking about something
theoretical here?


https://github.com/openSUSE/build-compare/ is in use for 13 years.

And strip-nondeterminism can be used to build another such tool.

They will only ever be able to normalize or ignore certain known classes 
of differences. It is good enough to avoid review of many diffs.


e.g. https://rb.zq1.de/compare.factory/report-202303.txt has
not-bit-by-bit-identical: 673
build-compare-failed: 483

So for 190 packages build-compare found that they only had insignificant 
diffs and were considered semantically equivalent, so I could spend more 
time, debugging the other 483 diffs.



I very much worry that the meaning of Reproducible Builds may gradually
get whittled down


I share this concern, which is why I have been calling this 
semi-reproducible to distinguish it from bit-reproducible / 
fully-reproducible.
That 'semi-' prefix should give people a good hint of what it is and if 
not, encourage them to ask for details. "sort-of-reproducible" or 
"almost-but-not-quite-reproducible" could also be an option :-)



Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Introducing: Semantically reproducible builds

2023-05-29 Thread Bernhard M. Wiedemann via rb-general



On 29/05/2023 05.25, David A. Wheeler wrote:

If you have tips on common likely errors, please post, I think
that would be of interest to many.


https://github.com/openSUSE/build-compare/issues/53
https://github.com/openSUSE/build-compare/issues/33
https://github.com/openSUSE/build-compare/pull/36
https://github.com/openSUSE/build-compare/pull/28

We use bash there to not add dependencies.
Looking at the bugs, those were mostly problems of tracking state in 
variables.


It would be less troublesome if we would not use it like diffoscope to 
report all diffs, but instead exit on the first relevant diff to keep it 
simple.


The cleaner way is to use strip-nondeterminism to remove all these 
insignificant bits during build and make the resulting bit-reproducible 
output the official binary.


As a *recipient* who has no control over the build process used by
someone else to create their package, I need some workable
alternatives to estimate risk.


A recipient could still use strip-nondeterminism (and custom sed) on 
both files before calling diff.

Testing for bit-identity is trivial.
Testing for semantic equivalence is not.

To ensure that the filters did not remove significant parts (e.g. sed 
/.*//), they should then use the filtered version in production.



Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Introducing: Semantically reproducible builds

2023-05-28 Thread Bernhard M. Wiedemann via rb-general
I agree, that it is good to give it a name (I have called it 
semi-reproducible before), but we should be clear on communicating the 
disadvantages.


In openSUSE we have been working towards repeatable semantically 
reproducible builds for over a decade [1] using our open-build-service 
and a tool called build-compare to filter out "insignificant" diffs.


However, while working with the tool, I already found three (3) bugs in 
build-compare that made it report packages with significant differences 
as 'identical'.
And if you don't rely on such tools, you need expensive manual reviews 
every time that cannot be automated and might also miss issues.


I have manually reviewed hundreds of package diffs in the past and it 
took many hours, so I'm not eager to repeat that.



Another disadvantage of such binaries is that you don't have a single 
correct SHAsum that can be signed, communicated and compared easily.

You always need the full binary to compare to your rebuild.

The cleaner way is to use strip-nondeterminism to remove all these 
insignificant bits during build and make the resulting bit-reproducible 
output the official binary.


Ciao
Bernhard M.

[1] 
https://github.com/openSUSE/build-compare/commit/5cba04fb8def5d88423737a1a1957730e2217357


OpenPGP_signature
Description: OpenPGP digital signature


Re: Three bytes in a zip file

2023-04-07 Thread Bernhard M. Wiedemann via rb-general



On 06/04/2023 10.28, Larry Doolittle wrote:

I'm trying to make a process to generate byte-for-byte reproducible zip files.


Try adding the -X option to the zip call.
It will suppress adding of extended attributes (atime/ctime).
And with
https://github.com/distropatches/zip/commit/501ae4e93fd6fa2f7d20d00d1b011f9006802eae
it will also normalize mtime.


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: verifiable source-only bootstrap from scratch

2023-03-13 Thread Bernhard M. Wiedemann via rb-general



On 09/03/2023 23.34, Vagrant Cascadian wrote:

On 2023-03-08, aho...@0w.se wrote:

We seem to be the first project offering bootstrappable and verifiable
builds without any binary seeds.

The project's website is at [1]

...

[1] the site is available through the Tor/onion network
(for the advantages of convenient and privacy-friendly hosting) at
http://rbzfp7h25zcnmxu4wnxhespe64addpopah5ckfpdfyy4qetpziitp5qd.onion/


Is there a URL other than via tor .onion network to read up on what this
project is actually doing?

While I applaud and support the use of tor, exclusively using tor is a
bit of a surprise and seems to severely limit the scope of people who
will even read about it at all.

live well,
   vagrant


I created a mirror for easier access:

https://www.zq1.de/~bernhard/mirror/rbzfp7h25zcnmxu4wnxhespe64addpopah5ckfpdfyy4qetpziitp5qd.onion/

Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: SBOMs - Anywhere?

2023-03-03 Thread Bernhard M. Wiedemann via rb-general



On 25/02/2023 16.56, Anthony Harrison wrote:
More tools are in the pipeline, including one to generate an SBOM from 
an installed platform distribution or package (currently works for 
Debian systems, work in progress for RPM based systems) and an audit 
tool. I hope to publish these in the next couple of weeks.


I want to mention that we can already generate [1] and publish [2] SBOMs 
in our Open-Build-Service to meet SLSA level4 requirements.



[1] https://github.com/openSUSE/obs-build/search?q=SBOM
[2] 
https://github.com/openSUSE/open-build-service/blob/1e051bb20fb385695399c79dd8c9920d5fa18273/src/backend/bs_regpush#L717


OpenPGP_signature
Description: OpenPGP digital signature


Re: How to talk to skeptics?

2022-12-21 Thread Bernhard M. Wiedemann via rb-general



On 18/12/2022 02.09, Martin via rb-general wrote:

Controlling hardware is essential


https://www.bunniestudios.com/blog/?p=5706

Covers the topic of why open-source hardware is not enough to build 
trustable devices.


TLDR: there are ways to subvert silicon that cannot be detected, even 
with a electron-microscope, even if you know where to look.


One way out are FPGAs wherein you place processor cores randomly, so 
attackers cannot know what to subvert at the time of fabrication.


However, this is orthogonal to reproducible+bootstrappable builds.

Ideally you have all of them, but having some of them, is better than 
having none.


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


How to talk to skeptics?

2022-12-14 Thread Bernhard M. Wiedemann via rb-general

Hi,

a colleague of mine is rather skeptic towards bootstrapping and 
reproducible-builds.


E.g. he wrote

https://fy.blackhats.net.au/blog/html/2021/05/12/compiler_bootstrapping_can_we_trust_rust.html

and the effect can also be seen in his packaging such as
https://build.opensuse.org/package/show/openSUSE:Factory/rust1.65
that ships with two gigabytes of bootstrap compiler binaries for various 
architectures instead of using our existing rust packages of version N-1 
"because compilation takes twice as long".


He also once pointed me to
https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html

In the end, it would be useful to collect some well-worded / 
well-thought counter-arguments on r-b.o (if we don't have that already)


https://reproducible-builds.org/docs/buy-in/ could provide some input.

Any thoughts and/or volunteers?

Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


scons discussion

2022-09-26 Thread Bernhard M. Wiedemann via rb-general

Hi,

there is an interesting rb-related argument in a PR, but I want to avoid 
that too many rb-people hop in there and instead would appreciate, if 
you could contribute to a cohesive argument at

https://etherpad.opensuse.org/p/scons-rb-argument

that then gets posted there by 1 representative.


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Fw: Build Reproducibility in Debian - Opinion Needed

2022-08-25 Thread Bernhard M. Wiedemann via rb-general
Muhammad Hassan wrote:
> Do you feel there is potential for detecting build unreproducibility 
> statically (without executing adversarial rebuilds)?

Yes, there are a number of potentially troublesome strings listed in
https://github.com/bmwiedemann/reproducibleopensuse/blob/master/howtodebug#L31

If one of these gets added, it may be harmless, but would warrant a
rebuild test or closer inspection of the source.


On 24/08/2022 19.37, Chris Lamb wrote:
> Other avenues requiring a single build would include all the instrumention
> approach (eg. strace/systemtap, etc.) taken by a few projects. I think
> Bernhard might be able to speak better on this, and there are some
> academic projects in this area as well.

My strace approach uses
https://github.com/bmwiedemann/reproducibleopensuse/blob/master/stracebuild
to trigger
https://github.com/bmwiedemann/reproducible-faketools/blob/master/bin/rpmbuild-strace

I use that to find where unreproducible files come from with
https://github.com/bmwiedemann/reproducibleopensuse/blob/master/autoprovenance

It seems, strace cannot see time syscalls - maybe because those do not
reach the kernel via the linux-vdso.so.1 shortcut.

It would be possible to see accesses to /dev/[u]random and readdir syscalls.


I have also played a bit with ptrace-based
https://github.com/dettrace/dettrace
but it needed regular updates as Linux keeps introducing new syscalls.



Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Fwd: enabling link time optimizations in package builds

2022-06-28 Thread Bernhard M. Wiedemann via rb-general


On 17/06/2022 11.12, Chris Lamb wrote:
> Hi Roland,
> 
>> would enabling LTO cause reproducible issues?
>> If I remember correctly, Bernhard mentioned some issues, which got 
>> 'solved' by using less parallel builds (-j1 or -j4?).
> 
> Good question. There was definitely at least one LTO-related issue in the
> past. Take, for instance, this bug report from 2015 about "FAT" LTO objects:
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66305
> 
> I don't know whether this is still valid and/or we would encounter it
> with Doko's proposal, however.

I think it can still be an issue.

We strip the unreproducible sections at the end of builds:
https://github.com/openSUSE/brp-check-suse/pull/29/files

The downside is that you don't get extra optimization when linking
external .a files, but such static linking is discouraged anyway for
maintainability.


The other thing we had to do was that we started to use -flto=auto
instead of -flto=$cpus
to avoid embedding this detail into debuginfo and such.

Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: rb meetup at openSUSE conference in Nuremberg

2022-05-20 Thread Bernhard M. Wiedemann via rb-general


On 10/03/2022 04.23, Bernhard M. Wiedemann wrote:
> Hi,
> 
> I submitted a rb workshop session proposal for
> https://events.opensuse.org/conferences/oSC22
> 
> Even if that is not accepted, this conf would be an opportunity for a
> small meetup of rb people.
> 
> Who would be interested to join? Add yourself to
> https://dudle.inf.tu-dresden.de/NUE-rb-meetup-2022/
> (times are in CEST and are just rough indicators)

2 more weeks to go until openSUSE conference

Meanwhile the rb workshop is scheduled for 2022-06-03 10:00 CEST
https://events.opensuse.org/conferences/oSC22/program/proposals/3707

But the proposed meetup should be independent of that.
Feel free to still add yourself to the dudle linked above.

Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: JDK 19+21 early-access build is reproducible

2022-05-08 Thread Bernhard M. Wiedemann via rb-general


On 06/05/2022 22.48, John Neffenger wrote:
> Starting yesterday, for the first time, the JDK can create reproducible
> builds of the JDK!

That is great news. Thank you John, Magnus and Andrew for taking care of
this.

I tried to get a double-build test working on openSUSE with
https://github.com/openjdk/jdk/releases/tag/jdk-19%2B21 to either
confirm the reproducibility or find some remaining diff for you,
however building JDK is still hard, so

https://build.opensuse.org/package/show/home:bmwiedemann:java/java-19-openjdk

errors out atm with

 checking for java...
/home/abuild/rpmbuild/BUILD/jdk-jdk-19-21/build/bootcycle-build/images/jdk/bin/java
 checking for javac...
/home/abuild/rpmbuild/BUILD/jdk-jdk-19-21/build/bootcycle-build/images/jdk/bin/javac
 checking for javah... no
 configure: error: Java 1.6 or later is required to build java-access-bridge


I found a javah in 1_8_0 but that is rejected as bootstrap java version
(needs to be 18 or 19)

Can you point me to the scripts that build your official Linux binaries
or do you have hints on how to fix my build?

You can also try this locally on Debian or openSUSE with an account from
https://idp-portal.suse.com/ and
osc co home:bmwiedemann:java/java-19-openjdk && cd $_
osc build --noservice standard


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Call for real-world scenarios prevented by RB practices

2022-03-24 Thread Bernhard M. Wiedemann


On 22/03/2022 13.46, Chris Lamb wrote:
> Just wondering if anyone on this list is aware of any real-world
> instances where RB practices have made a difference and flagged
> something legitimately "bad"?

Maybe not "bad" as in "malicious", but certainly I detected and fixed
some bad quality issues in openSUSE over the years.

Some where corrupted data made it into packages:
https://bazaar.launchpad.net/~intltool/intltool/trunk/revision/748
http://lists.gnu.org/archive/html/bug-bash/2018-07/msg00010.html
https://bugzilla.opensuse.org/show_bug.cgi?id=1192192
https://bugzilla.opensuse.org/show_bug.cgi?id=1103093

https://gitlab.gnome.org/GNOME/libxslt/-/issues/37
notable because the maintainer wrote:
> I'm still puzzled why it took so long to discover this issue

Also a bunch of year 2020 bugs such as
https://rt.cpan.org/Public/Bug/Display.html?id=124543
https://rt.cpan.org/Public/Bug/Display.html?id=124524


and https://bugzilla.opensuse.org/show_bug.cgi?id=1100677 has a whole
class with a dozen members. Most of those could have caused crashes on
older user machines.


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Strange things with timestamps on Debian (sudo)

2022-03-17 Thread Bernhard M. Wiedemann


On 16/03/2022 17.54, Marc Haber wrote:
> [tl;dr building with faketime yields Debian package with timestamps
> different from building without faketime, causing reprotest to fail]

It might be a problem with how faketime works:
https://github.com/wolfcw/libfaketime/issues/183

For openSUSE, I usually build in
kvm -rtc base=2037-09-04T00:00:00

and still get reproducible sudo-1.9.9 package binaries.

Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


rb meetup at openSUSE conference in Nuremberg

2022-03-09 Thread Bernhard M. Wiedemann
Hi,

I submitted a rb workshop session proposal for
https://events.opensuse.org/conferences/oSC22

Even if that is not accepted, this conf would be an opportunity for a
small meetup of rb people.

Who would be interested to join? Add yourself to
https://dudle.inf.tu-dresden.de/NUE-rb-meetup-2022/
(times are in CEST and are just rough indicators)


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: Thinking of our next summit this year

2022-03-01 Thread Bernhard M. Wiedemann


On 01/03/2022 17.59, Mattia Rizzolo wrote:
> Hello everybody,
> 
> in the past month or two we have seen how, at least in Europe and in the
> Americas, rules slowly opened up so that people could move around again.
> As such, some of us were thinking if this could be a good idea to
> physically meet each other again.
> 
> Roughly speaking, we were wondering how you would *feel* about traveling
> this coming September.
> I want to stress that nobody has started planning for anything at this
> moment, and this email is only soliciting feedback on whether you'd feel
> comfortable traveling (potentially abroad) and meeting others.

I would like to meet, but I don't want to fly.

After our 2019 meeting in Marrakesh, I noticed that just these two
flights of 2500 km produced more greenhouse gases than all of my 200
office visits that year combined.

And while the last 2 years did not feel like it, climate change is still
the bigger problem.

So if we can create something, where I can join a local group between
France, Poland, Austria and Denmark, I'm interested.

Winter might not be the smartest choice, though, given the seasonality
or Covid. October or March seem better.


Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


Re: SOURCE_DATE_EPOCH and timezone with FAT images

2022-02-23 Thread Bernhard M. Wiedemann


On 23/02/2022 11.53, Thomas Schmitt wrote:

> When trying to convince programmers of the additional complexity it does
> not really help that
>   https://reproducible-builds.org/docs/source-date-epoch/
> says:
>   "At present, we do not have a proposal that includes anything
>resembling a "time zone"."
> Although i agree that time representations are tricky, i think it is
> necessary to specify a rule for the special case where the timezone
> unavoidably influences the storage bytes of the reproducible software.

The most practical approach is to add to the build scripts
export TZ=UTC
(or UTC0)

If it is a regional project, hardcoding that local timezone would also
yield reproducible results.

What is that "additional complexity" you speak of?

Ciao
Bernhard M.


OpenPGP_signature
Description: OpenPGP digital signature


reproducible gcc

2022-02-14 Thread Bernhard M. Wiedemann
Hi Vagrant,

I missed you in IRC where you wondered why openSUSE's gcc shows as
reproducible in http://ismypackagereproducibleyet.org/?pkg=gcc

Our 'gcc' is just a meta-package that pulls in gcc11 or whatever version
is current.

Since around gcc9, we were also able to build gcc reproducibly if we
disabled profiling. We just don't do that for official builds, because
there is a 8% performance impact.
In general, compiler devs are very interested in deterministic results.

However, my notes show, gcc11 only becomes reproducible with
deterministic filesystem readdir order (which we have in production[1])
Thus I don't spend much time on the related bug:
https://bugzilla.opensuse.org/show_bug.cgi?id=1188621

So http://ismypackagereproducibleyet.org/?pkg=gcc11
in green here means, I was able to build reproducibly in some way - but
not in the default way.

Ciao
Bernhard M.


[1] https://github.com/openSUSE/obs-build/pull/634


OpenPGP_signature
Description: OpenPGP digital signature


Re: Fwd: Announcing SupplyChainSecurityCon, and other Open Source Summit NA 2022 Events

2022-02-11 Thread Bernhard M. Wiedemann


On 21/01/2022 02.21, Chris Lamb wrote:
> Thanks for forwarding this. Do you, or anyone else on this list in
> fact, have an intention to submit or attend at this point?

Since it has been silent so far, and I hope I do not have to travel to
Texas for this presentation,
I have now submitted a presentation for SupplyChainSecurityCon:

reproducible builds: unexpected benefits and problems

I have now worked on openSUSE reproducible builds for 6 years and would
like to share some insights on where it can help, how to best debug
non-determinism and what unexpected problems showed up with
reproducible-builds.

I plan for a good part of Q + discussion at the end.


OpenPGP_signature
Description: OpenPGP digital signature


Fwd: Announcing SupplyChainSecurityCon, and other Open Source Summit NA 2022 Events

2022-01-19 Thread Bernhard M. Wiedemann



 Forwarded Message 
Subject:Announcing SupplyChainSecurityCon, and other Open Source
Summit NA 2022 Events
Date:   Wed, 19 Jan 2022 07:33:24 -0800
From:   The Linux Foundation 
Reply-To:   no-re...@linuxfoundation.org
To: linuxcom...@lsmod.de



Announcing SupplyChainSecurityCon, and other Open Source Summit NA 2022
Events
View in browser



oss-22-graphics-011822-v3_email-600x313-na cropped

Open Source Summit is the premier event for open source developers,
technologists, and community leaders to collaborate, share information,
solve problems, and gain knowledge, furthering open source innovation
and ensuring a sustainable open source ecosystem.


The open source ecosystem continues to evolve, and as such, Open Source
Summit will do so as well. Thus, as we move forward, we view Open Source
Summit as a conference umbrella, composed of a collection of events that
will always cover the most important projects, technologies and topics
in open source today - in one place.

 


Open Source Summit North America


is Composed of 13 Events.

ossna22_umbrella

 

Introducing SupplyChainSecurityCon.

Last year, OSPOCon

was introduced under the OS Summit umbrella, and this year we’re
announcing that SupplyChainSecurityCon

is joining as well at Open Source Summit North America.

 

Co-hosted by CNCF

and OpenSSF
,
this event gathers security practitioners, open source developers, and
others interested in software supply chain security to explore the
security threats affecting the software supply chain, share best
practices and mitigation tactics, and increase knowledge about how to
best secure open source software.

 

Save the Date.

  * Open Source Summit North America

:
June 21-24 • Austin, Texas, USA
  * Open Source Summit Europe


Re: Reproducible tarballs on Github?

2021-10-23 Thread Bernhard M. Wiedemann


On 23/10/2021 20.14, David A. Wheeler wrote:
> 
> A given version of tar should produce deterministic results. However, if
> tar is updated, it’s not really
> reasonable to expect that the result will be identical.

> It’s reasonable for GitHub to change its default tar implementation. What 
> would you suggest as an alternative?

In principle it is possible to define unit-tests that check that a set
of given inputs will produce a certain set of outputs.
Then when you change the implementation, it ensures that (at least
these) outputs are still the same.
The downside is that it can make changes harder (e.g. because you need
to keep the old ordering of elements), but the upside is that you can be
pretty sure that outputs are correct.


One related thing I wondered: are there verification efforts that check
that release tarballs correspond to a git commit?
In some cases with automake/autoconf it will usually not be a perfect match.
The situation is better for projects that gpg-signs their tarballs, but
verification cannot hurt even in those cases.




OpenPGP_signature
Description: OpenPGP digital signature


Re: Reproducible Builds Summit 2022 - we haven't forgotten about it!

2021-10-21 Thread Bernhard M. Wiedemann


On 21/10/2021 13.13, Mattia Rizzolo wrote:
> "Thanks" to COVID-19, we, of course, skipped 2020, and we were
> previously hoping to run an event in 2021. But too much of the world is
> still not really viable when it comes to travelling and meeting people
> etc., even before considering our own personal risks.

Back in 2019 in Marrakesh I was thinking that the other major problem
our world has to solve - the climate crisis - means that I should rather
take the train to more local conferences instead of airplanes.
So can we plan that event in a way that I can join either from home or
join a sub-group somewhere within 800km of Berlin?

I remember that Gunner was not a friend of online collaboration tools
(such as trello), but I really would love to see that work some way.

Ciao
Bernhard M.



OpenPGP_signature
Description: OpenPGP digital signature


Re: Reproducible builds on Java

2021-09-06 Thread Bernhard M. Wiedemann


On 06/09/2021 11.17, Magnus Ihse Bursie wrote:
> I'm working for Oracle in the Build Group for OpenJDK [1], which is
> primary responsible for creating a built artifact of the OpenJDK source
> code. I also have a general interest in all things about building in
> general, so I've been lurking on this list for a while. :-)

I highly appreciate your efforts to move Java/OpenJDK in the right
direction.


> I have recently looked through the tests.reproducible-builds.org site
> for Java-specific problems listed. Most of them seem to be
> project-specific, or due to tools outside of the JDK. But one about
> javadoc caught my eye [5]. I talked to one of the javadoc engineers, and
> he think that any nondeterminism is due to old versions (prior to JDK
> 9), which should have been fixed since long. I don't understand how to
> get any further from this page to get more details on the problem. If
> someone could help me by pointing to specific bug reports on javadoc
> generating nondeterministic output, that'd be very helpful.


I checked my javadoc-related bugreports and there are 2 atm
https://issues.apache.org/jira/browse/MJAVADOC-619 marked as fixed in
2021-05

https://bugzilla.opensuse.org/show_bug.cgi?id=1174795 still open about
readdir-order-related non-determinism [1]


https://rb.zq1.de/compare.factory-20210830/diffs/jansi-compare.out
was built with java-11-openjdk-11.0.11.0
and has at least 2 different issues, but some of them are filtered out
by our build-compare tool.

The visible diff there might be the same cause as
https://bugzilla.opensuse.org/show_bug.cgi?id=1174795


One invisible diff is

/usr/share/javadoc/jansi/org/fusesource/jansi/class-use/Ansi.Attribute.html
-
+
 Uses of Class org.fusesource.jansi.Ansi.Attribute (jansi API
Reference (1.17.1))
 
-
+


I guess, there are already ways to override the date, but it does not
(yet) use SOURCE_DATE_EPOCH for that and we cannot manage to patch 200
java packages to use the right overrides.
Experience has shown that doing individual patches is an infinite effort
because any number of software can still be written.


I also tried to build jansi with openJDK-17 (and 13+15), but that failed
with
[javac] error: Source option 6 is no longer supported. Use 7 or later.

other javadoc-affected packages:
apache-commons-compress
maven-common-artifact-filters
jansi
jchardet
jdeparser
jnr-a64asm
jnr-process
mysql-connector-java
jnr-unixsocket
docker-client-java

So I guess, we should find one among those that can be built with
openJDK-17 and check how it behaves then.

Ciao
Bernhard M.

[1]
https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/readdir



OpenPGP_signature
Description: OpenPGP digital signature


Re: Recoding the configuration for live-build images (Was: Third status update about reproducible live-build ISO images in Jenkins)

2021-09-01 Thread Bernhard M. Wiedemann


On 31/08/2021 15.53, Chris Lamb wrote:
> Indeed, needing to
> extract parts of the ISO to recreate it is slightly sub-optimal, if
> only because it would require someone to download it first before
> attempting to recreate it (rather than just possessing the minuscule
> .buildinfo file containing the inputs and output hashes).

There are ways to read files off a remote iso without downloading the
whole thing:
https://github.com/bmwiedemann/curlwwwfs + fuseiso
or maybe
https://github.com/higlass/simple-httpfs

However, it would also be possible to place them as tarball next to it,
but then you add other challenges in toolchains and workflows, if you
think about the separate .buildinfo vs ArchLinux embeeded one.
How do you find the right buildinfo? What if someone only fetches the
binary?



OpenPGP_signature
Description: OpenPGP digital signature


Re: i-probably-didnt-backdoor-this: Reproducible Builds for upstreams

2021-08-19 Thread Bernhard M. Wiedemann


On 20/08/2021 01.16, kpcyrd wrote:
> 
> I uploaded a github repo that distributes a Hello World in various
> formats (ELF binary, Docker image, 3rd party(!) Arch Linux package) and
> documented every file and command needed to reproduce the artifacts
> bit-for-bit:
> 
> https://github.com/kpcyrd/i-probably-didnt-backdoor-this

This is nice work towards binaries we can trust.

Ciao
Bernhard M.



OpenPGP_signature
Description: OpenPGP digital signature


Re: Packaging Con 2021

2021-08-12 Thread Bernhard M. Wiedemann


On 12/08/2021 08.48, Wolf Vollprecht wrote:

> we're organizing a conference about software package management: 
> https://packaging-con.org/
> 
> It's the first time that this event happens: we are trying to bring together 
> many different package management communities, compare approaches, share 
> lessons learned, etc.
> 
> It would be really interesting to learn more about reproducible builds there, 
> and the specific challenges you guys had to overcome. I think it could be 
> very valuable for other package management communities. The call for 
> presentations ends on August 31.

I just submitted
https://pretalx.com/packagingcon-2021/me/submissions/BGXP3D/
before reading your mail.

Abstract:
> Why everyone should do reproducible builds and how can package managers help 
> in getting there.


Description:
> Distributors and users alike are worried these days about supply chain 
> attacks as those on SolarWinds. 
> 
> For FLOSS developers, reproducible-builds is an easy way to let people verify 
> that the published packages indeed correspond to their public sources.
> 
> This presentation will answer the Why? What? and How?


If lamby or someone else wants to do a joint session, it might make the
preparation a bit more tricky, but could make the outcome more
interesting to users.
Or I cover the basics and others add lightning-talk slots about their
specifics (would be great if the conference organizers get those
scheduled after it)



OpenPGP_signature
Description: OpenPGP digital signature


Re: Help us map the reproducible builds ecosystem

2021-08-05 Thread Bernhard M. Wiedemann


On 05/08/2021 17.18, Santiago Torres-Arias wrote:
> Part of what I'm hoping is to involve r-b within IronHacks in
> the forseeable future: so as to encourage a hackathon on finding and
> cataloging reproducibility issues.

Well, there is this high-level collection of the 10 basic issues I found
https://github.com/bmwiedemann/theunreproduciblepackage/
with descriptions, simplified examples and links to real fixes.

and there is the more low-level collection from mostly lamby
https://salsa.debian.org/reproducible-builds/reproducible-notes/
(also look at history because solved issues are dropped)

How would the catalog you envision be in contrast to those?



OpenPGP_signature
Description: OpenPGP digital signature


Re: apk/dex differences, diffoscope can't really tell what's going on. Any ideas?

2021-05-29 Thread Bernhard M. Wiedemann


On 29/05/2021 14.30, Marcus Hoffmann via rb-general wrote:
> we're trying to hunt down an unreproducible apk build.
> 
> We currently have a diff between two dex files which diffoscope can't
> really tell us anything about:
> https://bubu1.eu/diffoscope_dex.html
> 
> Anyone got any idea what's going on here?
> 
> (File are https://bubu1.eu/classes.dex and
> https://bubu1.eu/classes_fynn.dex)

They differ in
"pg-map-id":"xxx"

and the 24 differing bytes starting at offset 8 could be a 192 bit
checksum over the remaining content.

If in doubt, check the code creating it for "pg-map-id" and for what
goes after the dex\n035\000 magic header.

https://gitlab.torproject.org/tpo/applications/tor-browser-build/-/issues/40085
seems related.

https://speakerdeck.com/jakewharton/diffusing-changes-in-your-apks-droidcon-toronto-2019
also has something.



OpenPGP_signature
Description: OpenPGP digital signature


Re: Possible new category for non-reproducible builds: --build-id=sha1

2021-04-24 Thread Bernhard M. Wiedemann


On 24/04/2021 17.59, Roland Clobus wrote:
> I've looked the reproducible report for apt-cacher-ng [1].
> It looks like it is caused by a linker flag: -Wl,--build-id=sha1


man ld says
> --build-id=style
> If style is omitted, "sha1" is used.


So this is just the default made explicit.

If you see --build-id=uuid it is bad, because it will use randomness
instead of hashing inputs.

If you see variations in build-id with sha1 mode, it means there were
already variations in inputs before and those inputs should be made
deterministic.


Ciao
Bernhard m.



OpenPGP_signature
Description: OpenPGP digital signature


Re: Attack on SolarWinds could have been countered by reproducible builds

2021-04-16 Thread Bernhard M. Wiedemann


On 14/04/2021 19.02, Chris Lamb wrote:
> A quick update: as permitted by IEEE, the paper is now available in an
> open access / preprint capacity:
> 
>https://ieeexplore.ieee.org/document/9403390
>https://arxiv.org/abs/2104.06020


I reviewed the latter and found some issues:

> doing so is inefficient when source code is available for audit

was very confusing to read. I read it multiple times and understood it
as "source code makes audit inefficient" until some time later
re-reading with more context.

Should be something about "auditing source-code is more efficient than
auditing binaries"


> The mechanics of reproducibility testing suggest that this issue would not 
> have been readily discovered another way.

not sure if mechanics are people here or mechanisms - and not sure how
either would suggest something.
Why not "We believe that..." or "Our experience (in rb) leads us to
think..." ?


> However, this has not yet been achieved, partly because time and effort are 
> not inexhaustible or fungible resources in volunteer communities

This is hard to parse, not only because of the double-negation ("not
in-"). Does it mean: Engineers have limited time and volunteers even
more so? And 'fungible' means you can not just put a noob's hour in and
achieve as much as an expert-hour?



For the list of common issues: code compiling with -march=native is a
common occurrence that also is a bug found easily by rb. I often find
that in our HPC and science package sections.


In the debugging section, you only mentioned looking at diffoscope
output. Did you consider adding some of the other useful ways mentioned
in section 2 of
https://github.com/bmwiedemann/reproducibleopensuse/blob/devel/howtodebug ?




and some grammar fixes:

-a extremely mature
+an extremely mature

-tool that recursively unpacks a large number of archive formats and
translate tens of binary formats
+tool that recursively unpacks a large number of archive formats and
translates tens of binary formats


Ciao
Bernhard M.



OpenPGP_signature
Description: OpenPGP digital signature


Re: Please review the draft for March's report

2021-04-06 Thread Bernhard M. Wiedemann
On 06/04/2021 02.24, Daniel Shahaf wrote:
> I don't understand from that post what's so significant about sigstore,
> even after having followed the link to upstream's press release.  

I think, the problem that it tries to address is that most (90%?) of
upstreams publish just tarballs/zipfiles without a cryptographic
signature. E.g. [1]
So as a packager, I download the file and have no way to verify that I
got what the author meant to publish.

Now, if you have a third party that also downloads the file and
publishes a signature over what it got, you at least have another data
point that helps you verify that your local wifi or a rogue mirror did
not mitm your transfer or that at least everyone gets the same version
(you could call it "reproducible downloads").

If it ever happens that such a 3rd party signing key leaked, you do not
want years of signatures to become worthless => this is why they make
keys short-lived - similar to how you can make syslogs tamper-proof.


[1]  https://ftp.gnu.org/gnu/autoconf/
 https://avahi.org/download/
 https://download.gnome.org/sources/audiofile/0.3/
 http://mirror.synyx.de/apache/httpd/mod_fcgid/


Re: How could we accelerate *deployment* of verified reproducible builds?

2021-01-30 Thread Bernhard M. Wiedemann


On 30/01/2021 17.27, David A. Wheeler wrote:
> Technically correct, the best kind of correct :-). And to be fair, there 
> *are* some reproducible builds (as others have noted).

on that topic, openSUSE is somewhere around 96% verifiable (modulo some
missing mtime normalization) and I am also constantly verifying with my
rebuilder. Visible as "verified" : 1
in https://rb.zq1.de/compare.factory/reproducible.json


> But I want to see them accelerated into more key places. An unfair counter 
> statement could be “you’ve been at this a while, why aren’t you done?”. I 
> think that’s unfair because it’s not so easy; there are many little things 
> that have to be done (timestamps set, collections forced into specific 
> orders, etc.). But what would it take to accelerate things?

There are some hard problems, when reproducibility collides with other
desired properties of software.

E.g.
1) performance: gcc PGO makes gcc run 8% faster

2) security: tigervnc signs .jar files with random temp privkey,
   libcamera also does sigs to not trust third-party modules,
openbuildservice signs kernel modules with a secret key for secure-boot
other packages generate random DH-params and re-using them can make
attacks easier with pre-computing


3) simplicity/maintainability/portability/reliability (e.g. when
software needs to work with non-GNU date, patches can get rather messy
and introduce problems, also in some places we added a y2038 problem of
strtol/time_t with SDE patches)
Many upstreams also dont like the concept of the SDE environment
variable that influence results - paradoxically because they consider
that less reproducible than explicit command line args or config entries.


https://rb.zq1.de/compare.factory/graph.png shows another aspect of why
we are not done yet. You see, the number of reproducible packages is
constantly increasing and I do a dozen patches each month to make
unreproducible packages reproducible, but that just keeps the number of
unreproducible packages constant around 500.


To make progress, we wanted to concentrate on core packages first, e.g.
for openSUSE, we have
https://rb.zq1.de/compare.factory-20210129/unreproduciblerings.txt that
shows for ring0 (bootstrap) just
bison
gcc10
python38

- all of them suffer from PGO [1], python also suffers from
non-determinism in .pyc files [3]

Our (SUSE) compiler guys said, the merges of counters used in .gcda
files used for PGO are non-commutative, so if you do A and then B in a
profiling run, you get different optimizations than if you first did B
and then A. Redesigning that, could improve PGO determinism in many places.


Another approach could use a variant of dettrace[2] to make
non-deterministic behaviour reproducible in build envs.
For that we would need to make working OS packages.
That would again result in a trade-off (especially for large packages
like libreoffice or gcc), because it will slow down the build and that
slows down update-cycles for users.

I also wanted to
https://trello.com/c/yKaKMjNq/67-use-dettrace-in-autoclassify
to better pin down the source of non-determinism, but any extra
contributor could do these things.

There are also some other dozen things listed in my trello board (java
and python toolchain improvements could make big impact). That could get
us beyond 98%. Maybe even to 100% verifiable for the core packages.

Once packages are verifiable, verifying is easy.


Ciao
Bernhard M.


[1] https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/pgo
[2] https://github.com/dettrace/dettrace
[3] https://trello.com/c/I9voedvB/7-pyc-rb



OpenPGP_signature
Description: OpenPGP digital signature


Re: Hi, intro, and introducing System Transparency

2021-01-22 Thread Bernhard M. Wiedemann
Thanks for the interesting concept.

On 17/01/2021 13.40, Fredrik Strömberg wrote:
> https://system-transparency.org/
> https://mullvad.net/en/blog/2019/6/3/system-transparency-future/
> https://mullvad.net/nl/blog/2019/8/7/open-source-firmware-future/

in https://mullvad.net/media/system-transparency-rev5.pdf
you wrote
> The goal of the provisioning ritual is to convince future auditors that the 
> stated hardware specifications are correct; that the boot ROM was programmed 
> with an artifact with a specific checksum; and, finally, to tie the platform 
> to a newly generated public key contained in the platform TPM. Assurance that 
> the platform has not been tampered with after the provisioning ritual is 
> provided by tamper detection switches connected to the casing and TPM; 
> through the use of an enclosure PUF; or similar measures.

That reminded me very much of the design the DCI used to secure digital
projectors+media blocks that were allowed to receive encrypted video
content.


However, for a VPN, I'd prefer the tor design, because it is clear that
it is possible to monitor incoming and outgoing traffic at the ISP or
routers and correlate some of it via timing and packet sizes (harder if
you use padding and random delays).
Can be combined though. E.g. tor over VPN.

Of course, a lot depends on your threat model - are your users evading
the nosy neighbor, RIAA or the Mossad?




OpenPGP_signature
Description: OpenPGP digital signature


Re: Attack on SolarWinds could have been countered by reproducible builds

2020-12-27 Thread Bernhard M. Wiedemann


On 21/12/2020 22.28, Richard Purdie wrote:
> OE-Core is about 800 pieces of software generating ~11,000
> packages of which we have about 65 marked as not reproducible at
> present. We're obviously working on improving those 65, and the
> techniques used will "just work" to a large extend throughout our wider
> layers of other software, we're just note testing that until we sort
> the core.

do you have pointers to the list of unreproducible packages and how to
do test builds?


In http://git.openembedded.org/openembedded-core/
meta/lib/oeqa/selftest/cases/reproducible.py exclude_packages maybe?


>   'acpica-src',
>   'babeltrace2-ptest',
>   'bootchart2-doc',
>   'cups',
>   'cwautomacros',
>   'dtc',
>   'efivar',
>   'epiphany',
>   'gcr',
>   'git',
>   'glide',
>   'go-dep',
>   'go-helloworld',
>   'go-runtime',
>   'go_',
>   'groff',
https://build.opensuse.org/request/show/645935
>   'gst-devtools',
>   'gstreamer1.0-python',
>   'gtk-doc',
https://bugzilla.gnome.org/show_bug.cgi?id=784177
>   'igt-gpu-tools',
> 'kernel-devsrc',
>   'libaprutil',
>   'libcap-ng',
>   'libhandy-1-src',
>   'libid3tag',
>   'libproxy',
>   'libsecret-dev',
>   'libsecret-src',
>   'lttng-tools-dbg',
>   'lttng-tools-ptest',
>   'ltp',
>   'meson',
>   'ovmf-shell-efi',
>   'parted-ptest',
>   'perf',
https://elixir.bootlin.com/linux/latest/source/tools/perf/pmu-events/jevents.c#L1168
>   'python3-cython',
>   'qemu',
>   'quilt-ptest',
>   'rsync',
>   'ruby',
https://github.com/ruby/io-console/commit/679a941d05d869f5e575730f6581c027203b7b26
>   'spirv-tools-dev',
>   'swig',
>   'syslinux-misc',
>   'systemd-bootchart',
>   'valgrind-ptest',
>   'vim',
>   'watchdog',
>   'xmlto',
>   'xorg-minimal-fonts'

I found some relevant patches and pointers in our packages, linked above.



OpenPGP_signature
Description: OpenPGP digital signature


rb-debugging meeting minutes

2020-12-07 Thread Bernhard M. Wiedemann
A bit delayed are the meeting minutes from our IRC meeting on debugging
reproducibility issues:

http://meetbot.debian.net/reproducible-builds/2020/reproducible-builds.2020-11-16-18.08.html



Here is a dump of the etherpad notes for that topic:

bmwiedemann once wrote
https://github.com/bmwiedemann/reproducibleopensuse/blob/devel/howtodebug
which contains some openSUSE-specific parts, but also plenty
distribution-agnostic steps

do similar guides exist for Arch and Debian?
https://wiki.debian.org/ReproducibleBuilds/Howto - a bit out-of-date and
unmaintained

How do you do

double-build (with some variations)

Debian: reprotest ; has --auto-build for autoclassify and can keep
diffs with only one source of non-determinism

verification build (compare local build result with official build)

Archlinux: rebuilderd + archlinux-repro ; no reprotest yet

nix: first "nix-build ' -A packagename" to fetch the
official build from the cache, then the same command but with an
additional '--check' parameter to rebuild and verify. No reprotest yet.

compare noarch builds between i586 and x86_64 (aka amd64)

  Debian does this on tests.reproducible-builds.org i386 tests (one
build with 686-pae kernel, one with amd64 kernel)




OpenPGP_signature
Description: OpenPGP digital signature


Re: To do tasks in Reproducible Builds

2020-08-12 Thread Bernhard M. Wiedemann


On 12/08/2020 19.44, jathan wrote:
> On 12/08/2020 10:54, Holger Levsen wrote:
>> Hi Jathan,
>>
>> On Fri, Aug 07, 2020 at 04:10:25PM -0500, jathan wrote:
>>> I was visiting the Reproducible Builds websitesite and the Debian Salsa
>>> repo looking for some list of "To do tasks" in the team. Do we have
>>> something to view which tasks need to be done with priorities or how do
>>> you organise in that way please?
>>
>> did you see https://reproducible-builds.org/contribute/ ?
>>
>>
> Hi Holger!
> 
> Thanks a lot for your response :) Yes I did. I read that page, but I
> found it more in the way about how someone can collaborate in RB. I was
> looking for some real time resource in which you can see tasks to be
> done, in progress and done, something similar like a Kanban board or
> some kind of tasks tracking. I will check more detailed anyway the
> resources available at the contribution page. Thank you for pointing this!

I think, overall it would be good to get some toolchain issues fixed,
that affect multiple packages:

https://trello.com/c/pHLGpzDQ/39-mono
https://trello.com/c/9aSypA7E/71-rust-libgit2
https://trello.com/c/kfKHyItI/70-go-buildid
https://trello.com/c/I9voedvB/7-pyc-rb

java has plenty. In openSUSE, xmvn is the worst
https://bugzilla.opensuse.org/show_bug.cgi?id=1162112

When I last looked through the list of compilers and interpreters, those
that built reproducibly were rather few. perl, bash, llvm, ocaml and
ruby2.7 are the positive exceptions now.
http://ismypackagereproducibleyet.org/?pkg=perl
Maybe we can also find more/better datasources for this tool?


There are also pretty hard challenges, e.g. when you look at how clisp
and emacs create their binaries through memory dumps.

Or when you want to make gcc build reproducibly even with
profile-guided-optimizations and its huge profile run that is a whole
gcc build.
When I talks to our (SUSE) compiler guys, they said, there might be some
gcc patches possible to make profiling react less on variations in
ordering, so that running independent A,B would yield the same as B,A.


Ciao
Bernhard M.



signature.asc
Description: OpenPGP digital signature


Re: JavaScript information for website

2020-07-28 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



Am 24.07.20 um 22:10 schrieb John Scott:
> I'm not subscribed, please keep me CC'd.
>
> I'm working on adding metadata to the JavaScript on
> reproducible-builds.org so software like LibreJS can know it's free
> and where the source code can be obtained.
>
> However, modernizr.min.js, popper.min.js, and run_prettify.js are
> minimized and don't say what version they are. Would anyone be able
> to identify this so I can locate the corresponding source?

On that topic: did someone test if JS-minifier code is deterministic?

and is there continuous testing somewhere to verify that published
non-source minified versions correspond to the respective sources?


links I found on that topic in our blog posts:
https://blog.bitpay.com/npm-package-vulnerability-copay/
https://www.theregister.co.uk/2018/11/26/npm_repo_bitcoin_stealer/
https://diff.intrinsic.com/
https://extensionworkshop.com/documentation/publish/add-on-policies/
-BEGIN PGP SIGNATURE-

iHUEARYIAB0WIQTykslvYmKwlIQesLNdovN53d8CLgUCXx/NWAAKCRBdovN53d8C
LqAVAQCZPpAwpK3UexeJCESiH9bqtUTS4KcPUtSBxyr7ShLKdQEAtp5vrt4D5bvW
58iy/cioaDOYrusMcnR697F73kvCtQ8=
=fJhd
-END PGP SIGNATURE-


Re: Please review the draft for May's report

2020-06-09 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



Am 08.06.20 um 07:52 schrieb Daniel Shahaf:
> Besides, there was no question, no concrete request, no clickable
> URL…

https://walletscrutiny.com/ was mentioned, though.
IMHO an interesting and worthwhile project. It probably could use more
automation in verifying reproducibility.

How would the app-update workflow work in a perfect world, where we do
not have to trust the app builder?

Maybe like this:
1. developer pushes a signed git tag to the official repo

2. multiple independent builders build binaries and sign some
"buildinfo" about source+binary hashes, publish it to some
buildinfo-collection place.

3. after N trusted rebuilders agreed on what the correct binary should
be, the app-store (e.g. F-Droid) publishes the binary for all users

3b. in theory, this could use anonymous uploads, where anyone can
upload a binary to server.domain.tld/public/HASH as long as the HASH
of the upload is the correct one.

4. F-Droid client pulls new app version and signed buildinfo files and
checks if F-Droid server did the right thing
-BEGIN PGP SIGNATURE-

iHUEARYIAB0WIQTykslvYmKwlIQesLNdovN53d8CLgUCXt9k9gAKCRBdovN53d8C
LhryAP4rk1Zbq43fZlHSWI827+0RduubzlXHCI0eSRZ8nQ6AqQD+OdP6VPv0jGJY
No8c1w/vVesP5PJwafgVoV5Vp8TgIgQ=
=GCT2
-END PGP SIGNATURE-


Re: rebuilding Maven Central Repository artifacts: welcome reproducible-central

2020-04-16 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On 03/04/2020 06.03, Hervé Boutemy wrote:
> The big question is: where is the database that tells that a binary
> artifact is reproducible? Who should one trust for such a database?
> based on what proof?

There was the idea that rebuilders sign their buildinfo files
that contain what sources produced what binaries in what env.

Then the database would just collect (links to) those signed snippets
in a similar way to
https://keybase.io/bmwiedemann doing it for associating accounts via
signed messages.

That could allow (tools of) users to decide which set of rebuilers to
trust.

Just my 0.02 EUR

Ciao
Bernhard M.
-BEGIN PGP SIGNATURE-

iF0EARECAB0WIQRk4KvQEtfG32NHprVJNgs7HfuhZAUCXpi+3wAKCRBJNgs7Hfuh
ZErKAKCecupiwohH8SgO0a31dd94N/GEGACeLCIzm+MEaVAr8K4n+x0l5DpiOqc=
=uyqL
-END PGP SIGNATURE-


ismypackagereproducibleyet.org is working

2020-04-16 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

As discussed on the last summit in Marrakesh, I made a working draft for
https://ismypackagereproducibleyet.org/

With only 92 lines of code, it is still rather simple.
It is free of JavaScript.

It does not map package names as you will see at
https://ismypackagereproducibleyet.org/?pkg=ruby2.6
https://ismypackagereproducibleyet.org/?pkg=ruby2.7
and
https://ismypackagereproducibleyet.org/?pkg=firefox
https://ismypackagereproducibleyet.org/?pkg=MozillaFirefox


more interesting output examples:
https://ismypackagereproducibleyet.org/?pkg=perl
https://ismypackagereproducibleyet.org/?pkg=glibc


More distributions can easily be added, if they produce a .json output
like the others. See Makefile for existing ones and
https://rb.zq1.de/spec/json-format.txt for my old spec. Of the fields,
only "package" and "status" are used atm, so for Debian there is only
1 result shown per package name.

I welcome PRs at
https://github.com/bmwiedemann/ismypackagereproducibleyet

have a lot of phun
Bernhard M.
-BEGIN PGP SIGNATURE-

iF0EARECAB0WIQRk4KvQEtfG32NHprVJNgs7HfuhZAUCXphh7wAKCRBJNgs7Hfuh
ZBTWAJ9FZ5MTNkwU2Zb9MDHPApvPg7muXgCfRLlm7qD/xSssWpqfbaxm51g7ETM=
=69IN
-END PGP SIGNATURE-


Re: [rb-general] Request for help to make mariadb-10.3 in Debian reproducible

2020-02-10 Thread Bernhard M. Wiedemann

On 2/9/20 9:57 AM, Chris Lamb wrote:

Hi Otto,


Unfortunately none of the changes I made seemed to solve this..
10.3.22 is still unreproducible in unstable due to RocksDB, TokuDB and
Mroonga.


I had that old patch in rocksdb
https://github.com/facebook/rocksdb/pull/2848

but that is long merged and now mariadb-10.3.20 builds reproducibly in 
openSUSE. That is using only common nondeterminisms of hostname, CPU, 
date/year, filesystem, parallelism, randomness, ASLR, PIDs

so other parameters are normalized by the build environment.

Ciao
Bernhard M.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.


Re: [rb-general] Reproducible system images

2019-12-16 Thread Bernhard M. Wiedemann
On 15/12/2019 09.12, Lars Wirzenius wrote:
> Hi,
> 
> One of my hobby projects is vmdb2 (https://vmdb2.liw.fi/), which
> creates disk images with Debian installed. I was wondering whether it
> would be possible to generate system images reproducibly.
> 
> A quick experiment with debootstrap, which creates the initial
> directory tree from with my software produces the disk image, isn't
> reproducible. The main difference is the etc/machine-id file is
> generates, which contains randomly generated content. The other
> differences are log files, cache files, and file mtime timestamps. All
> of those would be possible to work on to make them reproducible.
> 
> vmdb2 could make machine-id be all zeroes, which would mean a new id
> gets generated upon first boot, and written to the file. I'm not
> entirely sure of the security and other implications this has.
> 
> What do others on the list think? Is reproducible system images a goal
> worth pursuing?

Others worked on this before:
https://wiki.debian.org/ReproducibleInstalls

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900918

and I looked into openSUSE's installation-images package, that has
similar problems.
There were also several post-install scripts creating files in
unreproducible ways. For normal packages that is not a problem, but for
images it is.

e.g.
https://gitlab.com/graphviz/graphviz/merge_requests/1290

and various acceleration caches.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.


[rb-general] reproducible builds on reddit

2019-12-13 Thread Bernhard M. Wiedemann
Hi,

because there are people out there that might not like Mailing Lists for
discussion or getting the latest updates, I opened
https://www.reddit.com/r/reproduciblebuilds/

You are all invited to join and post there.

Ciao
Bernhard M.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.


Re: [rb-general] What is the goal of reproducible builds?

2019-12-09 Thread Bernhard M. Wiedemann
Am 09.12.19 um 16:50 schrieb Santiago Torres-Arias:
>>> It all boils down as to where did a backdooring compiler come from, and how 
>>> is it backdooring the build.
>> Backdooring a compiler can be as simple as adding an optimization without 
>> fully understanding the impact
>> (See GCC optimizations + Linux kernel to see some amazing examples)
> Sure, my questions are: 
> - how did this backdooring optimization get there?
> - is it in the source code of a known compiler? (e.g., somebody broke 
> into GNU's gcc repository), 

I think this was referring to something like
https://www.redhat.com/en/blog/security-flaws-caused-compiler-optimizations

In that case, the aggressive optimizations were probably not an
intentional backdoor.
And because they were in the upstream gcc source code, they were in many
distributions.
Also the code that became insecure contained minor issues, that through
these optimizations became major issues.

However, this one falls into what I wrote on the etherpad:

> What are non-goals?
> Reproducible builds does not (intent to) help with vulnerabilities and other 
> issues that exist in the source code. Other methods exist to address those. 
> E.g. `print "2+2=5"` is wrong, yet perfectly reproducible.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.


[rb-general] What is the goal of reproducible builds?

2019-12-09 Thread Bernhard M. Wiedemann
TLDR:
The goal of reproducible builds is to reduce the likelyhood of running
software that was corrupted (during build)


At https://etherpad.opensuse.org/p/reproduciblebuilds-goal
I added a small FAQ around it. You are welcome to contribute there with
refinements or extra questions+answers (because discussions on mailing
lists are often not easy to condense into such a document).
We can still use the ML about discussion of how we discuss this :-)


background:
At the summit we had a session on how/what the r-b/verification
User-Experience (e.g. of apt) should be and found that it should be
shaped by the goal of r-b.
Since I could not find this goal documented yet, I am sending this mail
to get it fleshed out and then added to the website - on main page and
in docs/ .


Ciao
Bernhard M.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.


Re: [rb-general] progress in rpm and openSUSE in 2019

2019-11-30 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



On 30/11/2019 13.20, Holger Levsen wrote:
>> I also added an rbplot.pl script to graph the package status over
>> time like Debian does. http://rb.zq1.de/compare.factory/graph.png
>> shows the current state.
> wow, very nice!
>
> any idea what has changed in 2019-08 where the 'gap' became
> slightler bigger (a bit more orange and quite some more red)?

The extra red probably came from LTO-related build failures
or packages that started to fail for other reasons, but in my env
there had still been the old working build.

The extra unreproducible packages probably came because some packages
had been built with more determinism than normal and after the
rebuild, the tweaks were not applied again.
https://rb.zq1.de/compare.factory/reproducible.json lists the
"opensusetweaks" determinism-bits, but I did not yet get to
incorporate it into the graph.


>> https://bugzilla.opensuse.org/show_bug.cgi?id=1133809 tracks
>> progress towards bit-reproducible OBS Factory pkgs.
>
> no news since 2019-08-22 :/

it is not obvious in bugzilla, but the "History" link shows that at
2019-09-10, a "depends on"
https://bugzilla.opensuse.org/show_bug.cgi?id=1148824
has been added - because we found that for some files the rsync used
in pushing to mirrors did not push files that kept their mtime (due to
S_D_E) when the content did actually change in relevant ways.

Maybe we can also have a discussion on how to best set S_D_E for image
builds.
Or with respect to toolchain updates.


-BEGIN PGP SIGNATURE-

iF0EARECAB0WIQRk4KvQEtfG32NHprVJNgs7HfuhZAUCXeLMgwAKCRBJNgs7Hfuh
ZCT1AJ4lEsic8dD6da6Vbw2fuOU74rjoswCdFdhOk/mRrasTFDm9WEnUammhBTI=
=DIni
-END PGP SIGNATURE-
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.


[rb-general] progress in rpm and openSUSE in 2019

2019-11-15 Thread Bernhard M. Wiedemann
Hi

like last year (see
https://lists.reproducible-builds.org/pipermail/rb-general/2018-December/001301.html
)
as preparation to our summit in December, I wanted to once again collect
last year's changes in rpm and openSUSE that were relevant to
reproducible builds.

The collection below is in a slightly raw form, but I hope still useful
to many of us.


The period is roughly covered by
https://salsa.debian.org/reproducible-builds/reproducible-website/tree/master/_blog/posts/188.md
to _reports/2019-11.md


In my reproducibleopensuse repo, I added

https://github.com/bmwiedemann/reproducibleopensuse/blob/master/howtodebug

with details on how to find, debug and fix reproducibility issues.
I also added an rbplot.pl script to graph the package status over time
like Debian does.
http://rb.zq1.de/compare.factory/graph.png shows the current state.


I started a binary archive to be able to rebuild with old binaries. This
allowed to verify that published (reproducible) packages were not
tampered with during build.
https://lizards.opensuse.org/2019/04/03/experimental-opensuse-mirror-via-ipfs/

Slightly related is also this package source mirror:
https://github.com/bmwiedemann/openSUSE/
It is not yet used for r-b, but at some point could provide a way to
snapshot a whole distribution source tree with a simple "git tag"



I discovered one issue in OBS
https://github.com/openSUSE/open-build-service/issues/6690 new binaries
published under old names confuse our other tools


https://build.opensuse.org/package/rdiff/openSUSE:Leap:15.1/post-build-checks?linkrev=base=26
added FORCE_SOURCE_DATE=1 for latex
and fixed suse-ignored-rpaths.conf (the old version caused i586 or
x86_64 builds to be unverifiable)


https://github.com/openSUSE/brp-check-suse/pull/10 was merged,
allowing for bit-reproducible .a files


https://github.com/openSUSE/pesign-obs-integration/pull/13 pass through
rpm %licence filetype tag

https://github.com/openSUSE/pesign-obs-integration/pull/14 to better
keep rpm bits was merged, but then reverted because it caused trouble
for VirtualBox.


A number of fixes have been done in rpm. Like dpkg in Debian, rpm is the
low-level package manager used in openSUSE, Mandriva, Fedora, Qubes OS
and various derivatives. rpm also includes rpmbuild.


https://github.com/rpm-software-management/rpm/pull/656 properly
initialize some rpm metadata

https://github.com/rpm-software-management/rpm/pull/785 (allow for
unreproducible Build Date and make it the default)

https://github.com/rpm-software-management/rpm/pull/931 toolchain, keep
at least one changelog entry

https://github.com/rpm-software-management/rpm/pull/935 regression-fix
to allow to override the Build Date header again

https://github.com/rpm-software-management/rpm/pull/936 fix header
generation order



In 2019-07, openSUSE enabled builds with Link Time Optimization (LTO) in
all packages. This introduced some unreproducibility that has now all
been fixed.

https://bugzilla.opensuse.org/show_bug.cgi?id=1140896 =
https://bugzilla.opensuse.org/show_bug.cgi?id=1141319 -flto introduces
number of CPUs that causes variations in rpm OPTFLAGS and  debuginfo in
.a files and similar
https://bugzilla.opensuse.org/show_bug.cgi?id=1141323 packages embed
CFLAGS with -flto : fldigi gmp haproxy ImageMagick lyx neovim tboot tcl znc

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91307
LTO-induced indeterminism from global constructors

https://bugzilla.opensuse.org/show_bug.cgi?id=1143905 fwupd computed a
hash over unreproducible LTO data

https://github.com/openSUSE/brp-check-suse/pull/29 I proposed to strip
LTO data from .o files

https://build.opensuse.org/request/show/732635 updated rpm to use
-flto=auto (since 2019-10-21) to not embed the number of CPUs anymore in
the resulting artifacts.



https://github.com/openSUSE/build-compare/pull/31 ignore javadoc
dc.created date


https://github.com/openSUSE/osc/issues/547 report multibuild dep bug

https://github.com/openSUSE/obs-build/pull/510 use gzip -n in Debian
package build



https://github.com/bmwiedemann/theunreproduciblepackage/ got 8 commits,
including one on how floating point introduces non-determinism.
It also adds notes on solutions to some issues.


https://bugzilla.opensuse.org/show_bug.cgi?id=1133809 tracks progress
towards bit-reproducible OBS Factory pkgs.



And finally we had some toolchain and high profile openSUSE packages
patched:


https://github.com/python/cpython/pull/12341 a toolchain patch was
merged to sort readdir when building C-extensions for python.
openSUSE python + python3 packages got backports


We added a pip install macro to handle python's wheel (.whl) files
without creating unreproducible .pyc files
https://bugzilla.opensuse.org/show_bug.cgi?id=1094323


https://build.opensuse.org/request/show/705693 gettext-runtime use SDE
for mtime to make acl package build reproducibile

fix build time race in MozillaFirefox + Thunderbird translations

Re: [rb-general] submission: Reproducible Toolchains For The Win!

2019-08-13 Thread Bernhard M. Wiedemann
On 13/08/2019 02.18, Vagrant Cascadian wrote:
> On 2019-08-01, Bernhard M. Wiedemann wrote:
>> On 31/07/2019 16.50, Vagrant Cascadian wrote:
>> solved:
>>
>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32342 gzip
>> https://git.savannah.gnu.org/cgit/make.git/commit/?id=eedea52afb2069e54188508cd87cb7724b30dd6a

> It's probably going to be mostly the GNU side of toolchains due to the
> nature of the conference, but can still be really useful to have
> examples of general issues affecting other toolchains, even for those
> that are not specifically GNU.

gzip and gnu-make lines above still apply then.

And there have been several gcc patches and issues

[gcc/nvme-cli](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91307 report
LTO-induced indeterminism from global constructors

> _reports/2019-06.md:* Richard Biener submitted a patch for the [GCC
GNU Compiler Collection](https://gcc.gnu.org/) to [fix differences in
the runtime debugging info between
builds](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90778) in its D
programming language support.



signature.asc
Description: OpenPGP digital signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] submission: Reproducible Toolchains For The Win!

2019-08-01 Thread Bernhard M. Wiedemann
On 31/07/2019 16.50, Vagrant Cascadian wrote:

> This talk will mention some of the past and current issues in
> toolchains needed to realize Reproducible Builds in the real world.
> Let's work together to fix outstanding issues and further these
> efforts!

in case you need input, here are 9 entries tagged "toolchain" from my
list of patches:

solved:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32342 gzip
https://gitlab.kitware.com/cmake/cmake/merge_requests/432
https://github.com/rpm-software-management/rpm/pull/536
https://github.com/rpm-software-management/rpm/pull/485
https://bugreports.qt.io/browse/QTBUG-62511
https://git.savannah.gnu.org/cgit/make.git/commit/?id=eedea52afb2069e54188508cd87cb7724b30dd6a

open:

https://github.com/ImageMagick/ImageMagick/pull/1270
https://github.com/python/cpython/pull/12341
https://issues.apache.org/jira/browse/MJAVADOC-619 maven-javadoc-plugin
plus more missing java patches
and there is still the whole python py_compile mess left



signature.asc
Description: OpenPGP digital signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] "Reproducible Builds - aiming for bullseye" comments and a purpose

2019-07-22 Thread Bernhard M. Wiedemann
On 22/07/2019 03.47, jathan wrote:
> I would propose myself to do these trainings, but I still find
> it very difficult and my technical knowledge is not enough. It would be
> good if Holger, lamby, Vagrant or any other member of the team with the
> experience and confidence they already have, could do this kind of
> sessions. They would really help us a lot and make the Reproducible Team
> inside Debian look less hermetic and more friendly to people who have
> not idea about Reproducible Builds. Please answer me what do you think
> about it,

I did such an (unrecorded) session on the last openSUSE conf, based on
https://raw.githubusercontent.com/bmwiedemann/reproducibleopensuse/devel/howtodebug

maybe someone could translate it into a Debian version.
Or we make a generic version with links to distribution details.

I think, much of the problems are distribution-agnostic
and just some solutions are specific to distributions (e.g. dh_*)
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

[rb-general] using consistent terms

2019-07-10 Thread Bernhard M. Wiedemann
Hi,

while scanning for typos, I noticed that there are several variants of
our favourite terms used.

e.g. in reproducible-website repo:
find -type f -name \*.md |
  xargs grep -h -v strip-nondeterminism |
  egrep -o '(un|in|non|not)-?(determin|reproduc)' |
  sort | uniq -c | sort -n
  1 in-reproduc
  1 not-reproduc
  2 indetermin
  2 undetermin
 32 non-reproduc
 39 non-determin
 47 nondetermin
185 unreproduc


So should we standardise on "unreproducible" and "nondeterministic" ?

The other question would be British or American English?
or we dont care...

Ciao
Bernhard M.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible builds and distributed CI

2019-06-19 Thread Bernhard M. Wiedemann
I share Arnout's concerns.

With openSUSE we have the https://openbuildservice.org/  (OBS)
and there I had previously entertained similar thoughts.


On 19/06/2019 12.50, Arnout Engelen wrote:
> On Wed, Jun 19, 2019 at 12:29 PM Lars Wirzenius  wrote:
>> On Sun, May 19, 2019 at 01:09:40PM +0300, Lars Wirzenius wrote:
>> * Is the approach of at-least-N bitwise identical builds sensible,
>>   assuming sufficient build workers being available? Or are there
>>   security aspects and risks there that I am missing?
> 
> This is indeed an aspect that needs thought here. In its simplest
> implementation, where anyone can freely join the builder pool,
> this will obviously not work: an attacker could start a ton of build
> nodes (buying them, using a botnet, ...), and inject its malware
> when it controls at-least-N of the build nodes.
> 
> A "trust but verify" approach where you put your reputation on the
> line when providing build nodes (or get penalized in some other way
> when foul play is detected) could perhaps work.
> 
> Perhaps there are other creative mitigations?

If you want to be sure, then among the N there needs to be at least 1
that you trust. In a way, that is similar to the Debian and openSUSE
model where you have one trusted official build and rebuilders that
verify it.
Or you do it airport-security style of randomly checking 10% so anyone
doing anything malicious has a *risk* of being detected and penalized.

I guess, some trust-karma system could help there to reduce risk.
It assumes that those builders with a long track record of producing
correct builds are more likely to do the next build correctly.
And if it is hacked and produces malicious results, there are still
others that hopefully produce another result.
You still need a good way to handle disagreement - e.g. just because N-1
produce one result, it does not have to mean that the other one is wrong.


> Another attack vector you should think about is how to isolate the
> build itself: it'd be bad if someone could hack the build nodes by
> submitting a malicious build. I bet there's prior art on this though,
> as this is something basically all PAAS providers have to deal with
> somehow.

In OBS, build workers use qemu/KVM to not have to trust the software
they build. You still have to ensure to keep it updated because there
sometimes were kvm bugs (e.g. in emulated floppy controller) that
allowed code to break out.

> On the one
>   hand, how difficult is it to build something reproducibly?

Not that hard. You just have to avoid the 10 sources of non-determinism
that I collected in
https://github.com/bmwiedemann/theunreproduciblepackage

Most hello world programs have a reproducible build.
Most huge software collections (Firefox, Libreoffice, python3, openjdk)
have issues.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible build of a GCC cross-compiler?

2019-06-12 Thread Bernhard M. Wiedemann
On 12/06/2019 15.07, Bernhard M. Wiedemann wrote:

> I'll also test a cross-ppc64le-gcc8 build to see if it behaves worse,
> but would not expect so.

For the record: the cross compilation was also reproducible (without PGO).
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible build of a GCC cross-compiler?

2019-06-12 Thread Bernhard M. Wiedemann
On 12/06/2019 14.42, Sebastian Huber wrote:
> do you have a log of the cross compiler build which shows the GCC
> configure command line? An example for a proven reproducible build would
> be a big help for me.

This week, I confirmed that openSUSE's gcc9 + gcc8 can build
reproducibly (non-cross) when we disable Profile Guided Optimization
(PGO) with
%do_profiling 0

Unfortunately, this also loses 8% compiler performance, so we leave PGO
enabled in our official builds for now.


gcc9 needed 1 new fix:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90778
in addition to previous fixes like
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg00516.html
https://build.opensuse.org/package/view_file/devel:gcc/gcc8/gcc8-reproducible-builds-buildid-for-checksum.patch

plus all the normalizations we apply in our build env (constant path,
user, umask, locale, timezone)

https://build.opensuse.org/package/show/devel:gcc/gcc9
has build logs


I'll also test a cross-ppc64le-gcc8 build to see if it behaves worse,
but would not expect so.

Ciao
Bernhard M.



signature.asc
Description: OpenPGP digital signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] openSUSE reproducibility (Re: Core Debian reproducibility: how close?

2019-04-30 Thread Bernhard M. Wiedemann
On 29/04/2019 15.17, Holger Levsen wrote:
> On Wed, Mar 06, 2019 at 04:06:16PM +0100, Bernhard M. Wiedemann wrote:
>> Without these macros, mtimes of files are not normalized. This is
>> because that had some negative effects on python .pyc files.
>> Build Date and Build Host rpm headers would be less risky to normalize,
>> but some people still like to have these and also there is no advantage
>> in normalizing them as long as mtimes vary.
> 
> are there any plans to fix this and enable those macros? (eg a bug where
> you track this.)

opened one now:
https://bugzilla.opensuse.org/show_bug.cgi?id=1133809

Thanks for the nudge.

Ciao
Bernhard M.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Reproducible builds discussed in Apache Software Foundation (ASF) legal-discuss mailing list

2019-01-24 Thread Bernhard M. Wiedemann
On 23/01/2019 00.08, David A. Wheeler wrote:
> FYI, the "legal-disc...@apache.org" mailing list is having an active 
> discussion about doing reproducible builds for Apache Software Foundation 
> (ASF) projects under the topic "RE: Binary channels".  You can see that here:
> https://lists.apache.org/list.html?legal-disc...@apache.org
> 
> Their legal group is concerned about binaries released by the ASF - 
> officially the ASF only releases source code, but in practice they release 
> binaries - and how do they know they're okay?  One answer is to use 
> reproducible builds.  I've been advocating for reproducible builds from the 
> ASF, and thought you'd like to know. 

even if they did not distribute binaries of their software, others will
do that and will face similar r-b issues.
So it is good to solve r-b issues at the root (aka upstream).

One example of how _not_ to do it can be seen in:

https://bitbucket.org/berkeleylab/gasnet/pull-requests/253
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

[rb-general] __DATE__ and other toolchain patches

2018-12-25 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

regarding https://github.com/uoaerg/wavemon/pull/59

I thought, we do not need to do this kind of patch anymore since gcc
natively supports SOURCE_DATE_EPOCH to override __DATE__ and __TIME__
and everyone interested in reproducible builds sets this variable.


I guess, we could even frame this more broadly and ask, if there is a
toolchain fix merged upstream
(think https://bugs.launchpad.net/intltool/+bug/1687644 )
do we still want to fix issues that would already have gone away if
everyone had that fix in his distribution?


On my side I very much prefer to push toolchain fixes because you fix
it only once and not in an infinite number of places using that code.
-BEGIN PGP SIGNATURE-

iF0EARECAB0WIQRk4KvQEtfG32NHprVJNgs7HfuhZAUCXCKTpAAKCRBJNgs7Hfuh
ZBXZAJ9FMYJKtDHEFaz8wa39nCjJdJiNaQCeKt8tZ28xEfJvMfVmmK/6crmVXNY=
=5MKL
-END PGP SIGNATURE-
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

[rb-general] transitive collision resistance [was: rb formalism]

2018-12-21 Thread Bernhard M. Wiedemann
somewhat offtopic

On 20/12/2018 09.59, Daniel Shahaf wrote:
> Hash functions are usually defined in terms of collision resistance.
> The constructions above have not been proven to be collision resistant,
> and moreover, they might not *be* collision resistant — even if h() is.
> Therefore, we should assume they are not collision resistant.

https://en.wikipedia.org/wiki/Collision_resistance says:

> Collision resistance is a property of cryptographic hash functions: a hash 
> function H is collision resistant if it is hard to find two inputs that hash 
> to the same output; that is, two inputs a and b such that H(a) = H(b), and a 
> ≠ b.



While I agree that you can certainly find collisions when you do
crc16(H(a),H(b))
or
H(crc16(a),crc16(b))

I fail to see how that would be possible with cryptographic hash
functions like SHA-256, so
H(H(a),H(b))

especially since the hash functions internally usually work in rounds
and you have to complete a sufficient number of rounds to get resistance
and adding another full hash around it is like doubling the number of
rounds and the concatenation in the middle does not weaken that.


One can even construct a general proof:
Given a H where it is not possible to collide H(a) = H(b) with a ≠ b
Then it is also not possible to collide
H(H(a),H(a2)) = H(H(b),H(b2)) with a ≠ b  or  a2 ≠ b2

because of the premiss, we have
H(a) ≠ H(b)  or  H(a2) ≠ H(b2)
so that H(a),H(a2) ≠ H(b),H(b2) - assuming hash output of fixed length
and then the premiss applies again to the outer H
to prove the conclusion.



signature.asc
Description: OpenPGP digital signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] rb formalism

2018-12-19 Thread Bernhard M. Wiedemann
On 18/12/2018 15.44, Eric Myhre wrote:
> I think it's fairly open to interpretation.  Implementing it as
> h(h(➡),■) would be more or less the same semantics, no?

you could even use h(h(➡),h(■))
so that you only have to hash ■ output data once.
A bit like .buildinfo files
or foo.tar.xz.sha256.asc signatures



signature.asc
Description: OpenPGP digital signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Core Debian reproducibility: how close?

2018-10-23 Thread Bernhard M. Wiedemann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/10/2018 14.51, David A. Wheeler wrote:
> How close is the core of Debian to being reproducibly built? By
> core I mean the packages that you always have to install no matter
> what.

Coincidentally, I just answered a similar question for openSUSE:
https://lists.opensuse.org/opensuse-factory/2018-10/msg00242.html

Of 107 core devel pkgs, 4 are very bad

Of 2444 DVD pkgs, 49 are very bad
120 more have reproducibility issues that can be auto-filtered.

Not all of them are strictly required/core, but things like Firefox,
Thunderbird, libreoffice would be good to get fixed some day, too.

Usually, around 95% of packages can be built with bit-identical results.

As detailed in https://www.suse.com/c/?p=42014 I also compared
official builds with local ones and already found several bugs with
it, so reproducibility is not just theoretical.


Ciao
Bernhard M.
-BEGIN PGP SIGNATURE-

iF0EARECAB0WIQRk4KvQEtfG32NHprVJNgs7HfuhZAUCW8+QbgAKCRBJNgs7Hfuh
ZAKRAKC8hGw0IqsH8yQ7HWpAA6Isf6bCqQCfRsHKacLpW48D3znPUZDChsrGBr4=
=s3Sb
-END PGP SIGNATURE-
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.