Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-19 Thread Ludovic Courtès
Hi,

Skyler Ferris  skribis:

> In short, I'm not sure that we actually get any value from checking the 
> PGP signature for most projects. Either HTTPS is good enough or the 
> attacker won. 99% of the time HTTPS is good enough (though it is notable 
> that the remaining 1% has a disproportionate impact on the affected 
> population).

When checking PGP signatures, you end up with a trust-on-first-use
model: the first time, you download a PGP key that you know nothing
about and you authenticate code against that, which gives no
information.

On subsequent releases though, you can ensure (ideally) that releases
still originates from the same party.

HTTPS has nothing to do with that: it just proves that the web server
holds a valid certificate for its domain name.

But really, the gold standard, if I dare forego any form of modesty, is
the ‘.guix-authorizations’ model as it takes care of key distribution as
well as authorization delegation and revocation.

  https://doi.org/10.22152/programming-journal.org/2023/7/1

Ludo’.



Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-14 Thread Skyler Ferris
On 4/13/24 05:47, Giovanni Biscuolo wrote:
> Hello Skyler,
>
> Skyler Ferris  writes:
>
>> On 4/12/24 23:50, Giovanni Biscuolo wrote:
>>> general reminder: please remember the specific scope of this (sub)thread
> [...]
>
>>> (https://yhetil.org/guix/8734s1mn5p@xelera.eu/)
>>>
>>> ...and if needed read that message again to understand the context,
>>> please.
>>>
>> I assume that this was an indirect response to the email I sent
>> previously where I discussed the problems with PGP signatures on release
>> files.
> No, believe me! I'm sorry I gave you this impression. :-)
>
>> I believe that this was in scope
> To be clear: not only I did not mean to say - even indirectly - that you
> where out of scope _or_ that you did not understand the context.
>
> Also, I really did not mean to /appear/ as the "coordinator" of this
> (sub)thread and even less to /appear/ as the one who decides what's in
> scope and what's OT; obviously everyone is absolutely free to decide
> what is in scope and that she or he understood the context .
>
>> because of the discussion about whether to use VCS checkouts which
>> lack signatures or release tarballs which have signatures.
> I still have not commented what you discussed just because I lack time,
> not interest;  if I can I'll do it ASAP™ :-(
>
> [...]
>
> Thanks! Gio'
>
Thanks for clarifying! Misunderstandings happen sometimes. I look 
forward to hearing  your thoughts if you're able to find time to share 
them! =)




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-13 Thread Skyler Ferris
Hi all,

On 4/11/24 06:49, Andreas Enge wrote:
> Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga:
>> I think it's just better to
>> obtain the exact same code that is easy to find
> The exact same code as what? Actually I often wonder when looking for
> a project and end up with a Github repository how I could distinguish
> the "original" from its clones in a VCS. With the signature by the
> known (this may also be a wrong assumption, admittedly) maintainer
> there is at least some form of assurance of origin.

I think this assumption deserves a lot more scrutiny than it typically 
gets (this is a general statement not particular to your message; even 
the tails project gets this part of security wrong and they are 
generally diligent in their efforts). I find it difficult to download 
PGP keys with any degree of confidence. Often, I see a file with a 
signature and a key served by the same web page, all coming from the 
same server. PGP keys are only useful if the attacker compromised the 
information that the user is receiving from the web page (for example, 
by gaining control of the web server or compromising the HTTPS session). 
In the typical scenario I have encountered, the attacker would also 
replace the key and signature with ones that they generated themself.

In short, I'm not sure that we actually get any value from checking the 
PGP signature for most projects. Either HTTPS is good enough or the 
attacker won. 99% of the time HTTPS is good enough (though it is notable 
that the remaining 1% has a disproportionate impact on the affected 
population).

Some caveats:

It's difficult for me to use web of trust effectively because I haven't 
met anyone who uses PGP keys IRL. I'm ultimately trusting my internet 
connection and servers which are either semi-centralized (there are not 
that many open keyservers, it's an oligopoly for lack of a better term) 
or have the problem described above. So maybe everyone else is using web 
of trust effectively and I don't know what I'm talking about. =)

The key download could be compared to the "trust on first use" model 
that SSH uses. It's not clear to me how effective a simple text box 
saying "we rotated our keys so you need to re-download it!" would be, 
but I suspect that most people would download without a second thought. 
It might be interesting to add public keys and signature locations to 
package definitions and have Guix re-verify the signature when it 
downloads the source. This would provide more scrutiny when keys are 
rotated (because of the review process) and would prevent harm from the 
situation where the package author is re-downloading the key each time 
the software is updated.

The review process also adds a significant layer of protection because 
an attacker would need to compromise the HTTPS session of the reviewer 
in addition to the original package author (assuming that the signature 
is re-checked by the reviewer; I'm not sure how often this happens in 
practice). In principle it should be difficult for an attacker to 
predict who will be reviewing which issue. However, if the pool of 
reviewers is small it would be easier for the attacker to predict this 
or just compromise all of the reviewers. Also, if there was some way for 
the attacker to launch a general attack on people working out of the 
Guix repository then the value of this protection becomes negligible.

The above two paragraphs are somewhat at odds: if Guix has the public 
key baked in and knows where to download the signature, some reviewers 
might not double-check the key that they get from the website because 
Guix is doing it for them. On one hand, I generally think that 
automating security makes it worse because once it's automated there's a 
system of rules for attackers to manipulate. On the other hand, if we 
assume people aren't doing the things they need to then no amount of 
technical support will give us a secure system. How much is reasonable 
to expect of people? From my extremely biased perspective, it's 
difficult to say.

>> and everybody is reading.
> This is a steep claim! I agree that nobody reads generated files in
> a release tarball, but I am not sure how many other files are actually
> read.
>
> Andreas

I would guess that the level of the protection is strongly correlated 
with the popularity of the project among developers who need to add 
features or fix bugs. I don't think anybody reads a source repository 
"cover to cover", but we rummage around in the code on an as-needed 
basis. It would probably be difficult to sneak something into core 
projects like glibc or gcc, but pretty easy to sneak something into 
"emojis-but-cooler.js". It would be better to have comprehensive audits 
of all the projects, but that's not something Guix can manage by itself. 
It could make it easier to free up resources for that task, but I digress.

While it is hyperbolic to say that "with enough eyes, all bugs are 
shallow" there is a 

Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-13 Thread Skyler Ferris
Hi again,

On 4/12/24 23:50, Giovanni Biscuolo wrote:
> Hello,
>
> general reminder: please remember the specific scope of this (sub)thread
>
> --8<---cut here---start->8---
>
>   Please consider that this (sub)thread is _not_ specific to xz-utils but
>   to the specific attack vector (matrix?) used to inject a backdoor in a
>   binary during a build phase, in a _very_ stealthy way.
>
>   Also, since Guix _is_ downstream, I'd like this (sub)thread to
>   concentrate on what *Guix* can/should do to strenghten the build process
>   /independently/ of what upstreams (or other distributions) can/should
>   do.
>
> --8<---cut here---end--->8---
> (https://yhetil.org/guix/8734s1mn5p@xelera.eu/)
>
> ...and if needed read that message again to understand the context,
> please.
>
>
I assume that this was an indirect response to the email I sent 
previously where I discussed the problems with PGP signatures on release 
files. I believe that this was in scope because of the discussion about 
whether to use VCS checkouts which lack signatures or release tarballs 
which have signatures. If the signatures on the release tarballs are not 
providing us with additional confidence then we are not losing anything 
by switching to the VCS checkout. Analysis of the effectiveness of what 
upstream projects are doing is relevant when trying to determine what we 
are capable of doing. I also pointed out that a change to Guix such as 
adding signature metadata to packages could help make up for problems 
with upstream workflows and how the review process provides additional 
confidence, demonstrating how this analysis is relevant to what to 
currently/could possibly do. Please let me know if you think that this 
is incorrect.

Additionally, I need to correct something that I previously said. I 
stated this:

On 4/12/24 17:14, Skyler Ferris wrote:
> even the tails project gets this part of security wrong and they are 
> generally diligent in their efforts

Without first double-checking the current state of the project. While 
this was true at one point, they have since updated their website and 
clearly explain the problem and what their new verification method is 
able to protect against at 
https://tails.net/contribute/design/download_verification/. I apologize 
for disseminating outdated information.




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-13 Thread Giovanni Biscuolo
Hello Skyler,

Skyler Ferris  writes:

> On 4/12/24 23:50, Giovanni Biscuolo wrote:

>> general reminder: please remember the specific scope of this (sub)thread

[...]

>> (https://yhetil.org/guix/8734s1mn5p@xelera.eu/)
>>
>> ...and if needed read that message again to understand the context,
>> please.
>>
> I assume that this was an indirect response to the email I sent 
> previously where I discussed the problems with PGP signatures on release 
> files.

No, believe me! I'm sorry I gave you this impression. :-)

> I believe that this was in scope

To be clear: not only I did not mean to say - even indirectly - that you
where out of scope _or_ that you did not understand the context.

Also, I really did not mean to /appear/ as the "coordinator" of this
(sub)thread and even less to /appear/ as the one who decides what's in
scope and what's OT; obviously everyone is absolutely free to decide
what is in scope and that she or he understood the context .

> because of the discussion about whether to use VCS checkouts which
> lack signatures or release tarballs which have signatures.

I still have not commented what you discussed just because I lack time,
not interest;  if I can I'll do it ASAP™ :-(

[...]

Thanks! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-13 Thread Giovanni Biscuolo
Hi Attila,

sorry for the delay in my reply,

I'm asking myself if this (sub)thread should be "condensed" in a
dedicated RFC (are RFCs official workflows in Guix, now?); if so, I
volunteer to file such an RFC in the next weeks.

Attila Lendvai  writes:

>> Are there other issues (different from the "host cannot execute target
>> binary") that makes relesase tarballs indispensable for some upstream
>> projects?
>
>
> i didn't mean to say that tarballs are indispensible. i just wanted to
> point out that it's not as simple as going through each package
> definition and robotically changing the source origin from tarball to
> git repo. it costs some effort, but i don't mean to suggest that it's
> not worth doing.

OK understood thanks!

[...]

> i think a good first step would be to reword the packaging guidelines
> in the doc to strongly prefer VCS sources instead of tarballs.

I agree.

>> Even if We™ (ehrm) find a solution to the source tarball reproducibility
>> problem (potentially allowing us to patch all the upstream makefiles
>> with specific phases in our packages definitions) are we really going to
>> start our own (or one managed by the reproducible build community)
>> "reproducible source tarballs" repository? Is this feaseable?
>
> but why would that be any better than simply building from git? which,
> i think, would even take less effort.

I agree, I was just brainstorming.

[...]

Thanks, Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-13 Thread Giovanni Biscuolo
Hello,

general reminder: please remember the specific scope of this (sub)thread

--8<---cut here---start->8---

 Please consider that this (sub)thread is _not_ specific to xz-utils but
 to the specific attack vector (matrix?) used to inject a backdoor in a
 binary during a build phase, in a _very_ stealthy way.

 Also, since Guix _is_ downstream, I'd like this (sub)thread to
 concentrate on what *Guix* can/should do to strenghten the build process
 /independently/ of what upstreams (or other distributions) can/should
 do.

--8<---cut here---end--->8---
(https://yhetil.org/guix/8734s1mn5p@xelera.eu/)

...and if needed read that message again to understand the context,
please.

Andreas Enge  writes:

> Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga:
>> I think it's just better to
>> obtain the exact same code that is easy to find
>
> The exact same code as what?

Of what is contained in the official tool used by upstream to track
their code, that is the one and _only_ that is /pragmatically/ open to
scrutiny by other upstream and _downstream_ contributors.

> Actually I often wonder when looking for a project and end up with a
> Github repository how I could distinguish the "original" from its
> clones in a VCS.

Actually it's a little bit of "intelligence work" but it's something
that usually downstream should really do: have a reasonable level of
trust that the origin is really the upstream one.

But here we are /brainstormig/ about the very issue that led to the
backdoor injection, and that issue is how to avoid "backdoor injections
via build subversion exploiting semi-binary seeds in release tarballs".
(see the scope above)

> With the signature by the known (this may also be a wrong assumption,
> admittedly) maintainer there is at least some form of assurance of
> origin.

We should definitely drop the idea of "trust by autority" as a
sufficient requisite for verifiability, that is one assumption for
reproducible builds.

The XZ backdoor injection absolutely demonstrates that one and just one
_co-maintainer_ was able to hide a trojan in the _signed_ release
tarball and the payload in the git archive (as very obfuscated bynary),
so it was _the origin_ that was "infected".

It's NOT important _who_ injected the backdoor (and in _was_ upstream),
but _how_.

In other words, we need a _pragmatic_ way (possibly with helping tools)
to "challenge the upstream authority" :-)

>> and everybody is reading.
>
> This is a steep claim! I agree that nobody reads generated files in
> a release tarball, but I am not sure how many other files are actually
> read.

Let's say that at least /someone/ should be _able_ to read the files,
but in the attack we are considering /no one/ is _pragmatically_ able to
read the (auto)generated semi-binary seeds in the release tarballs.

Security is a complex system, especially when considering the entire
supply chain: let's focus on this _specific_ weakness of the supply
chain. :-)


Ciao! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-13 Thread Giovanni Biscuolo
Hello,

Ludovic Courtès  writes:

> Ekaitz Zarraga  skribis:
>
>> On 2024-04-04 21:48, Attila Lendvai wrote:
>>> all in all, just by following my gut insctincts, i was advodating
>>> for building everything from git even before the exposure of this
>>> backdoor. in fact, i found it surprising as a guix newbie that not
>>> everything is built from git (or their VCS of choice).
>>
>> That has happened to me too.
>> Why not use Git directly always?
>
> Because it create{s,d} a bootstrapping issue.  The
> “builtin:git-download” method was added only recently to guix-daemon and
> cannot be assumed to be available yet:
>
>   https://issues.guix.gnu.org/65866

This fortunately will help a lot with the "everything built from git"
part of the "whishlist", but what about the not zero occurrences of
"other upstream VCSs"?

[...]

> I think we should gradually move to building everything from
> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
>
> This has been suggested several times before.  The difficulty, as you
> point out, will lie in addressing bootstrapping issues with core
> packages: glibc, GCC, Binutils, Coreutils, etc.  I’m not sure how to do
> that but…

does it have to be an "all of nothing" choiche?  I mean "continue using
release tarballs" vs "use git" for "all"?

If using git is unfeaseable for bootstrapping reasons [1], why not
cointinue using release tarballs with some _extra_ verifications steps
and possibly add some automation steps to "lint" to help contributors
and committers check that there are not "quasi-binary" seeds [2] hidden
in release tarballs?

WDYT?

[...]

Grazie! Gio'



[1] or other reasons specific to a package that should be documented
when needed, at least with a comment in the package definition

[2] the autogenerated files that are not pragmatically verifiable

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-12 Thread Ludovic Courtès
Hi!

Andreas Enge  skribis:

> Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès:
>> I think we should gradually move to building everything from
>> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
>
> the big drawback of this approach is that we would lose maintainers'
> signatures, right?

Yes.  But as Attila wrote, one can hope that they provide a way to
authenticate at least part of their VCS history, for example with signed
tags.  (Ideally everyone would use ‘guix git authenticate’ of course.)

> Would the suggestion to use signed tarballs, but to autoreconf the
> generated files, not be a better compromise between trusting and
> distrusting upstream maintainers?

IMO starting from an authenticated VCS checkout is clearer.

Ludo’.



Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-12 Thread Attila Lendvai
> > I think we should gradually move to building everything from
> > source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
> 
> 
> the big drawback of this approach is that we would lose maintainers'
> signatures, right?


it's possible to sign git commits and (annotated) tags, too.

it's good practice to enable signing by default.

admittedly though, few people sign all their commits, and even fewer sign their 
tags.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Never appeal to a man's "better nature". He may not have one. Invoking his 
self-interest gives you more leverage.”
— Robert Heinlein (1907–1988), 'Time Enough For Love' (1973)




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-11 Thread Ekaitz Zarraga

Hi,


and everybody is reading.


This is a steep claim! I agree that nobody reads generated files in
a release tarball, but I am not sure how many other files are actually
read.


Yea, it is. I'd also love to know how effective is the reading in a 
release tarball vs a VCS repo. Quality of the reading is also very 
important. I simply don't even try to read a tarball, not having the 
history makes the understanding very difficult. If I find a piece of 
code that seems odd, I would like to `git blame` it and see what was the 
reason for the inclusion, who included it and so on.


It's not much, but it's better than nothing. Although, I'd understand if 
you told me the history might be misleading, too.




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-11 Thread Andreas Enge
Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga:
> I think it's just better to
> obtain the exact same code that is easy to find

The exact same code as what? Actually I often wonder when looking for
a project and end up with a Github repository how I could distinguish
the "original" from its clones in a VCS. With the signature by the
known (this may also be a wrong assumption, admittedly) maintainer
there is at least some form of assurance of origin.

> and everybody is reading.

This is a steep claim! I agree that nobody reads generated files in
a release tarball, but I am not sure how many other files are actually
read.

Andreas




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-11 Thread Ekaitz Zarraga

Hi,

On 2024-04-11 14:43, Andreas Enge wrote:

Hello,

Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès:

I think we should gradually move to building everything from
source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.


the big drawback of this approach is that we would lose maintainers'
signatures, right?

Would the suggestion to use signed tarballs, but to autoreconf the
generated files, not be a better compromise between trusting and
distrusting upstream maintainers?

Andreas



Probably not, because the release tarballs might code that is not 
present in the Git history and there are not that many eyes checking 
them. This time it was autoconf, but it might be anything else.


The maintainers' machines can be hijacked too... I think it's just 
better to obtain the exact same code that is easy to find and everybody 
is reading.




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-11 Thread Andreas Enge
Hello,

Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès:
> I think we should gradually move to building everything from
> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.

the big drawback of this approach is that we would lose maintainers'
signatures, right?

Would the suggestion to use signed tarballs, but to autoreconf the
generated files, not be a better compromise between trusting and
distrusting upstream maintainers?

Andreas




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-10 Thread Ludovic Courtès
Hi,

Ekaitz Zarraga  skribis:

> On 2024-04-04 21:48, Attila Lendvai wrote:
>> all in all, just by following my gut insctincts, i was advodating
>> for building everything from git even before the exposure of this
>> backdoor. in fact, i found it surprising as a guix newbie that not
>> everything is built from git (or their VCS of choice).
>
> That has happened to me too.
> Why not use Git directly always?

Because it create{s,d} a bootstrapping issue.  The
“builtin:git-download” method was added only recently to guix-daemon and
cannot be assumed to be available yet:

  https://issues.guix.gnu.org/65866

> In the bootstrapping it's also a problem, as all those tools
> (autotools) must be bootstrapped, and they require other programs
> (compilers) that actually use them. And we'll be forced to use git,
> too, or at least clone the bootstrapping repos, git-archive them
> ourselves and host them properly signed. At least, we could challenge
> them using git (similar to what we do with the substitutes), which we
> cannot do right now with the release tarballs against the actual code
> of the repository.

I think we should gradually move to building everything from
source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.

This has been suggested several times before.  The difficulty, as you
point out, will lie in addressing bootstrapping issues with core
packages: glibc, GCC, Binutils, Coreutils, etc.  I’m not sure how to do
that but…

> In live-bootstrap they just write the build scripts by hand, and
> ignore whatever the ./configure script says. That's also a reasonable
> way to tackle the bootstrapping, but it's a hard one. Thankfully, we
> are working together in this Bootstrapping effort so we can learn from
> them and adapt their recipes to our Guix commencement.scm module. This
> would be some effort, but it's actually doable.

… live-bootstrap can probably be a good source of inspiration to find a
way to build those core packages (or some of them) straight from a VCS
checkout.  And here the trick will be to find a way to do that in a
concise and maintainable way (generating config.h and Makefiles by hand
may prove unmaintainable in practice.)

Ludo’.



Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-05 Thread Jan Wielkiewicz
On Thu, 04 Apr 2024 12:34:42 +0200
Giovanni Biscuolo  wrote:

> Hello everybody,
> 
> I know for sure that Guix maintainers and developers are working on
> this, I'm just asking to find some time to inform and possibly discuss
> with users (also in guix-devel) on what measures GNU Guix - the
> software distribution - can/should deploy to try to avoid this kind
> of attacks.

What about integrating ClamAV into the build farms (if this isn't a
thing already)? ClamAV could scan source files and freshly-built
packages and perhaps detect obvious malware. AFAIK it can also detect
CVEs. Guix already has ClamAV packaged so this shouldn't be that hard.

--

Jan Wielkiewicz



Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-05 Thread Attila Lendvai
> Are there other issues (different from the "host cannot execute target
> binary") that makes relesase tarballs indispensable for some upstream
> projects?


i didn't mean to say that tarballs are indispensible. i just wanted to point 
out that it's not as simple as going through each package definition and 
robotically changing the source origin from tarball to git repo. it costs some 
effort, but i don't mean to suggest that it's not worth doing.


> So, while "almost all the world" is applying wrong solutions to the
> source tarball reproducibility problem, what can Guix do?


AFAIU the plan is straightforward: change all package definitions to point to 
the (git) repos of the upstream, and ignore any generated ./configure scripts 
if it happens to be checked into the repo.

it involves quite some work, both in quantity, and also some thinking around 
surprises.

i think a good first step would be to reword the packaging guidelines in the 
doc to strongly prefer VCS sources instead of tarballs.


> Even if We™ (ehrm) find a solution to the source tarball reproducibility
> problem (potentially allowing us to patch all the upstream makefiles
> with specific phases in our packages definitions) are we really going to
> start our own (or one managed by the reproducible build community)
> "reproducible source tarballs" repository? Is this feaseable?


but why would that be any better than simply building from git? which, i think, 
would even take less effort.


> > but these generated man files are part of the release tarball, so
> > cross compilation works fine using the tarball.
> 
> 
> AFAIU in this case there is an easy alternative: distribute the
> (generated) man files as code tracked in the DVCS (e.g. git) repo
> itself.


yes, that would work in this case (although, that man page is guaranteed to go 
stale). my proposal was to simply drop the generated man file. it adds very 
little value (although it's not zero; web search, etc).

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“It is easy to be conspicuously 'compassionate' if others are being forced to 
pay the cost.”
— Murray N. Rothbard (1926–1995)




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-05 Thread Giovanni Biscuolo
Hi Attila and guix-security team,

Attila Lendvai  writes:

>> Are really "configure scripts containing hundreds of thousands of lines
>> of code not present in the upstream VCS" the norm?
>
> pretty much for all C and C++ projects that use autoconf... which is
> numerous, especially among the core GNU components.

OK, thank you for the confirmation.

[...]

>> ...or is it better to completely avoid release tarballs as our sources
>> uris?
>
> yes, and this^ would guarantee the previous point, but it's not always 
> trivial.
>
> as an example see this: https://issues.guix.gnu.org/61750

[...]

> it breaks crosscompilation, because the host cannot execute the target
> binary.

OK thanks, I missed that.

In general, there is really no other solution for projects than to
distribute some artifacts "out of band" or renounce to crosscompile?!?

Are there other issues (different from the "host cannot execute target
binary") that makes relesase tarballs indispensable for some upstream
projects?

AFAIU the only thing that /could/ "save" source tarballs it's their
/scientific/ reproducibility.  In this direction there is a very
interesting patchset from Janneke Nieuwenhuizen to try to get a
reproducible _Guix_ release tarball:

https://issues.guix.gnu.org/70169
«Reproducible `make dist' tarball in defiance of Autotools and Gettext»

Obviously having a reproducible tarball makes _practical_ the
"pragmatically impossible" task to reproduce a release tarball to check
if it corresponds to the same **build** (make dist) performed in the
official DVCS repo; only this could "save" all the "build software using
release tarball" workflow.

...but /in general/ here we are _downstream_, we have absolutely no
control over upstream, and it's _very_ unlikely that we'll see a *good*
solution to the tarball reproduciblity problem applied "in the wild
upstream" soon.

I said "a **good* solution" because some proposals I'm reading about are
/bad/ _complications_ that absolutely are NOT really solving the source
tarball reproduciblity problem [1]; for example:

1. build the tarball on the RM host using a docker container
(unreproducible built) and call it "a reproducible release tarball":
https://medium.com/@lanoxx/creating-reproducible-release-tarballs-fa2e2ce745a7

2. have a CI system based on github actions [2] and call it "fully
verifiable": https://externals.io/message/122811#122814 (from
php.internals mailing list)

So, while "almost all the world" is applying _wrong_ solutions to the
source tarball reproducibility problem, what can Guix do?

Even if We™ (ehrm) find a solution to the source tarball reproducibility
problem (potentially allowing us to patch all the upstream makefiles
with specific phases in our packages definitions) are we really going to
start our own (or one managed by the reproducible build community)
"reproducible source tarballs" repository?  Is this feaseable?

I think there is no solution that can "pragmatically save" the source
tarballs of all the software packaged in Guix (and all other
distributions part of the reproducible builds effort).

> but these generated man files are part of the release tarball, so
> cross compilation works fine using the tarball.

AFAIU *in this case* there is an easy alternative: distribute the
(generated) man files as *code* tracked in the DVCS (e.g. git) repo
itself.  IMHO it's likely that this workflow can fix most if not all the
crosscompilation issues, no?

In general, AFAIU it's against reproducibility to distribute
pre-generated (compiled? transpiled?) artifacts in a tarball that are
not present in the official DVCS repo, especially when tarballs are
_not_ reproducible (and they are not in likely 99.9% of cases).

> all in all, just by following my gut insctincts, i was advodating for
> building everything from git even before the exposure of this
> backdoor. in fact, i found it surprising as a guix newbie that not
> everything is built from git (or their VCS of choice).

Given the current situation so clearly exposed by the "xz backdoor"
case, this is something Guix should seriously consider.

I mean: Guix should seriously consider to drop source tarballs and
_also_ all pre-compiled artifacts distributed only via that tarballs.

I don't like this proposal, but I see no other "pragmatically possible"
solution.

AFAIU no need to rush, but I'm afraid that the class of attacks we can
call "supply-chain backdoor injection due to source tarball
pragmatically impossible verifiability" are hard to deploy but
unfortunately not _too_ hard.

[...]

Thanks! Gio'


[1] this boils down to the unfortunate fact that "reproducibility" is a
very misunderstood concept [1.1], even by some very skilled (experienced?)
programmers

[1.1] because it's strictly related to good _redistribution_ of
_trusted_ software, not to good programming

[2]
https://docs.github.com/en/actions/learn-github-actions/understanding-github-actions#runners
«each workflow run executes in a fresh, newly-provisioned 

Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Ricardo Wurmus
[mu4e must have changed the key bindings for replies, so here is my mail
again, this time as a wide reply.]

Giovanni Biscuolo  writes:

> So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of
> tampered .m4 macros (and other possibly tampered build configuration
> script)?
>
> IMHO "ignoring" (deleting) pre-built build scripts in Guix
> build-system(s) should be considered... or is /already/ so?

The gnu-build-system has a bootstrap phase, but it only does something
when a configure script does not already exist.  We sometimes force it
to bootstrap the build system when we patch configure.ac.

In previous discussions there were no big objections to always
bootstrapping the build system files from autoconf/automake sources.

This particular backdoor relied on a number of obfuscations:

- binary test data.  Nobody ever looks at binaries.

- incomprehensibility of autotools output.  This one is fundamentally a
  social problem and easily extends to other complex build systems.  In
  the xz case, the instructions for assembling the shell snippets to
  inject the backdoor could hide in plain sight, just because configure
  scripts are expected to be near incomprehensible.  They contain no
  comments, are filled to the brim with portable (lowest common
  denominator) shell magic, and contain bizarrely named variables.

Not using generated output is a good idea anyway and removes the
requirement to trust that the release tarballs are faithful derivations
from the autotools sources, but given the bland complexity of build system
code (whether that's recursive Makefiles, CMake cruft, or the infamous
gorilla spit[1] of autotools) I don't see a good way out.

[1] 
https://www.gnu.org/software/autoconf/manual/autoconf-2.65/autoconf.html#History

> Given the above observation that < to peer review a tarball prepared in this manner>>, I strongly doubt that
> a possible Makefile tampering _in_the_release_tarball_ is easy to peer
> review; I'd ask: is it feaseable such an "automated analysis" (see
> above) in a dedicated build-system phase?

I don't think it's feasible.  Since Guix isn't a regular user (the
target audience of configure scripts) it has no business depending on
generated configure scripts.  It should build these from source.

> In other words: what if the backdoor was injected directly in the source
> code of the *official* release tarball signed with a valid GPG signature
> (and obviously with a valid sha256 hash)?

A malicious maintainer can sign bad release tarballs.  A malicious
contributor can push signed commits that contain backdoors in code.

> Do upstream developer communities peer review release tarballs or they
> "just" peer review the code in the official DVCS?

Most do neither.  I'd guess that virtually *nobody* reviews tarballs
beyond automated tests (like what the GNU maintainers' GNUmakefile /
maint.mk does when preparing a release).

> Also, in (info "(guix) origin Reference") I see that Guix packages can have a
> list of uri(s) for the origin of source code, see xz as an example [7]:
> are they intended to be multiple independent sources to be compared in
> order to prevent possible tampering or are they "just" alternatives to
> be used if the first listed uri is unavailable?

They are alternative URLs, much like what the mirror:// URLs do.

> If the case is the first, a solution would be to specify multiple
> independent release tarballs for each package, so that it would be
> harder to copromise two release sources, but that is not something under
> Guix control.

We have hashes for this purpose.  A tarball that was modified since the
package definition has been published would have a different hash.  This
is not a statement about tampering, but only says that our expectations
(from the time of packaging) have not been met.

> All in all: should we really avoid the "pragmatically impossible to be
> peer reviewed" release tarballs?

Yes.

-- 
Ricardo



Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Ekaitz Zarraga

Hi,

I just want to add some perspective from the bootstrapping.

On 2024-04-04 21:48, Attila Lendvai wrote:


all in all, just by following my gut insctincts, i was advodating for building 
everything from git even before the exposure of this backdoor. in fact, i found 
it surprising as a guix newbie that not everything is built from git (or their 
VCS of choice).


That has happened to me too.
Why not use Git directly always?

In the bootstrapping it's also a problem, as all those tools (autotools) 
must be bootstrapped, and they require other programs (compilers) that 
actually use them. And we'll be forced to use git, too, or at least 
clone the bootstrapping repos, git-archive them ourselves and host them 
properly signed. At least, we could challenge them using git (similar to 
what we do with the substitutes), which we cannot do right now with the 
release tarballs against the actual code of the repository.


In live-bootstrap they just write the build scripts by hand, and ignore 
whatever the ./configure script says. That's also a reasonable way to 
tackle the bootstrapping, but it's a hard one. Thankfully, we are 
working together in this Bootstrapping effort so we can learn from them 
and adapt their recipes to our Guix commencement.scm module. This would 
be some effort, but it's actually doable.


Hope this adds something useful to the discussion,

Ekaitz




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Attila Lendvai
> Are really "configure scripts containing hundreds of thousands of lines
> of code not present in the upstream VCS" the norm?


pretty much for all C and C++ projects that use autoconf... which is numerous, 
especially among the core GNU components.


> If so, can we consider hundreds of thousand of lines of configure
> scripts and other (auto)generated files bundled in release tarballs
> "pragmatically impossible" to be peer reviewed?


yes.


> Can we consider that artifacts as sort-of-binary and "force" our
> build-systems to regenerate all them?


that would be a good practice.


> ...or is it better to completely avoid release tarballs as our sources
> uris?


yes, and this^ would guarantee the previous point, but it's not always trivial.

as an example see this: https://issues.guix.gnu.org/61750

in short: when building shepherd from git the man files need to be generated 
using the program help2man. this invokes the binary with --help and formats the 
output as a man page. the usefulness of this is questionable, but the point is 
that it breaks crosscompilation, because the host cannot execute the target 
binary.

but these generated man files are part of the release tarball, so cross 
compilation works fine using the tarball.

all in all, just by following my gut insctincts, i was advodating for building 
everything from git even before the exposure of this backdoor. in fact, i found 
it surprising as a guix newbie that not everything is built from git (or their 
VCS of choice).

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“For if you [the rulers] suffer your people to be ill-educated, and their 
manners to be corrupted from their infancy, and then punish them for those 
crimes to which their first education disposed them, what else is to be 
concluded from this, but that you first make thieves [and outlaws] and then 
punish them.”
— Sir Thomas More (1478–1535), 'Utopia', Book 1




Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Giovanni Biscuolo
Hi Attila,

Attila Lendvai  writes:

>> Also, in (info "(guix) origin Reference") I see that Guix packages
>> can have a list of uri(s) for the origin of source code, see xz as an
>> example [7]: are they intended to be multiple independent sources to
>> be compared in order to prevent possible tampering or are they "just"
>> alternatives to be used if the first listed uri is unavailable?
>
> a source origin is identified by its cryptographic hash (stored in its
> sha256 field); i.e. it doesn't matter *where* the source archive was
> acquired from. if the hash matches the one in the package definition,
> then it's the same archive that the guix packager has seen while
> packaging.

Ehrm, you are right, mine was a stupid question :-)

We *are* already verifying that tarballs had not been tampered
with... by other people but the release manager :-(

[...]

Happy hacking! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Giovanni Biscuolo
Hello,

a couple of additional (IMO) useful resources...

Giovanni Biscuolo  writes:

[...]

> Let me highlight this: «It is pragmatically impossible [...] to peer
> review a tarball prepared in this manner.»
>
> There is no doubt that the release tarball is a very weak "trusted
> source" (trusted by peer review, not by authority) than the upstream
> DVCS repository.

This kind of attack was described by Daniel Stenberg in his «HOWTO
backdoor curl» article in 2021.03.30 as "skip-git-altogether" method:

https://daniel.haxx.se/blog/2021/03/30/howto-backdoor-curl/
--8<---cut here---start->8---

The skip-git-altogether methods

As I’ve described above, it is really hard even for a skilled developer
to write a backdoor and have that landed in the curl git repository and
stick there for longer than just a very brief period.

If the attacker instead can just sneak the code directly into a release
archive then it won’t appear in git, it won’t get tested and it won’t
get easily noticed by team members!

curl release tarballs are made by me, locally on my machine. After I’ve
built the tarballs I sign them with my GPG key and upload them to the
curl.se origin server for the world to download. (Web users don’t
actually hit my server when downloading curl. The user visible web site
and downloads are hosted by Fastly servers.)

An attacker that would infect my release scripts (which btw are also in
the git repository) or do something to my machine could get something
into the tarball and then have me sign it and then create the “perfect
backdoor” that isn’t detectable in git and requires someone to diff the
release with git in order to detect – which usually isn’t done by anyone
that I know of.

[...] I of course do my best to maintain proper login sanitation,
updated operating systems and use of safe passwords and encrypted
communications everywhere. But I’m also a human so I’m bound to do
occasional mistakes.

Another way could be for the attacker to breach the origin download
server and replace one of the tarballs there with an infected version,
and hope that people skip verifying the signature when they download it
or otherwise notice that the tarball has been modified. I do my best at
maintaining server security to keep that risk to a minimum. Most people
download the latest release, and then it’s enough if a subset checks the
signature for the attack to get revealed sooner rather than later.

--8<---cut here---end--->8---

Unfortunately Stenberg in that section misses one attack vector he
mentioned in a previous article section named "The tricking a user
method":

--8<---cut here---start->8---

We can even include more forced “convincing” such as direct threats
against persons or their families: “push this code or else…”. This way
of course cannot be protected against using 2fa, better passwords or
things like that.

--8<---cut here---end--->8---

...and an attack vector involving more subltle ways (let's call it
distributed social engineering) to convince the upstream developer and
other contributors and/or third parties they need a project
co-maintainer authorized to publish _official_ release tarballs.

Following Stenberg's attacks classification, since the supply-chain
attack was intended to install a backdoor in the _sshd_ service, and
_not_ in xz-utils or liblzma, we can classify this attack as:

  skip-git-altogether to install a backdoor further-down-the-chain,
  precisely in a _dependency_ of the attacked one, durind a period of
  "weakness" of the upstream maintainers

Stenberg closes his article with this update and one related reply to a
comment:

--8<---cut here---start->8---

Dependencies

Added after the initial post. Lots of people have mentioned that curl
can get built with many dependencies and maybe one of those would be an
easier or better target. Maybe they are, but they are products of their
own individual projects and an attack on those projects/products would
not be an attack on curl or backdoor in curl by my way of looking at it.

In the curl project we ship the source code for curl and libcurl and the
users, the ones that builds the binaries from that source code will get
the dependencies too.

[...]

 Jean Hominal says: 
 April 1, 2021 at 14:04 

 I think the big difference why you “missed” dependencies as an attack
 vector is because today, most application developers ship their
 dependencies in their application binaries (by linking statically or
 shipping a container) – in such a case, I would definitely count an
 attack on such a dependency, that is then shipped as part of the
 project’s artifacts, as a successful attack on the project.

 However, as you only ship a source artifact – of course, dependencies
 *are* out of scope in your case.

 Daniel Stenberg says: 
 April 1, 2021 at 15:05 

 Jean: Right. 

Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Attila Lendvai
> Also, in (info "(guix) origin Reference") I see that Guix packages can have a
> list of uri(s) for the origin of source code, see xz as an example [7]:
> are they intended to be multiple independent sources to be compared in
> order to prevent possible tampering or are they "just" alternatives to
> be used if the first listed uri is unavailable?


a source origin is identified by its cryptographic hash (stored in its sha256 
field); i.e. it doesn't matter *where* the source archive was acquired from. if 
the hash matches the one in the package definition, then it's the same archive 
that the guix packager has seen while packaging.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“We’ll know our disinformation program is complete when everything the American 
public believes is false.”
— William Casey (1913–1987), the director of CIA 1981-1987




backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)

2024-04-04 Thread Giovanni Biscuolo
Hello everybody,

I know for sure that Guix maintainers and developers are working on
this, I'm just asking to find some time to inform and possibly discuss
with users (also in guix-devel) on what measures GNU Guix - the software
distribution - can/should deploy to try to avoid this kind of attacks.

Please consider that this (sub)thread is _not_ specific to xz-utils but
to the specific attack vector (matrix?) used to inject a backdoor in a
binary during a build phase, in a _very_ stealthy way.

Also, since Guix _is_ downstream, I'd like this (sub)thread to
concentrate on what *Guix* can/should do to strenghten the build process
/independently/ of what upstreams (or other distributions) can/should
do.

First of all, I understand the xz backdoor attack was complex (both
socially and technically) and all the details are still under scrutiny,
but AFAIU the way the backdoor has been injected by "infecting" the
**build phase** of the software (and obfuscating the payload in
binaries) is very alarming and is something all distributions aiming at
reproducible builds must (and they actually _are_) examine(ing) very
well.

John Kehayias  writes:

[...]

> On Fri, Mar 29, 2024 at 01:39 PM, Felix Lechner via Reports of security 
> issues in Guix itself and in packages provided by Guix wrote:
>
>> Hi Ryan,
>>
>> On Fri, Mar 29 2024, Ryan Prior wrote:

[...]

>>> Guix currently packages xz-utils 5.2.8 as "xz" using the upstream
>>> tarball. [...] Should we switch from using upstream tarballs to some
>>> fork with more responsible maintainers?
>>
>> Guix's habit of building from tarballs is a poor idea because tarballs
>> often differ.

First of all: is to be considered reproducible a software that produces
different binaries if compiled from the source code repository (git or
something else managed) or from the official released source tarball?

My first thought is no.

>> For example, maintainers may choose to ship a ./configure script that
>> is otherwise not present in Git (although a configure.ac might be).
>> Guix should build from Git.

Two useful pointers explaining how the backdoor has been injected are
[1] (general workflow) and [2] (payload obfuscation)

The first and *indispensable* condition for the attack to be succesful
is this:

--8<---cut here---start->8---

* The release tarballs upstream publishes don't have the same code that
 GitHub has. This is common in C projects so that downstream consumers
 don't need to remember how to run autotools and autoconf. The version
 of build-to-host.m4 in the release tarballs differs wildly from the
 upstream on GitHub.

[...]

* Explain dist tarballs, why we use them, what they do, link to
  autotools docs, etc

 * "Explaining the history of it would be very helpful I think. It also
 explains how a single person was able to insert code in an open source
 project that no one was able to peer review. It is pragmatically
 impossible, even if technically possible once you know the problem is
 there, to peer review a tarball prepared in this manner."

--8<---cut here---end--->8---
(from [1])

Let me highlight this: «It is pragmatically impossible [...] to peer
review a tarball prepared in this manner.»

There is no doubt that the release tarball is a very weak "trusted
source" (trusted by peer review, not by authority) than the upstream
DVCS repository.

It's *very* noteworthy that the backdoor was discovered thanks to a
performance issue and _not_ during a peer review of the source
code... the _build_ code *is* source code, no?

It's not the first time a source release tarball of free software is
compromised [3], but the way the compromise worked in this case is
something new (or at least never spetted before, right?).

> We discussed a bit on #guix today about this. A movement to sourcing
> more directly from Git in general has been discussed before, though
> has some hurdles.

Please could someone knowledgeable about the details describe what are
the hurdles about sourcing from DVCS (eventually other than git)?

> I will let someone more knowledgeable about the details chime in, but
> yes, something we should do.

I'm definitely _not_ the knowledgeable one, but I'd like to share the
result of my researches.

Is it possible to enhance our build-system(s) (e.g. gnu-build-system) so
thay can /ignore/ pre-built .m4 or similar script and rebuild them
during the build process?

Richard W.M. Jones on fedora-devel ML proposed [4]:

--8<---cut here---start->8---

(1) We should routinely delete autoconf-generated cruft from upstream
projects and regenerate it in %prep. It is easier to study the real
source rather than dig through the convoluted, generated shell script in
an upstream './configure' looking for back doors. For most projects,
just running "autoreconf - fiv" is enough.

--8<---cut here---end--->8---

There is an interesting