Re: Security lessons from liblzma - libsystemd

2024-04-15 Thread Michael Paquier
On Fri, Apr 12, 2024 at 09:00:11AM -0700, Andres Freund wrote:
> I'm actually fairly bothered by us linking to libxml2. It was effectively
> unmaintained for most of the last decade, with just very occasional drive-by
> commits. And it's not that there weren't significant bugs or such. Maintenance
> has picked up some, but it's still not well maintained, I'd say.  If I wanted
> to attack postgres, it's where I'd start.

Indeed, libxml2 worries me to, as much as out-of-core extensions.
There are a bunch of these out there, some of them not that
maintained, and they could face similar attacks.
--
Michael


signature.asc
Description: PGP signature


Re: Security lessons from liblzma - libsystemd

2024-04-12 Thread Andres Freund
Hi,

On 2024-04-04 01:10:20 +0200, Peter Eisentraut wrote:
> On 03.04.24 23:19, Magnus Hagander wrote:
> > When the code is this simple, we should definitely consider carrying it
> > ourselves. At least if we don't expect to need *other* functionality
> > from the same library in the future, which I doubt we will from
> > libsystemd.
> 
> Well, I've long had it on my list to do some integration to log directly to
> the journal, so you can preserve metadata better.  I'm not sure right now
> whether this would use libsystemd, but it's not like there is absolutely no
> other systemd-related functionality that could be added.

Interesting. I think that'd in all likelihood end up using libsystemd.


> Personally, I think this proposed change is trying to close a barndoor after
> a horse has bolted.  There are many more interesting and scary libraries in
> the dependency tree of "postgres", so just picking off one right now doesn't
> really accomplish anything.  The next release of libsystemd will drop all
> the compression libraries as hard dependencies, so the issue in that sense
> is gone anyway.  Also, fun fact: liblzma is also a dependency via libxml2.

I agree that doing this just because of future risk in liblzma is probably not
worth it. Despite soon not being an unconditional dependency of libsystemd
anymore, I'd guess that in a few months it's going to be one of the better
vetted libraries out there.  But I don't think that means it's not worth
reducing the dependencies that we have unconditionally loaded.

Having a dependency to a fairly large library (~900k), which could be avoided
with ~30 LOC, is just not worth it. Independent of liblzma. Not from a
performance POV, nor from a risk POV.

I'm actually fairly bothered by us linking to libxml2. It was effectively
unmaintained for most of the last decade, with just very occasional drive-by
commits. And it's not that there weren't significant bugs or such. Maintenance
has picked up some, but it's still not well maintained, I'd say.  If I wanted
to attack postgres, it's where I'd start.

My worry level, in decreasing order, about postmaster-level dependencies:
- libxml2 - effectively unmaintained, just about maintained today
- gssapi - heavily undermaintained from what I can see, lots of complicated code
- libldap - undermaintained, lots of creaky old code
- libpam - undermaintained, lots of creaky old code
- the rest

Greetings,

Andres Freund




Re: Security lessons from liblzma - libsystemd

2024-04-12 Thread Magnus Hagander
On Thu, Apr 4, 2024 at 1:10 AM Peter Eisentraut 
wrote:

> On 03.04.24 23:19, Magnus Hagander wrote:
> > When the code is this simple, we should definitely consider carrying it
> > ourselves. At least if we don't expect to need *other* functionality
> > from the same library in the future, which I doubt we will from
> libsystemd.
>
> Well, I've long had it on my list to do some integration to log directly
> to the journal, so you can preserve metadata better.  I'm not sure right
> now whether this would use libsystemd, but it's not like there is
> absolutely no other systemd-related functionality that could be added.
>

Ah interesting. I hadn't thought of that use-case.




> Personally, I think this proposed change is trying to close a barndoor
> after a horse has bolted.  There are many more interesting and scary
> libraries in the dependency tree of "postgres", so just picking off one
> right now doesn't really accomplish anything.  The next release of
> libsystemd will drop all the compression libraries as hard dependencies,
> so the issue in that sense is gone anyway.  Also, fun fact: liblzma is
> also a dependency via libxml2.
>


To be clear, I didn't mean to single out this one, just saying that it's
something we should keep in consideration in general when adding library
dependencies. Every new dependency, no matter how small, increases the
management and risks for it. And we should just be aware of that and weigh
them against each other.

As in we should *consider* it, that doesn't' mean we should necessarily
*do* it.

(And yes, there are many scary dependencies down the tree)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Security lessons from liblzma

2024-04-09 Thread Joe Conway

On 4/8/24 22:57, Bruce Momjian wrote:

On Sat, Mar 30, 2024 at 04:50:26PM -0400, Robert Haas wrote:

An awful lot of what we do operates on the principle that we know the
people who are involved and trust them, and I'm glad we do trust them,
but the world is full of people who trusted somebody too much and
regretted it afterwards. The fact that we have many committers rather
than a single maintainer probably reduces risk at least as far as the
source repository is concerned, because there are more people paying
attention to potentially notice something that isn't as it should be.


One unwritten requirement for committers is that we are able to
communicate with them securely.  If we cannot do that, they potentially
could be forced by others, e.g., governments, to add code to our
repositories.

Unfortunately, there is on good way for them to communicate with us
securely once they are unable to communicate with us securely.  I
suppose some special word could be used to communicate that status ---
that is how it was done in non-electronic communication in the past.


I don't know how that really helps. If one of our committers is under 
duress, they probably cannot risk outing themselves anyway.


The best defense, IMHO, is the fact that our source code is open and can 
be reviewed freely.


The trick is to get folks to do the review.

I know, for example, at $past_employer we had a requirement to get 
someone on our staff to review every single commit in order to maintain 
certain certifications. Of course there is no guarantee that such 
reviews would catch everything, but maybe we could establish post commit 
reviews by contributors in a more rigorous way? Granted, getting more 
qualified volunteers is not a trivial problem...


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Security lessons from liblzma

2024-04-08 Thread Bruce Momjian
On Sat, Mar 30, 2024 at 04:50:26PM -0400, Robert Haas wrote:
> An awful lot of what we do operates on the principle that we know the
> people who are involved and trust them, and I'm glad we do trust them,
> but the world is full of people who trusted somebody too much and
> regretted it afterwards. The fact that we have many committers rather
> than a single maintainer probably reduces risk at least as far as the
> source repository is concerned, because there are more people paying
> attention to potentially notice something that isn't as it should be.

One unwritten requirement for committers is that we are able to
communicate with them securely.  If we cannot do that, they potentially
could be forced by others, e.g., governments, to add code to our
repositories.

Unfortunately, there is on good way for them to communicate with us
securely once they are unable to communicate with us securely.  I
suppose some special word could be used to communicate that status ---
that is how it was done in non-electronic communication in the past.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-04-08 Thread Jacob Champion
On Fri, Apr 5, 2024 at 5:14 PM Michael Paquier  wrote:
> Saying that, my spidey sense tingles at the recent commit
> 3311ea86edc7, that had the idea to introduce a 20k line output file
> based on a 378 line input file full of random URLs.  In my experience,
> tests don't require to be that large to be useful, and the input data
> is very hard to parse.

That's a good point. I've proposed a patch over at [1] to shrink it
substantially.

--Jacob

[1] 
https://www.postgresql.org/message-id/CAOYmi%2B%3Dkx14ui_A__4L_XcFePSuUuR1kwJfUKxphuZU_i6%3DWpA%40mail.gmail.com




Re: Security lessons from liblzma - libsystemd

2024-04-08 Thread Étienne BERSAC
Hi,

> There are many more interesting and scary libraries in the dependency
> tree of "postgres", so just picking off one right now doesn't really
> accomplish anything.  The next release of libsystemd will drop all
> the compression libraries as hard dependencies, so the issue in that
> sense is gone anyway.  Also, fun fact: liblzma is also a dependency
> via libxml2.

Having an audit of all libraries linked to postgres and their level of
trust should help to point the next weak point. I'm pretty sure we have
several of these tiny libraries maintained by a lone out of time hacker
linked somewhere. What is the next xz ?

Regards,
Étienne 
-- 
DALIBO




Re: Security lessons from liblzma

2024-04-07 Thread Ranier Vilela
Hi,

Sorry, without any connection with the technical part of the thread.
But I couldn't help but record this, and congratulate Andres Freund, for
the excellent work.
It's not every day that a big press, in Brazil or around the world,
publishes something related to technology people.

Today I came across a publication in Brazil's most widely circulated
newspaper, calling him a hero for preventing the attack.

programador-conserta-falha-no-linux-evita-ciberataque-e-vira-heroi-da-internet.shtml


Congratulations Andres Freund, nice work.

best regards,
Ranier Vilela


Re: Security lessons from liblzma

2024-04-05 Thread Michael Paquier
On Fri, Apr 05, 2024 at 11:40:46AM -0700, Jacob Champion wrote:
> On Fri, Apr 5, 2024 at 6:24 AM Robert Haas  wrote:
>> I'm also sort of afraid that we're getting sucked into thinking real
>> hard about this SSL certificate issue rather than trying to brainstorm
>> all the other places that might be problematic. The latter might be a
>> more fruitful exercise (or maybe not, what do I know?).
> 
> +1. Don't get me wrong: I spent a lot of time refactoring the sslfiles
> machinery a while back, and I would love for it to all go away. I
> don't really want to interrupt any lines of thought that are moving in
> that direction. Please continue.

There are a few things that I've not seen mentioned on this thread.

Any random byte sequences included in the tree should have ways to be
regenerated.  One problem with xz was that the binary blobs were
directly part of the tree, with the input file and the test crafted so
as the test would skip portions of the input, one line in ./configure
being enough to switch the backdoor to evil mode (correct me if I read
that wrong).  There could be other things that one would be tempted to
introduce for Postgres as test data to introduce a backdoor with a
slight tweak of the source tarball.  To name a few:
- Binary WAL sequences, arguing that this WAL data is useful even with
alignment restrictions.
- Data for COPY.
- Physical dumps, in raw or just SQL data.

Anything like that can also be used in some test data provided by
somebody in a proposed patch or a bug report, for the sake of
reproducing an issue.  If there is an attack on this data, another
contributor could run it and get his/her own host powned.  One thing
that would be easy to hide is something that reads on-disk file in a
large dump file, with something edited in its inner parts.

An extra thing is if we should expand the use of signed commits and
potentially physical keys for committers, provided by pg-infra which
would be the source of trust?  Some have mentioned that in the past,
and this could reduce the risk of committing problematic code if a
committer's host is powned because the physical key would be required.

Saying that, my spidey sense tingles at the recent commit
3311ea86edc7, that had the idea to introduce a 20k line output file
based on a 378 line input file full of random URLs.  In my experience,
tests don't require to be that large to be useful, and the input data
is very hard to parse.
--
Michael


signature.asc
Description: PGP signature


Re: Security lessons from liblzma

2024-04-05 Thread Jacob Champion
On Fri, Apr 5, 2024 at 6:24 AM Robert Haas  wrote:
> I wonder how hard it would be to just code up our own binary to do
> this. If it'd be a pain to do that, or to maintain it across SSL
> versions, then it's a bad plan and we shouldn't do it. But if it's not
> that much code, maybe it'd be worth considering.

I think my biggest concern, other than the maintenance costs, would be
the statement "we know SSL works on Windows because we test it against
some certificates we hand-rolled ourselves." We can become experts in
certificate formats, of course, but... does it buy us much? If someone
comes and complains that a certificate doesn't work correctly (as they
have *very* recently [3]), I would like to be able to write a test
that uses what OpenSSL actually generates as opposed to learning how
to make it myself first.

> I'm also sort of afraid that we're getting sucked into thinking real
> hard about this SSL certificate issue rather than trying to brainstorm
> all the other places that might be problematic. The latter might be a
> more fruitful exercise (or maybe not, what do I know?).

+1. Don't get me wrong: I spent a lot of time refactoring the sslfiles
machinery a while back, and I would love for it to all go away. I
don't really want to interrupt any lines of thought that are moving in
that direction. Please continue.

_And also:_ the xz attack relied on a long chain of injections, both
technical and social. I'm still wrapping my head around Russ Cox's
writeup [1, 2], but the "hidden blob of junk" is only a single part of
all that. I'm not even sure it was a necessary part; it just happened
to work well for this particular project and line of attack.

I've linked Russ Cox in particular because Golang has made a bunch of
language-level decisions with the supply chain in mind, including the
philosophy that a build should ideally not be able to run arbitrary
code at all, and therefore generated files _must_ be checked in. I
remember $OLDJOB having buildbots that would complain if the contents
of the file you checked in didn't match what was (reproducibly!)
generated. I think there's a lot more to think about here.

--Jacob

[1] https://research.swtch.com/xz-timeline
[2] https://research.swtch.com/xz-script
[3] 
https://www.postgresql.org/message-id/flat/17760-b6c61e752ec07060%40postgresql.org




Re: Security lessons from liblzma

2024-04-05 Thread Robert Haas
On Thu, Apr 4, 2024 at 4:48 PM Daniel Gustafsson  wrote:
> AFAIK we haven't historically enforced that installations have the openssl
> binary in PATH, but it would be a pretty low bar to add.  The bigger issue is
> likely to find someone to port this to Windows, it probably won't be too hard
> but as with all things building on Windows, we need someone skilled in that
> area to do it.

I wonder how hard it would be to just code up our own binary to do
this. If it'd be a pain to do that, or to maintain it across SSL
versions, then it's a bad plan and we shouldn't do it. But if it's not
that much code, maybe it'd be worth considering.

I'm also sort of afraid that we're getting sucked into thinking real
hard about this SSL certificate issue rather than trying to brainstorm
all the other places that might be problematic. The latter might be a
more fruitful exercise (or maybe not, what do I know?).

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Security lessons from liblzma

2024-04-04 Thread Tom Lane
Jelte Fennema-Nio  writes:
> Okay turns out even generating them in parallel isn't any faster than
> running that sequentially. I guess it's because of the strong random
> generation being the slow part. Command I used was the following and
> took ~5 seconds on my machine:

> make -C src/test/ssl sslfiles-clean && make -C src/test/ssl sslfiles -j20

Just for comparison's sake, this takes about 2 minutes on mamba's
host.  Now that's certainly museum-grade hardware, but I don't
think it's even the slowest machine in the buildfarm.  On a
Raspberry Pi 4B, it was about 25 seconds.

(I concur with your result that parallelism helps little.)

regards, tom lane




Re: Security lessons from liblzma

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 23:06, Daniel Gustafsson  wrote:
>
> > On 4 Apr 2024, at 23:02, Jelte Fennema-Nio  wrote:
>
> > How about we make it meson/make targets, so they are simply cached
> > just like any of our other build artefacts are cached. Then only clean
> > builds are impacted, not every test run.
>
> They already are (well, make not meson yet), they're just not hooked up to any
> top-level commands.  Running "make ssfiles-clean ssfiles" in src/test/ssl
> regenerates all the files from the base config files that define their
> contents.

Okay turns out even generating them in parallel isn't any faster than
running that sequentially. I guess it's because of the strong random
generation being the slow part. Command I used was the following and
took ~5 seconds on my machine:

make -C src/test/ssl sslfiles-clean && make -C src/test/ssl sslfiles -j20

And I think that's actually a good thing, because that would mean
total build time is pretty much not impacted if we'd include it as
part of the regular clean build. Since building these certs are
bottlenecked on randomness, not on CPU (as pretty much all of our
other build artifacts are). So they should pipeline pretty very well
with the other items, assuming build concurrency is set slightly
higher than the number of cores.

As a proof of concept I ran the above command in a simple bash loop constantly:

while true; do make -C src/test/ssl sslfiles-clean && make -C
src/test/ssl sslfiles -j20; done

And then ran a clean (parallel) build in another shell:

ninja -C build clean && ninja -C build

And total build time went from 41 to 43 seconds. To be clear, that's
while constantly running the ssl file creation. If I only run the
creation once, there's no noticeable increase in build time at all.




Re: Security lessons from liblzma

2024-04-04 Thread Tom Lane
Jelte Fennema-Nio  writes:
> How about we make it meson/make targets, so they are simply cached
> just like any of our other build artefacts are cached. Then only clean
> builds are impacted, not every test run.

Every buildfarm and CI run is "clean" in those terms.

regards, tom lane




Re: Security lessons from liblzma

2024-04-04 Thread Daniel Gustafsson
> On 4 Apr 2024, at 23:02, Jelte Fennema-Nio  wrote:

> How about we make it meson/make targets, so they are simply cached
> just like any of our other build artefacts are cached. Then only clean
> builds are impacted, not every test run.

They already are (well, make not meson yet), they're just not hooked up to any
top-level commands.  Running "make ssfiles-clean ssfiles" in src/test/ssl
regenerates all the files from the base config files that define their
contents.
 
--
Daniel Gustafsson





Re: Security lessons from liblzma

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 22:56, Daniel Gustafsson  wrote:
>
> > On 4 Apr 2024, at 22:47, Tom Lane  wrote:
> >
> > Robert Haas  writes:
> >> On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
> >>> I don't disagree, like I said that very email: it's non-trivial and I 
> >>> wish we
> >>> could make it better somehow, but I don't hav an abundance of good ideas.
> >
> >> Is the basic issue that we can't rely on the necessary toolchain to be
> >> present on every machine where someone might try to build PostgreSQL?
> >
> > IIUC, it's not really that, but that regenerating these files is
> > expensive; multiple seconds even on fast machines.  Putting that
> > into tests that are run many times a day is unappetizing.
>
> That's one aspect of it.  We could cache the results of course to amortize the
> cost over multiple test-runs but at the end of the day it will add time to
> test-runs regardless of what we do.

How about we make it meson/make targets, so they are simply cached
just like any of our other build artefacts are cached. Then only clean
builds are impacted, not every test run.




Re: Security lessons from liblzma

2024-04-04 Thread Greg Sabino Mullane
>
> It would be better if we created the required test files as part of the
> test run.  (Why not?  Too slow?)  Alternatively, I have been thinking
> that maybe we could make the output more reproducible by messing with
> whatever random seed OpenSSL uses.  Or maybe use a Python library to
> create the files.  Some things to think about.
>

I think this last idea is the way to go. I've hand-crafted GIF images and
PGP messages in the past; surely we have enough combined brain power around
here to craft our own SSL files? It may even be a wheel that someone has
invented already.

Cheers,
Greg


Re: Security lessons from liblzma

2024-04-04 Thread Bruce Momjian
On Thu, Apr  4, 2024 at 10:56:01PM +0200, Daniel Gustafsson wrote:
> > On 4 Apr 2024, at 22:47, Tom Lane  wrote:
> > 
> > Robert Haas  writes:
> >> On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
> >>> I don't disagree, like I said that very email: it's non-trivial and I 
> >>> wish we
> >>> could make it better somehow, but I don't hav an abundance of good ideas.
> > 
> >> Is the basic issue that we can't rely on the necessary toolchain to be
> >> present on every machine where someone might try to build PostgreSQL?
> > 
> > IIUC, it's not really that, but that regenerating these files is
> > expensive; multiple seconds even on fast machines.  Putting that
> > into tests that are run many times a day is unappetizing.
> 
> That's one aspect of it.  We could cache the results of course to amortize the
> cost over multiple test-runs but at the end of the day it will add time to
> test-runs regardless of what we do.
> 
> One thing to consider would be to try and rearrange/refactor the tests to
> require a smaller set of keys and certificates.  I haven't looked into what
> sort of savings that could yield (if any) but if we go the route of
> regeneration at test-time we shouldn't leave potential savings on the table.

Rather then everyone testing it on every build, couldn't we have an
automated test every night that checked binary files.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-04-04 Thread Daniel Gustafsson
> On 4 Apr 2024, at 22:47, Tom Lane  wrote:
> 
> Robert Haas  writes:
>> On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
>>> I don't disagree, like I said that very email: it's non-trivial and I wish 
>>> we
>>> could make it better somehow, but I don't hav an abundance of good ideas.
> 
>> Is the basic issue that we can't rely on the necessary toolchain to be
>> present on every machine where someone might try to build PostgreSQL?
> 
> IIUC, it's not really that, but that regenerating these files is
> expensive; multiple seconds even on fast machines.  Putting that
> into tests that are run many times a day is unappetizing.

That's one aspect of it.  We could cache the results of course to amortize the
cost over multiple test-runs but at the end of the day it will add time to
test-runs regardless of what we do.

One thing to consider would be to try and rearrange/refactor the tests to
require a smaller set of keys and certificates.  I haven't looked into what
sort of savings that could yield (if any) but if we go the route of
regeneration at test-time we shouldn't leave potential savings on the table.

--
Daniel Gustafsson





Re: Security lessons from liblzma

2024-04-04 Thread Daniel Gustafsson
> On 4 Apr 2024, at 22:40, Robert Haas  wrote:
> 
> On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
>> I don't disagree, like I said that very email: it's non-trivial and I wish we
>> could make it better somehow, but I don't hav an abundance of good ideas.
> 
> Is the basic issue that we can't rely on the necessary toolchain to be
> present on every machine where someone might try to build PostgreSQL?

AFAIK we haven't historically enforced that installations have the openssl
binary in PATH, but it would be a pretty low bar to add.  The bigger issue is
likely to find someone to port this to Windows, it probably won't be too hard
but as with all things building on Windows, we need someone skilled in that
area to do it.

>> Removing the generated versions and creating them when running tests makes
>> sneaking in malicious content harder since it then has to be submitted in
>> clear-text *only*.  The emphasis added since it's like that today as well: 
>> *I*
>> fully trust our team of committers to not accept a binary file in a patch
>> without replacing with a regenerated version, but enforcing it might make it
>> easier for a wider community to share that level of trust?
> 
> To be honest, I'm not at all sure that I would have considered
> regenerating a binary file to be a must-do kind of a thing, so I guess
> that's a lesson learned for me. Trust is a really tricky thing in
> cases like this. It's not just about whether some committer is
> secretly a malicious actor; it's also about whether everyone
> understands the best practices and follows them consistently. In that
> regard, I don't even trust myself. I hope that it's unlikely that I
> would mistakenly commit something malicious, but I think it could
> happen, and I think it could happen to anyone else, too.

It absolutelty could.  Re-reading Ken Thompsons Turing Lecture "Reflections on
Trusting Trust" at periodic intervals is a good reminder to self just how
complicated this is.

--
Daniel Gustafsson





Re: Security lessons from liblzma

2024-04-04 Thread Tom Lane
Robert Haas  writes:
> On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
>> I don't disagree, like I said that very email: it's non-trivial and I wish we
>> could make it better somehow, but I don't hav an abundance of good ideas.

> Is the basic issue that we can't rely on the necessary toolchain to be
> present on every machine where someone might try to build PostgreSQL?

IIUC, it's not really that, but that regenerating these files is
expensive; multiple seconds even on fast machines.  Putting that
into tests that are run many times a day is unappetizing.

regards, tom lane




Re: Security lessons from liblzma

2024-04-04 Thread Robert Haas
On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
> I don't disagree, like I said that very email: it's non-trivial and I wish we
> could make it better somehow, but I don't hav an abundance of good ideas.

Is the basic issue that we can't rely on the necessary toolchain to be
present on every machine where someone might try to build PostgreSQL?

> Removing the generated versions and creating them when running tests makes
> sneaking in malicious content harder since it then has to be submitted in
> clear-text *only*.  The emphasis added since it's like that today as well: *I*
> fully trust our team of committers to not accept a binary file in a patch
> without replacing with a regenerated version, but enforcing it might make it
> easier for a wider community to share that level of trust?

To be honest, I'm not at all sure that I would have considered
regenerating a binary file to be a must-do kind of a thing, so I guess
that's a lesson learned for me. Trust is a really tricky thing in
cases like this. It's not just about whether some committer is
secretly a malicious actor; it's also about whether everyone
understands the best practices and follows them consistently. In that
regard, I don't even trust myself. I hope that it's unlikely that I
would mistakenly commit something malicious, but I think it could
happen, and I think it could happen to anyone else, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Security lessons from liblzma

2024-04-04 Thread Daniel Gustafsson
> On 4 Apr 2024, at 21:38, Robert Haas  wrote:

> Essentially, your argument is the same as his, namely: hey, don't
> worry, you could totally verify these test files if you wanted to! But
> of course, nobody did, because it was hard, and everybody had better
> things to do with their time. And I think the same thing is probably
> true here: nobody really is going to verify much about these files.

I don't disagree, like I said that very email: it's non-trivial and I wish we
could make it better somehow, but I don't hav an abundance of good ideas.

Removing the generated versions and creating them when running tests makes
sneaking in malicious content harder since it then has to be submitted in
clear-text *only*.  The emphasis added since it's like that today as well: *I*
fully trust our team of committers to not accept a binary file in a patch
without replacing with a regenerated version, but enforcing it might make it
easier for a wider community to share that level of trust?

--
Daniel Gustafsson





Re: Security lessons from liblzma

2024-04-04 Thread Robert Haas
On Wed, Apr 3, 2024 at 3:42 PM Daniel Gustafsson  wrote:
> > So you just have to "trust" that what's there now makes sense.
>
> Not entirely, you can review the input files which are used to generate the
> test data and verify that those make sense..

Yeah, I mean, in theory I suppose that's true, but look at this commit:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=6e636819e8f070330d835fce46289a3ff72a7b89

In case the link stops working for some reason, the commit message is
as follows:

JT> Tests: Update two test files.
JT>
JT> The original files were generated with random local to my machine.
JT> To better reproduce these files in the future, a constant seed was used
JT> to recreate these files.

Essentially, your argument is the same as his, namely: hey, don't
worry, you could totally verify these test files if you wanted to! But
of course, nobody did, because it was hard, and everybody had better
things to do with their time. And I think the same thing is probably
true here: nobody really is going to verify much about these files.

I just went and ran openssl x509 -in $f -text on each .crt file and it
worked, so all of those files look basically like certificates. But
like, hypothetically, how do I know that the modulus chosen for a
particular certificate was chosen at random, rather than maliciously?
Similarly for the key files. Are there padding bytes in any of these
files that could be used to store evil information? I don't know that,
either. I'm not sure how far it's worth continuing down this path of
paranoia; I doubt that Daniel Gustafsson is a clever alias for a
nefarious cabal of state-sponsored hackers, and the hypotheses that
supposedly-random values were chosen non-randomly borders on
unfalsifiable. Nonetheless, I think Peter is correct to point out that
these are the kinds of files about which it is reasonable to be
concerned, because they seem to have properties quite similar to those
of the files used in an actual attack.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Security lessons from liblzma - libsystemd

2024-04-03 Thread Peter Eisentraut

On 03.04.24 23:19, Magnus Hagander wrote:
When the code is this simple, we should definitely consider carrying it 
ourselves. At least if we don't expect to need *other* functionality 
from the same library in the future, which I doubt we will from libsystemd.


Well, I've long had it on my list to do some integration to log directly 
to the journal, so you can preserve metadata better.  I'm not sure right 
now whether this would use libsystemd, but it's not like there is 
absolutely no other systemd-related functionality that could be added.


Personally, I think this proposed change is trying to close a barndoor 
after a horse has bolted.  There are many more interesting and scary 
libraries in the dependency tree of "postgres", so just picking off one 
right now doesn't really accomplish anything.  The next release of 
libsystemd will drop all the compression libraries as hard dependencies, 
so the issue in that sense is gone anyway.  Also, fun fact: liblzma is 
also a dependency via libxml2.






Re: Security lessons from liblzma - libsystemd

2024-04-03 Thread Andres Freund
Hi,

On 2024-04-03 17:58:55 -0400, Tom Lane wrote:
> Magnus Hagander  writes:
> > On Wed, Apr 3, 2024 at 7:57 PM Andres Freund  wrote:
> >> Openssh has now integrated [1] a patch to remove the dependency on
> >> libsystemd
> >> for triggering service manager readyness notifications, by inlining the
> >> necessary function. That's not hard, the protocol is pretty simple.
> >> I suspect we should do the same. We're not even close to being a target as
> >> attractive as openssh, but still, it seems unnecessary.
> 
> > +1.
> 
> I didn't read the patch, but if it's short and stable enough then this
> seems like a good idea.

It's basically just checking for an env var, opening the unix socket indicated
by that, writing a string to it and closing the socket again.


> (If openssh and we are using such a patch, that will probably be a big
> enough stake in the ground to prevent somebody deciding to change the
> protocol ...)

One version of the openssh patch to remove liblzma was submitted by one of the
core systemd devs, so I think they agree that it's a stable API.  The current
protocol supports adding more information by adding attributes, so it should
be extensible enough anyway.


> >> An argument could be made to instead just remove support, but I think it's
> >> quite valuable to have intra service dependencies that can rely on the
> >> server actually having started up.
> 
> > If we remove support we're basically just asking most of our linux
> > packagers to add it back in, and they will add it back in the same way we
> > did it. I think we do everybody a disservice if we do that. It's useful
> > functionality.
> 
> Yeah, that idea seems particularly silly in view of the desire
> expressed earlier in this thread to reduce the number of patches
> carried by packagers.  People packaging for systemd-using distros
> will not consider that this functionality is optional.

Yep.

Greetings,

Andres Freund




Re: Security lessons from liblzma - libsystemd

2024-04-03 Thread Tom Lane
Magnus Hagander  writes:
> On Wed, Apr 3, 2024 at 7:57 PM Andres Freund  wrote:
>> Openssh has now integrated [1] a patch to remove the dependency on
>> libsystemd
>> for triggering service manager readyness notifications, by inlining the
>> necessary function. That's not hard, the protocol is pretty simple.
>> I suspect we should do the same. We're not even close to being a target as
>> attractive as openssh, but still, it seems unnecessary.

> +1.

I didn't read the patch, but if it's short and stable enough then this
seems like a good idea.  (If openssh and we are using such a patch,
that will probably be a big enough stake in the ground to prevent
somebody deciding to change the protocol ...)

>> An argument could be made to instead just remove support, but I think it's
>> quite valuable to have intra service dependencies that can rely on the
>> server actually having started up.

> If we remove support we're basically just asking most of our linux
> packagers to add it back in, and they will add it back in the same way we
> did it. I think we do everybody a disservice if we do that. It's useful
> functionality.

Yeah, that idea seems particularly silly in view of the desire
expressed earlier in this thread to reduce the number of patches
carried by packagers.  People packaging for systemd-using distros
will not consider that this functionality is optional.

regards, tom lane




Re: Security lessons from liblzma - libsystemd

2024-04-03 Thread Magnus Hagander
On Wed, Apr 3, 2024 at 7:57 PM Andres Freund  wrote:

> Hi,
>
> As most will know by now, the way xz debacle was able to make sshd
> vulnerable
> was through a dependency from sshd to libsystemd and then from libsystemd
> to
> liblzma. One lesson from this is that unnecessary dependencies can still
> increase risk.
>

Yeah, I think that's something to consider for every dependency added. I
think we're fairly often protected against "adding too many libraries"
because many libraries simply don't exist for all the platforms we want to
build on. But it's nevertheless something to think about each time.


It's worth noting that we have an optional dependency on libsystemd as well.
>
> Openssh has now integrated [1] a patch to remove the dependency on
> libsystemd
> for triggering service manager readyness notifications, by inlining the
> necessary function. That's not hard, the protocol is pretty simple.
>
> I suspect we should do the same. We're not even close to being a target as
> attractive as openssh, but still, it seems unnecessary.
>

+1.

When the code is this simple, we should definitely consider carrying it
ourselves. At least if we don't expect to need *other* functionality from
the same library in the future, which I doubt we will from libsystemd.


An argument could be made to instead just remove support, but I think it's
> quite valuable to have intra service dependencies that can rely on the
> server
> actually having started up.
>
>
If we remove support we're basically just asking most of our linux
packagers to add it back in, and they will add it back in the same way we
did it. I think we do everybody a disservice if we do that. It's useful
functionality.

//Magnus


Re: Security lessons from liblzma

2024-04-03 Thread Daniel Gustafsson
> On 3 Apr 2024, at 20:09, Peter Eisentraut  wrote:
> 
> On 30.03.24 00:14, Daniel Gustafsson wrote:
>> One take-away for me is how important it is to ship recipes for regenerating
>> any testdata which is included in generated/compiled/binary format.  Kind of
>> how we in our tree ship the config for test TLS certificates and keys which 
>> can
>> be manually inspected, and used to rebuild the testdata (although the risk 
>> for
>> injections in this particular case seems low).  Bad things can still be
>> injected, but formats which allow manual review at least goes some way 
>> towards
>> lowering risk.
> 
> I actually find the situation with the test TLS files quite unsatisfactory.  
> While we have build rules, the output files are not reproducible, both 
> because of some inherent randomness in the generation, and because different 
> OpenSSL versions produce different details.

This testdata is by nature not reproducible, and twisting arms to make it so
will only result in testing against synthetic data which risk hiding bugs IMO.

> So you just have to "trust" that what's there now makes sense.

Not entirely, you can review the input files which are used to generate the
test data and verify that those make sense..

> Of course, you can use openssl tools to unpack these files, but that is 
> difficult and unreliable unless you know exactly what you are looking for.

..and check like you mention above, but it's as you say not entirely trivial.  
It
would be nice to improve this but I'm not sure how.  Document how to inspect
the data in src/test/ssl/README perhaps?

> It would be better if we created the required test files as part of the test 
> run.  (Why not?  Too slow?)

The make sslfiles step requires OpenSSL 1.1.1, which is higher than what we
require to be installed to build postgres.  The higher-than-base requirement is
due to it building test data only used when running 1.1.1 or higher, so
technically it could be made to work if anyone is interesting in investing time
in 1.0.2.

Time is one aspect, on my crusty old laptop it takes ~2 seconds to regenerate
the files.  That in itself isn't that much, but we've rejected test-time
additions far less than that.  We could however make CI and the Buildfarm run
the regeneration and leave it up to each developer to decide locally?  Or
remove them and regenerate on the first SSL test run and then use the cached
ones after that?

On top of time I have a feeling the regeneration won't run on Windows.  When
it's converted to use Meson then that might be fixed though.

--
Daniel Gustafsson





Re: Security lessons from liblzma

2024-04-03 Thread Peter Eisentraut

On 30.03.24 00:14, Daniel Gustafsson wrote:

One take-away for me is how important it is to ship recipes for regenerating
any testdata which is included in generated/compiled/binary format.  Kind of
how we in our tree ship the config for test TLS certificates and keys which can
be manually inspected, and used to rebuild the testdata (although the risk for
injections in this particular case seems low).  Bad things can still be
injected, but formats which allow manual review at least goes some way towards
lowering risk.


I actually find the situation with the test TLS files quite 
unsatisfactory.  While we have build rules, the output files are not 
reproducible, both because of some inherent randomness in the 
generation, and because different OpenSSL versions produce different 
details.  So you just have to "trust" that what's there now makes sense. 
 Of course, you can use openssl tools to unpack these files, but that 
is difficult and unreliable unless you know exactly what you are looking 
for.  Also, for example, do we even know whether all the files that are 
there now are even used by any tests?


A few years ago, some guy on the internet sent in a purported update to 
one of the files [0], which I ended up committing, but I remember that 
that situation gave me quite some pause at the time.


It would be better if we created the required test files as part of the 
test run.  (Why not?  Too slow?)  Alternatively, I have been thinking 
that maybe we could make the output more reproducible by messing with 
whatever random seed OpenSSL uses.  Or maybe use a Python library to 
create the files.  Some things to think about.


[0]: 
https://www.postgresql.org/message-id/FEF81714-D479-4512-839B-C769D2605F8A%40yesql.se






Re: Security lessons from liblzma - libsystemd

2024-04-03 Thread Andres Freund
Hi,

As most will know by now, the way xz debacle was able to make sshd vulnerable
was through a dependency from sshd to libsystemd and then from libsystemd to
liblzma. One lesson from this is that unnecessary dependencies can still
increase risk.

It's worth noting that we have an optional dependency on libsystemd as well.

Openssh has now integrated [1] a patch to remove the dependency on libsystemd
for triggering service manager readyness notifications, by inlining the
necessary function. That's not hard, the protocol is pretty simple.

I suspect we should do the same. We're not even close to being a target as
attractive as openssh, but still, it seems unnecessary.

Intro into the protocol is at [2], with real content and outline of the
relevant code at [3].


An argument could be made to instead just remove support, but I think it's
quite valuable to have intra service dependencies that can rely on the server
actually having started up.

Greetings,

Andres Freund

[1] https://bugzilla.mindrot.org/show_bug.cgi?id=2641
[2] 
https://www.freedesktop.org/software/systemd/man/devel/systemd.html#Readiness%20Protocol
[3] https://www.freedesktop.org/software/systemd/man/devel/sd_notify.html#Notes




Re: Security lessons from liblzma

2024-04-01 Thread Andrew Dunstan



On 2024-03-31 Su 17:12, Andres Freund wrote:

Hi,

On 2024-03-31 12:18:29 +0200, Michael Banck wrote:

If you ask where they are maintained, the answer is here:

https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads

the other major versions have their own branch.

Luckily these are all quite small, leaving little space to hide stuff.  I'd
still like to get rid of at least some of them.

I've previously proposed a patch to make pkglibdir configurable, I think we
should just go for that.

For the various defines, ISTM we should just make them #ifndef guarded, then
they could be overridden by defining them at configure time. Some of them,
like DEFAULT_PGSOCKET_DIR seem to be overridden by just about every
distro. And others would be nice to easily override anyway, I e.g. dislike the
default DEFAULT_PAGER value.



+1 to this proposal.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Security lessons from liblzma

2024-04-01 Thread Tom Lane
Bruce Momjian  writes:
> On Mon, Apr  1, 2024 at 03:17:55PM -0400, Tom Lane wrote:
>> AFAIK, every open-source distro makes all the pieces needed to
>> rebuild their packages available to users.  It wouldn't be much
>> of an open-source situation otherwise.  You do have to learn
>> their package build process.

> I wasn't clear if all the projects provide a source tree that can be
> verified against the project's source tree, and then independent
> patches, or if the patches were integrated and therefore harder to
> verify against the project source tree.

In the systems I'm familiar with, an SRPM-or-equivalent includes the
pristine upstream tarball and then some patch files to apply to it.
The patch files have to be maintained anyway, and if you don't ship
them then you're not shipping "source".

regards, tom lane




Re: Security lessons from liblzma

2024-04-01 Thread Bruce Momjian
On Mon, Apr  1, 2024 at 03:17:55PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > I was more asking if users have access to patches so they could recreate
> > the build by using the Postgres git tree and supplied OS-specific
> > patches.
> 
> AFAIK, every open-source distro makes all the pieces needed to
> rebuild their packages available to users.  It wouldn't be much
> of an open-source situation otherwise.  You do have to learn
> their package build process.

I wasn't clear if all the projects provide a source tree that can be
verified against the project's source tree, and then independent
patches, or if the patches were integrated and therefore harder to
verify against the project source tree.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-04-01 Thread Bruce Momjian
On Fri, Mar 29, 2024 at 06:37:24PM -0400, Bruce Momjian wrote:
> You might have seen reports today about a very complex exploit added to
> recent versions of liblzma.  Fortunately, it was only enabled two months
> ago and has not been pushed to most stable operating systems like Debian
> and Ubuntu.  The original detection report is:
> 
> https://www.openwall.com/lists/oss-security/2024/03/29/4

I was watching this video about the exploit:

https://www.youtube.com/watch?v=bS9em7Bg0iU

and at 2:29, they mention "hero software developer", our own Andres
Freund as the person who discovered the exploit.  I noticed the author's
name at the openwall email link above, but I assumed it was someone else
with the same name.  They mentioned it was found while researching
Postgres performance, and then I noticed the email address matched!

I thought the analogy he uses at the end of the video is very clear.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-04-01 Thread Tom Lane
Bruce Momjian  writes:
> I was more asking if users have access to patches so they could recreate
> the build by using the Postgres git tree and supplied OS-specific
> patches.

AFAIK, every open-source distro makes all the pieces needed to
rebuild their packages available to users.  It wouldn't be much
of an open-source situation otherwise.  You do have to learn
their package build process.

regards, tom lane




Re: Security lessons from liblzma

2024-04-01 Thread Bruce Momjian
On Sun, Mar 31, 2024 at 02:12:57PM -0700, Andres Freund wrote:
> Hi,
> 
> On 2024-03-31 12:18:29 +0200, Michael Banck wrote:
> > If you ask where they are maintained, the answer is here:
> >
> > https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads
> >
> > the other major versions have their own branch.
> 
> Luckily these are all quite small, leaving little space to hide stuff.  I'd
> still like to get rid of at least some of them.
> 
> I've previously proposed a patch to make pkglibdir configurable, I think we
> should just go for that.
> 
> For the various defines, ISTM we should just make them #ifndef guarded, then
> they could be overridden by defining them at configure time. Some of them,
> like DEFAULT_PGSOCKET_DIR seem to be overridden by just about every
> distro. And others would be nice to easily override anyway, I e.g. dislike the
> default DEFAULT_PAGER value.

I realize we can move some changes into our code, but packagers are
still going to need a way to do immediate adjustments to match their OS
in time frames that don't match the Postgres release schedule.

I was more asking if users have access to patches so they could recreate
the build by using the Postgres git tree and supplied OS-specific
patches.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-04-01 Thread David E. Wheeler
On Apr 1, 2024, at 06:55, walt...@technowledgy.de wrote:

> Also a configurable directoy to look up extensions, possibly even to be 
> changed at run-time like [2]. The patch says this:
> 
>> This directory is prepended to paths when loading extensions (control and 
>> SQL files), and to the '$libdir' directive when loading modules that back 
>> functions. The location is made configurable to allow build-time testing of 
>> extensions that do not have been installed to their proper location yet.
> 
> This seems like a great thing to have. This might also be relevant in light 
> of recent discussions in the ecosystem around extension management.
> 
> All the path-related issues have in common, that while it's easy to move 
> files around to their proper locations later, they all need to adjust 
> pg_config's output.

Funny timing, I was planning to resurrect this old patch[1] and propose that 
patch this week. One of motivators is the increasing use of Docker images in 
Kubernetes to run Postgres, where there’s a desire to keep the core service and 
extensions immutable, and to have a second directory mounted to a persistent 
volume into which other extensions can be installed and preserved independently 
of the Docker image.

The current approach involves symlinking shenanigans[2] that complicate things 
pretty substantially, making it more difficult to administer. A second 
directory fit for purpose would be far better.

There are some other motivators, so I’ll do some additional diligence and start 
a separate thread (or reply to the original[3]).

Best,

David

[1] https://commitfest.postgresql.org/5/170/
[2] https://speakerdeck.com/ongres/postgres-extensions-in-kubernetes?slide=14
[3] 
https://www.postgresql.org/message-id/flat/51AE0845.8010600%40ocharles.org.uk





Re: Security lessons from liblzma

2024-04-01 Thread walther
I looked through the repositories of 19 linux distros [1] to see what 
kind of patches are applied often. Many of them share the same package 
managers / repositories and thus the same patches. I made sure to look 
at some smaller, "other" distros as well, to see what kind of problems 
appear outside the mainstream distros.


Andres Freund:

I've previously proposed a patch to make pkglibdir configurable, I think we
should just go for that.


+1. Other paths, which some distros need to configure are pkgincludedir 
and pgxs (independently of pkglibdir).


Also a configurable directoy to look up extensions, possibly even to be 
changed at run-time like [2]. The patch says this:



This directory is prepended to paths when loading extensions (control and SQL 
files), and to the '$libdir' directive when loading modules that back 
functions. The location is made configurable to allow build-time testing of 
extensions that do not have been installed to their proper location yet.


This seems like a great thing to have. This might also be relevant in 
light of recent discussions in the ecosystem around extension management.


All the path-related issues have in common, that while it's easy to move 
files around to their proper locations later, they all need to adjust 
pg_config's output.



Andres Freund:

For the various defines, ISTM we should just make them #ifndef guarded, then
they could be overridden by defining them at configure time. Some of them,
like DEFAULT_PGSOCKET_DIR seem to be overridden by just about every
distro. And others would be nice to easily override anyway, I e.g. dislike the
default DEFAULT_PAGER value.


DEFAULT_PAGER is also overriden by a few distros. DEFAULT_EDITOR by one 
as well. As you said DEFAULT_PGSOCKET_DIR in **a lot** of them.



Andres Freund:

postgresql-17-rpm-pgsql.patch:

We should probably make this stuff properly configurable. The current logic
with inferring whether to add /postgresql is just weird. Perhaps a configure
option that defaults to the current logic when set to an empty string but can
be overridden?


+1 for that option to force the suffix, no matter whether postgres/pgsql 
are in the path already.



Some other commonly patched issues are:
- Building only libpq, only libecpg - or disabling both and building 
only the server code. This comes up often, it's not supported nicely in 
our build system, yet. I think meson already has some build targets for 
parts of that, but it's very hard / impossible to then install only this 
subset of the build. It would be great to be able to build and install 
only the frontend code (including only the frontend headers) or only the 
backend code (including headers) etc.
- Related to the above is pg_config and how to deal with it when 
installing separate client and server packages, in different locations, 
too. Some distros provide a second version of pg_config as pg_server_config.


Those two issues and the path-related issues above make it harder than 
it should be to provide separate packages for each major version, which 
can be installed at the same time (versioned prefixes, multiple server 
packages, but maybe only a single libpq package etc.).



Some small patches that might not be widespread, but could possibly 
still be upstreamed:

- jit-s390x [3] (Alpine, Debian, Fedora, openSUSE)
- pg_config --major-version option for extension builds [4] (Alpine)
- Some fixes for man pages [5] (AlmaLinux, CentOS, Fedora)


TLDR: I think it should be possible to make the build system more 
flexible in some areas without introducing distro specific things in 
core. This should eliminate the need for many of the same patches across 
the board for a lot of distros.


Best,

Wolfgang

[1]: ALT Linux, Adélie Linux, AlmaLinux, Alpine Linux, Arch Linux, 
CentOS, Crux, Debian, Fedora, Gentoo, GoboLinux, Guix, Mandriva, NixOS, 
OpenWRT, Rocky Linux, Ubuntu, Void Linux, openSUSE


[2]: 
https://salsa.debian.org/postgresql/postgresql/-/blob/17/debian/patches/extension_destdir?ref_type=heads


[3]: 
https://salsa.debian.org/postgresql/postgresql/-/blob/17/debian/patches/jit-s390x?ref_type=heads


[4]: 
https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/postgresql16/pg_config-add-major-version.patch?ref_type=heads


[5]: 
https://gitlab.com/redhat/centos-stream/rpms/postgresql/-/blob/c9s/postgresql-man.patch





Re: Security lessons from liblzma

2024-03-31 Thread Andres Freund
Hi,

On 2024-03-31 12:18:29 +0200, Michael Banck wrote:
> If you ask where they are maintained, the answer is here:
>
> https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads
>
> the other major versions have their own branch.

Luckily these are all quite small, leaving little space to hide stuff.  I'd
still like to get rid of at least some of them.

I've previously proposed a patch to make pkglibdir configurable, I think we
should just go for that.

For the various defines, ISTM we should just make them #ifndef guarded, then
they could be overridden by defining them at configure time. Some of them,
like DEFAULT_PGSOCKET_DIR seem to be overridden by just about every
distro. And others would be nice to easily override anyway, I e.g. dislike the
default DEFAULT_PAGER value.


On 2024-03-31 16:55:26 +0100, Devrim Gündüz wrote:
> Here are the v17 patches:
>
> https://git.postgresql.org/gitweb/?p=pgrpms.git;a=tree;f=rpm/redhat/main/non-common/postgresql-17/main

A bit bigger/more patches, but still not too bad.

postgresql-17-conf.patch

Uncomments a few values to their default, that's a bit odd.


postgresql-17-ecpg_config.h:
postgresql-17-pg_config.h:

Hm, wonder if we should make this easier somehow. Perhaps we ought to support
installing the various *config.h headers into a different directory than the
architecture independent stuff?  At least on debian based systems it seems we
ought to support installing pg_config.h etc into /usr/include/ or
something along those lines.


postgresql-17-rpm-pgsql.patch:

We should probably make this stuff properly configurable. The current logic
with inferring whether to add /postgresql is just weird. Perhaps a configure
option that defaults to the current logic when set to an empty string but can
be overridden?


Greetings,

Andres Freund




Re: Security lessons from liblzma

2024-03-31 Thread Tom Lane
Michael Banck  writes:
> On Sun, Mar 31, 2024 at 01:05:40PM -0400, Joe Conway wrote:
>> But it has always bothered me how many patches get applied to the upstream
>> tarballs by the OS package builders.

> I think this more an artifact of how RHEL development works, i.e. trying
> to ship the same major version of glibc for 10 years, but still fix lots
> of bugs and possibly some performance improvements your larger customers
> ask for. So I guess a lot of those 1000 patches are just cherry-picks /
> backports of upstream commits from newer releases.

Yeah.  Also, precisely because they keep supporting versions that are
out-of-support according to upstream, the idea that all the patches
can be moved upstream isn't going to work for them, and they're
unlikely to be excited about partial solutions.

The bigger problem though is: if we do this, are we going to take
patches that we fundamentally don't agree with?  For instance,
if a packager chooses to rip out the don't-run-server-as-root check.
(Pretty sure I've heard of people doing that.)  That would look like
blessing things we don't think are good ideas, and it would inevitably
lead to long arguments with packagers about why-dont-you-do-this-some-
other-way.  I'm not excited about that prospect.

regards, tom lane




Re: Security lessons from liblzma

2024-03-31 Thread Michael Banck
Hi,

On Sun, Mar 31, 2024 at 01:05:40PM -0400, Joe Conway wrote:
> But it has always bothered me how many patches get applied to the upstream
> tarballs by the OS package builders. Some of them, e.g. glibc on RHEL 7,
> include more than 1000 patches that you would have to manually vet if you
> cared enough and had the skills. Last time I looked at the openssl package
> sources it was similar in volume and complexity. They might as well be
> called forks if everyone were being honest about it...

I think this more an artifact of how RHEL development works, i.e. trying
to ship the same major version of glibc for 10 years, but still fix lots
of bugs and possibly some performance improvements your larger customers
ask for. So I guess a lot of those 1000 patches are just cherry-picks /
backports of upstream commits from newer releases.

I guess it would be useful to maybe have another look at the patches
that are being applied for apt/yum.postgresql.org for the 18 release
cycle, but I do not think those are a security problem. Not sure about
RPM builds, but at least in theory the APT builds should be
reproducible.

What would be a significant gain in security/trust was an easy
service/recipe on how to verify the reproducibility (i) by independently
building packages (and maybe the more popular extensions) and comparing
them to the {apt,yum}.postgresql.org repository packages (ii) by being
able to build the release tarballs reproducibly.


Michael




Re: Security lessons from liblzma

2024-03-31 Thread Joe Conway

On 3/31/24 11:49, Tom Lane wrote:

Joe Conway  writes:
I am saying maybe those patches should be eliminated in favor of our 
tree including build options that would produce the same result.


I don't really see how that can be expected to work sanely.
It turns the responsibility for platform-specific build issues
on its head, and it doesn't work at all for issues discovered
after we make a release.  The normal understanding of how you
can vet a distro's package is that you look at the package
contents (the SRPM in Red Hat world and whatever the equivalent
concept is elsewhere), check that the contained tarball
matches upstream and that the patches and build instructions
look sane, and then build it locally and check for a match to
the distro's binary package.  Even if we could overcome the
obstacles to putting the patch files into the upstream tarball,
we're surely not going to include the build instructions, so
we'd not have moved the needle very far in terms of whether the
packager could do something malicious.


True enough I guess.

But it has always bothered me how many patches get applied to the 
upstream tarballs by the OS package builders. Some of them, e.g. glibc 
on RHEL 7, include more than 1000 patches that you would have to 
manually vet if you cared enough and had the skills. Last time I looked 
at the openssl package sources it was similar in volume and complexity. 
They might as well be called forks if everyone were being honest about it...


I know our PGDG packages are no big deal compared to that, fortunately.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Security lessons from liblzma

2024-03-31 Thread Devrim Gündüz
Hi,

On Sat, 2024-03-30 at 21:52 -0400, Bruce Momjian wrote:
> How would someone access the out-of-tree patches? 

Here are the v17 patches:

https://git.postgresql.org/gitweb/?p=pgrpms.git;a=tree;f=rpm/redhat/main/non-common/postgresql-17/main

You can replace -17 with -16 (and etc) for the other major releases.

Please note that both Debian folks and me build about 300 other packages
to support the ecosystem. Just saying.

Regards,
-- 
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
Twitter: @DevrimGunduz , @DevrimGunduzTR




Re: Security lessons from liblzma

2024-03-31 Thread Tom Lane
Joe Conway  writes:
> I am saying maybe those patches should be eliminated in favor of our 
> tree including build options that would produce the same result.

I don't really see how that can be expected to work sanely.
It turns the responsibility for platform-specific build issues
on its head, and it doesn't work at all for issues discovered
after we make a release.  The normal understanding of how you
can vet a distro's package is that you look at the package
contents (the SRPM in Red Hat world and whatever the equivalent
concept is elsewhere), check that the contained tarball
matches upstream and that the patches and build instructions
look sane, and then build it locally and check for a match to
the distro's binary package.  Even if we could overcome the
obstacles to putting the patch files into the upstream tarball,
we're surely not going to include the build instructions, so
we'd not have moved the needle very far in terms of whether the
packager could do something malicious.

regards, tom lane




Re: Security lessons from liblzma

2024-03-31 Thread Devrim Gündüz
Hi,

On Sun, 2024-03-31 at 08:15 -0400, Joe Conway wrote:
> 
> I am saying maybe those patches should be eliminated in favor of our 
> tree including build options that would produce the same result.

Works for me, as a long as I can commit them and upcoming potential
patches to PostgreSQL git repo.

OTOH, we also carry non-patches like README files, systemd unit files,
pam files, setup script, etc., which are very RPM specific.

Regards,
-- 
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
Twitter: @DevrimGunduz , @DevrimGunduzTR




Re: Security lessons from liblzma

2024-03-31 Thread Joe Conway

On 3/30/24 21:52, Bruce Momjian wrote:

On Sat, Mar 30, 2024 at 07:54:00PM -0400, Joe Conway wrote:

Virtually every RPM source, including ours, contains out of tree patches
that get applied on top of the release tarball. At least for the PGDG
packages, it would be nice to integrate them into our git repo as build
options or whatever so that the packages could be built without any patches
applied to it. Add a tarball that is signed and traceable back to the git
tag, and we would be in a much better place than we are now.


How would someone access the out-of-tree patches?  I think Debian
includes the patches in its source tarball.


I am saying maybe those patches should be eliminated in favor of our 
tree including build options that would produce the same result.


For example, these patches are applied to our release tarball files when 
the RPM is being built for pg16 on RHEL 9:


-
https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-rpm-pgsql.patch;h=d9b6d12b7517407ac81352fa325ec91b05587641;hb=HEAD

https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-var-run-socket.patch;h=f2528efaf8f4681754b20283463eff3e14eedd39;hb=HEAD

https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-conf.patch;h=da28ed793232316dd81fdcbbe59a6479b054a364;hb=HEAD

https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-perl-rpath.patch;h=748c42f0ec2c9730af3143e90e5b205c136f40d9;hb=HEAD
-

Nothing too crazy, but wouldn't it be better if no patches were required 
at all?


Ideally we should have reproducible builds so that starting with our 
tarball (which is traceable back to the git release tag) one can easily 
obtain the same binary as what the RPMs/DEBs deliver.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Security lessons from liblzma

2024-03-31 Thread Michael Banck
On Sat, Mar 30, 2024 at 09:52:47PM -0400, Bruce Momjian wrote:
> On Sat, Mar 30, 2024 at 07:54:00PM -0400, Joe Conway wrote:
> > Virtually every RPM source, including ours, contains out of tree patches
> > that get applied on top of the release tarball. At least for the PGDG
> > packages, it would be nice to integrate them into our git repo as build
> > options or whatever so that the packages could be built without any patches
> > applied to it. Add a tarball that is signed and traceable back to the git
> > tag, and we would be in a much better place than we are now.
> 
> How would someone access the out-of-tree patches?  I think Debian
> includes the patches in its source tarball.

If you ask where they are maintained, the answer is here:

https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads

the other major versions have their own branch.


Michael




Re: Security lessons from liblzma

2024-03-30 Thread Bruce Momjian
On Sat, Mar 30, 2024 at 07:54:00PM -0400, Joe Conway wrote:
> Virtually every RPM source, including ours, contains out of tree patches
> that get applied on top of the release tarball. At least for the PGDG
> packages, it would be nice to integrate them into our git repo as build
> options or whatever so that the packages could be built without any patches
> applied to it. Add a tarball that is signed and traceable back to the git
> tag, and we would be in a much better place than we are now.

How would someone access the out-of-tree patches?  I think Debian
includes the patches in its source tarball.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-03-30 Thread Tom Lane
Joe Conway  writes:
> On 3/30/24 19:54, Joe Conway wrote:
>> Tom follows this, at least last time I checked:
>> https://wiki.postgresql.org/wiki/Release_process

> Reading through that, I wonder if this part is true anymore:

>In principle this could be done anywhere, but again there's a concern
>about reproducibility, since the results may vary depending on
>installed bison, flex, docbook, etc versions. Current practice is to
>always do this as pgsql on borka.postgresql.org, so it can only be
>done by people who have a login there. In detail:

The reproducibility argument would still apply to the docs (in
whatever form we're packaging them), but hopefully not to the
basic source tarball.

> Maybe if we split out the docs from the release tarball, we could also 
> add the script (mk-release) to our git repo?

If memory serves, the critical steps are already in our source tree,
as "make dist" (but I'm not sure how that's going to work if we want
to get away from using autoconf/make).  It's not clear to me how much
of the rest of mk-release is relevant to people who might be trying to
generate things elsewhere.  I'd like mk-release to continue to work
for older branches, too, so it's going to be some sort of hybrid mess
for a few years here.

regards, tom lane




Re: Security lessons from liblzma

2024-03-30 Thread Joe Conway

On 3/30/24 19:54, Joe Conway wrote:

On 2024-03-30 16:50:26 -0400, Robert Haas wrote:

or what Tom does when he builds the release tarballs.


Tom follows this, at least last time I checked:

https://wiki.postgresql.org/wiki/Release_process


Reading through that, I wonder if this part is true anymore:

  In principle this could be done anywhere, but again there's a concern
  about reproducibility, since the results may vary depending on
  installed bison, flex, docbook, etc versions. Current practice is to
  always do this as pgsql on borka.postgresql.org, so it can only be
  done by people who have a login there. In detail:

Maybe if we split out the docs from the release tarball, we could also 
add the script (mk-release) to our git repo?


Some other aspects of that wiki page look out of date too. Perhaps it 
needs an overall update? Maybe Tom and/or Magnus could weigh in here.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Security lessons from liblzma

2024-03-30 Thread Joe Conway

On 3/30/24 17:12, Andres Freund wrote:

Hi,

On 2024-03-30 16:50:26 -0400, Robert Haas wrote:

We might also want to move toward signing commits and tags. One of the
meson maintainers was recommending that on-list not long ago.


I don't know how valuable the commit signing really is, but I strongly agree
that we should sign both tags and tarballs.


+1



We should think about weaknesses that might occur during the packaging
process, too. If someone who alleges that their packaging PG is really
packaging PG w/badstuff123.patch, how would we catch that?


I don't think we realistically can. The environment, configure and compiler
options will influence things too much to do any sort of automatic
verification.

But that shouldn't stop us from ensuring that at least the packages
distributed via *.postgresql.org are reproducibly built.

Another good avenue for introducing an attack would be to propose some distro
specific changes to the packaging teams - there's a lot fewer eyes there.  I
think it might be worth working with some of our packagers to integrate more
of their changes into our tree.


Huge +1 to that. I have thought many times, and even said to Devrim, a 
huge number of people trust him to not be evil.


Virtually every RPM source, including ours, contains out of tree patches 
that get applied on top of the release tarball. At least for the PGDG 
packages, it would be nice to integrate them into our git repo as build 
options or whatever so that the packages could be built without any 
patches applied to it. Add a tarball that is signed and traceable back 
to the git tag, and we would be in a much better place than we are now.



I can't for example verify what the infrastructure team is doing,


Not sure what you feel like you should be able to follow -- anything 
specific?



or what Tom does when he builds the release tarballs.


Tom follows this, at least last time I checked:

https://wiki.postgresql.org/wiki/Release_process


This one however, I think we could improve upon. Making sure the tarball
generation is completely binary reproducible and providing means of checking
that would surely help. This will be a lot easier if we, as dicussed
elsewhere, I believe, split out the generated docs into a separately
downloadable archive. We already stopped including other generated files
recently.


again, big +1


We shouldn't make the mistake of assuming that bad things can't happen to
us.


+1


and again with the +1 ;-)

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Security lessons from liblzma

2024-03-30 Thread Bruce Momjian
On Sat, Mar 30, 2024 at 04:50:26PM -0400, Robert Haas wrote:
> On Fri, Mar 29, 2024 at 7:00 PM Andres Freund  wrote:
> > I am doubtful that every committer would find something sneaky hidden in
> > e.g. one of the test changes in a large commit. It's not too hard to hide
> > something sneaky. I comparison to that hiding something in configure.ac 
> > seems
> > less likely to succeed IMO, that imo tends to be more scrutinized. And 
> > hiding
> > just in configure directly wouldn't get you far, it'd just get removed when
> > the committer or some other committer at a later time, regenerates 
> > configure.
> 
> I agree with this. If I were trying to get away with a malicious
> commit, I'd look for files that other people would be unlikely to
> examine closely, or would have difficulty examining closely. Test data
> or test scripts seem like great possibilities. And I also would like
> it to be part of some relatively large commit that is annoying to read
> through visually. We don't have a lot of binary format files in the
> tree, which is good, but there's probably some things like Unicode
> tables and ECPG expected output files that very, very few people ever
> actually examine. If we had a file in the tree that looked based on
> the name like an expected output file for a test, but there was no
> corresponding test, how many of us would notice that? How many of us
> would scrutinize it? Imagine hiding something bad in the middle of
> that file somewhere.

So, in this case, the hooks were in 'configure', but not configure.ac,
and the exploit was in a test file which was in the tarball but _not_ in
the git tree.  So, they used the obfuscation of 'configure's syntax, and
the lack of git oversight by not putting the test files in the git tree.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Security lessons from liblzma

2024-03-30 Thread Andres Freund
Hi,

On 2024-03-30 16:50:26 -0400, Robert Haas wrote:
> We might also want to move toward signing commits and tags. One of the
> meson maintainers was recommending that on-list not long ago.

I don't know how valuable the commit signing really is, but I strongly agree
that we should sign both tags and tarballs.


> We should think about weaknesses that might occur during the packaging
> process, too. If someone who alleges that their packaging PG is really
> packaging PG w/badstuff123.patch, how would we catch that?

I don't think we realistically can. The environment, configure and compiler
options will influence things too much to do any sort of automatic
verification.

But that shouldn't stop us from ensuring that at least the packages
distributed via *.postgresql.org are reproducibly built.

Another good avenue for introducing an attack would be to propose some distro
specific changes to the packaging teams - there's a lot fewer eyes there.  I
think it might be worth working with some of our packagers to integrate more
of their changes into our tree.


> I can't for example verify what the infrastructure team is doing, or what
> Tom does when he builds the release tarballs.

This one however, I think we could improve upon. Making sure the tarball
generation is completely binary reproducible and providing means of checking
that would surely help. This will be a lot easier if we, as dicussed
elsewhere, I believe, split out the generated docs into a separately
downloadable archive. We already stopped including other generated files
recently.


> We shouldn't make the mistake of assuming that bad things can't happen to
> us.

+1

Greetings,

Andres Freund




Re: Security lessons from liblzma

2024-03-30 Thread Robert Haas
On Fri, Mar 29, 2024 at 7:00 PM Andres Freund  wrote:
> I am doubtful that every committer would find something sneaky hidden in
> e.g. one of the test changes in a large commit. It's not too hard to hide
> something sneaky. I comparison to that hiding something in configure.ac seems
> less likely to succeed IMO, that imo tends to be more scrutinized. And hiding
> just in configure directly wouldn't get you far, it'd just get removed when
> the committer or some other committer at a later time, regenerates configure.

I agree with this. If I were trying to get away with a malicious
commit, I'd look for files that other people would be unlikely to
examine closely, or would have difficulty examining closely. Test data
or test scripts seem like great possibilities. And I also would like
it to be part of some relatively large commit that is annoying to read
through visually. We don't have a lot of binary format files in the
tree, which is good, but there's probably some things like Unicode
tables and ECPG expected output files that very, very few people ever
actually examine. If we had a file in the tree that looked based on
the name like an expected output file for a test, but there was no
corresponding test, how many of us would notice that? How many of us
would scrutinize it? Imagine hiding something bad in the middle of
that file somewhere.

Maybe we need some kind of tool that scores files for risk. Longer
files are riskier. Binary files are riskier, as are text files that
are something other than plain English/C code/SGML. Files that haven't
changed in a long time are not risky, but files with few recent
changes are riskier than files with many recent changes, especially if
only 1 or 2 committers made all of those changes, and especially if
those commits also touched a lot of other files. Of course, if we had
a tool like this that were public, I suppose anyone targeting PG would
look at the tool and try to find ways around its heuristics. But maybe
we should have something and not disclose the whole algorithm
publicly, or even if we do disclose it all, having something is
probably better than having nothing. It might force a hypothetical bad
actor to do things that would be more likely to be noticed by the
humans.

We might also want to move toward signing commits and tags. One of the
meson maintainers was recommending that on-list not long ago.

We should think about weaknesses that might occur during the packaging
process, too. If someone who alleges that their packaging PG is really
packaging PG w/badstuff123.patch, how would we catch that?

An awful lot of what we do operates on the principle that we know the
people who are involved and trust them, and I'm glad we do trust them,
but the world is full of people who trusted somebody too much and
regretted it afterwards. The fact that we have many committers rather
than a single maintainer probably reduces risk at least as far as the
source repository is concerned, because there are more people paying
attention to potentially notice something that isn't as it should be.
But it's also more potential points of compromise, and a lot of things
outside of that repository are not easy to audit. I can't for example
verify what the infrastructure team is doing, or what Tom does when he
builds the release tarballs. It seems like a stretch to imagine
someone taking over Tom's online identity while simultaneously
rendering him incommunicado ... but at the same time, the people
behind this attack obviously put a lot of work into it and had a lot
of resources available to craft the attack. We shouldn't make the
mistake of assuming that bad things can't happen to us.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Security lessons from liblzma

2024-03-29 Thread Tom Lane
Daniel Gustafsson  writes:
> One take-away for me is how important it is to ship recipes for regenerating
> any testdata which is included in generated/compiled/binary format.

IMO that's a hard, no-exceptions requirement just for
maintainability's sake, never mind security risks.

regards, tom lane




Re: Security lessons from liblzma

2024-03-29 Thread Daniel Gustafsson
> On 29 Mar 2024, at 23:59, Andres Freund  wrote:
> On 2024-03-29 18:37:24 -0400, Bruce Momjian wrote:

>> Now, we don't take pull requests, and all our committers are known
>> individuals, but this might have cautionary lessons for us.
> 
> I am doubtful that every committer would find something sneaky hidden in
> e.g. one of the test changes in a large commit. It's not too hard to hide
> something sneaky.

One take-away for me is how important it is to ship recipes for regenerating
any testdata which is included in generated/compiled/binary format.  Kind of
how we in our tree ship the config for test TLS certificates and keys which can
be manually inspected, and used to rebuild the testdata (although the risk for
injections in this particular case seems low).  Bad things can still be
injected, but formats which allow manual review at least goes some way towards
lowering risk.

--
Daniel Gustafsson





Re: Security lessons from liblzma

2024-03-29 Thread Andres Freund
Hi,

On 2024-03-29 18:37:24 -0400, Bruce Momjian wrote:
> You might have seen reports today about a very complex exploit added to
> recent versions of liblzma.  Fortunately, it was only enabled two months
> ago and has not been pushed to most stable operating systems like Debian
> and Ubuntu.  The original detection report is:
> 
> https://www.openwall.com/lists/oss-security/2024/03/29/4
> 
> And this ycombinator discussion has details:
> 
> https://news.ycombinator.com/item?id=39865810
> 
> It looks like an earlier commit with a binary blob "test data"
> contained the bulk of the backdoor, then the configure script
> enabled it, and then later commits patched up valgrind errors
> caused by the backdoor. See the commit links in the "Compromised
> Repository" section.
> 
> and I think the configure came in through the autoconf output file
> 'configure', not configure.ac:
>
> This is my main take-away from this. We must stop using upstream
> configure and other "binary" scripts. Delete them all and run
> "autoreconf -fi" to recreate them. (Debian already does something
> like this I think.)

I don't think that's a useful lesson here, actually. In this case, if you ran
autoreconf -fi in a released tarball, it'd just recreate precisely what
the tarball already contained, including the exploit.

I think the issue in this case rather was that the tarball contains files that
are not in the release - a lot of them.  The attackers injected the
'activating' part of the backdoor into the release tarball, while it was not
present in the git tree. They did that because they knew that distributions
often build from release tarballs.

If the release pre-backdoor release tarball had been identical to the git
repository, this would likely have been noticed by packagers - but it was
normal for there to be a lot of new files.

We traditionally had also a lot of generated files in the tarball that weren't
in our git tree - but we removed a lot of that a few months ago, when we
stopped including bison/flex/other generated code.

We still include generated docs, but that's much harder to exploit at
scale.  However, they still make it harder to verify that the release is
exactly the same as the git tree.


> Now, we don't take pull requests, and all our committers are known
> individuals, but this might have cautionary lessons for us.

I am doubtful that every committer would find something sneaky hidden in
e.g. one of the test changes in a large commit. It's not too hard to hide
something sneaky. I comparison to that hiding something in configure.ac seems
less likely to succeed IMO, that imo tends to be more scrutinized. And hiding
just in configure directly wouldn't get you far, it'd just get removed when
the committer or some other committer at a later time, regenerates configure.

Greetings,

Andres Freund




Re: Security lessons from liblzma

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 11:37 AM Bruce Momjian  wrote:
> You might have seen reports today about a very complex exploit added to
> recent versions of liblzma.  Fortunately, it was only enabled two months
> ago and has not been pushed to most stable operating systems like Debian
> and Ubuntu.  The original detection report is:
>
> https://www.openwall.com/lists/oss-security/2024/03/29/4

Incredible work from Andres.  The attackers made a serious strategic
mistake: they made PostgreSQL slightly slower.




Security lessons from liblzma

2024-03-29 Thread Bruce Momjian
You might have seen reports today about a very complex exploit added to
recent versions of liblzma.  Fortunately, it was only enabled two months
ago and has not been pushed to most stable operating systems like Debian
and Ubuntu.  The original detection report is:

https://www.openwall.com/lists/oss-security/2024/03/29/4

And this ycombinator discussion has details:

https://news.ycombinator.com/item?id=39865810

It looks like an earlier commit with a binary blob "test data"
contained the bulk of the backdoor, then the configure script
enabled it, and then later commits patched up valgrind errors
caused by the backdoor. See the commit links in the "Compromised
Repository" section.

and I think the configure came in through the autoconf output file
'configure', not configure.ac:

This is my main take-away from this. We must stop using upstream
configure and other "binary" scripts. Delete them all and run
"autoreconf -fi" to recreate them. (Debian already does something
like this I think.)

Now, we don't take pull requests, and all our committers are known
individuals, but this might have cautionary lessons for us.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.