Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-04-08 Thread Christoph Berg
Re: Paul Gevers 2019-04-06 
> Regarding this PostgreSQL reindexing issue, is there anything we need to
> mention in the release-notes? If this isn't fleshed out, but the most
> likely answer is yes, than I'd appreciate it to receive a bug against
> release-notes to remind us about it later on. Text can come later when
> it is clear what needs to be done.

Opened #926627 for that.

Note that I still need input on how to raise the message on the
packaging side.

Christoph



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-04-06 Thread Paul Gevers
Dear all,

Regarding this PostgreSQL reindexing issue, is there anything we need to
mention in the release-notes? If this isn't fleshed out, but the most
likely answer is yes, than I'd appreciate it to receive a bug against
release-notes to remind us about it later on. Text can come later when
it is clear what needs to be done.

Paul



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-26 Thread Philipp Kern
On 3/26/2019 3:20 PM, Christoph Berg wrote:
> We were thinking about doing something like that, but that doesn't
> work for the general case - most libc upgrades do not break
> everything, and reindexing would be overkill. It might help for the
> 2.28 upgrade, but getting this to work consistently would require lots
> of scripting with lots of cornercases to cover. I don't think it is
> possible to get this working reliably now, especially as we would need
> to push that "fix" into stretch-proposed-updates as well. (Because
> libc6 will likely be upgraded first, before the new postgresql-common
> version could take action.)

Technically the latter could be solved by libc6 in testing adding a
breaks on postgresql-common. As neither postgresql-common nor
postgresql-client-common seem to depend on libc6 at all, it doesn't
immediately seem crazy to me to do that.

But I don't dispute that the complexity could be high to do this
properly. It's unfortunate that this came up that late, given that it
was already a problem for users of testing.

Kind regards
Philipp Kern



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-26 Thread Christoph Berg
Re: Philipp Kern 2019-03-26 <66988de0-f9be-14c0-6b64-df64261fe...@philkern.de>
> I suspect this is why MySQL keeps a whole zoo of collations internally
> that never change.

DB2 and Oracle bundle ICU for that reason, afaict. (But bundling
software has other problems, as we all know...)

> Is there a way upon next (re)start to have a startup script check for
> this case and reindex automatically then - at the expense of a hugely
> enlarged downtime? Say, with a flag file that keeps the glibc major
> version at last restart time around - for the first iteration on this?

We were thinking about doing something like that, but that doesn't
work for the general case - most libc upgrades do not break
everything, and reindexing would be overkill. It might help for the
2.28 upgrade, but getting this to work consistently would require lots
of scripting with lots of cornercases to cover. I don't think it is
possible to get this working reliably now, especially as we would need
to push that "fix" into stretch-proposed-updates as well. (Because
libc6 will likely be upgraded first, before the new postgresql-common
version could take action.)

> That's at least better than silent data corruption, even if still
> disruptive. On the other hand I guess you'd need to start the cluster
> for serving anyway for reindex to work and would then serve broken data
> in the meantime, too?

That's part of the problem, yes.

Christoph



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-26 Thread Philipp Kern
On 3/26/2019 9:45 AM, Christoph Berg wrote:
> Unfortunately not. PostgreSQL supports ICU, but not as the global
> locale for clusters/databases, which is still libc only. And even if
> it was supported, it's not the default, and we are still breaking all
> installations.

I suspect this is why MySQL keeps a whole zoo of collations internally
that never change.

>>> I've been thinking about this for some time, and the best I could come
>>> up so far is "raise a debconf note that people need to invoke REINDEX
>>> DATABASE". The open question about this plan is, how should this note
>>> be triggered.
>>
>> That might not work for unique indices because locale data changes
>> could cause strings to sort the same that were distinct before the
>> update.
> 
> Well, that's not an argument for silently doing nothing. And I doubt
> that this case even exists, for any two distinct strings, the
> collation should output a consistent "less than" or "greater than"
> answer.
> 
> I forgot to mention Plan 3: Mention this in the release notes.
> That should be done anyway, the question being if that is enough.
> My suspicion is that few people actually read the release notes, so
> some notification from inside the system would be needed as well.
> Be it a debconf note, and/or a NEWS.Debian entry somewhere.

Is there a way upon next (re)start to have a startup script check for
this case and reindex automatically then - at the expense of a hugely
enlarged downtime? Say, with a flag file that keeps the glibc major
version at last restart time around - for the first iteration on this?

That's at least better than silent data corruption, even if still
disruptive. On the other hand I guess you'd need to start the cluster
for serving anyway for reindex to work and would then serve broken data
in the meantime, too?

Kind regards
Philipp Kern



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-26 Thread Christoph Berg
Re: Florian Weimer 2019-03-25 <87o95yhp3h@mid.deneb.enyo.de>
> > For PostgreSQL, this means that the ordering of indexes on disk is
> > becoming corrupt, and all "text" (varchar, char, ...) indexes need to
> > be rebuilt. (And worse, if that is not done immediately, the tables
> > might become corrupt because some tuples aren't index-visible anymore
> > due to the incorrect btree ordering.)
> 
> That's fairly normal in a glibc update.  glibc upstream prefers it
> this way.  I've discussed it several times with other glibc
> maintainers.

Changes are normal. What's not normal here is the scale of the
changes, indexes will break for virtually all users.

> My understanding is that ICU provides versioned collation tables,
> which would allow you to avoid this issue.
> 
>   

Unfortunately not. PostgreSQL supports ICU, but not as the global
locale for clusters/databases, which is still libc only. And even if
it was supported, it's not the default, and we are still breaking all
installations.

> > I've been thinking about this for some time, and the best I could come
> > up so far is "raise a debconf note that people need to invoke REINDEX
> > DATABASE". The open question about this plan is, how should this note
> > be triggered.
> 
> That might not work for unique indices because locale data changes
> could cause strings to sort the same that were distinct before the
> update.

Well, that's not an argument for silently doing nothing. And I doubt
that this case even exists, for any two distinct strings, the
collation should output a consistent "less than" or "greater than"
answer.

I forgot to mention Plan 3: Mention this in the release notes.
That should be done anyway, the question being if that is enough.
My suspicion is that few people actually read the release notes, so
some notification from inside the system would be needed as well.
Be it a debconf note, and/or a NEWS.Debian entry somewhere.

I deem this to be release-critical for PostgreSQL users. The reason
I'm asking here is to get input which plan is the best.

Christoph



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-25 Thread Florian Weimer
* Christoph Berg:

> with the update to glibc 2.28, collation aka sort ordering is
> changing:
>
> $ echo $LANG
> de_DE.utf8
> $ (echo 'a-a'; echo 'a a'; echo 'a+a'; echo 'aa') | sort
>
> stretch:
>   aa
>   a a
>   a-a
>   a+a
>
> buster:
>   a a
>   a+a
>   a-a
>   aa
>
> A vast number of locales is affected, including en_US, possibly all of
> them.
>
> For PostgreSQL, this means that the ordering of indexes on disk is
> becoming corrupt, and all "text" (varchar, char, ...) indexes need to
> be rebuilt. (And worse, if that is not done immediately, the tables
> might become corrupt because some tuples aren't index-visible anymore
> due to the incorrect btree ordering.)

That's fairly normal in a glibc update.  glibc upstream prefers it
this way.  I've discussed it several times with other glibc
maintainers.

My understanding is that ICU provides versioned collation tables,
which would allow you to avoid this issue.

  

> I've been thinking about this for some time, and the best I could come
> up so far is "raise a debconf note that people need to invoke REINDEX
> DATABASE". The open question about this plan is, how should this note
> be triggered.

That might not work for unique indices because locale data changes
could cause strings to sort the same that were distinct before the
update.



Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-25 Thread Christoph Berg
Hi,

with the update to glibc 2.28, collation aka sort ordering is
changing:

$ echo $LANG
de_DE.utf8
$ (echo 'a-a'; echo 'a a'; echo 'a+a'; echo 'aa') | sort

stretch:
  aa
  a a
  a-a
  a+a

buster:
  a a
  a+a
  a-a
  aa

A vast number of locales is affected, including en_US, possibly all of
them.

For PostgreSQL, this means that the ordering of indexes on disk is
becoming corrupt, and all "text" (varchar, char, ...) indexes need to
be rebuilt. (And worse, if that is not done immediately, the tables
might become corrupt because some tuples aren't index-visible anymore
due to the incorrect btree ordering.)

https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html
https://www.postgresql.org/message-id/9cbd8ba7-899f-4ed3-92b1-902b0d245...@manitou-mail.org

The PostgreSQL project is discussing how this could be handled inside
the database, but a) it's totally unclear how this could be detected
generically, not just for this set of test strings, and b) Debian
needs a fix now, not something that might appear in PostgreSQL 12 or 13.

I've been thinking about this for some time, and the best I could come
up so far is "raise a debconf note that people need to invoke REINDEX
DATABASE". The open question about this plan is, how should this note
be triggered.

Plan 1: Add a check if there are any postgresql clusters in
/etc/postgresql/, and raise the warning from locales.postinst and
locales-all.postinst.

Plan 2: Add a trigger to postgresql-common that checks if
locales(-all) are being upgraded, and raise the warning from there.
(This plan has the downside that we'd need to fix postgresql-common in
stretch to have the same check.)

Plan 1 looks much better.

I'm sorry that I didn't raise that earlier because I had hoped to come
up with some smarter solution that would take some burden from the
user having to run commands manually.

Does that make sense? Are there any options that I missed? Are there
any other packages affected? How do we proceed?

Christoph