Help. The database was created using collation version 2.17, but the operating system provides version 2.34.

2024-06-20 Thread Dmitry O Litvintsev
Hello, 

I am in the process of migrating DB to Alma9 host. The databse 
is rather large - few TBs. 

I have run pg_basebackup on Alma9 host and established replication from 
production to it. The idea is to quickly switch from master to this new host 
during downtime. 

Establishing replication went fine. Source postgresql version is 15.6, 
destination is 15.7

When I psql into replica I get:

WARNING:  database "xxx" has a collation version mismatch
DETAIL:  The database was created using collation version 2.17, but the 
operating system provides version 2.34.
HINT:  Rebuild all objects in this database that use the default collation and 
run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL with the 
right library version.

Looking up the issue the solution seems to be 

  REINDEX database xxx 
  ALTER DATABASE xxx REFRESH COLLATION VERSION

But this defeats the whole idea of having short downtime because REINDEX will 
take forever. 

What is this "or build PostgreSQL with the right library version"?
Is this about 15.7 vs 15.6 or is it about different glibc version between RH7 
and Alma9?

Is there a better way to handle it? I cannot afford long downtime. 
This came up rather unexpectedly and I am now in a tight situation having to 
find solution fast. I do not recall having similar issue when going from RH6 to 
RH7. 

Thank you for your help.




Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.

2024-06-20 Thread Achilleas Mantzios - cloud

hi

On 6/20/24 10:23, Dmitry O Litvintsev wrote:

Hello,

I am in the process of migrating DB to Alma9 host. The databse
is rather large - few TBs.

I have run pg_basebackup on Alma9 host and established replication from 
production to it. The idea is to quickly switch from master to this new host 
during downtime.

Establishing replication went fine. Source postgresql version is 15.6, 
destination is 15.7


You mean physical replication or logical ? In case of logical how did 
you initdb ?


Did you build postgresql from source or using a RH package ?

sorry for not being able to provide anything helpful.



When I psql into replica I get:

WARNING:  database "xxx" has a collation version mismatch
DETAIL:  The database was created using collation version 2.17, but the 
operating system provides version 2.34.
HINT:  Rebuild all objects in this database that use the default collation and 
run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL with the 
right library version.

Looking up the issue the solution seems to be

   REINDEX database xxx
   ALTER DATABASE xxx REFRESH COLLATION VERSION

But this defeats the whole idea of having short downtime because REINDEX will 
take forever.

What is this "or build PostgreSQL with the right library version"?
Is this about 15.7 vs 15.6 or is it about different glibc version between RH7 
and Alma9?

Is there a better way to handle it? I cannot afford long downtime.
This came up rather unexpectedly and I am now in a tight situation having to 
find solution fast. I do not recall having similar issue when going from RH6 to 
RH7.

Thank you for your help.







Postgresql python in upgraded version 16.2

2024-06-20 Thread Šika Michal
TLP:AMBER

Hello all,
I upgraded version of Postgres from 13.5 to newest 16.2. But Python was not 
upgraded.

I create procedure pyver():
CREATE OR REPLACE FUNCTION pyver ()
 RETURNS TEXT
 AS $$
 import sys
 pyversion = sys.version
 return pyversion
 $$ LANGUAGE 'plpython3u';

On the Postgres 13.5 I get this result:
# psql -d database
psql (13.5)
Type "help" for help.

postgres@database # select pyver();
  pyver
-
 3.6.8 (default, Aug 13 2020, 07:36:02) +
 [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
(1 řádka)

postgres@database #

On the Postgres 16.2 (upgraded from version 13.5) I get this result:
# psql -d database
psql (16.2)
Type "help" for help.

database=# select pyver();
  pyver
-
 3.6.8 (default, Jan  5 2024, 09:14:44) +
 [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)]
(1 row)

database=#

It seems python was not upgraded when I upgrade PostgreSQL. Is possible upgrade 
the python to actual version 3.9.x installed on RedHat 8.5 ?

Thanks
Michal


TLP:AMBER


Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.

2024-06-20 Thread Kenneth Barber
> I am in the process of migrating DB to Alma9 host. The databse
> is rather large - few TBs.
>
> I have run pg_basebackup on Alma9 host and established replication from 
> production to it. The idea is to quickly switch from master to this new host 
> during downtime.
>
> Establishing replication went fine. Source postgresql version is 15.6, 
> destination is 15.7

What replication did you use? If it's streaming, you will have this
problem. If you can use logical replication and something like:

https://github.com/dimitri/pgcopydb

It will allow you to use the logical replication trick to lower
downtime. It involves copying the database using a parallel transfer
(and some pg_dump/pg_restore wrapping for the structural stuff), and
using logical replication to keep the target up to date. It will allow
you to keep writing to your source DB while it moves, but at the cost
of disabling DDL while it's happening.

Look at the pgcopydb clone --follow documentation for more info.

I'm sure there are other options people can recommend also.

I would test it thoroughly beforehand to make sure it's a fit.

> When I psql into replica I get:
>
> WARNING:  database "xxx" has a collation version mismatch
> DETAIL:  The database was created using collation version 2.17, but the 
> operating system provides version 2.34.
> HINT:  Rebuild all objects in this database that use the default collation 
> and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL 
> with the right library version.
>
> Looking up the issue the solution seems to be
>
>   REINDEX database xxx
>   ALTER DATABASE xxx REFRESH COLLATION VERSION

Here we've taken two approaches when we just "copied the disks over"
so to speak:

* Backport the collation library (tricky, I can explain this deeper,
but it's tricky)
* Reindex after migration (slow but less tricky)

We had this problem going from xenial to focal, and we had to pin to
something compatible with the xenial libc.

> But this defeats the whole idea of having short downtime because REINDEX will 
> take forever.
>
> What is this "or build PostgreSQL with the right library version"?
> Is this about 15.7 vs 15.6 or is it about different glibc version between RH7 
> and Alma9?
>
> Is there a better way to handle it? I cannot afford long downtime.
> This came up rather unexpectedly and I am now in a tight situation having to 
> find solution fast. I do not recall having similar issue when going from RH6 
> to RH7.

Unfortunately you've hit a bad problem that a few of us have probably
already been through. At least you don't have thousands of these
things :-).

ken.




Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.

2024-06-20 Thread Daniel Verite
Dmitry O Litvintsev wrote:

> WARNING:  database "xxx" has a collation version mismatch
> DETAIL:  The database was created using collation version 2.17, but the
> operating system provides version 2.34.
> HINT:  Rebuild all objects in this database that use the default collation
> and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL
> with the right library version.

This upgrade comprises the major change in GNU libc 2.28,
so indeed text indexes created by 2.17 are very likely unsafe to use
on your new server.
See https://wiki.postgresql.org/wiki/Locale_data_changes

>  REINDEX database xxx 
> ALTER DATABASE xxx REFRESH COLLATION VERSION
> But this defeats the whole idea of having short downtime because REINDEX
> will take forever. 

The indexes that don't involve collatable types (text,varchar), and those
that use the C collation don't need to be reindexed.
Maybe you can reduce significantly the downtime by including only
the ones that matter.

The wiki page gives the query to obtain the list of affected indexes:

SELECT DISTINCT indrelid::regclass::text, indexrelid::regclass::text,
collname, pg_get_indexdef(indexrelid) 
FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index,
generate_subscripts(indcollation, 1) g(i)) s 
  JOIN pg_collation c ON coll=c.oid
WHERE collprovider IN ('d', 'c') AND collname NOT IN ('C', 'POSIX');

> I do not recall having similar issue when going from RH6 to RH7.

This warning was added relatively recently, in Postgres 15 (october 2022).


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite




Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.

2024-06-20 Thread Ron Johnson
On Thu, Jun 20, 2024 at 3:23 AM Dmitry O Litvintsev 
wrote:

> Hello,
>
> I am in the process of migrating DB to Alma9 host. The databse
> is rather large - few TBs.
>
> I have run pg_basebackup on Alma9 host and established replication from
> production to it. The idea is to quickly switch from master to this new
> host during downtime.
>
> Establishing replication went fine. Source postgresql version is 15.6,
> destination is 15.7
>
> When I psql into replica I get:
>
> WARNING:  database "xxx" has a collation version mismatch
> DETAIL:  The database was created using collation version 2.17, but the
> operating system provides version 2.34.
> HINT:  Rebuild all objects in this database that use the default collation
> and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL
> with the right library version.
>
> Looking up the issue the solution seems to be
>
>   REINDEX database xxx
>   ALTER DATABASE xxx REFRESH COLLATION VERSION
>
> But this defeats the whole idea of having short downtime because REINDEX
> will take forever.
>
> What is this "or build PostgreSQL with the right library version"?
> Is this about 15.7 vs 15.6 or is it about different glibc version between
> RH7 and Alma9?
>
> Is there a better way to handle it? I cannot afford long downtime.


You "only" need to REINDEX indices with TEXT (including CHAR and VARCHAR)
columns. That may be most of your indices, or very few.

I use this view and query to find such indices:

create or replace view dba.all_indices_types as
select tbcl.relnamespace::regnamespace::text||'.'||tbcl.relname as
table_name
, ndcl.relname as index_name
, array_agg(ty.typname order by att.attnum) as index_types
from pg_class ndcl
inner join pg_index nd
on (ndcl.oid = nd.indexrelid and ndcl.relkind = 'i')
inner join pg_class tbcl
on (nd.indrelid = tbcl.oid and tbcl.relkind = 'r')
inner join pg_attribute att
on att.attrelid = nd.indexrelid
inner join pg_type ty
on att.atttypid = ty.oid
where tbcl.relnamespace::regnamespace::text != 'pg_catalog'
group by tbcl.relnamespace::regnamespace::text||'.'||tbcl.relname
, ndcl.relname
order by 1, 2;

select *
from dba.all_indices_types
where index_types && '{"text","varchar","char"}';


Re: Transaction issue

2024-06-20 Thread Karsten Hilbert
Am Wed, Jun 19, 2024 at 02:32:07PM -0700 schrieb Rich Shepard:

> Yes, I see how this works if the transaction is committed. But before I
> commit the transaction I run a select statement to ensure the rows added are
> correct. Can I rollback a commited transaction? I've assumed not, so I won't
> commit the transaction without testing. And I'm not getting a detailed error
> message.

Shot in the dark: are you be any chance using tab-completion
when running the SELECT before the COMMIT ?

I've seen it happen a few times that tab completion ran
queries behind the scenes which failed and thereby set the
open transaction to abort state ...

And, no, I can't reproduce :/

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Transaction issue

2024-06-20 Thread Rich Shepard

On Thu, 20 Jun 2024, Karsten Hilbert wrote:


Shot in the dark: are you be any chance using tab-completion
when running the SELECT before the COMMIT ?


Karsten,

Nope. I prepare DDL, DML, and DQL scripts in emacs, then run them from the
psql prompt using \i .

Regards,

Rich




Re: Transaction issue

2024-06-20 Thread Rich Shepard

On Wed, 19 Jun 2024, Ron Johnson wrote:


In addition, manually run the "BEGIN;" before the "\i insert-blarge.sql"
command.

That way, insert-blarge.sql just inserts. My reasoning: since you control
the ROLLBACK, you should also control the BEGIN.


Ron,

Hadn't thought of doing that, but now will. Thanks for the excellent
recomendation.

Regards,

Rich




Re: Transaction issue

2024-06-20 Thread Rob Sargent



> On Jun 20, 2024, at 7:05 AM, Rich Shepard  wrote:
> 
> On Thu, 20 Jun 2024, Karsten Hilbert wrote:
> 
>> Shot in the dark: are you be any chance using tab-completion
>> when running the SELECT before the COMMIT ?
> 
> Karsten,
> 
> Nope. I prepare DDL, DML, and DQL scripts in emacs, then run them from the
> psql prompt using \i .
> 

Is psql running in emacs (which is my preferred sql shell. M-x sql-postgres)?

Also change all semi-colons to \p\g to confirm the error location.  Maybe trap 
the output

> Regards,
> 
> Rich
> 
> 




Re: Transaction issue

2024-06-20 Thread Rich Shepard

On Thu, 20 Jun 2024, Rob Sargent wrote:


Is psql running in emacs (which is my preferred sql shell. M-x sql-postgres)?


Rob,

Not when I'm entering new data or updating existing tables. Otherwise, yes.

Regards,

Rich




Re: Transaction issue

2024-06-20 Thread Adrian Klaver

On 6/20/24 07:47, Rich Shepard wrote:

On Thu, 20 Jun 2024, Rob Sargent wrote:

Is psql running in emacs (which is my preferred sql shell. M-x 
sql-postgres)?


Rob,

Not when I'm entering new data or updating existing tables. Otherwise, yes.


From one of my previous posts(modified):

At this point I think you need to create a simple test case where:


1) You have script with
BEGIN;

COMMIT; --optional


2) In psql do \i 


3) Do what you did before to 'recover' from the error.


Shows us the content of the steps in your reply.




Regards,

Rich




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Postgresql python in upgraded version 16.2

2024-06-20 Thread Adrian Klaver

On 6/19/24 22:05, Šika Michal wrote:

TLP:AMBER


Hello all,
I upgraded version of Postgres from 13.5 to newest 16.2. But Python was 
not upgraded.


I create procedure pyver():
CREATE OR REPLACE FUNCTION pyver ()
  RETURNS TEXT
  AS $$
      import sys
      pyversion = sys.version
      return pyversion
  $$ LANGUAGE 'plpython3u';

On the Postgres 13.5 I get this result:
# psql -d database
psql (13.5)
Type "help" for help.

postgres@database # select pyver();
                   pyver
-
  3.6.8 (default, Aug 13 2020, 07:36:02) +
  [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
(1 řádka)

postgres@database #

On the Postgres 16.2 (upgraded from version 13.5) I get this result:
# psql -d database
psql (16.2)
Type "help" for help.

database=# select pyver();
                   pyver
-
  3.6.8 (default, Jan  5 2024, 09:14:44) +
  [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)]
(1 row)

database=#

It seems python was not upgraded when I upgrade PostgreSQL. Is possible 
upgrade the python to actual version 3.9.x installed on RedHat 8.5 ?


Where are you getting the packages from, the Red Hat repo or the PGDG repo?



Thanks
Michal

TLP:AMBER



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Transaction issue

2024-06-20 Thread Rich Shepard

On Thu, 20 Jun 2024, Adrian Klaver wrote:


From one of my previous posts(modified):
At this point I think you need to create a simple test case where:

1) You have script with
BEGIN;

COMMIT; --optional

2) In psql do \i 

3) Do what you did before to 'recover' from the error.

Shows us the content of the steps in your reply.


Adrian,

I belive that I did this yesterday. Regardless, either leaving off `BEGIN;'
from the top of the script, or entering it prior to running the script,
resolves the issue.

I've not seen a problem since finding the solutions.

Best regards,

Rich




Autovacuum, dead tuples and bloat

2024-06-20 Thread Shenavai, Manuel
Hi everyone,

we can see in our database, that the DB is 200GB of size, with 99% bloat. After 
vacuum full the DB decreases to 2GB.
DB total size: 200GB
DB bloat: 198 GB
DB non-bloat: 2GB

We further see, that during bulk updates (i.e. a long running transaction), the 
DB is still growing, i.e. the size of the DB growth by +20GB after the bulk 
updates.

My assumption is, that after an autovacuum, the 99% bloat should be available 
for usage again. But the DB size would stay at 200GB. In our case, I would only 
expect a growth of the DB, if the bulk-updates exceed the current DB size (i.e. 
220 GB).

How could I verify my assumption?

I think of two possibilities:

  1.  My assumption is wrong and for some reason the dead tuples are not 
cleaned so that the space cannot be reused
  2.  The bulk-update indeed exceeds the current DB size. (Then the growth is 
expected).

Can you help me to verify these assumptions? Are there any statistics available 
that could help me with my verification?

Thanks in advance &
Best regards,
Manuel


Re: Autovacuum, dead tuples and bloat

2024-06-20 Thread Adrian Klaver

On 6/20/24 09:46, Shenavai, Manuel wrote:

Hi everyone,

we can see in our database, that the DB is 200GB of size, with 99% 
bloat. After vacuum full the DB decreases to 2GB.


DB total size: 200GB

DB bloat: 198 GB >
DB non-bloat: 2GB

We further see, that during bulk updates (i.e. a long running 
transaction), the DB is still growing, i.e. the size of the DB growth by 
+20GB after the bulk updates.


How soon after the updates did you measure the above?



My assumption is, that after an autovacuum, the 99% bloat should be 
available for usage again. But the DB size would stay at 200GB. In our 
case, I would only expect a growth of the DB, if the bulk-updates exceed 
the current DB size (i.e. 220 GB).


Was the transaction completed(commit/rollback)?

Are there other transactions using the table or tables?



How could I verify my assumption?

I think of two possibilities:

 1. My assumption is wrong and for some reason the dead tuples are not
cleaned so that the space cannot be reused
 2. The bulk-update indeed exceeds the current DB size. (Then the growth
is expected).

Can you help me to verify these assumptions? Are there any statistics 
available that could help me with my verification?


Use:

https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW

Select the rows that cover the table or tables involved. Look at the 
vacuum/autovacuum/analyze fields.




Thanks in advance &

Best regards,

Manuel



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Autovacuum, dead tuples and bloat

2024-06-20 Thread Ron Johnson
On Thu, Jun 20, 2024 at 12:47 PM Shenavai, Manuel 
wrote:

> Hi everyone,
>
>
>
> we can see in our database, that the DB is 200GB of size, with 99% bloat.
> After vacuum full the DB decreases to 2GB.
>
> DB total size: 200GB
>
> DB bloat: 198 GB
>
> DB non-bloat: 2GB
>
>
>
> We further see, that during bulk updates (i.e. a long running
> transaction), the DB is still growing, i.e. the size of the DB growth by
> +20GB after the bulk updates.
>
>
>
> My assumption is, that after an autovacuum, the 99% bloat should be
> available for usage again. But the DB size would stay at 200GB. In our
> case, I would only expect a growth of the DB, if the bulk-updates exceed
> the current DB size (i.e. 220 GB).
>
>
That's also my understanding of how vacuum works.

Note: I disable autovacuum before bulk modifications, manually VACUUM
ANALYZE and then reenable autovacuum.  That way, autovacuum doesn't jump in
the middle of what I'm doing.

 How could I verify my assumption?
>
>
>
> I think of two possibilities:
>
>1. My assumption is wrong and for some reason the dead tuples are not
>cleaned so that the space cannot be reused
>2. The bulk-update indeed exceeds the current DB size. (Then the
>growth is expected).
>
>
>
> Can you help me to verify these assumptions? Are there any statistics
> available that could help me with my verification?
>

I've got a weekly process that deletes all records older than N days from a
set of tables.
db=# ALTER TABLE t1 SET (autovacuum_enabled = off);
db=# ALTER TABLE t2 SET (autovacuum_enabled = off);
db=# ALTER TABLE t3 SET (autovacuum_enabled = off);
db=# DELETE FROM t1 WHERE created_on < (CURRENT_TIMESTAMP - INTERVAL '90
DAY');
db=# DELETE FROM t2 WHERE created_on < (CURRENT_TIMESTAMP - INTERVAL '90
DAY');
db=# DELETE FROM t3 WHERE created_on < (CURRENT_TIMESTAMP - INTERVAL '90
DAY');
$ vacuumdb --jobs=3 -t t1 -t t2 -t t3
db=# ALTER TABLE t1 SET (autovacuum_enabled = on);
db=# ALTER TABLE t2 SET (autovacuum_enabled = on);
db=# ALTER TABLE t3 SET (autovacuum_enabled = on);

pgstattuple shows that that free percentage stays pretty constant.  That
seems to be what you're asking about.


Re: Autovacuum, dead tuples and bloat

2024-06-20 Thread Achilleas Mantzios

Στις 20/6/24 19:46, ο/η Shenavai, Manuel έγραψε:


Hi everyone,

we can see in our database, that the DB is 200GB of size, with 99% 
bloat. After vacuum full the DB decreases to 2GB.


DB total size: 200GB

DB bloat: 198 GB

DB non-bloat: 2GB

We further see, that during bulk updates (i.e. a long running 
transaction), the DB is still growing, i.e. the size of the DB growth 
by +20GB after the bulk updates.


My assumption is, that after an autovacuum, the 99% bloat should be 
available for usage again. But the DB size would stay at 200GB. In our 
case, I would only expect a growth of the DB, if the bulk-updates 
exceed the current DB size (i.e. 220 GB).


How could I verify my assumption?

I think of two possibilities:

 1. My assumption is wrong and for some reason the dead tuples are not
cleaned so that the space cannot be reused
 2. The bulk-update indeed exceeds the current DB size. (Then the
growth is expected).

Your only assumption should be the official manual, and other material 
such as books, articles from reputable sources, even reading the source 
as a last resort could be considered.


For starters : do you have autovacuum enabled ? If not, then enable this.

Then monitor for vacuum via pg_stat_user_tables, locate the tables that 
you would expect vacuum to have happened but did not, then consider 
autovacuum tuning.


Watch the logs for lines such as :

 dead row versions cannot be removed yet, oldest xmin: 

those are held from being marked as removed, due to being visible by 
long running transactions. Monitor for those transactions.


You have to monitor (if this is the case) about autovacuum being killed 
and not allowed to do its job.


Can you help me to verify these assumptions? Are there any statistics 
available that could help me with my verification?


Thanks in advance &

Best regards,

Manuel


--
Achilleas Mantzios
 IT DEV - HEAD
 IT DEPT
 Dynacom Tankers Mgmt (as agents only)