Help. The database was created using collation version 2.17, but the operating system provides version 2.34.
Hello, I am in the process of migrating DB to Alma9 host. The databse is rather large - few TBs. I have run pg_basebackup on Alma9 host and established replication from production to it. The idea is to quickly switch from master to this new host during downtime. Establishing replication went fine. Source postgresql version is 15.6, destination is 15.7 When I psql into replica I get: WARNING: database "xxx" has a collation version mismatch DETAIL: The database was created using collation version 2.17, but the operating system provides version 2.34. HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL with the right library version. Looking up the issue the solution seems to be REINDEX database xxx ALTER DATABASE xxx REFRESH COLLATION VERSION But this defeats the whole idea of having short downtime because REINDEX will take forever. What is this "or build PostgreSQL with the right library version"? Is this about 15.7 vs 15.6 or is it about different glibc version between RH7 and Alma9? Is there a better way to handle it? I cannot afford long downtime. This came up rather unexpectedly and I am now in a tight situation having to find solution fast. I do not recall having similar issue when going from RH6 to RH7. Thank you for your help.
Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.
hi On 6/20/24 10:23, Dmitry O Litvintsev wrote: Hello, I am in the process of migrating DB to Alma9 host. The databse is rather large - few TBs. I have run pg_basebackup on Alma9 host and established replication from production to it. The idea is to quickly switch from master to this new host during downtime. Establishing replication went fine. Source postgresql version is 15.6, destination is 15.7 You mean physical replication or logical ? In case of logical how did you initdb ? Did you build postgresql from source or using a RH package ? sorry for not being able to provide anything helpful. When I psql into replica I get: WARNING: database "xxx" has a collation version mismatch DETAIL: The database was created using collation version 2.17, but the operating system provides version 2.34. HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL with the right library version. Looking up the issue the solution seems to be REINDEX database xxx ALTER DATABASE xxx REFRESH COLLATION VERSION But this defeats the whole idea of having short downtime because REINDEX will take forever. What is this "or build PostgreSQL with the right library version"? Is this about 15.7 vs 15.6 or is it about different glibc version between RH7 and Alma9? Is there a better way to handle it? I cannot afford long downtime. This came up rather unexpectedly and I am now in a tight situation having to find solution fast. I do not recall having similar issue when going from RH6 to RH7. Thank you for your help.
Postgresql python in upgraded version 16.2
TLP:AMBER Hello all, I upgraded version of Postgres from 13.5 to newest 16.2. But Python was not upgraded. I create procedure pyver(): CREATE OR REPLACE FUNCTION pyver () RETURNS TEXT AS $$ import sys pyversion = sys.version return pyversion $$ LANGUAGE 'plpython3u'; On the Postgres 13.5 I get this result: # psql -d database psql (13.5) Type "help" for help. postgres@database # select pyver(); pyver - 3.6.8 (default, Aug 13 2020, 07:36:02) + [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] (1 řádka) postgres@database # On the Postgres 16.2 (upgraded from version 13.5) I get this result: # psql -d database psql (16.2) Type "help" for help. database=# select pyver(); pyver - 3.6.8 (default, Jan 5 2024, 09:14:44) + [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] (1 row) database=# It seems python was not upgraded when I upgrade PostgreSQL. Is possible upgrade the python to actual version 3.9.x installed on RedHat 8.5 ? Thanks Michal TLP:AMBER
Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.
> I am in the process of migrating DB to Alma9 host. The databse > is rather large - few TBs. > > I have run pg_basebackup on Alma9 host and established replication from > production to it. The idea is to quickly switch from master to this new host > during downtime. > > Establishing replication went fine. Source postgresql version is 15.6, > destination is 15.7 What replication did you use? If it's streaming, you will have this problem. If you can use logical replication and something like: https://github.com/dimitri/pgcopydb It will allow you to use the logical replication trick to lower downtime. It involves copying the database using a parallel transfer (and some pg_dump/pg_restore wrapping for the structural stuff), and using logical replication to keep the target up to date. It will allow you to keep writing to your source DB while it moves, but at the cost of disabling DDL while it's happening. Look at the pgcopydb clone --follow documentation for more info. I'm sure there are other options people can recommend also. I would test it thoroughly beforehand to make sure it's a fit. > When I psql into replica I get: > > WARNING: database "xxx" has a collation version mismatch > DETAIL: The database was created using collation version 2.17, but the > operating system provides version 2.34. > HINT: Rebuild all objects in this database that use the default collation > and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL > with the right library version. > > Looking up the issue the solution seems to be > > REINDEX database xxx > ALTER DATABASE xxx REFRESH COLLATION VERSION Here we've taken two approaches when we just "copied the disks over" so to speak: * Backport the collation library (tricky, I can explain this deeper, but it's tricky) * Reindex after migration (slow but less tricky) We had this problem going from xenial to focal, and we had to pin to something compatible with the xenial libc. > But this defeats the whole idea of having short downtime because REINDEX will > take forever. > > What is this "or build PostgreSQL with the right library version"? > Is this about 15.7 vs 15.6 or is it about different glibc version between RH7 > and Alma9? > > Is there a better way to handle it? I cannot afford long downtime. > This came up rather unexpectedly and I am now in a tight situation having to > find solution fast. I do not recall having similar issue when going from RH6 > to RH7. Unfortunately you've hit a bad problem that a few of us have probably already been through. At least you don't have thousands of these things :-). ken.
Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.
Dmitry O Litvintsev wrote: > WARNING: database "xxx" has a collation version mismatch > DETAIL: The database was created using collation version 2.17, but the > operating system provides version 2.34. > HINT: Rebuild all objects in this database that use the default collation > and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL > with the right library version. This upgrade comprises the major change in GNU libc 2.28, so indeed text indexes created by 2.17 are very likely unsafe to use on your new server. See https://wiki.postgresql.org/wiki/Locale_data_changes > REINDEX database xxx > ALTER DATABASE xxx REFRESH COLLATION VERSION > But this defeats the whole idea of having short downtime because REINDEX > will take forever. The indexes that don't involve collatable types (text,varchar), and those that use the C collation don't need to be reindexed. Maybe you can reduce significantly the downtime by including only the ones that matter. The wiki page gives the query to obtain the list of affected indexes: SELECT DISTINCT indrelid::regclass::text, indexrelid::regclass::text, collname, pg_get_indexdef(indexrelid) FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index, generate_subscripts(indcollation, 1) g(i)) s JOIN pg_collation c ON coll=c.oid WHERE collprovider IN ('d', 'c') AND collname NOT IN ('C', 'POSIX'); > I do not recall having similar issue when going from RH6 to RH7. This warning was added relatively recently, in Postgres 15 (october 2022). Best regards, -- Daniel Vérité https://postgresql.verite.pro/ Twitter: @DanielVerite
Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.
On Thu, Jun 20, 2024 at 3:23 AM Dmitry O Litvintsev wrote: > Hello, > > I am in the process of migrating DB to Alma9 host. The databse > is rather large - few TBs. > > I have run pg_basebackup on Alma9 host and established replication from > production to it. The idea is to quickly switch from master to this new > host during downtime. > > Establishing replication went fine. Source postgresql version is 15.6, > destination is 15.7 > > When I psql into replica I get: > > WARNING: database "xxx" has a collation version mismatch > DETAIL: The database was created using collation version 2.17, but the > operating system provides version 2.34. > HINT: Rebuild all objects in this database that use the default collation > and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL > with the right library version. > > Looking up the issue the solution seems to be > > REINDEX database xxx > ALTER DATABASE xxx REFRESH COLLATION VERSION > > But this defeats the whole idea of having short downtime because REINDEX > will take forever. > > What is this "or build PostgreSQL with the right library version"? > Is this about 15.7 vs 15.6 or is it about different glibc version between > RH7 and Alma9? > > Is there a better way to handle it? I cannot afford long downtime. You "only" need to REINDEX indices with TEXT (including CHAR and VARCHAR) columns. That may be most of your indices, or very few. I use this view and query to find such indices: create or replace view dba.all_indices_types as select tbcl.relnamespace::regnamespace::text||'.'||tbcl.relname as table_name , ndcl.relname as index_name , array_agg(ty.typname order by att.attnum) as index_types from pg_class ndcl inner join pg_index nd on (ndcl.oid = nd.indexrelid and ndcl.relkind = 'i') inner join pg_class tbcl on (nd.indrelid = tbcl.oid and tbcl.relkind = 'r') inner join pg_attribute att on att.attrelid = nd.indexrelid inner join pg_type ty on att.atttypid = ty.oid where tbcl.relnamespace::regnamespace::text != 'pg_catalog' group by tbcl.relnamespace::regnamespace::text||'.'||tbcl.relname , ndcl.relname order by 1, 2; select * from dba.all_indices_types where index_types && '{"text","varchar","char"}';
Re: Transaction issue
Am Wed, Jun 19, 2024 at 02:32:07PM -0700 schrieb Rich Shepard: > Yes, I see how this works if the transaction is committed. But before I > commit the transaction I run a select statement to ensure the rows added are > correct. Can I rollback a commited transaction? I've assumed not, so I won't > commit the transaction without testing. And I'm not getting a detailed error > message. Shot in the dark: are you be any chance using tab-completion when running the SELECT before the COMMIT ? I've seen it happen a few times that tab completion ran queries behind the scenes which failed and thereby set the open transaction to abort state ... And, no, I can't reproduce :/ Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: Transaction issue
On Thu, 20 Jun 2024, Karsten Hilbert wrote: Shot in the dark: are you be any chance using tab-completion when running the SELECT before the COMMIT ? Karsten, Nope. I prepare DDL, DML, and DQL scripts in emacs, then run them from the psql prompt using \i . Regards, Rich
Re: Transaction issue
On Wed, 19 Jun 2024, Ron Johnson wrote: In addition, manually run the "BEGIN;" before the "\i insert-blarge.sql" command. That way, insert-blarge.sql just inserts. My reasoning: since you control the ROLLBACK, you should also control the BEGIN. Ron, Hadn't thought of doing that, but now will. Thanks for the excellent recomendation. Regards, Rich
Re: Transaction issue
> On Jun 20, 2024, at 7:05 AM, Rich Shepard wrote: > > On Thu, 20 Jun 2024, Karsten Hilbert wrote: > >> Shot in the dark: are you be any chance using tab-completion >> when running the SELECT before the COMMIT ? > > Karsten, > > Nope. I prepare DDL, DML, and DQL scripts in emacs, then run them from the > psql prompt using \i . > Is psql running in emacs (which is my preferred sql shell. M-x sql-postgres)? Also change all semi-colons to \p\g to confirm the error location. Maybe trap the output > Regards, > > Rich > >
Re: Transaction issue
On Thu, 20 Jun 2024, Rob Sargent wrote: Is psql running in emacs (which is my preferred sql shell. M-x sql-postgres)? Rob, Not when I'm entering new data or updating existing tables. Otherwise, yes. Regards, Rich
Re: Transaction issue
On 6/20/24 07:47, Rich Shepard wrote: On Thu, 20 Jun 2024, Rob Sargent wrote: Is psql running in emacs (which is my preferred sql shell. M-x sql-postgres)? Rob, Not when I'm entering new data or updating existing tables. Otherwise, yes. From one of my previous posts(modified): At this point I think you need to create a simple test case where: 1) You have script with BEGIN; COMMIT; --optional 2) In psql do \i 3) Do what you did before to 'recover' from the error. Shows us the content of the steps in your reply. Regards, Rich -- Adrian Klaver adrian.kla...@aklaver.com
Re: Postgresql python in upgraded version 16.2
On 6/19/24 22:05, Šika Michal wrote: TLP:AMBER Hello all, I upgraded version of Postgres from 13.5 to newest 16.2. But Python was not upgraded. I create procedure pyver(): CREATE OR REPLACE FUNCTION pyver () RETURNS TEXT AS $$ import sys pyversion = sys.version return pyversion $$ LANGUAGE 'plpython3u'; On the Postgres 13.5 I get this result: # psql -d database psql (13.5) Type "help" for help. postgres@database # select pyver(); pyver - 3.6.8 (default, Aug 13 2020, 07:36:02) + [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] (1 řádka) postgres@database # On the Postgres 16.2 (upgraded from version 13.5) I get this result: # psql -d database psql (16.2) Type "help" for help. database=# select pyver(); pyver - 3.6.8 (default, Jan 5 2024, 09:14:44) + [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] (1 row) database=# It seems python was not upgraded when I upgrade PostgreSQL. Is possible upgrade the python to actual version 3.9.x installed on RedHat 8.5 ? Where are you getting the packages from, the Red Hat repo or the PGDG repo? Thanks Michal TLP:AMBER -- Adrian Klaver adrian.kla...@aklaver.com
Re: Transaction issue
On Thu, 20 Jun 2024, Adrian Klaver wrote: From one of my previous posts(modified): At this point I think you need to create a simple test case where: 1) You have script with BEGIN; COMMIT; --optional 2) In psql do \i 3) Do what you did before to 'recover' from the error. Shows us the content of the steps in your reply. Adrian, I belive that I did this yesterday. Regardless, either leaving off `BEGIN;' from the top of the script, or entering it prior to running the script, resolves the issue. I've not seen a problem since finding the solutions. Best regards, Rich
Autovacuum, dead tuples and bloat
Hi everyone, we can see in our database, that the DB is 200GB of size, with 99% bloat. After vacuum full the DB decreases to 2GB. DB total size: 200GB DB bloat: 198 GB DB non-bloat: 2GB We further see, that during bulk updates (i.e. a long running transaction), the DB is still growing, i.e. the size of the DB growth by +20GB after the bulk updates. My assumption is, that after an autovacuum, the 99% bloat should be available for usage again. But the DB size would stay at 200GB. In our case, I would only expect a growth of the DB, if the bulk-updates exceed the current DB size (i.e. 220 GB). How could I verify my assumption? I think of two possibilities: 1. My assumption is wrong and for some reason the dead tuples are not cleaned so that the space cannot be reused 2. The bulk-update indeed exceeds the current DB size. (Then the growth is expected). Can you help me to verify these assumptions? Are there any statistics available that could help me with my verification? Thanks in advance & Best regards, Manuel
Re: Autovacuum, dead tuples and bloat
On 6/20/24 09:46, Shenavai, Manuel wrote: Hi everyone, we can see in our database, that the DB is 200GB of size, with 99% bloat. After vacuum full the DB decreases to 2GB. DB total size: 200GB DB bloat: 198 GB > DB non-bloat: 2GB We further see, that during bulk updates (i.e. a long running transaction), the DB is still growing, i.e. the size of the DB growth by +20GB after the bulk updates. How soon after the updates did you measure the above? My assumption is, that after an autovacuum, the 99% bloat should be available for usage again. But the DB size would stay at 200GB. In our case, I would only expect a growth of the DB, if the bulk-updates exceed the current DB size (i.e. 220 GB). Was the transaction completed(commit/rollback)? Are there other transactions using the table or tables? How could I verify my assumption? I think of two possibilities: 1. My assumption is wrong and for some reason the dead tuples are not cleaned so that the space cannot be reused 2. The bulk-update indeed exceeds the current DB size. (Then the growth is expected). Can you help me to verify these assumptions? Are there any statistics available that could help me with my verification? Use: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW Select the rows that cover the table or tables involved. Look at the vacuum/autovacuum/analyze fields. Thanks in advance & Best regards, Manuel -- Adrian Klaver adrian.kla...@aklaver.com
Re: Autovacuum, dead tuples and bloat
On Thu, Jun 20, 2024 at 12:47 PM Shenavai, Manuel wrote: > Hi everyone, > > > > we can see in our database, that the DB is 200GB of size, with 99% bloat. > After vacuum full the DB decreases to 2GB. > > DB total size: 200GB > > DB bloat: 198 GB > > DB non-bloat: 2GB > > > > We further see, that during bulk updates (i.e. a long running > transaction), the DB is still growing, i.e. the size of the DB growth by > +20GB after the bulk updates. > > > > My assumption is, that after an autovacuum, the 99% bloat should be > available for usage again. But the DB size would stay at 200GB. In our > case, I would only expect a growth of the DB, if the bulk-updates exceed > the current DB size (i.e. 220 GB). > > That's also my understanding of how vacuum works. Note: I disable autovacuum before bulk modifications, manually VACUUM ANALYZE and then reenable autovacuum. That way, autovacuum doesn't jump in the middle of what I'm doing. How could I verify my assumption? > > > > I think of two possibilities: > >1. My assumption is wrong and for some reason the dead tuples are not >cleaned so that the space cannot be reused >2. The bulk-update indeed exceeds the current DB size. (Then the >growth is expected). > > > > Can you help me to verify these assumptions? Are there any statistics > available that could help me with my verification? > I've got a weekly process that deletes all records older than N days from a set of tables. db=# ALTER TABLE t1 SET (autovacuum_enabled = off); db=# ALTER TABLE t2 SET (autovacuum_enabled = off); db=# ALTER TABLE t3 SET (autovacuum_enabled = off); db=# DELETE FROM t1 WHERE created_on < (CURRENT_TIMESTAMP - INTERVAL '90 DAY'); db=# DELETE FROM t2 WHERE created_on < (CURRENT_TIMESTAMP - INTERVAL '90 DAY'); db=# DELETE FROM t3 WHERE created_on < (CURRENT_TIMESTAMP - INTERVAL '90 DAY'); $ vacuumdb --jobs=3 -t t1 -t t2 -t t3 db=# ALTER TABLE t1 SET (autovacuum_enabled = on); db=# ALTER TABLE t2 SET (autovacuum_enabled = on); db=# ALTER TABLE t3 SET (autovacuum_enabled = on); pgstattuple shows that that free percentage stays pretty constant. That seems to be what you're asking about.
Re: Autovacuum, dead tuples and bloat
Στις 20/6/24 19:46, ο/η Shenavai, Manuel έγραψε: Hi everyone, we can see in our database, that the DB is 200GB of size, with 99% bloat. After vacuum full the DB decreases to 2GB. DB total size: 200GB DB bloat: 198 GB DB non-bloat: 2GB We further see, that during bulk updates (i.e. a long running transaction), the DB is still growing, i.e. the size of the DB growth by +20GB after the bulk updates. My assumption is, that after an autovacuum, the 99% bloat should be available for usage again. But the DB size would stay at 200GB. In our case, I would only expect a growth of the DB, if the bulk-updates exceed the current DB size (i.e. 220 GB). How could I verify my assumption? I think of two possibilities: 1. My assumption is wrong and for some reason the dead tuples are not cleaned so that the space cannot be reused 2. The bulk-update indeed exceeds the current DB size. (Then the growth is expected). Your only assumption should be the official manual, and other material such as books, articles from reputable sources, even reading the source as a last resort could be considered. For starters : do you have autovacuum enabled ? If not, then enable this. Then monitor for vacuum via pg_stat_user_tables, locate the tables that you would expect vacuum to have happened but did not, then consider autovacuum tuning. Watch the logs for lines such as : dead row versions cannot be removed yet, oldest xmin: those are held from being marked as removed, due to being visible by long running transactions. Monitor for those transactions. You have to monitor (if this is the case) about autovacuum being killed and not allowed to do its job. Can you help me to verify these assumptions? Are there any statistics available that could help me with my verification? Thanks in advance & Best regards, Manuel -- Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt (as agents only)