Re: Issue with PostgreSQL Installer on Windows and Special Characters in the superuser password
On 5/31/24 14:18, Timo Schwaak wrote: Dear PostgreSQL community, I hope this is the right place for my concern. Today I discovered a strange behaviour of the Windows Installer of PostgreSQL. While migrating an application database from MSSQL, the Postgres setup did not seem to complete correctly and had problems starting the Postgres service after installation. I found that the service did not seem to be created at all. After some trying around with help I found online, trying non-domain users and some other stuff that was proposed for the problem, I found the problem lay in my chosen password. My password was randomly generated with alphanumeric + special characters. While this did not work, switching to alphanumeric-only passwords seemed to solve the problem. I suspect that the quotation marks in particular are causing problems in the setup, as they seem to terminate some strings prematurely in the scripts used by the installer. My setup was as follows: OS: Windows Server 2019, also tested on Windows Server 2022 PSQL version: 16.3 (latest installer from EDB) Steps to reproduce: 1. Run the installer 2. Enter any password that contains quotation marks 3. The installer should fail at the end and report some errors. Could you please help me investigate this, is this a known problem or is there even a known workaround? If this is not a known issue, can you help me point this report in the right direction so that it can be addressed? 1) Don't use passwords with quotation marks. 2) Contact https://www.enterprisedb.com/ and file an issue with them. The installer is their product not part of the community distribution. 3) Poke pgadmin-support here: https://www.postgresql.org/list/pgadmin-support/ That is from EDB also, maybe someone over there can help. 4) Jump on this thread at pgsql-bugs about installer : https://www.postgresql.org/message-id/flat/DU0PR10MB71447B1C6946731AD6C73D13C0E72%40DU0PR10MB7144.EURPRD10.PROD.OUTLOOK.COM Or start a new one here: https://www.postgresql.org/account/login/?next=/account/submitbug/ Best regards, Timo -- Adrian Klaver adrian.kla...@aklaver.com
Re: Rules and Command Status - update/insert/delete rule with series of commands in action
On 5/30/24 15:34, johnlu...@hotmail.com wrote: On 5/30/24 4:56 PM, David G. Johnston wrote: Very little interest exists in working on user-specified rules. They are practically deprecated. Ah - pity - see my last comment to Adrian's Any interest in perhaps providing a choice via a configuration parameter? If anything is done it would have to be new syntax. A much bigger task surely. On 5/30/24 5:19 PM, Adrian Klaver wrote: 2) Use INSTEAD OF triggers: Unfortunately the same functionality as in my example with the RULE is not supported for triggers on views : from the manual |INSTEAD OF| triggers may only be defined on views, *and only at row level*; A RULE is essentially a statement-level operation which is what I need for this particular case. A row-level trigger would not work because it cannot "see" the query causing it to be fired, and also , (most importantly) is not fired at all if no rows match the original query, whereas a RULE is always in effect regardless of which rows are involved. before. I should add that the RULE I showed in my example is not the only RULE being used on this view - there are other conditional RULEs, and the combined effect is of being able to change the effect of the original statement into a set of new statements, one of which does what is needed. Basically don't give the user what they asked for, give them some cobbled together on the fly version. Who decides that what the user needs? Seems to me this the point at which to have a discussion with the application developers about having the application asking the correct questions, rather then going down the road of bait and switch. And if you are now inclined to say "well, maybe the application itself is poorly written and should be changed" - I would have to agree, but that is not mine to change. But I suppose that my next question, given what you both say about the RULE system being a dead-end, is whether there is any likelihood of supporting an INSTEAD OF trigger on a view at statement level? Maybe that stands more chance of going somewhere? Cheers, John Lumby -- Adrian Klaver adrian.kla...@aklaver.com
Re: Rules and Command Status - update/insert/delete rule with series of commands in action
On 5/30/24 11:32, johnlu...@hotmail.com wrote: Any thoughts? Any rationales one way or the other? Any interest in perhaps providing a choice via a configuration parameter? 1) As to using rules, stop now before you invest the time to find out is a road you don't want to go down. 2) Use INSTEAD OF triggers: https://www.postgresql.org/docs/current/sql-createtrigger.html Cheers, John Lumby -- Adrian Klaver adrian.kla...@aklaver.com
Re: Dll libpq.dll 32 bits
On 5/29/24 11:18, José Mello Júnior wrote: Again reply to list by using Reply All. Ccing list. For my projects in 32 bits, the language of PC-SOFT (wlanguage) use. Are you talking about this?: https://windev.com/pcsoft/index.html Em qua., 29 de mai. de 2024, 15:09, Adrian Klaver mailto:adrian.kla...@aklaver.com>> escreveu: On 5/29/24 10:59 AM, José Mello Júnior wrote: Reply to list also. Ccing list. > Sorry, for windows That is the OS, what I am after is what client on Windows do you need the DLL for? > > > > Em qua., 29 de mai. de 2024, 14:53, Adrian Klaver > mailto:adrian.kla...@aklaver.com> <mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>> escreveu: > > > > On 5/29/24 10:49 AM, José Mello Júnior wrote: > > Where do i found this dll for acess postgresql 15? > > Access Postgres using what client? > > -- > Adrian Klaver > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> <mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>> > -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Dll libpq.dll 32 bits
On 5/29/24 10:59 AM, José Mello Júnior wrote: Reply to list also. Ccing list. Sorry, for windows That is the OS, what I am after is what client on Windows do you need the DLL for? Em qua., 29 de mai. de 2024, 14:53, Adrian Klaver mailto:adrian.kla...@aklaver.com>> escreveu: On 5/29/24 10:49 AM, José Mello Júnior wrote: > Where do i found this dll for acess postgresql 15? Access Postgres using what client? -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Dll libpq.dll 32 bits
On 5/29/24 10:49 AM, José Mello Júnior wrote: Where do i found this dll for acess postgresql 15? Access Postgres using what client? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Pgpool with high availability
On 5/28/24 22:15, vijay patil wrote: Hi Adrian, After modifying the pgpool.conf configuration to correct a subnet mistake, Previously, the wrong subnet was specified as /26, and it has now been corrected to /24. The configuration changes were as follows: This seems to be a networking issue: [root@staging-ha0001 ~]# ping 10.127.1.18 PING 10.127.1.18 (10.127.1.18) 56(84) bytes of data. From 10.127.1.10 icmp_seq=1 Destination Host Unreachable <...> Where are ha0001 and ha0003 on the network relative to ha0002? Also are they reachable for other services? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Pgpool with high availability
On 5/28/24 1:31 AM, vijay patil wrote: HI Team, "I'm encountering challenges while configuring Pgpool with high availability. The initial setup is completed, and Pgpool is operational on a single node, functioning without issues. However, upon attempting to start Pgpool on any additional nodes, particularly node 2, it becomes immediately unreachable. And how we are supposed to arrive at an answer with essentially no information provided? Need: 1) Configuration for initial setup. 2) A more detailed explanation of what "... upon attempting to start Pgpool on any additional nodes" means? Include configuration changes. 3) The error messages. 4) Where the nodes are located? I'm seeking assistance to address this issue. My setup consists of three nodes, each hosting both PostgreSQL and Pgpool services." Thanks Vijay -- Adrian Klaver adrian.kla...@aklaver.com
Re: expected authentication request from server, but received H
On 5/27/24 14:19, Peter J. Holzer wrote: On 2024-05-23 17:23:14 -0400, Tom Lane wrote: "Peter J. Holzer" writes: One of our users reports getting the error message "expected authentication request from server, but received H" when trying to connect to the database. That's very bizarre, and I don't recall any similar reports in the recent past. Server is PostgreSQL 15.7 (Ubuntu 15.7-1.pgdg22.04+1) I haven't been able to find out what client version they are using yet, but I suspect it's quite old. It would definitely be interesting to find out what the client-side code is, including the openssl version being used. They have psqlODBC 9.3 and 13.2 on a 64-bit Windows platform (probably Windows 10). The problem occurs with both. I tried to replicate it with psqlODBC 13.0 and couldn't. Just to be clear that is 9.03 and 13.02. OpenSSL is bundled with psqlODBC, AFAICS. For 13.0 that would be 1.1.1.8 (hmm, that doesn't look like an OpenSSL version. Would that be 1.1.1h in Linux terms?). But the fact that it happens with two very different versions but I can't replicate it with one in between (and quite close The only thing I could find that relates to SSL directly in the move from 13.00 --> 13.02 is: https://github.com/postgresql-interfaces/psqlodbc/commit/4eaa430e3ef06162a5655dc7baff9be40f2e1cce Which happened at 13.01. to one of them) makes it unlikely (IMHO) to be just version related. There must be something else going on. I've asked them (their IT department, not the user) to try to disable any anti-virus software. That's a bit of a reach, but always a possibility if something behaves weirdly on Windows. hp -- Adrian Klaver adrian.kla...@aklaver.com
Re: Long running query causing XID limit breach
On 5/25/24 22:56, David HJ wrote: anyone know how to describe from this mailing list? See here: https://lists.postgresql.org/unsubscribe/ -- Adrian Klaver adrian.kla...@aklaver.com
Re: Json table/column design question
On 5/23/24 09:06, Skorpeo Skorpeo wrote: Thank you for the valuable feedback. I see people are big fans of json here. You can be a fan of JSON and still think it is not the correct way to store data in a relational database. When you do that you often end up with nested, possibly unstructured, data sets. That ends up with the exertion of more effort to get the data out in any meaningful form and in a timely manner then the time and effort it would take to enter it in a structured way. On Thu, May 23, 2024 at 3:04 AM Laurenz Albe <mailto:laurenz.a...@cybertec.at>> wrote: On Wed, 2024-05-22 at 22:38 -0500, Skorpeo Skorpeo wrote: > I was wondering if having unrelated columns in a table is a sound approach when > using json. In other words, if I have two collections of unrelated json objects, > for example "Users" and "Inventory", would it be ok to have one table with a > "Users" column and a "Inventory" column? My concern is that from a row > perspective the columns could be different lengths, such as more inventory > items as users. And for any given row the data in one column would have no > relation to another column. I would only query a single column at a time. > > Would this approach be ok or are there pitfalls such that it would be > advantageous/recommended to have a separate table for each column? It doesn't matter much if you use one or two columns. But the word "collection" makes me worry. Perhaps this article can give you some ideas: https://www.cybertec-postgresql.com/en/json-postgresql-how-to-use-it-right/ <https://www.cybertec-postgresql.com/en/json-postgresql-how-to-use-it-right/> Yours, Laurenz Albe -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/23/24 06:01, HORDER Philip wrote: Classified as: {OPEN} I actually hadn't thought to look for Postgres bugs, the system is so reliable for us, it just doesn't go wrong! I can trigger the fault be running a reload of the lfm database, and we've been running Postgres 10 & 13 for several years now without seeing this problem. Our symptoms match that bug description exactly, so I'm pretty confident this is the problem. I'll have to push through the right paperwork to get a Postgres update, and that will take a while While you are doing the paperwork make sure it is for upgrade to 15.7. Also might mention minor upgrades (x in X.x) are a good idea as Postgres, like any software, does get bugs and the minor releases fix them. Point the folks in charge at: https://www.postgresql.org/docs/15/release.html Also minor releases as a rule do not require dump/restore or pg_upgrade or other heavy lifting. There have been exceptions, but they are rare. Thank you so much for your time. Phil Horder Database Mechanic Thales Land & Air Systems -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/22/24 08:55, HORDER Philip wrote: Classified as: {OPEN} https://www.postgresql.org/docs/15/release-15-5.html#id-1.11.6.7.4 Fix race condition in database dropping that could lead to the autovacuum launcher getting stuck Wow, that sounds like our problem! Another thought I had is that since you are on Windows maybe there is an AV program seeing the autovacuum as virus-like and killing it. Though I would expect there to be more errors in that case. Anyway it might be worth investigating should an upgrade not fix it. I will investigate. Maybe try and find the orphaned stats entry to prove it. It would be a good idea to upgrade to 15.7 in any case to get all the other bug/security fixes. Phil Horder Database Mechanic Thales Land & Air Systems -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/22/24 01:33, HORDER Philip wrote: Classified as: {OPEN} 2) There is a round of autovacuum immediately after the lfm is restored. Yes, some tables in the lfm database, but not all, an apparently random selection, anywhere between 2 and 21 tables, across the lfm schemas, public & pg_catalog. 3) autovacuum then goes silent. Yes. Dead in a ditch. But with no errors. 4) Before the next drop/create lfm you restart the Postgres server and autovacuum starts again. I haven't restarted in a week, and the pattern remains, with a bit of analyze at each reload of lfm, and then nothing. What is in the logs when you do the restart? Nothing notable: 1) denied connections, while restarting 2) authorized connections 3) auto analyze going into overdrive: See below Is there some process that runs shortly after the drop/create lfm cycle? Not that I can see. I was hoping more coffee would lead to enlightenment, it did not. It did lead me to do what I should have done at the start which is look at the release notes for 15.x. You are on Postgres 15.3 and current is 15.7. On the path from .5 --> .7 is: https://www.postgresql.org/docs/15/release-15-5.html#id-1.11.6.7.4 Fix race condition in database dropping that could lead to the autovacuum launcher getting stuck (Andres Freund, Will Mortensen, Jacob Speidel) The race could lead to a statistics entry for the removed database remaining present, confusing the launcher's selection of which database to process. Phil Horder Database Mechanic -- Adrian Klaver adrian.kla...@aklaver.com
Re: search_path and SET ROLE
On 5/22/24 07:27, Ron Johnson wrote: PG 9.6.24 (Soon, I swear!) It seems that the search_path of the role that you SET ROLE to does not become the new search_path. Am I missing something, or is that PG's behavior? AS USER postgres $ psql -h 10.143.170.52 -Xac "CREATE ROLE dbagrp SUPERUSER INHERIT NOLOGIN;" CREATE ROLE dbagrp SUPERUSER INHERIT NOLOGIN; CREATE ROLE $ psql -h 10.143.170.52 -Xac "CREATE USER rjohnson IN GROUP dbagrp INHERIT;" CREATE USER rjohnson IN GROUP dbagrp INHERIT; CREATE ROLE [postgres@FISPMONDB001 ~]$ psql -h 10.143.170.52 -Xac "CREATE USER \"11026270\" IN GROUP dbagrp INHERIT PASSWORD '${NewPass}' VALID UNTIL '2024-06-30 23:59:59';" CREATE USER "11026270" IN GROUP dbagrp INHERIT PASSWORD 'linenoise' VALID UNTIL '2024-06-30 23:59:59'; CREATE ROLE $ psql -h 10.143.170.52 -Xac "ALTER ROLE dbagrp set search_path = dbagrp, public, dba, cds, tms;" ALTER ROLE dbagrp set search_path = dbagrp, public, dba, cds, tms; ALTER ROLE AS USER rjohnson [rjohnson@fpslbxcdsdbppg1 ~]$ psql -dCDSLBXW psql (9.6.24) Type "help" for help. CDSLBXW=> SET ROLE dbagrp; SET CDSLBXW=# CDSLBXW=# SHOW SEARCH_PATH; search_path - "$user", public (1 row) Back to user postgres = $ psql -h 10.143.170.52 -Xac "ALTER ROLE rjohnson set search_path = dbagrp, public, dba, cds, tms;" ALTER ROLE rjohnson set search_path = dbagrp, public, dba, cds, tms; ALTER ROLE Back to user rjohnson = [rjohnson@fpslbxcdsdbppg1 ~]$ psql -dCDSLBXW psql (9.6.24) Type "help" for help. CDSLBXW=> CDSLBXW=> SET ROLE dbagrp; SET CDSLBXW=# SHOW SEARCH_PATH; search_path --- dbagrp, public, dba, cds, tms (1 row) https://www.postgresql.org/docs/current/sql-alterrole.html Whenever the role subsequently starts a new session, the specified value becomes the session default, overriding whatever setting is present in postgresql.conf or has been received from the postgres command line. This only happens at login time; executing SET ROLE or SET SESSION AUTHORIZATION does not cause new configuration values to be set. Settings set for all databases are overridden by database-specific settings attached to a role. Settings for specific databases or specific roles override settings for all roles. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/21/24 13:44, HORDER Philip wrote: Classified as: {OPEN} 2024-05-15 03:31:31.290 GMT [4556]: [3-1] db=lfm,user=superuser,app=[unknown],client=::1 LOG: connection authorized: user=superuser database=lfm application_name=pg_restore That would be the lfm database being restored. What does the log show after that as pertains to autovacuum? Yep, pg_restore recreates the dropped lfm database. And after that nothing. The log just holds connection requests, and a checkpoint every hour. That's it. No "automatic vacuum", or "automatic analyze" anywhere. And nothing any day since then, for a week. Just for confirmation your settings are still?: autovacuum_max_workers = 10 log_autovacuum_min_duration = 0 You said previously: "The only way I can find of getting the analyzer back is to restart Postgres." To be clear this means: 1) The lfm database is dropped/created. 2) There is a round of autovacuum immediately after the lfm is restored. 3) autovacuum then goes silent. 4) Before the next drop/create lfm you restart the Postgres server and autovacuum starts again. What is in the logs when you do the restart? Is there some process that runs shortly after the drop/create lfm cycle? Phil Horder Database Mechanic -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/21/24 9:29 AM, HORDER Philip wrote: Classified as: {OPEN} I am having a hard time figuring out how both of the above can be true. Dropping and recreating the database would wipe out the statistics. We have multiple databases on the one Postgres server. The 'postgres' database contains our main application, with tables in schema 'a'. We have two other databases, one of which is 'lfm'. In PgAdmin, there is Databases (3), then those database names. As the superuser, I can connect to any of these databases. When we drop database lfm, and recreate it using pg_restore, the stats collection dies, for all databases on the server: 'postgres', 'lfm', 'lfm2' I haven't dropped database 'postgres' So the existing stats on tables in the 'postgres' database remain, but no further stats are collected. From int-postgres-log-21-05-2024.txt: 2024-05-15 03:31:31.290 GMT [4556]: [3-1] db=lfm,user=superuser,app=[unknown],client=::1 LOG: connection authorized: user=superuser database=lfm application_name=pg_restore That would be the lfm database being restored. What does the log show after that as pertains to autovacuum? Phil Horder Database Mechanic Thales Land & Air Systems {OPEN} The information contained in this e-mail is confidential. It is intended only for the stated addressee(s) and access to it by any other person is unauthorised. If you are not an addressee, you must not disclose, copy, circulate or in any other way use or rely on the information contained in this e-mail. Such unauthorised use may be unlawful. If you have received this e-mail in error, please inform the originator immediately and delete it and all copies from your system. Thales UK Limited. A company registered in England and Wales. Registered Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered Number: 868273 Please consider the environment before printing a hard copy of this e-mail. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/21/24 9:29 AM, HORDER Philip wrote: Classified as: {OPEN} I am having a hard time figuring out how both of the above can be true. Dropping and recreating the database would wipe out the statistics. We have multiple databases on the one Postgres server. The 'postgres' database contains our main application, with tables in schema 'a'. We have two other databases, one of which is 'lfm'. My mistake, I was over concentrating on the log messages and did not pay attention to the psql prompt which was showing postgres as the database. Phil Horder Database Mechanic Thales Land & Air Systems {OPEN} The information contained in this e-mail is confidential. It is intended only for the stated addressee(s) and access to it by any other person is unauthorised. If you are not an addressee, you must not disclose, copy, circulate or in any other way use or rely on the information contained in this e-mail. Such unauthorised use may be unlawful. If you have received this e-mail in error, please inform the originator immediately and delete it and all copies from your system. Thales UK Limited. A company registered in England and Wales. Registered Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered Number: 868273 Please consider the environment before printing a hard copy of this e-mail. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/21/24 06:00, HORDER Philip wrote: Classified as: {OPEN} Assuming clean shutdowns the statistics will survive restarts. They would be wiped when you drop a database and start over, have an unclean shutdown or you use one of the reset functions... Yes, stats are permanent, but are not being updated. We don't use any of the pg_stat_reset functions. - I've left the system alone over the weekend. Here's the timeline: 14th May: Postgres working ok, 1767 log entries for "automatic analyze", mostly in database postgres. 03:30 Jenkins deployed an update, resulting in reload of lfm database. This is where I am getting confused. In your original post you had: To apply an update, we: stop the applications that use LFM, set the user (LFU) to NOLOGIN kill any left-over connections: select pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'lfm' and usename = 'lfu'; drop the existing reference database using the dropDb utility. reload the new file using pg_restore and the postgres super user. set the user (LFU) to LOGIN In other words DROP DATABASE then CREATE DATABASE and reload the schema objects and associated data. Yet your int stats output.txt file has things like the following: -[ RECORD 3 ]---+-- relid | 923130055 schemaname | a relname | cr_pt_e_20240502_20240503 seq_scan| 1264 seq_tup_read| 8800722491 idx_scan| 4601405 idx_tup_fetch | 4415621 n_tup_ins | 3851400 n_tup_upd | 15790 n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup | 7166325 n_dead_tup | 0 n_mod_since_analyze | 0 n_ins_since_vacuum | 0 last_vacuum | last_autovacuum | 2024-05-03 09:03:44.810654+00 last_analyze| 2024-05-09 08:44:37.725674+00 last_autoanalyze| 2024-05-03 09:03:58.838664+00 vacuum_count| 0 autovacuum_count| 1 analyze_count | 1 autoanalyze_count | 1 I am having a hard time figuring out how both of the above can be true. Dropping and recreating the database would wipe out the statistics. Summary: since the reload of lfm database on 15th May, the only "automatic analyze" events logged have been for the lfm database, at the point of reload. No other stats analyze seems to have taken place, on any database on the server since that point, even partitions with over a million rows. The above is confusing also. In your original post you only referred to the postgres and lfm databases. What other databases are you referring to? As to partitions are you referring to partitions of tables in the lfm database or something else? Apart from that, Postgres appears to be working normally. I'm sure that another restart of Postgres will restore the auto analyze, until the next pg_restore of lfm. So what's going on? How is it that we're breaking this important function that we shouldn't be able to have any effect on? Thanks for looking, Phil Horder Database Mechanic Thales Land & Air Systems -- Adrian Klaver adrian.kla...@aklaver.com
Re: Left join syntax error
On 5/18/24 08:04, Rich Shepard wrote: On Sat, 18 May 2024, Adrian Klaver wrote: ... LEFT JOIN companies as c ON c.company_nbr = p.company_nbr; Adrian, Tried that: bustrac-# LEFT JOIN companies as c ON c.company_nbr = p.company_nbr; ERROR: table name "c" specified more than once Show the complete query. Take the error message as correct, you are specifying 'companies as c' more then once. Thanks, Rich -- Adrian Klaver adrian.kla...@aklaver.com
Re: Left join syntax error
On 5/18/24 08:01, Rich Shepard wrote: On Sat, 18 May 2024, Ray O'Donnell wrote: You need to include the alias for the table also - see "...from companies as c..." in Shammat's example. Ray, That didn't work: bustrac-# FROM people as p, companies as c bustrac-# LEFT JOIN companies as c ON c.company_nbr = p.company_nbr; ERROR: table name "c" specified more than once so I tried only the alias on the join line: bustrac-# LEFT JOIN c ON c.company_nbr = p.company_nbr; ERROR: relation "c" does not exist LINE 3: LEFT JOIN c ON c.company_nbr = p.company_nbr; ^ and that didn't work either. The query needs to be: SELECT p.lname, p.fname, p.job_title, p.company_nbr, p.email, c.company_name FROM people as p LEFT JOIN companies as c ON c.company_nbr = p.company_nbr; Only reference companies as c once. Thanks, Rich -- Adrian Klaver adrian.kla...@aklaver.com
Re: utf8 vs UTF-8
On 5/18/24 07:48, Troels Arvin wrote: Hello, Tom Lane wrote: >> test1 | loc_test | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 >> test3 | troels | UTF8 | libc | en_US.utf8 | en_US.utf8 > > On most if not all platforms, both those spellings of the locale names > will be taken as valid. You might try running "locale -a" to get an > idea of which one is preferred according to your current libc > installation "locale -a" on the Ubuntu system outputs this: C C.utf8 en_US.utf8 POSIX If you expand that to locale -v -a you get: locale: en_US.utf8 archive: /usr/lib/locale/locale-archive --- title | English locale for the USA source | Free Software Foundation, Inc. address | https://www.gnu.org/software/libc/ email | bug-glibc-loca...@gnu.org language | American English territory | United States revision | 1.0 date | 2000-06-24 codeset | UTF-8 So at first, I thought en_US.utf8 would be the most correct locale identifier. However, when I look at Postgres' own databases, they have the slightly different locale string: psql --list | grep -E 'postgres|template' postgres | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ... template0 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ... template1 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ... Also, when I try to create a database with "en_US.utf8" as locale without specifying a template: troels=# create database test4 locale 'en_US.utf8'; ERROR: new collation (en_US.utf8) is incompatible with the collation of the template database (en_US.UTF-8) HINT: Use the same collation as in the template database, or use template0 as template. I'm going to say that is Postgres being exact to a fault. Given the locale of Postgres' own databases and Postgres' error message, I'm leaning to en_US.UTF-8 being the most correct locale to use. Because why would Postgres care about it, if utf8/UTF-8 doesn't matter? but TBH, I doubt it's worth worrying about. But couldn't there be an issue, if for example the client's locale and the server's locale aren't exactly the same? I'm thinking maybe the client library has to perform unneeded translation of the stream of data to/from the database? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Left join syntax error
On 5/18/24 07:46, Rich Shepard wrote: On Sat, 18 May 2024, Shammat wrote: Don't put the second table in the FROM part SELECT p.lname, p.fname, p.job_title, p.company_nbr, p.email, c.company_name FROM people as p LEFT JOIN companies as c ON c.company_nbr = p.company_nbr Shammat, I tried this with this result: ERROR: missing FROM-clause entry for table "c" LINE 3: LEFT JOIN companies ON c.company_nbr = p.company_nbr; ... LEFT JOIN companies as c ON c.company_nbr = p.company_nbr; Thanks for the suggestion. Regards, Rich -- Adrian Klaver adrian.kla...@aklaver.com
Re: Valid until
On 5/18/24 03:09, Rama Krishnan wrote: Reply to list also. Ccing list Hi Adrian, I have modified the pg_hba entry from trust to md5 like below ``` local all all md5 That would be the issue. trust ignores the password check. ``` When i have tired with postgres user I am able to connect Which is expected as postgres does not have a 'valid until' restriction. [postgres@postgres16 data]$ psql -U postgres -d postgres Password for user postgres: psql (16.2) Type "help" for help. postgres=# \du List of roles Role name | Attributes + pgbackrest | Replication postgres | Superuser, Create role, Create DB, Replication, Bypass RLS test | Password valid until 2023-05-13 00:00:00+00 user_name | Password valid until 2024-05-13 00:00:00+00 But when i tried with test or user_name user even though I am passing the correct value I am getting this error Again as expected as the 'valid until' timestamp is in the past. ``` [postgres@postgres16 data]$ psql -U test -d postgres Password for user test: psql: error: connection to server on socket "/run/postgresql/.s.PGSQL.5432" failed: FATAL: password authentication failed for user "test" postgres=# \c postgres user_name Password for user user_name: connection to server on socket "/run/postgresql/.s.PGSQL.5432" failed: FATAL: password authentication failed for user "user_name" ``` Once i done the changes the valid until expiration date ``` postgres=# alter user test VALID UNTIL '2024-05-19'; ALTER ROLE postgres=> \du List of roles Role name | Attributes + pgbackrest | Replication postgres | Superuser, Create role, Create DB, Replication, Bypass RLS test | Password valid until 2024-05-19 00:00:00+00 user_name | Password valid until 2024-05-13 00:00:00+00 ``` Finally it allows to connect test Which is correct as the 'valid until' timestamp is in the future. ``` [postgres@postgres16 data]$ psql -d postgres -U test Password for user test: psql (16.2) ``` I believe this is a expected output of validunitl , Please correct me if i m wrong The behavior is as referenced in the documentation: https://www.postgresql.org/docs/current/sql-createrole.html VALID UNTIL 'timestamp' The VALID UNTIL clause sets a date and time after which the role's password is no longer valid. If this clause is omitted the password will be valid for all time. Regards A.Rama Krishnan -- Adrian Klaver adrian.kla...@aklaver.com
Re: Valid until
On 5/18/24 02:37, Rama Krishnan wrote: Reply to list also. Ccing list. Please find the details below ``` postgres=# \du List of roles Role name | Attributes + pgbackrest | Replication postgres | Superuser, Create role, Create DB, Replication, Bypass RLS test | Password valid until 2023-05-13 00:00:00+00 user_name | Password valid until 2024-05-13 00:00:00+00 postgres=# show timezone; TimeZone -- UTC (1 row) postgres=# select version(); version - PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20), 64-bit (1 row) postgres=# \q [postgres@postgres16 log]$ psql -U username -d postgres psql: error: connection to server on socket "/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "username" does not exist [postgres@postgres16 log]$ psql -U user_name -d postgres psql (16.2) Type "help" for help. ``` Regards A.Rama Krishnan On Tue, May 14, 2024 at 8:57 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 5/14/24 00:57, Rama Krishnan wrote: > Hi team, > > I have a question about the "valid until" parameter in the "create role" > command. > > I recently set a user's password validity to "2024-05-13", but despite > this, the user is still able to connect to the database. Is this a bug, > or is the "valid until" parameter just for identification purposes, > indicating when the password will expire? > > Could you please clarify this for me? https://www.postgresql.org/docs/current/sql-createrole.html <https://www.postgresql.org/docs/current/sql-createrole.html> " VALID UNTIL 'timestamp' The VALID UNTIL clause sets a date and time after which the role's password is no longer valid. If this clause is omitted the password will be valid for all time. " When did the user log in? What is the TimeZone setting in the database? > > > Thanks > > Ram -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/16/24 08:59, HORDER Philip wrote: Classified as: {OPEN} Adrian, Still your contention was that autovacuum quit running after the initial restore and that is not the case This Postgres server has been restarted a few times since 2nd May most recently on Tuesday 14th, hence the more recent analyze status. Assuming clean shutdowns the statistics will survive restarts. They would be wiped when you drop a database and start over, have an unclean shutdown or you use one of the reset functions from here: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-FUNCTIONS 28.2.25. Statistics Functions We've had some problems with our data feeds on this integration system, but these are now running again. I'm planning to leave it all alone until I'm back in the office on Tuesday, and run this query again for a few tables and send you an update. I'm expecting no further stats analysis, (and the performance to be appalling). From here: https://www.postgresql.org/docs/current/runtime-config-logging.html log_autovacuum_min_duration In addition, when this parameter is set to any value other than -1, a message will be logged if an autovacuum action is skipped due to a conflicting lock or a concurrently dropped relation. Thanks for your time. Phil Horder Database Mechanic Thales Land & Air Systems -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/16/24 07:38, HORDER Philip wrote: Classified as: {OPEN} Did you have chance to do below? Using psql do \x select * from pg_stat_all_tables where relname = 'a.accp'; Sorry, missed that bit. From this output you can see that no stats have been collected since the last two overnight updates. Still your contention was that autovacuum quit running after the initial restore and that is not the case. Assuming this is the instance you restored on May 2 2024 then approximate counts are: autovacuum_count 1888/13 days = 145 autovacuums/day = 6/hour autoanalyze_count 1715/13 = 132 autoanalyze/day = 5.5/hr postgres=# select * from pg_stat_all_tables where relname='accp'; -[ RECORD 1 ]---+ relid | 811486381 schemaname | a relname | accp seq_scan| 1654767 seq_tup_read| 901811880 idx_scan| 146070383 idx_tup_fetch | 305949969 n_tup_ins | 2056948 n_tup_upd | 0 n_tup_del | 1662699 n_tup_hot_upd | 0 n_live_tup | 294 n_dead_tup | 75162 n_mod_since_analyze | 0 n_ins_since_vacuum | 75162 last_vacuum | last_autovacuum | 2024-05-15 03:25:16.007386+00 last_analyze| last_autoanalyze| 2024-05-15 03:25:16.008873+00 vacuum_count| 0 autovacuum_count| 1888 analyze_count | 19836 autoanalyze_count | 1715 Phil Horder Database Mechanic Thales Land & Air Systems {OPEN} The information contained in this e-mail is confidential. It is intended only for the stated addressee(s) and access to it by any other person is unauthorised. If you are not an addressee, you must not disclose, copy, circulate or in any other way use or rely on the information contained in this e-mail. Such unauthorised use may be unlawful. If you have received this e-mail in error, please inform the originator immediately and delete it and all copies from your system. Thales UK Limited. A company registered in England and Wales. Registered Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered Number: 868273 Please consider the environment before printing a hard copy of this e-mail. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/16/24 03:08, HORDER Philip wrote: Classified as: {OPEN} --oids have not been supported with pg_dump since v11 You're absolutely correct, this command came from my notes, which are obviously out of date. We're running Postgres 15 pg_dump, and I've updated my notes. I'm double-checking the command options with the guy who creates these files, but he's on holiday. I expect we just dropped the OID option. FYI, --blobs is only needed when dumping a specific table or schema, otherwise it happens by default. Thanks, I didn’t know that. Did you have chance to do below? > From yesterday, current stats for table a.accp, from pg_STAT_all_tables: >"811486381""airscape""accp"16458538988177871456553503047055581967135016364880294000"2024-05-14 14:51:37.158892+00""2024-05-09 08:27:45.328468+00""2024-05-14 13:15:31.999198+00"01815170101653 I can't parse this out well enough to really say anything. Though the timestamps seem to indicate recent activity. Using psql do \x select * from pg_stat_all_tables where relname = 'a.accp'; to get an extended output that is easier to read. Phil Horder Database Mechanic Thales Land & Air Systems {OPEN} The information contained in this e-mail is confidential. It is intended only for the stated addressee(s) and access to it by any other person is unauthorised. If you are not an addressee, you must not disclose, copy, circulate or in any other way use or rely on the information contained in this e-mail. Such unauthorised use may be unlawful. If you have received this e-mail in error, please inform the originator immediately and delete it and all copies from your system. Thales UK Limited. A company registered in England and Wales. Registered Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered Number: 868273 Please consider the environment before printing a hard copy of this e-mail. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/15/24 01:08, HORDER Philip wrote: Classified as: {OPEN} Backups of this db are created with: pg_dump --file=fsm.dmp -Fc --blobs --oids --dbname=lfm --host=localhost --port= --username=superuser From your original post: "Running Postgres 15.3 ..." --oids have not been supported with pg_dump since v11 When I try it: pg_dump -V pg_dump (PostgreSQL) 15.7 (Ubuntu 15.7-1.pgdg22.04+1) pg_dump -d test -U postgres --oids -f test.sql /usr/lib/postgresql/15/bin/pg_dump: unrecognized option '--oids' So what version of pg_dump are you using? Or you not working on a v15 instance of Postgres? FYI, --blobs is only needed when dumping a specific table or schema, otherwise it happens by default. Restore is run with: dropdb --port= --maintenance-db=postgres --username=superuser --if-exists lfm pg_restore -Fc --create --dbname=postgres --port= --username=superuser fsm.dmp - 2) From earlier post: '... only analyzes tables in the new db at the point of reload, then shuts off again.' Provide that sequence of events from the Postgres log. Log file extract is attached, with object names obfuscated. Still working my way through that. From yesterday, current stats for table a.accp, from pg_STAT_all_tables: "811486381""airscape""accp"16458538988177871456553503047055581967135016364880294000"2024-05-14 14:51:37.158892+00""2024-05-09 08:27:45.328468+00""2024-05-14 13:15:31.999198+00"01815170101653 I can't parse this out well enough to really say anything. Though the timestamps seem to indicate recent activity. Using psql do \x select * from pg_stat_all_tables where relname = 'a.accp'; to get an extended output that is easier to read. This table has a low row count, but high content turnover. It usually gets auto-analyzed every minute. For today, this hasn't been auto analysed since the update at 3am. 4) The autovacuum settings you have in effect: vacuum_cost_limit = 2000 log_autovacuum_min_duration = 0 autovacuum_max_workers = 10 all other vacuum settings are defaults. Phil Horder Database Mechanic Thales Land & Air Systems {OPEN} The information contained in this e-mail is confidential. It is intended only for the stated addressee(s) and access to it by any other person is unauthorised. If you are not an addressee, you must not disclose, copy, circulate or in any other way use or rely on the information contained in this e-mail. Such unauthorised use may be unlawful. If you have received this e-mail in error, please inform the originator immediately and delete it and all copies from your system. Thales UK Limited. A company registered in England and Wales. Registered Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered Number: 868273 Please consider the environment before printing a hard copy of this e-mail. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Valid until
On 5/14/24 00:57, Rama Krishnan wrote: Hi team, I have a question about the "valid until" parameter in the "create role" command. I recently set a user's password validity to "2024-05-13", but despite this, the user is still able to connect to the database. Is this a bug, or is the "valid until" parameter just for identification purposes, indicating when the password will expire? Could you please clarify this for me? https://www.postgresql.org/docs/current/sql-createrole.html " VALID UNTIL 'timestamp' The VALID UNTIL clause sets a date and time after which the role's password is no longer valid. If this clause is omitted the password will be valid for all time. " When did the user log in? What is the TimeZone setting in the database? Thanks Ram -- Adrian Klaver adrian.kla...@aklaver.com
Re: UTC is not a time zone?
On 5/13/24 11:18 AM, Christophe Pettus wrote: On May 13, 2024, at 11:17, Tom Lane wrote: What's causing that I can't say. It doesn't look like we log the errno anywhere when failing to read a zone file :-( File descriptor exhaustion? (Of course, that would mean something somewhere is leaking them, which is another problem.) If I am understanding this correctly: "The parent process would long since have cached the zone data in its memory, but this error is in a parallel worker process, which'd have to read the file for itself during startup. " Then this: "As an experiment, I'm just pounding the server with a single connection doing nothing but SET TIMEZONEs repeatedly. So far, no break, but it is *very* intermittent." May not induce the error unless there are parallel workers involved. -- Adrian Klaver adrian.kla...@aklaver.com
Re: UTC is not a time zone?
On 5/13/24 11:02 AM, Christophe Pettus wrote: On May 13, 2024, at 10:58, Adrian Klaver wrote: You sure the timezone file did not get changed under the connection? Yes (at least, nothing happened on the system that would indicate that). The system wasn't touched during the execution (and, as noted, it worked after as well as before). Hmm. What does pg_config --configure show for '--with-system-tzdata' ? Why is SET TIMEZONE TO 'UTC being constantly called? ORM noise. -- Adrian Klaver adrian.kla...@aklaver.com
Re: UTC is not a time zone?
On 5/13/24 10:50 AM, Christophe Pettus wrote: On May 13, 2024, at 10:48, Adrian Klaver wrote: Yes: https://www.postgresql.org/message-id/5DF49366-10D1-42A4-99BF-F9A7DC3AB0F4%40mailbox.org Answer: https://www.postgresql.org/message-id/1273542.1712326418%40sss.pgh.pa.us Thanks! Similar, but I don't think it's that. This was a sudden change in a long-running connection that had issued a ton of ` SET TIMEZONE TO 'UTC'; ` before the failure. You sure the timezone file did not get changed under the connection? Why is SET TIMEZONE TO 'UTC being constantly called? -- Adrian Klaver adrian.kla...@aklaver.com
Re: UTC is not a time zone?
On 5/13/24 10:37 AM, Christophe Pettus wrote: PostgreSQL 16.3 on MacOS Sonoma. A long-running process (which held a connection open the entire time) failed with: 2024-05-13 09:12:44.719 PDT,"cyan","cyan",35926,"[local]",664214f9.8c56,3,"SELECT",2024-05-13 06:26:17 PDT,3/60,0,ERROR,22023,"invalid value for parameter ""TimeZone"": ""UTC""","while setting parameter ""TimeZone"" to ""UTC"" parallel worker"," select count(1), count(1) filter (where visited > 0) from framework_seenchoice ",,,"","client backend",,0 It's not (easily) repeatable, and the system was not touched while the process was running (no installing new binaries, etc.). Does this look familiar to anyone? Yes: https://www.postgresql.org/message-id/5DF49366-10D1-42A4-99BF-F9A7DC3AB0F4%40mailbox.org Answer: https://www.postgresql.org/message-id/1273542.1712326418%40sss.pgh.pa.us -- Adrian Klaver adrian.kla...@aklaver.com
Re: Unexpected data when subscribing to logical replication slot
On 5/9/24 00:32, Daniel McKenzie wrote: We've had this running in live now for years without a hiccup so we are surprised to learn that we have this massive race condition and it just so happens that the hardware is fast enough to process the transaction before the .NET application can react to replication slot changes. On broad scale I'm going to say that over 'for years' there has been an increase in load on the Postgres server as well as the I/0 system of the machine it is running on. What you are seeing now is the canary in the mine giving you the heads up that more trouble lies ahead as the hardware and software is reaching load limits. On finer scale my guess is that the following is happening when synchronous_commit = on: 1) Postgres session #1 does data change. 2) This is replicated out and picked up by wal2json, which sees the new data. 3) The Postgres server waits for the confirmation that the WAL record has been written out to disk. Upon confirmation it commits on the server. This is the part that I am not sure of in relation to wal2json. 4) Postgres session #2 queries the database for the record. In the case where 3) has not completed it sees the old values as the data change in session #1 has not committed and therefore the new values are not seen by other sessions. Daniel McKenzie Software Developer -- Adrian Klaver adrian.kla...@aklaver.com
Re: Unexpected data when subscribing to logical replication slot
On 5/9/24 00:32, Daniel McKenzie wrote: Asynchronous commit introduces the risk of data loss. There is a short time window between the report of transaction completion to the client and the time that the transaction is truly committed. To get anywhere with this issue you will need to provide the information Tomas Vondra requested upstream: " Where/how does the enrichment query run? How does the whole process look like? I guess an application is receiving decoded changes as JSON, and then querying the database? " and " Would be good to have some sort of reproducer - ideally a script that sets up an instance + replication, and demonstrates the issue. Or at least a sufficiently detailed steps to reproduce it without having to guess what exactly you did. " The documentation speaks about synchronous_commit changing how transactions change behaviour for the client. So in this case, my psql terminal is the client, and I would expect a faster commit (from its perspective) and then a period of risk (as a process usually done as part of the commit is now being done in the background) but it's not clear how that affects a replication slot subscriber. What we're struggling to understand is: why are we seeing any updates in the replication slot before they have been "truly committed"? From: https://www.postgresql.org/docs/current/wal-async-commit.html "As described in the previous section, transaction commit is normally synchronous: the server waits for the transaction's WAL records to be flushed to permanent storage before returning a success indication to the client. The client is therefore guaranteed that a transaction reported to be committed will be preserved, even in the event of a server crash immediately after. However, for short transactions this delay is a major component of the total transaction time. Selecting asynchronous commit mode means that the server returns success as soon as the transaction is logically completed, before the WAL records it generated have actually made their way to disk. This can provide a significant boost in throughput for small transactions." It is about the state of the WAL record, in synchronous_commit=on the commit will not happen on the server and be seen by the client until the WAL record has been recorded as saved to disk. For synchronous_commit=off the commit happens as soon as COMMIT is reached in the transaction and the WAL record save happens after that. I don't use wal2json so I don't know how it deals with the above. There appears to be a state of limbo between updating data and that data being available to query (and our subscriber is picking up changes in this period of time) but I can't pin down any documentation which describes it. The answer awaits a the full description of the process requested by Tomas Vondra. Best guess, the fact that synchronous_commit=off 'cures' it implies that in synchronous_commit=on mode you are picking up data on the receiving and sending end at different points in '... the server waits for the transaction's WAL records to be flushed to permanent storage before returning a success indication to the client.' We've had this running in live now for years without a hiccup so we are surprised to learn that we have this massive race condition and it just I would not say "...perhaps 1 in 50 times" is massive. so happens that the hardware is fast enough to process the transaction before the .NET application can react to replication slot changes. Daniel McKenzie Software Developer -- Adrian Klaver adrian.kla...@aklaver.com
Re: Unexpected data when subscribing to logical replication slot
On 5/8/24 08:24, Daniel McKenzie wrote: It's running both (in docker containers) and also quite a few more docker containers running various .NET applications. I think what you found is that the r7a.medium instance is not capable enough to do all that it is asked without introducing lag under load. Answering the questions posed by Tomas Vondra would help get to the actual cause of the lag. In meantime my suspicion is this part: "For example, when I use a psql terminal to update a user's last name from "Jones" to "Smith" then I would expect the enrichment query to find "Smith" but it will sometimes still find "Jones". It finds the old data perhaps 1 in 50 times." If this is being run against the Postgres server my guess is that synchronous_commit=on is causing the commit on the server to wait for the WAL records to be flushed to disk and this is not happening in a timely manner in the '... 1 in 50 times' you mention. In that case you see the old values not the new committed values. This seems to be confirmed when you set synchronous_commit=off and don't see old values. For completeness per: https://www.postgresql.org/docs/current/wal-async-commit.html "However, for short transactions this delay is a major component of the total transaction time. Selecting asynchronous commit mode means that the server returns success as soon as the transaction is logically completed, before the WAL records it generated have actually made their way to disk. This can provide a significant boost in throughput for small transactions. Asynchronous commit introduces the risk of data loss. There is a short time window between the report of transaction completion to the client and the time that the transaction is truly committed (that is, it is guaranteed not to be lost if the server crashes). ... " Daniel McKenzie Software Developer -- Adrian Klaver adrian.kla...@aklaver.com
Re: Unexpected data when subscribing to logical replication slot
We have found two things that appear to resolve the problem - * Using a more powerful EC2 instance. We can reproduce the issue with a r7a.medium instance but not with a r7a.large EC2 instance. * Changing the Postgres synchronous_commit parameter from "on" to "off". We cannot reproduce the issue with synchronous_commit set to "off". What is running on the EC2 instance?: 1) The Postgres server. 2) The replication receiver. 3) Both. We need help to understand this unexpected behaviour. We are using Postgres 14.10. Thanks, Daniel McKenzie -- Adrian Klaver adrian.kla...@aklaver.com
Re: Question regarding how databases support atomicity
On 5/7/24 9:48 AM, Siddharth Jain wrote: Thanks All for the kind responses. I understand how MVCC takes care of atomicity for updates to rows. I was developing a project where lets say data for each table is stored in its own folder together with metadata (we are not talking postgres now). So if I have two tables A and B I have a folder structure like: A \_ metadata.json B \_ metadata.json Now if I want to rename a table, I need to move the folder and also update metadata accordingly. These are two separate operations but need to be done atomically - all or none. in this case it is possible that we succeed in renaming the folder but fail to update metadata for whatever reason. then if we try to undo the folder rename we get another failure for whatever reason. how to deal with such scenarios? are there no such scenarios in postgres? The only thing I can think of is creating a function in one of the untrusted languages plpython3u or plperlu to do the renaming. Then in say plpython3u case wrap the actions in try/except block. On a failure take the appropriate undo action. On Fri, May 3, 2024 at 8:29 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote: "David G. Johnston" mailto:david.g.johns...@gmail.com>> writes: > On Friday, May 3, 2024, Siddharth Jain mailto:siddh...@gmail.com>> wrote: >> The way I understand this is that if there is a failure in-between, we >>> start undoing and reverting the previous operations one by one. > Not in PostgreSQL. All work performed is considered provisional until a > commit succeeds. At which point all provisional work, which had been > tagged with the same transaction identifier, becomes reality to the rest of > the system, by virtue of marking the transaction live. Right. We don't use UNDO; instead, we use multiple versions of database rows (MVCC). A transaction doesn't modify the contents of existing rows, but just marks them as provisionally outdated, and then inserts new row versions that are marked provisionally inserted. Other transactions ignore the outdate markings and the uncommitted new rows, until the transaction commits, at which time the new versions become live and the old ones become dead. If the transaction never does commit -- either through ROLLBACK or a crash -- then the old row versions remain live and the new ones are dead. In either case, we don't have a consistency or correctness problem, but we do have dead rows that must eventually get vacuumed away to prevent indefinite storage bloat. That can be done by background housekeeping processes though (a/k/a autovacuum). I believe Oracle, for one, actually does use UNDO. I don't know what they do about failure-to-UNDO. regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/7/24 08:24, Adrian Klaver wrote: On 5/7/24 02:38, HORDER Philip wrote: Thanks for your time Adrian 1) What is the exact pg_restore command you are using? 2) From earlier post: '... only analyzes tables in the new db at the point of reload, then shuts off again.' Provide that sequence of events from the Postgres log. 3) Also statistics from https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW for that table after the reload. 4) The autovacuum settings you have in effect. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/7/24 02:38, HORDER Philip wrote: Thanks for your time Adrian Is there enough data processing? Yes, one table is receiving upwards of 20 million rows daily. We noticed the problem when fetch performance on this table degraded after updates. Autovacuum has thresholds for turning on, are you sure those thresholds are just not being met? Yes we're sure. Our biggest table is set for a fixed number of rows rather than a percentage, this gets an auto analyse about every 15 minutes. After an update this just stops, and there are no analyse entries in the log file. None at all, for any table. 1) What is the exact pg_restore command you are using? 2) From earlier post: '... only analyzes tables in the new db at the point of reload, then shuts off again.' Provide that sequence of events from the Postgres log. 3) Also statistics from https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW for that table after the reload. When we restart Postgres the auto analyse restarts and catches up with the backlog. Phil Horder Database Mechanic Thales Land & Air Systems The information contained in this e-mail is confidential. It is intended only for the stated addressee(s) and access to it by any other person is unauthorised. If you are not an addressee, you must not disclose, copy, circulate or in any other way use or rely on the information contained in this e-mail. Such unauthorised use may be unlawful. If you have received this e-mail in error, please inform the originator immediately and delete it and all copies from your system. Thales UK Limited. A company registered in England and Wales. Registered Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered Number: 868273 Please consider the environment before printing a hard copy of this e-mail. -- Adrian Klaver adrian.kla...@aklaver.com
Re: problem loading shared lib pg_tde.so
On 5/6/24 07:42, Adrian Klaver wrote: On 5/6/24 04:05, Matthias Apitz wrote: I see three different versions of OpenSSL: OPENSSL_1_1_1d -- From error messsage OpenSSL 1.1.1l-fips -- SuSE 15 version OpenSSL 1.1.1t -- Your built version? Are you sure you pointing at the same version in all cases? Should have added what does the below return?: ldd /usr/local/sisis-pap/pgsql-16.2/lib/pg_tde.so matthias -- Adrian Klaver adrian.kla...@aklaver.com
Re: problem loading shared lib pg_tde.so
On 5/6/24 04:05, Matthias Apitz wrote: I have a problem while loading the pg_tde.so shared lib. contrib/pg_tde was built with: cd postgresql-16.2/contrib/pg_tde || exit gmake clean export LDFLAGS="-L/usr/local/sisis-pap/lib -L/usr/lib64" export CFLAGS="-m64 -I/usr/local/sisis-pap/include" export CPPFLAGS="-m64 -I/usr/local/sisis-pap/include" ./configure --prefix=/usr/local/sisis-pap/pgsql-16.2 \ --libdir=/usr/local/sisis-pap/pgsql-16.2/lib --with-libcurl=/usr/local/sisis-pap/ gmake gmake install but the shared lib /usr/local/sisis-pap/pgsql-16.2/lib/pg_tde.so can't be loaded on startup of the server: 024-05-06 11:18:45.967 CEST [15368] FATAL: could not load library "/usr/local/sisis-pap/pgsql-16.2/lib/pg_tde.so": /usr/lib64/libssh.so.4: undefined symbol: EVP_KDF_CTX_new_id, version OPENSSL_1_1_1d 2024-05-06 11:18:45.967 CEST [15368] LOG: database system is shut down This is the OpenSSL version of SuSE Linux Enterprise 15 SP5: # openssl version OpenSSL 1.1.1l-fips 24 Aug 2021 SUSE release 150500.17.25.1 This is what we have compiled and PostgreSQL should use: # export LD_LIBRARY_PATH=/usr/local/sisis-pap/lib # /usr/local/sisis-pap/bin/openssl version OpenSSL 1.1.1t 7 Feb 2023 I see three different versions of OpenSSL: OPENSSL_1_1_1d -- From error messsage OpenSSL 1.1.1l-fips -- SuSE 15 version OpenSSL 1.1.1t -- Your built version? Are you sure you pointing at the same version in all cases? When I disable 'pg_tde' in data/postgresql.auto.conf the server starts fine; vim /data/postgresql162/data/postgresql.auto.conf # disabled shared_preload_libraries = 'pg_tde' # /etc/init.d/postgres162 start starts fine and the postgres proc is using our libssl.so.1.1 # lsof -p 17254 | egrep 'libssl' postgres 17254 postgres mem REG 254,0 697248 1080241 /usr/local/sisis-pap/lib/libssl.so.1.1 # strings /usr/local/sisis-pap/lib/libssl.so.1.1 | grep EVP_KDF (nix) # strings /usr/lib64/libssh.so.4 | grep EVP_KDF EVP_KDF_CTX_new_id EVP_KDF_ctrl EVP_KDF_CTX_free EVP_KDF_derive I have a complete different OpenSSL 3.0.x environment: all OpenSSL consumers use /usr/local/sisis-pap.sp01/lib/libssl.so.3, also PostgreSQL and pg_tde have been compiled against this; and this runs fine with 'pg_tde'. What the avove error means? Thanks matthias -- Adrian Klaver adrian.kla...@aklaver.com
Re: \dt shows table but \d says the table doesn't exist ?
On 5/3/24 21:06, David G. Johnston wrote: On Friday, May 3, 2024, Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote: By and large, I'd expect people using mixed-case table names to get accustomed pretty quickly to the fact that they have to double-quote those names in SQL. I don't see why it's a surprise that that is also true in \d commands. Every day the number of people increases who get mixed-case names in their DB because their client language preserves case and doesn’t require quoting. In a sense they do by making quoting the default, which people find out when they stray from the language/ORM/framework and get pointed at: https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS And it isn’t like any sane person is using case to distinguish names in their DB and so would be at risk of seeing multiple unwanted matches for any given pattern. Have you met people? I don’t think name pattern matching during object lookup is a good tool to wield to try and convince people that using camel case is a bad idea and they should use snake case instead. If they don’t write enough raw SQL to be annoyed by their choice more power to them, making \d more accessible for them is a win and snake case people won’t notice or care. David J. -- Adrian Klaver adrian.kla...@aklaver.com
Re: How to Build Postgres in a Portable / Relocatable fashion?
On 5/3/24 17:35, Bruce Momjian wrote: On Fri, May 3, 2024 at 05:29:34PM -0700, Adrian Klaver wrote: From here: https://www.enterprisedb.com/ #1 IN POSTGRES THE MOST ADMIRED, DESIRED, & USED DATABASE. is not helpful. Though I would say the bigger issue is here: https://www.postgresql.org/download/ Where you have Packages and Installers and then further down 3rd party distributions It tends to imply that the 'Packages and Installers' are not third party, when they are to some degree or another. I think it is that way because we don't have any guarantees or regular communication with "3rd party distributions". I would say it is not clear enough that Core is responsible for the source releases, anything above that is some other groups responsibility. This becomes apparent when you have to tell people that fixing packaging issues requires reaching out to said groups through means that are not entirely clear. -- Adrian Klaver adrian.kla...@aklaver.com
Re: How to Build Postgres in a Portable / Relocatable fashion?
On 5/3/24 17:07, Bruce Momjian wrote: On Fri, May 3, 2024 at 11:27:12PM +, AJ ONeal wrote: Back in the good old days there was official 1st-party support for Linux: This was/is a third party site. Oh. I thought that EDB was *the* Postgres company like MySQL AB was *the* MySQL company. My mistake. How did you come to that conclusion? Is there something we could do to avoid this assumption. From here: https://www.enterprisedb.com/ #1 IN POSTGRES THE MOST ADMIRED, DESIRED, & USED DATABASE. is not helpful. Though I would say the bigger issue is here: https://www.postgresql.org/download/ Where you have Packages and Installers and then further down 3rd party distributions It tends to imply that the 'Packages and Installers' are not third party, when they are to some degree or another. -- Adrian Klaver adrian.kla...@aklaver.com
Re: How to Build Postgres in a Portable / Relocatable fashion?
On 5/3/24 15:57, AJ ONeal wrote: Back in the good old days there was official 1st-party support for Linux: Still there. The below is for the EDB installer, which at this point is only for MacOS and Windows as Unixen platforms have there own packaging. This was/is a third party site. https://www.enterprisedb.com/downloads/postgres-postgresql-downloads <https://www.enterprisedb.com/downloads/postgres-postgresql-downloads> Those binaries could be |tar xvf|'d and used without any fuss or muss, from any location, on any distro. I'm the core maintainer of https://webinstall.dev/ <https://webinstall.dev/> and I'm thinking to break our "official builds only" rule because, well, I haven't been able to find any way to get up-to-date builds from an official channel - and |apt| is always half-a-decade out-of-date (part of the reason for Webi, but Webi is far worse in this case, stuck at v10). I don't see that. Using PGDG repo on Ubuntu 22.04: apt list --installed | grep postgres postgresql-14/jammy-pgdg,now 14.11-1.pgdg22.04+1 amd64 [installed] postgresql-15-pgtap/jammy-pgdg,now 1.3.3-1.pgdg22.04+1 all [installed] postgresql-15/jammy-pgdg,now 15.6-1.pgdg22.04+1 amd64 [installed] postgresql-16-pgtap/jammy-pgdg,now 1.3.3-1.pgdg22.04+1 all [installed] postgresql-16-unit/jammy-pgdg,now 7.8-1.pgdg22.04+1 amd64 [installed] postgresql-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed,automatic] postgresql-client-14/jammy-pgdg,now 14.11-1.pgdg22.04+1 amd64 [installed,automatic] postgresql-client-15/jammy-pgdg,now 15.6-1.pgdg22.04+1 amd64 [installed,automatic] postgresql-client-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed,automatic] postgresql-client-common/jammy-pgdg,now 259.pgdg22.04+1 all [installed,automatic] postgresql-common/jammy-pgdg,now 259.pgdg22.04+1 all [installed,automatic] postgresql-plpython3-14/jammy-pgdg,now 14.11-1.pgdg22.04+1 amd64 [installed] postgresql-plpython3-15/jammy-pgdg,now 15.6-1.pgdg22.04+1 amd64 [installed] postgresql-plpython3-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed] postgresql-server-dev-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed] postgresql/jammy-pgdg,now 16+259.pgdg22.04+1 all [installed] Could I get some help on how to do that? Are the old build processes documented somewhere? Or are there some scripts in a far corner of the Internet that could still do that? As to Apt build scripts: https://wiki.postgresql.org/wiki/Apt/RepoDocs Or what options might I need to pass to ./configure to get it to build with relative locations? I'm not a C developer, and I'm not familiar with C build tools beyond |./configure; make; sudo make install|. I'd really appreciate some direction on this. Thanks. AJ ONeal -- Adrian Klaver adrian.kla...@aklaver.com
Re: \dt shows table but \d says the table doesn't exist ?
On 5/3/24 14:06, Magnus Hagander wrote: On Fri, May 3, 2024 at 10:58 PM David Gauthier <mailto:dfgpostg...@gmail.com>> wrote: psql (15.3, server 14.5) on linux Someone else's DB which I've been asked to look at. \dt gives many tables, here are just 3... public | some_idIds | table | cron_user public | WarningIds | table | cron_user public | cpf_inv_driverIds | table | cron_user but \d public.some_idIds gives.. Did not find any relation named "public.some_idIds". Looks like you might need a \d "some_idIds" (include the quotes) since it has an uppercase characters? This: "Did not find any relation named "public.some_idIds"." to me indicates it did look for the properly cased name. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/> -- Adrian Klaver adrian.kla...@aklaver.com
Re: \dt shows table but \d says the table doesn't exist ?
On 5/3/24 13:58, David Gauthier wrote: psql (15.3, server 14.5) on linux Someone else's DB which I've been asked to look at. \dt gives many tables, here are just 3... public | some_idIds | table | cron_user public | WarningIds | table | cron_user public | cpf_inv_driverIds | table | cron_user but \d public.some_idIds gives.. Did not find any relation named "public.some_idIds". Soo... what am I missing ? Try: select relname, length(relname) from pg_class where relname ilike '%some_idIds%'; Best guess there are hidden characters. owner is "cron_user". \dt shows cron_user is the owner of the table. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/2/24 08:52, HORDER Philip wrote: Running Postgres 15.3 with PostGIS 3.3 On Windows 10 (yes, I know) It’s a single node db with no replication, topping out at about 200GB. We have a schema (A) in the default 'postgres' maintenance database, which our software services connect to, with one set of users (SU) This above is probably not a good idea, The 'postgres' database is generally taken to be a throw away database for establishing an initial connection. Many utilities/tools use it for that purpose, having your data in it exposes that data. This works, some of the time. If I repeat the update process, somewhere around run #4 the auto analyzer stops working, and only analyzes tables in the new db at the point of reload, then shuts off again. All vacuum and analyze operations on the 'postgres' database just stops, even though there is still data processing into it. Is there enough data processing? Autovacuum has thresholds for turning on, are you sure those thresholds are just not being met? With log_autovacuum_min_duration = 0, we are logging all vacuum & analyze operations, so we can see when the entries shut off in the Postgres log files, e.g. 2024-05-02 14:52:01.597 GMT [6896]: [23-1] db=,user=,app=,client= LOG: automatic analyze of table "lfm.pg_catalog.pg_trigger" Except the above shows it working. What is the evidence it is not? The only way I can find of getting the analyzer back is to restart Postgres. Did you wait to see if activity after the pg_restore crossed the autovacuum thresholds? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Restore of a reference database kills the auto analyze processing.
On 5/2/24 8:52 AM, HORDER Philip wrote: Running Postgres 15.3 with PostGIS 3.3 On Windows 10 (yes, I know) It’s a single node db with no replication, topping out at about 200GB. We have a schema (A) in the default 'postgres' maintenance database, which our software services connect to, with one set of users (SU) We have another database, let’s call it LFM, which contains reference data for some COTS software. I don't know what's in it, we just get given updates for it in pg_backup binary files, about 2MB each. Do you mean pg_basebackup, pg_dump or something else? -- Adrian Klaver adrian.kla...@aklaver.com
Re: How to interpret 'depends on' errors in pg_restore?
On 5/2/24 02:20, Fire Emerald wrote: I didn't used pg_dump/restore until today and finally found my mistake which lead to the "problem" described below. The output "depends on" comes from the -l (l as Lima) flag, what i wanted was the -1 (number one) flag, which stands for single transaction in pg_restore. As -l does not execute anything, nothing was logged in the postgres server log and none error was shown anywhere. -l does indeed execute something per: https://www.postgresql.org/docs/current/app-pgrestore.html " -l --list List the table of contents of the archive. The output of this operation can be used as input to the -L option. Note that if filtering switches such as -n or -t are used with -l, they will restrict the items listed. " As example: pg_restore -l redmine41_14_032124.out ; ; Archive created at 2024-03-21 01:00:01 PDT ; dbname: redmine ; TOC Entries: 455 ; Compression: -1 ; Dump Version: 1.14-0 ; Format: CUSTOM ; Integer: 4 bytes ; Offset: 8 bytes ; Dumped from database version: 14.11 (Ubuntu 14.11-1.pgdg22.04+1) ; Dumped by pg_dump version: 14.11 (Ubuntu 14.11-1.pgdg22.04+1) ; ; ; Selected TOC Entries: ; 209; 1259 17070 TABLE public ar_internal_metadata redmine 210; 1259 17075 TABLE public attachments redmine 211; 1259 17088 SEQUENCE public attachments_id_seq redmine 4069; 0 0 SEQUENCE OWNED BY public attachments_id_seq redmine ... Generally you want to redirect that to a file with -f with the goal of using it with: " -L list-file --use-list=list-file Restore only those archive elements that are listed in list-file, and restore them in the order they appear in the file. Note that if filtering switches such as -n or -t are used with -L, they will further restrict the items restored. list-file is normally created by editing the output of a previous -l operation. Lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the start of the line. See below for examples. " You instead redirected the output to the target database and that led to your errors. Both chars looked so identical in my editors/shells that i thought i used -1, in fact using -l. It's always the tiny obvious thing, which we do not see. Best regards, Chris Am 28. März 2024 16:57:04 schrieb Fire Emerald : Am 28. März 2024 15:00:06 schrieb Tom Lane : Fire Emerald writes: Then i did a pg_restore -d target --verbose -Fc file.dump and saw in the output this: 5145 0 730750 TABLE subpartitions backends_y2024w03 userA ; depends on: 237 and so on ... That is not an error, it's just verbose display of one of the items in the dump. Well, I know it's not an error, but it's everything i got. There was no error shown. The command completed, but without anything imported. Nothing was restored. You would need to show us the actual errors. (Suggestion: leave off --verbose, it's just clutter.) A guess though is that the import failed because of foreign key constraints. --data-only mode is not good at ordering the table loads to ensure that FK constraints are satisfied on-the-fly. regards, tom lane As i said, the same import but with INSERT INTOs worked without any issues. So no, there are no FK constraints failing. *But* the target and source table had partitioned tables attached, using ATTACH PARTITION. The schema was like: db1 schema1 public table1 (links to the listed below) db1 schema1 subpartitions backends_y2024w03 db1 schema1 subpartitions backends_y2024w04 db1 schema1 subpartitions backends_y2024w05 The partitioning must be the problem somehow. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Need help to make space on my database
On 4/29/24 08:51, Cocam' server wrote: > Did the above work for each table? Yes, except for the biggest table > Have you done something like?: > > select relname, n_dead_tup from pg_stat_all_tables where relname = > ''; I hadn't thought of that, but it seems that some tables have dead tuples You have something against providing actual numbers? The point is there is really nothing to be gained by doing VACUUM FULL if the dead tuples are some small percentage of the tables. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Need help to make space on my database
On 4/29/24 08:04, Cocam' server wrote: When replying use Reply All to include the mailing list Ccing list > How much current free space do you have available on the disk? as we speak, I only have 6 GB available on the machine running the server > Did you VACUUM FULL a table at a time or all of them at once? I tried to make a VACUUM FULL. I also tried on the biggest tables (200 Mb and +) but not on all of them Did the above work for each table? Have you done something like?: select relname, n_dead_tup from pg_stat_all_tables where relname = ''; to see if there any dead tuples to clean out. Or if you use the contrib extension pgstattuple: https://www.postgresql.org/docs/current/pgstattuple.html then: SELECT * FROM pgstattuple(''); This returns something like: -[ RECORD 1 ]--+ table_len | 3940352 tuple_count| 4310 tuple_len | 3755414 tuple_percent | 95.31 dead_tuple_count | 0 dead_tuple_len | 0 dead_tuple_percent | 0 free_space | 124060 free_percent | 3.15 The two biggest are these: state_groups_state | 5475 MB event_json | 2328 MB (I'd particularly like to make room on these two tables, which take up the most space) By the way, excuse me if I make a few mistakes (especially when replying), this is the first time I've used Postgres community support directly Le lun. 29 avr. 2024 à 16:37, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> a écrit : On 4/29/24 07:33, Cocam' server wrote: Please reply to list also Ccing list > No, the aim is also to reallocate free space to the system for the other > tasks it performs.(That's why I said I'd like it returned to the OS) You led with: "I need help to make space on my database". How much current free space do you have available on the disk? Did you VACUUM FULL a table at a time or all of them at once? What are the individual tables sizes? > > Le lun. 29 avr. 2024 à 16:19, Adrian Klaver mailto:adrian.kla...@aklaver.com> > <mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>> a écrit : > > On 4/29/24 06:45, Cocam' server wrote: > > Hello. > > > > I need help to make space on my database. I have tables that are > several > > GB in size. I used to use the VACUUM FULL VERBOSE command; but > now, this > > command is too greedy in free space to be used and I'm looking > for a way > > to make free space (given back to the OS) > > > > Thanks in advance to everyone who responds > > Per > > https://www.postgresql.org/docs/current/sql-vacuum.html <https://www.postgresql.org/docs/current/sql-vacuum.html> > <https://www.postgresql.org/docs/current/sql-vacuum.html <https://www.postgresql.org/docs/current/sql-vacuum.html>> > > "VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL > operation, tuples that are deleted or obsoleted by an update are not > physically removed from their table; they remain present until a VACUUM > is done. Therefore it's necessary to do VACUUM periodically, especially > on frequently-updated tables. > > <...> > > Plain VACUUM (without FULL) simply reclaims space and makes it > available > for re-use. This form of the command can operate in parallel with > normal > reading and writing of the table, as an exclusive lock is not obtained. > However, extra space is not returned to the operating system (in most > cases); it's just kept available for re-use within the same table. > " > > So a regular VACUUM should work if all you want to do is give the > database the ability to recycle the vacuumed tuple space. > > -- > Adrian Klaver > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> <mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>> > -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Need help to make space on my database
On 4/29/24 07:33, Cocam' server wrote: Please reply to list also Ccing list No, the aim is also to reallocate free space to the system for the other tasks it performs.(That's why I said I'd like it returned to the OS) You led with: "I need help to make space on my database". How much current free space do you have available on the disk? Did you VACUUM FULL a table at a time or all of them at once? What are the individual tables sizes? Le lun. 29 avr. 2024 à 16:19, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> a écrit : On 4/29/24 06:45, Cocam' server wrote: > Hello. > > I need help to make space on my database. I have tables that are several > GB in size. I used to use the VACUUM FULL VERBOSE command; but now, this > command is too greedy in free space to be used and I'm looking for a way > to make free space (given back to the OS) > > Thanks in advance to everyone who responds Per https://www.postgresql.org/docs/current/sql-vacuum.html <https://www.postgresql.org/docs/current/sql-vacuum.html> "VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables. <...> Plain VACUUM (without FULL) simply reclaims space and makes it available for re-use. This form of the command can operate in parallel with normal reading and writing of the table, as an exclusive lock is not obtained. However, extra space is not returned to the operating system (in most cases); it's just kept available for re-use within the same table. " So a regular VACUUM should work if all you want to do is give the database the ability to recycle the vacuumed tuple space. -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Need help to make space on my database
On 4/29/24 06:45, Cocam' server wrote: Hello. I need help to make space on my database. I have tables that are several GB in size. I used to use the VACUUM FULL VERBOSE command; but now, this command is too greedy in free space to be used and I'm looking for a way to make free space (given back to the OS) Thanks in advance to everyone who responds Per https://www.postgresql.org/docs/current/sql-vacuum.html "VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables. <...> Plain VACUUM (without FULL) simply reclaims space and makes it available for re-use. This form of the command can operate in parallel with normal reading and writing of the table, as an exclusive lock is not obtained. However, extra space is not returned to the operating system (in most cases); it's just kept available for re-use within the same table. " So a regular VACUUM should work if all you want to do is give the database the ability to recycle the vacuumed tuple space. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Query Discrepancy in Postgres HLL Test
On 4/28/24 06:01, Ayush Vatsa wrote: Hi PostgreSQL Community, I'm currently delving into Postgres HLL (HyperLogLog) functionality and have encountered an unexpected behavior while executing queries from the "cumulative_add_sparse_edge.sql <https://github.com/citusdata/postgresql-hll/blob/master/sql/cumulative_add_sparse_edge.sql#L28-L36>" regress test. This particular test data file <https://github.com/citusdata/postgresql-hll/blob/master/sql/data/cumulative_add_sparse_edge.csv#L515-L516> involves three columns, with the last column representing an HLL (HyperLogLog) value derived from the previous HLL value and the current raw value. Upon manual inspection of the query responsible for deriving the last row's HLL value, I noticed a discrepancy. When executing the query: """ -- '\x148B481002' is second last rows hll value SELECT hll_add('\x148B481002.', hll_hashval(2561)); """ instead of obtaining the expected value (''\x148B481002''), I received a different output which is ('\x138b48000200410061008100a1 '). My initial assumption is that this could potentially be attributed to a precision error. However, I'm reaching out to seek clarity on why this disparity is occurring and to explore potential strategies for mitigating it (as I want the behaviour to be consistent to regress test file). I would say your best option is to file an issue here: https://github.com/citusdata/postgresql-hll/issues Regards Ayush Vatsa -- Adrian Klaver adrian.kla...@aklaver.com
Re: What linux version to install ?
On 4/25/24 00:46, Kashif Zeeshan wrote: I worked with both CentOS/RHEL 7/8. What does the above mean relative to the original question? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Backup_Long Running
On 4/24/24 00:03, jaya kumar wrote: Hi Team, Production database Backup is running very long hours. Any option to reduce backup time? Kindly advise me. Hardware specifications? Network specifications? The actual pg_basebackup command used? Server(s) user load? DB size: 793 GB We are taking pg_basebackup backup. Backup File size=613G Backup Running Duration: 8 Hours -- Thanks & Regards, Jayakumar.S +91-9840864439. -- Adrian Klaver adrian.kla...@aklaver.com
Re: issue with reading hostname
On 4/22/24 14:54, Atul Kumar wrote: I mean, Once I change the hostname then how will the socket read the new hostname ? Does it require a postgres service restart ? The host name of the machine? On Tue, Apr 23, 2024 at 3:19 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 4/22/24 14:37, Atul Kumar wrote: > Can we edit the socket to change the hostname in it ? On Ubuntu 22.04 install, given: srwxrwxrwx 1 postgres postgres 0 Apr 22 14:01 .s.PGSQL.5432= -rw--- 1 postgres postgres 68 Apr 22 14:01 .s.PGSQL.5432.lock The contents of .s.PGSQL.5432.lock(the file that indicates a Postgres instance has a lock on the socket) are: 862 /var/lib/postgresql/15/main 1713795311 5432 /var/run/postgresql There is no hostname to be changed as you are working with a local socket. > > Regards. > > On Tue, Apr 23, 2024 at 2:41 AM Ron Johnson mailto:ronljohnso...@gmail.com> > <mailto:ronljohnso...@gmail.com <mailto:ronljohnso...@gmail.com>>> wrote: > > On Mon, Apr 22, 2024 at 4:14 PM Atul Kumar mailto:akumar14...@gmail.com> > <mailto:akumar14...@gmail.com <mailto:akumar14...@gmail.com>>> wrote: > > Hi, > > I have postgresql version 15 running on centos7. > > I have below query that reads hostname from /tmp directory: > > psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT > pg_is_in_recovery();' > > > If you installed from the PGDG repository (possibly also the CENTOS > repos, but I'm not sure), then the domain socket also lives in : > /var/run/postgresql > > * I find that more expressive than /tmp. > * No need to specify the host when using sockets. > * Using a socket name makes parameterizing the hostname easier in > scripts. > > -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: CLUSTER vs. VACUUM FULL
On 4/22/24 14:35, Ron Johnson wrote: > > On Stack Exchange, I've got a question on how to determine when to run > CLUSTER. It ties in strongly with this thread.. > And the link is? It should have been the initial question of this thread and it explains what you are really after. Sorry. Got distracted by the answer. https://dba.stackexchange.com/questions/338870/when-to-rerun-cluster <https://dba.stackexchange.com/questions/338870/when-to-rerun-cluster> -- Adrian Klaver adrian.kla...@aklaver.com
Re: issue with reading hostname
On 4/22/24 14:37, Atul Kumar wrote: Can we edit the socket to change the hostname in it ? On Ubuntu 22.04 install, given: srwxrwxrwx 1 postgres postgres0 Apr 22 14:01 .s.PGSQL.5432= -rw--- 1 postgres postgres 68 Apr 22 14:01 .s.PGSQL.5432.lock The contents of .s.PGSQL.5432.lock(the file that indicates a Postgres instance has a lock on the socket) are: 862 /var/lib/postgresql/15/main 1713795311 5432 /var/run/postgresql There is no hostname to be changed as you are working with a local socket. Regards. On Tue, Apr 23, 2024 at 2:41 AM Ron Johnson <mailto:ronljohnso...@gmail.com>> wrote: On Mon, Apr 22, 2024 at 4:14 PM Atul Kumar mailto:akumar14...@gmail.com>> wrote: Hi, I have postgresql version 15 running on centos7. I have below query that reads hostname from /tmp directory: psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT pg_is_in_recovery();' If you installed from the PGDG repository (possibly also the CENTOS repos, but I'm not sure), then the domain socket also lives in : /var/run/postgresql * I find that more expressive than /tmp. * No need to specify the host when using sockets. * Using a socket name makes parameterizing the hostname easier in scripts. -- Adrian Klaver adrian.kla...@aklaver.com
Re: CLUSTER vs. VACUUM FULL
On 4/22/24 13:59, Ron Johnson wrote: On Mon, Apr 22, 2024 at 4:21 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: [snip] Which gets us back to your comment upstream: "What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the PK, if the PK is a sequence (whether that be an actual sequence, or a timestamp or something else that grows monotonically)." This is a case specific to you and this particular circumstance, not a general rule for VACUUM FULL. If for no other reason then it might make more sense for the application that the CLUSTER be done on some other index then the PK. On Stack Exchange, I've got a question on how to determine when to run CLUSTER. It ties in strongly with this thread.. And the link is? -- Adrian Klaver adrian.kla...@aklaver.com
Re: issue with reading hostname
On 4/22/24 13:13, Atul Kumar wrote: Hi, I have postgresql version 15 running on centos7. I have below query that reads hostname from /tmp directory: psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT pg_is_in_recovery();' so below are my questions: 1. Is the psql client reading the socket file that resides in the /tmp directory to fetch the hostname ? 2. I saw the socket file in /tmp and it is empty. Then how is the psql client still reading the socket file successfully for hostname ? this is my socket looks ( the size is 0 as the file is empty): srwxrwxrwx. 1 postgres postgres 0 Apr 22 12:47 .s.PGSQL.5432 Please help me clarify these doubts. https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS "host Name of host to connect to. If a host name looks like an absolute path name, it specifies Unix-domain communication rather than TCP/IP communication; the value is the name of the directory in which the socket file is stored. (On Unix, an absolute path name begins with a slash. On Windows, paths starting with drive letters are also recognized.) If the host name starts with @, it is taken as a Unix-domain socket in the abstract namespace (currently supported on Linux and Windows). The default behavior when host is not specified, or is empty, is to connect to a Unix-domain socket in /tmp (or whatever socket directory was specified when PostgreSQL was built). On Windows, the default is to connect to localhost. A comma-separated list of host names is also accepted, in which case each host name in the list is tried in order; an empty item in the list selects the default behavior as explained above. See Section 34.1.1.3 for details. " The simplistic explanation is that the socket is the "host". Regards. -- Adrian Klaver adrian.kla...@aklaver.com
Re: CLUSTER vs. VACUUM FULL
On 4/22/24 12:51, Ron Johnson wrote: On Mon, Apr 22, 2024 at 3:14 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: 1) If they are already in enough of a PK order that the CLUSTER time vs VACUUM FULL time would not be material as there is not much or any sorting to do then what does the CLUSTER gain you? Not much. Now they're just "slightly more ordered" instead of "slightly less ordered" for little if any extra effort. 2) What evidence is there that the records where still in PK order just because you deleted based on CREATED_ON? I understand the correlation between CREATED_ON and the PK just not sure why that would necessarily translate to an on disk order by PK? 1. Records are appended to tables in INSERT order, and INSERT order is highly correlated to synthetic PK, by the nature of sequences. Not something I would count on, see: https://www.postgresql.org/docs/current/sql-createsequence.html Notes for how that may not always be the case. Also any UPDATE or DELETE is going to change that. There is no guarantee of order for the data in the table. If there where you would not need to run CLUSTER. 2. My original email showed that CLUSTER took just as long as VACUUM FULL. That means not many records had to be sorted, because... the on-disk order was strongly correlated to PK and CREATED_ON. > Will that happen *every time* in *every circumstance* in *every database*? No, and I never said it would. But it does in *my *database in *this *application. Which gets us back to your comment upstream: "What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the PK, if the PK is a sequence (whether that be an actual sequence, or a timestamp or something else that grows monotonically)." This is a case specific to you and this particular circumstance, not a general rule for VACUUM FULL. If for no other reason then it might make more sense for the application that the CLUSTER be done on some other index then the PK. -- Adrian Klaver adrian.kla...@aklaver.com
Re: altering a column to to make it generated
On 4/22/24 12:42, Celia McInnis wrote: Can I alter a table column to now make it generated? I tried this unsuccessfully: create temp table tmp1 as select 15::numeric(6,1) as distance,'24:30'::interval,0::numeric(7,3) as avgspd; alter table tmp1 alter column avgspd type numeric(7,3) generated always as ((3600.*distance)/EXTRACT (EPOCH FROM rdrtime::interval)) STORED; ERROR: syntax error at or near "generated" LINE 1: ... table tmp1 alter column avgspd type numeric(7,3) generated ... I think that at least I have the right bracketing this time! :-) See David Johnston's comment. I keep on wanting to believe that ALTER TABLE supports GENERATED expressions, when it only supports GENERATED IDENTITY:) -- Adrian Klaver adrian.kla...@aklaver.com
Re: CLUSTER vs. VACUUM FULL
On 4/22/24 11:45 AM, Ron Johnson wrote: On Mon, Apr 22, 2024 at 12:29 PM David G. Johnston mailto:david.g.johns...@gmail.com>> wrote: On Mon, Apr 22, 2024, 08:37 Ron Johnson mailto:ronljohnso...@gmail.com>> wrote: On Mon, Apr 22, 2024 at 10:25 AM Tom Lane mailto:t...@sss.pgh.pa.us>> wrote: Marcos Pegoraro mailto:mar...@f10.com.br>> writes: > But wouldn't it be good that VACUUM FULL uses that index defined by > Cluster, if it exists ? No ... what would be the difference then? What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the PK, if the PK is a sequence (whether that be an actual sequence, or a timestamp or something else that grows monotonically). That's because the data is already roughly in PK order. If things are bad enough to require a vacuum full that doesn't seem like a good assumption. Sure it does. For example, I just deleted the oldest half of the records in 30 tables. Tables who's CREATED_ON timestamp value strongly correlates to the synthetic PK sequence values. Thus, the remaining records were still mostly in PK order. CLUSTERs on the PK values would have taken just about as much time as the VACUUM FULL statements which I /did/ run. 1) If they are already in enough of a PK order that the CLUSTER time vs VACUUM FULL time would not be material as there is not much or any sorting to do then what does the CLUSTER gain you? Unless this table then became read only whatever small gain arose from the CLUSTER would fade away as UPDATEs and DELETEs where done. 2) What evidence is there that the records where still in PK order just because you deleted based on CREATED_ON? I understand the correlation between CREATED_ON and the PK just not sure why that would necessarily translate to an on disk order by PK? -- Adrian Klaver adrian.kla...@aklaver.com
Re: adding a generated column to a table?
On 4/22/24 09:05, Celia McInnis wrote: If I have a table containing a date field, say: create temporary table tmp1 as select now()::date as evtdate; SELECT 1 select DATE_PART('year', evtdate)::integer as year from tmp1; year -- 2024 (1 row) Is there some way of doing something like the following?: alter table tmp1 add column year integer generated always as DATE_PART('year', evtdate)::integer STORED; ERROR: syntax error at or near "DATE_PART" LINE 1: ... tmp1 add column year integer generated always as DATE_PART(... https://www.postgresql.org/docs/current/sql-createtable.html GENERATED ALWAYS AS ( generation_expr ) STORED So: generated always as (DATE_PART('year', evtdate)::integer) STORED -- Adrian Klaver adrian.kla...@aklaver.com
Re: CLUSTER vs. VACUUM FULL
On 4/22/24 08:37, Ron Johnson wrote: On Mon, Apr 22, 2024 at 10:25 AM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote: Marcos Pegoraro mailto:mar...@f10.com.br>> writes: > But wouldn't it be good that VACUUM FULL uses that index defined by > Cluster, if it exists ? No ... what would be the difference then? What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the PK, if the PK is a sequence (whether that be an actual sequence, or a timestamp or something else that grows monotonically). Why? That would, per David Rowley's comments, impose a sort cost on top of the cost of hitting every heap page and rewriting it. You end up with sorted table granted, until such time as you start making changes to it. If you are to the point of running VACUUM FULL that indicates to me the table has seen a heavy load of changes that you want to clean out. Given the temporary nature of the effects of a CLUSTER under a change load I don't see why it would be the way to go to clean up a changing table. That's because the data is already roughly in PK order. -- Adrian Klaver adrian.kla...@aklaver.com
Re: error in trigger creation
On 4/21/24 14:21, Tom Lane wrote: Adrian Klaver writes: On 4/21/24 11:20, yudhi s wrote: So in this case i was wondering if "event trigger" can cause any additional threat and thus there is no such privilege like "create trigger" exist in postgres and so it should be treated cautiously? An event trigger runs as a superuser and executes a function that in turn can do many things, you do the math on the threat level. As a trivial example: an event trigger could prevent the legitimate superuser(s) from doing anything at all in that database, just by blocking all their commands. This might not even require malicious intent, merely faulty coding --- but the opportunity for malicious intent is staggeringly large. As an FYI to above: https://www.postgresql.org/docs/current/sql-createeventtrigger.html "Event triggers are disabled in single-user mode (see postgres). If an erroneous event trigger disables the database so much that you can't even drop the trigger, restart in single-user mode and you'll be able to do that." regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com
Re: error in trigger creation
On 4/21/24 11:20, yudhi s wrote: On Sun, Apr 21, 2024 at 8:13 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote: So do you mean , we should not create the event trigger using the "security definer" , rather have the super user do this each time we have to create the event trigger? Actually , I am not very much aware about the security part, but is it fine to give the super user privilege to the application user(say app_user) from which normally scripts/procedures get executed by the application, but nobody(individual person) can login using that user. Additionally in other databases, triggers are driven by some specific privileges (say for example in oracle "create trigger" privilege). And it doesn't need any super user and we were having many Which Postgres has https://www.postgresql.org/docs/current/ddl-priv.html TRIGGER Allows creation of a trigger on a table, view, etc. but you are talking about event triggers https://www.postgresql.org/docs/current/sql-createeventtrigger.html where "Only superusers can create event triggers." To paraphrase Henry Ford, you can have any user for an event trigger as long as the user is a superuser. applications in which the application user (which were used for app to app login) was having these privileges, similar to "create table" privileges which comes by default to the schema who owns the objects etc. So in this case i was wondering if "event trigger" can cause any additional threat and thus there is no such privilege like "create trigger" exist in postgres and so it should be treated cautiously? An event trigger runs as a superuser and executes a function that in turn can do many things, you do the math on the threat level. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Logging statement having any threat?
On 4/21/24 02:35, Lok P wrote: On Sat, Apr 20, 2024 at 10:02 PM Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: Have you tried?: https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT <https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT> " log_statement (enum) <...> The default is none. Only superusers and users with the appropriate SET privilege can change this setting. " Or https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET <https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET> set_config ( setting_name text, new_value text, is_local boolean ) → text > > Now when we reach out to the infrastructure team , they are saying these > variables(pg_cluster_log_statement,pg_instance_log_statement) were Where are those variables coming from? I can not find them in RDS or Terraform docs. Thank You Adrian. Actually I was trying to understand if the auto_explain can only work and help us see the slow sql statements in the log, only after we set the "log_statement" parameter to non default values (like all, mod, ddl)? And what is the exact threat with the logging these queries , and i log_statement = 'mod' create role pwd_test with password 'test'; CREATE ROLE tail -f /var/log/postgresql/postgresql-16-main.log <...> 2024-04-21 09:04:17.746 PDT [9664] postgres@test LOG: statement: create role pwd_test with password 'test'; think ,I got the point as you mentioned , having access to database itself is making someone to see the object details, however do you agree that in case of RDS logs are available through different mediums like cloud watch, data dog agent etc , so that may pose additional threats as Aah, the joys of managed services where you have to check even more layers when building out your security. Logging itself is not the issue, who has access to the logs is. The more access points the more difficult that gets. Dealing with this is going to require a system wide review by all parties and coming up with an agreed upon access policy that balances security with the need to monitor what is happening in the database. Otherwise troubleshooting issues will be a long drawn out process which in itself could end up being a security issue. because , may be some person doesn't have access to database directly but still having permission to see the logs, so the appropriate access control need to put in place? And additionally I was trying to execute the "SELECT set_config('log_statement', 'all', true);" but it says "/permission denied to set parameter "log_statement/".".So might be it needs a higher privileged user to run it. To answer your question on the variable those we have on the terraform module, the terraform module is customized by the database infra team so that might be why we are seeing those there which may not be exactly the same as its showing in RDS docs for postgres. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.Concepts.PostgreSQL.html <https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.Concepts.PostgreSQL.html> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Logging statement having any threat?
On 4/20/24 07:02, Lok P wrote: Now when we reach out to the infrastructure team , they are saying these variables(pg_cluster_log_statement,pg_instance_log_statement) were removed due to potential security threat. So I want to understand from experts here , how this is really a security threat and if any option to get this logging enabled (which will help us debug performance issues) at same time addressing the threat too? I should have added to previous post, if you have access to the database the security wall has already been breached. Regards Lok -- Adrian Klaver adrian.kla...@aklaver.com
Re: Logging statement having any threat?
On 4/20/24 07:02, Lok P wrote: Hello All, Its postgres version 15.4 and its RDS, in which our dev team gets the infrastructure code from another third party team which provides us base infrastructure code to build a postgres database, in which we will be able to do change DB parameter values etc whatever is mentioned in the file with possible values. But surprisingly we don't see log_statement there. Below was our requirement, For debugging and evaluating performance we were having pg_stat_statements but it contains aggregated information about all the query execution. But in case just want to debug any point in time issues where the selected few queries were performing bad (may be because of plan change), we were planning to have the auto_explain extension added and set the log_min_duration to ~5 seconds, So that, all the queries going above that time period(5 seconds) will be logged and provide detailed information on the exact point of bottleneck. But we see the log_statement parameter has been removed from the base infrastructure script/terraform script given by the database team here, so that means we will get it as default which is "NONE", which means no statement(SELECT/DML/DDL etc) can be logged. Have you tried?: https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT " log_statement (enum) <...> The default is none. Only superusers and users with the appropriate SET privilege can change this setting. " Or https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET set_config ( setting_name text, new_value text, is_local boolean ) → text Now when we reach out to the infrastructure team , they are saying these variables(pg_cluster_log_statement,pg_instance_log_statement) were Where are those variables coming from? I can not find them in RDS or Terraform docs. removed due to potential security threat. So I want to understand from experts here , how this is really a security threat and if any option to get this logging enabled (which will help us debug performance issues) at same time addressing the threat too? Regards Lok -- Adrian Klaver adrian.kla...@aklaver.com
Re: [help] Error in database import
On 4/19/24 20:58, Tu Ho wrote: Hi, I am currently having a project where I need to combine 2 large database. I was not however able to import an excel file .csv into the There are Excel files(.xls/.xlsx) and there are *.csv files. database. The error was ERROR: syntax error at or near "OIDS" LINE 1: ...ing Site" , "International Name(s)" ) FROM STDIN OIDS DELI... ^" I have no idea how to fix this because I used the "upload file" option. What should I do? Use a recent version of whatever client you are using as: https://www.postgresql.org/docs/current/sql-copy.html "If 1, OIDs are included in the data; if 0, not. Oid system columns are not supported in PostgreSQL anymore, but the format still contains the indicator." Your client is using old syntax as the last version of Postgres that allowed FROM STDIN OIDS was: https://www.postgresql.org/docs/11/sql-copy.html "where option can be one of: FORMAT format_name OIDS [ boolean ] " And Postgres 11 is ~6 months past EOL. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Can you refresh a materialized view from a materialized view on another server?
On 4/18/24 19:49, Michael Nolan wrote: My production server has a materialized view that is refreshed from a mysql_fdw several times a day. What I'd like to be able to do is refresh the materialized view on the testbed server from the one on the production server so that they are the same. (Refreshing it from the MySQL server will result in one that has records that have been added or updated rather than an exact copy of the one on the production serve Use postgres_fdw to connect the test bed to the production server? Mike Nolan htf...@gmail.com -- Adrian Klaver adrian.kla...@aklaver.com
Re: Not able to grant access on pg_signal_backend on azure flexible server
On 4/18/24 22:48, Saksham Joshi wrote: Hi, We have created an azure postgresql flexible server and we have added an ad admin as a user and Created our database using this admin user.However,When are running this command: 'Grant pg_signal_backend To adminUser' we are getting an error that says 'permission denied to grant role "pg_signal_backend".While this is strange the admin user is infact the owner of the said database and we don't have any other user that have the said privileges. https://www.postgresql.org/docs/current/predefined-roles.html "PostgreSQL provides a set of predefined roles that provide access to certain, commonly needed, privileged capabilities and information. Administrators (including roles that have the CREATEROLE privilege) can GRANT these roles to users and/or other roles in their environment, providing those users with access to the specified capabilities and information." -- Adrian Klaver adrian.kla...@aklaver.com
Re: Question on trigger
On 4/16/24 12:39, veem v wrote: On Sat, 13 Apr 2024 at 21:44, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 4/13/24 00:03, veem v wrote: > Thank you Adrian. > > So it seems the heavy DML tables will see an impact if having triggers > (mainly for each row trigger) created on them. > > And also the bulk DML/array based insert (which inserts multiple rows in > one short or one batch) , in those cases it seems the trigger will not > make that happen as it will force it to make it happen row by row, as > the trigger is row based. Will test anyway though. You said you have triggers in the Oracle database and I assumed they worked and where not a show stopping issue there. What makes you think that would be different in Postgres? What type of triggers where there in Oracle, per row, per statement or a mix? Actually we have row level triggers in oracle which are running for smaller volume DML and are making the direct path inserts to happen in conventional row by row insert, in presence of trigger. So was wondering Not sure what the above means, you will need to provide a more detailed description. Though any DML you are doing on table that has any sort of constraint, index, trigger, foreign key, default values, etc is going to have more overhead then into an unencumbered table. FYI, some of the preceding are system triggers, for example foreign keys. if it postgres we will be encountering a similar issue and batch inserts may be converted back to row by row automatically. And here we are going to process higher volume DMLS in postgresql database. Hard to say with the information provided. Easiest way to find out is create a test setup and run the code. Though I guess, as I have not actually tried this, you could have a per row trigger and per statement trigger for the same action and disable the per row and enable the per statement trigger for batch operations. Then once the batch operation is done reverse the process. Again something to test to verify. -- Adrian Klaver adrian.kla...@aklaver.com
Re: constant crashing
On 4/14/24 14:50, jack wrote: Reply to list also Ccing list Hello, I am not sure what "locale" means. Go to the settings App for whatever version of Windows you are on and search for locale. The Windows app is an inhouse application which uses Actian-Zen SQL. The data is exported to simple ASCII in a tab delimited format similar to CSV. And you know it is ASCII for a fact? Those files are then imported into the PostgreSQL table using COPY. Importing the data is not an issue. I am able to load all the data without any problems, even into 1 table which ends up with about 1.2 billion records. But when I try to update the data in that table I get many errors, essentially crashes. Repeating what has been asked and answered it not really going anywhere. There may be some control characters (garbage) in the data but that should not crash postgresql, especially if it can import the data without issues. Unless it does. That is the point of the questions, getting to what is actually causing the issue. Until the problem can be boiled down to a reproducible test case there really is not much hope of anything more then the the 'yes you have a problem' answer. And there is a difference between dumping data into a table and then doing an UPGRADE where the data strings are manipulated by functions. Anyway, I hope I answered your questions. Thanks for your help. On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver wrote: On 4/14/24 13:18, jack wrote: The CSV files are being produced by another system, a WIndows app on a Windows machine. I then copy them to a USB key and copy them onto the ubuntu machine. The data is then imported via the COPY command. The app? The locale in use on the Windows machine? The locale in use in the database? COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t' The fields are tab delimited. But importing the data works. I can get all the data into a single table without any problems. The issue is only when I start to update the single table. And that is why I started using smaller temporary tables for each CSV file, to do the updates in the smaller tables before I move them all to a single large table. The import is just dumping the data in, my suspicion is the problem is related to using string functions on the data. After all the data is loaded and updated, I run php programs on the large table to generate reports. All of which works well EXCEPT for performing the updates on the data. And I do not want to use perl or any outside tool. I want it all one in SQL because I am required to document all my steps so that someone else can take over, so everything needs to be as simple as possible. -- Adrian Klaver adrian.kla...@aklaver.com -- Adrian Klaver adrian.kla...@aklaver.com
Re: constant crashing
On 4/14/24 13:18, jack wrote: The CSV files are being produced by another system, a WIndows app on a Windows machine. I then copy them to a USB key and copy them onto the ubuntu machine. The data is then imported via the COPY command. The app? The locale in use on the Windows machine? The locale in use in the database? COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t' The fields are tab delimited. But importing the data works. I can get all the data into a single table without any problems. The issue is only when I start to update the single table. And that is why I started using smaller temporary tables for each CSV file, to do the updates in the smaller tables before I move them all to a single large table. The import is just dumping the data in, my suspicion is the problem is related to using string functions on the data. After all the data is loaded and updated, I run php programs on the large table to generate reports. All of which works well EXCEPT for performing the updates on the data. And I do not want to use perl or any outside tool. I want it all one in SQL because I am required to document all my steps so that someone else can take over, so everything needs to be as simple as possible. -- Adrian Klaver adrian.kla...@aklaver.com
Re: constant crashing
On 4/14/24 12:22, jack wrote: Here is an excerpt of /var/log/postgresql/postgresql-16-main.log Where and how are the CSV files being produced? What is the database locale? -- Adrian Klaver adrian.kla...@aklaver.com
Re: constant crashing
On 4/14/24 09:20, jack wrote: The full error reads: server closed the connection expectantly This probably means the server terminated abnormally before or while processing the request. error: connection to server was lost Look at the OS system log. PostgreSQL 16.2 I also believe it is a resource issue which can be rectified with a setting, but which setting? If you were updating 100 million records what settings would you adjust? Here are the updates I am performing on the 100 million records: UPDATE table SET category_modified = UPPER(category); UPDATE table SET category_modified = REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'), '\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND POSITION('--' IN category_modified)>0; UPDATE table SET category_modified = REPLACE(category_modified,' ','-'); UPDATE table SET category_modified = CASE WHEN category_modified IS NOT NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END; UPDATE table SET category_modified = regexp_replace(category_modified, '-{2,}', '-', 'g'); UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND category_modified LIKE '%-'; Is the above all being done in one script/transaction? Again what are the table definitions for the tables being copied into and/or modified? -- Adrian Klaver adrian.kla...@aklaver.com
Re: constant crashing
On 4/14/24 07:24, jack wrote: Hello, I am trying to load about 1.4 billion records into a postgresql table. The data is in about 100 individual CSV files which I load individually into the table using the COPY command. After the data is loaded I perform updates to the data, and this is when I get errors. Some updates work, some crash. I am using psql and an SQL script file (update.sql) to perform the updating (UPDATE table SET field=UPPER(field), etc.). The errors are all "Server closed the connection unexpectedly" I have been working on this for just over 1 year now, documenting every step, and I am still unable to get this to work without it crashing somewhere along the way. I am beginning to wonder if postgreSQL is bi-polar. More information needed, to start: 1) Complete example of the code in the script. 2) Table definition(s) of those being copied into. 3) Sample of the data being copied. 4) The error message(s) generated. 5) Database locale Any help would be greatly appreciated. Thank you -- Adrian Klaver adrian.kla...@aklaver.com
Re: Question on trigger
On 4/13/24 00:03, veem v wrote: Thank you Adrian. So it seems the heavy DML tables will see an impact if having triggers (mainly for each row trigger) created on them. And also the bulk DML/array based insert (which inserts multiple rows in one short or one batch) , in those cases it seems the trigger will not make that happen as it will force it to make it happen row by row, as the trigger is row based. Will test anyway though. You said you have triggers in the Oracle database and I assumed they worked and where not a show stopping issue there. What makes you think that would be different in Postgres? What type of triggers where there in Oracle, per row, per statement or a mix? On Thu, 11 Apr 2024 at 22:00, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 4/11/24 07:31, veem v wrote: > Hi, We used to use Oracle database in which we had audit > triggers(something as below) mandated for all tables by the control > team. Now we are going to use the postgresql 15.4 database for one of > our applications. So,wanted to understand if there exists any downside > of such audit trigger setup for all the tables? Will it impact the bulk > data insert/update/delete OR slowdown of any of the DML operations > significantly (and thus will not be advisable to use for all tables but > selected ones)? Triggers are overhead in Postgres as they where in Oracle. If they didn't cause an issue in Oracle I would suspect that would also be the case in Postgres. To confirm you would need to create a test setup and run some common operations and see what the overhead is. Some potential performance improvements: https://www.postgresql.org/docs/current/sql-createtrigger.html <https://www.postgresql.org/docs/current/sql-createtrigger.html> "...a trigger that is marked FOR EACH STATEMENT only executes once for any given operation, regardless of how many rows it modifies (in particular, an operation that modifies zero rows will still result in the execution of any applicable FOR EACH STATEMENT triggers)." <...> "The REFERENCING option enables collection of transition relations, which are row sets that include all of the rows inserted, deleted, or modified by the current SQL statement. This feature lets the trigger see a global view of what the statement did, not just one row at a time. This option is only allowed for an AFTER trigger that is not a constraint trigger; also, if the trigger is an UPDATE trigger, it must not specify a column_name list. OLD TABLE may only be specified once, and only for a trigger that can fire on UPDATE or DELETE; it creates a transition relation containing the before-images of all rows updated or deleted by the statement. Similarly, NEW TABLE may only be specified once, and only for a trigger that can fire on UPDATE or INSERT; it creates a transition relation containing the after-images of all rows updated or inserted by the statement." As example: https://www.postgresql.org/docs/current/plpgsql-trigger.html <https://www.postgresql.org/docs/current/plpgsql-trigger.html> Example 43.7. Auditing with Transition Tables > > CREATE OR REPLACE TRIGGER TAB_AUD_TRG > BEFORE DELETE OR INSERT OR UPDATE > ON tab > FOR EACH ROW > BEGIN > IF inserting THEN > :NEW.create_timestamp := systimestamp; > :NEW.create_userid := sys_context('USERENV','SESSION_USER'); > :NEW.update_timestamp := systimestamp; > :NEW.update_userid := sys_context('USERENV','SESSION_USER'); > ELSIF updating THEN > IF updating('create_userid') OR updating('create_timestamp') THEN > :new.create_userid := :old.create_userid; > :new.create_timestamp := :old.create_timestamp; > END IF; > :NEW.update_timestamp := systimestamp; > :NEW.update_userid := sys_context('USERENV','SESSION_USER'); > END IF; > END; > / > > Regards > Veem -- Adrian Klaver adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> -- Adrian Klaver adrian.kla...@aklaver.com
Re: Question on trigger
On 4/11/24 07:31, veem v wrote: Hi, We used to use Oracle database in which we had audit triggers(something as below) mandated for all tables by the control team. Now we are going to use the postgresql 15.4 database for one of our applications. So,wanted to understand if there exists any downside of such audit trigger setup for all the tables? Will it impact the bulk data insert/update/delete OR slowdown of any of the DML operations significantly (and thus will not be advisable to use for all tables but selected ones)? Triggers are overhead in Postgres as they where in Oracle. If they didn't cause an issue in Oracle I would suspect that would also be the case in Postgres. To confirm you would need to create a test setup and run some common operations and see what the overhead is. Some potential performance improvements: https://www.postgresql.org/docs/current/sql-createtrigger.html "...a trigger that is marked FOR EACH STATEMENT only executes once for any given operation, regardless of how many rows it modifies (in particular, an operation that modifies zero rows will still result in the execution of any applicable FOR EACH STATEMENT triggers)." <...> "The REFERENCING option enables collection of transition relations, which are row sets that include all of the rows inserted, deleted, or modified by the current SQL statement. This feature lets the trigger see a global view of what the statement did, not just one row at a time. This option is only allowed for an AFTER trigger that is not a constraint trigger; also, if the trigger is an UPDATE trigger, it must not specify a column_name list. OLD TABLE may only be specified once, and only for a trigger that can fire on UPDATE or DELETE; it creates a transition relation containing the before-images of all rows updated or deleted by the statement. Similarly, NEW TABLE may only be specified once, and only for a trigger that can fire on UPDATE or INSERT; it creates a transition relation containing the after-images of all rows updated or inserted by the statement." As example: https://www.postgresql.org/docs/current/plpgsql-trigger.html Example 43.7. Auditing with Transition Tables CREATE OR REPLACE TRIGGER TAB_AUD_TRG BEFORE DELETE OR INSERT OR UPDATE ON tab FOR EACH ROW BEGIN IF inserting THEN :NEW.create_timestamp := systimestamp; :NEW.create_userid := sys_context('USERENV','SESSION_USER'); :NEW.update_timestamp := systimestamp; :NEW.update_userid := sys_context('USERENV','SESSION_USER'); ELSIF updating THEN IF updating('create_userid') OR updating('create_timestamp') THEN :new.create_userid := :old.create_userid; :new.create_timestamp := :old.create_timestamp; END IF; :NEW.update_timestamp := systimestamp; :NEW.update_userid := sys_context('USERENV','SESSION_USER'); END IF; END; / Regards Veem -- Adrian Klaver adrian.kla...@aklaver.com
Re: Failure of postgres_fdw because of TimeZone setting
On 4/10/24 1:31 PM, Tom Lane wrote: Adrian Klaver writes: On 4/10/24 12:38, Adnan Dautovic wrote: By the way, the row count of pg_timezone_names is 385, but I do not know how that compares to a more standard installation. On my instance of Postgres 16.2, 1196. You're probably using a build with --with-system-tzdata pointing at a system tzdata tree that includes leap-second-aware zones. These tend to have duplicative entries like "America/New_York" and "posix/America/New_York". (There's also a subtree like "right/America/New_York", but we reject those because we don't do leap seconds.) The real number of distinct zones in a standard tzdata file set these days is a shade under 600. It's the PGDG package running on Ubuntu 22.04. regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com
Re: Failure of postgres_fdw because of TimeZone setting
On 4/10/24 12:38, Adnan Dautovic wrote: Hi, On 05. Apr 2024, at 16:13, Tom Lane wrote: Adnan Dautovic writes: By the way, the row count of pg_timezone_names is 385, but I do not know how that compares to a more standard installation. On my instance of Postgres 16.2, 1196. Kind regards, Adnan Dautovic -- Adrian Klaver adrian.kla...@aklaver.com
Re: Some advice need after a 20 year gap after Ingres/GUIs
On 4/10/24 03:11, John Bateson wrote: Good morning, The end of my working life was taken up with developing and supporting a Scientific Research establishment near Warrington in the UK. I had a small team of programmer who did an excellent job for me and with me. The software was Ingres and the main program supporting user administration on a Synchrotron was built under OpenROAD with other developments around the on-site stores and finance using ABF since the stores workers found this much faster than a GUI! There was also some web development use .Net practises. This was a quite complex system in the end with nearly 200 tables. We ended up using Ingres Replicator with the intention of running it from two sites 180 miles apart – at the time it was a pile of *notquitegoodenough*! This was early this century so Actian may have made some improvements since then… So much for the background to establish that I am not a complete newbie, just out of the loop for a while. *Using Postgres and PGAdmin -4*. So, 20 years later I am developing, (unpaid) a new project for some historic railways in the UK and linking these to the development and design of a range of kits for those interested in model railways. This is getting towards 20 tables so far. What I really need is a recommendation for the current and best practice for an easy GUI that will allow me to press a button without opening up PGAdmin or a black screen, i.e. the traditional .EXE file I can put on a button on the screen on the screen on a Windows 11 based system. With Postgres or other similar client-server database there will be a need to set up the database server somewhere first, before you get to the GUI interface part. Are you planning on a single instance of Postgres that folks log in to from multiple locations? Or do want a stand alone setup that each user has on their machine? While my programming history goes back to MDBS-4 and beyond I know I have some catching up to do and while not a complete newbie, need something I can work on quickly and intuitively and inexpensively! So, recommendation and possibly some consensus would be very much appreciated. And apologies if I have taken up too much of your time or have placed this in the wrong forum. John -- Adrian Klaver adrian.kla...@aklaver.com
Re: Issue with date/timezone conversion function
On 4/9/24 11:24, Lok P wrote: On Tue, Apr 9, 2024 at 10:33 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote: 'EST' is going to rotate to UTC-5, but that's probably not what you want in the summer. I'd suggest AT TIME ZONE 'America/New_York' or the like. See https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES <https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES> regards, tom lane Thank you so much. You are correct. The AT TIME ZONE 'America/New_York' is giving correct EST time conversion. But I think regarding why it looks to be shifting i.e. the same time duration appears to be holding a different count of transactions while the base table is not getting updated/inserted/deleted for its historical create_timestamps, I suspect the below conversion part. The task is to count each ~15minutes duration transaction and publish in ordered fashion i.e. something as below, but the way it's been written seems wrong. It's an existing script. It first gets the date component with truncated hour and then adds the time component to it to make it ~15minutes interval. Can it be written in some simple way? 9-apr-2024 14:00 12340 9-apr-2024 14:15 12312 9-apr-2024 14:30 12323 9-apr-2024 14:45 12304 /DATE_TRUNC('hour', create_timestamp AT TIME ZONE '/America/New_York'/) +/ /(EXTRACT(MINUTE FROM create_timestamp AT TIME ZONE '/America/New_York/') / 15 * 15) * INTERVAL '15 minute'/ Something like?: create table dt_bin_test(id integer, tz_fld timestamptz); insert into dt_bin_test values(1, '2024-04-01 8:15'), (2, '2024-04-01 9:01'), (3, '2024-04-01 9:16'), (4, '2024-04-01 9:45'), (5, '2024-04-01 8:15'), (6, '2024-04-01 9:01'); select count(tz_fld), date_bin('15 minutes', tz_fld, '2024-01-01') as bin from dt_bin_test group by date_bin('15 minutes', tz_fld, '2024-01-01'); count | bin ---+ 2 | 2024-04-01 09:00:00-07 2 | 2024-04-01 08:15:00-07 1 | 2024-04-01 09:15:00-07 1 | 2024-04-01 09:45:00-07 -- Adrian Klaver adrian.kla...@aklaver.com
Re: Issue with date/timezone conversion function
On 4/9/24 9:16 AM, Lok P wrote: On Tue, Apr 9, 2024 at 9:26 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 4/9/24 08:43, Lok P wrote: > Hi All, > It's version 15.4 of postgresql database. Every "date/time" data type > attribute gets stored in the database in UTC timezone only. One of the > support persons local timezone is "asia/kolkata" and that support > person needs to fetch the count of transactions from a table- > transaction_tab and share it with another person/customer who is in the > EST timezone, so basically the transaction has to be shown or displayed > the EST timezone. What is the datatype for the create_timestamp? What does SHOW timezone; return on the server? Thank you for the quick response. The column data type for "create_timestamp" is "timestamptz'. Show timezone from the support users client machine UI showing "Asia/Calcutta". Not having access to run "Show timezone" on the server currently, I will try to get it. output from pg_setting showing setting as "Asia/Calcutta', reset_val as "Asia/Calcutta", boot_val as "GMT" In the pg_settings query what are the source, sourcefile, sourceline fields set to? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Issue with date/timezone conversion function
On 4/9/24 08:43, Lok P wrote: Hi All, It's version 15.4 of postgresql database. Every "date/time" data type attribute gets stored in the database in UTC timezone only. One of the support persons local timezone is "asia/kolkata" and that support person needs to fetch the count of transactions from a table- transaction_tab and share it with another person/customer who is in the EST timezone, so basically the transaction has to be shown or displayed the EST timezone. What is the datatype for the create_timestamp? What does SHOW timezone; return on the server? -- Adrian Klaver adrian.kla...@aklaver.com
Re: [Code: 0, SQL State: 0A000] when "typing" from pg_catalog
On 4/9/24 08:12, Thiemo Kellner wrote: Thanks for taking this up. Am 09.04.2024 um 17:09 schrieb Adrian Klaver: On 4/9/24 07:59, Thiemo Kellner wrote: [Code: 0, SQL State: 0A000] ERROR: References to other databases are not implemented: pg_catalog.pg_roles.rolname Position: 298 [Script position: 334 - 361] PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader'; Is being seen as DB_NAME.TABLE_NAME.COLUMN_NAME. Yes, obviously, but why? With the information_schema view all is fine. And, I suppose, with all other objects in other schemas of the same database too. Because you did not do?: PG_CATALOG.PG_ROLES.ROLNAME%type -- Adrian Klaver adrian.kla...@aklaver.com
Re: [Code: 0, SQL State: 0A000] when "typing" from pg_catalog
On 4/9/24 07:59, Thiemo Kellner wrote: Hi I have the following function code. When trying to install, it gives me [Code: 0, SQL State: 0A000] FEHLER: Verweise auf andere Datenbanken sind nicht implementiert: pg_catalog.pg_roles.rolname Position: 298 [Script position: 334 - 361] [Code: 0, SQL State: 0A000] ERROR: References to other databases are not implemented: pg_catalog.pg_roles.rolname Position: 298 [Script position: 334 - 361] To the best of my knowledge, pg_catalog is a schema not a database, like information_schema. Am I missing something? And why is it not allowed to type from the catalogue? I presume, this example is rather academic due to the name type. PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader'; Is being seen as DB_NAME.TABLE_NAME.COLUMN_NAME. Kind regards Thiemo create or replace function GRANT_SELECTS() returns void language plpgsql as $body$ declare C_SCHEMA_NAME constant INFORMATION_SCHEMA.SCHEMATA.SCHEMA_NAME%type := 'snowrunner'; -- C_ROLE_NAME constant name := C_ROLE_NAME constant PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader'; V_SQL_STATEMENT text; begin -- Check the existance of the schema perform 1 from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME = C_SCHEMA_NAME; if not found then raise exception 'Schema "%s" does not exist!', C_SCHEMA_NAME; end if; -- Check the existance of the role perform 1 from PG_CATALOG.PG_ROLES where ROLNAME = C_ROLE_NAME; if not found then raise exception 'Role "%s" does not exist!', C_ROLE_NAME; end if; -- Issue grants V_SQL_STATEMENT := format('grant select on all tables in schema %i to %i', C_SCHEMA_NAME, C_ROLE_NAME); raise info '%', V_SQL_STATEMENT; execute V_SQL_STATEMENT; V_SQL_STATEMENT := format('grant select on all views in schema %i to %i', C_SCHEMA_NAME, C_ROLE_NAME); raise info '%', V_SQL_STATEMENT; execute V_SQL_STATEMENT; V_SQL_STATEMENT := format('grant select on all materialized views in schema %i to %i', C_SCHEMA_NAME, C_ROLE_NAME); raise info '%', V_SQL_STATEMENT; execute V_SQL_STATEMENT; commit; return; end; $body$; -- Adrian Klaver adrian.kla...@aklaver.com
Re: Regarding: Replication of TRUNCATE commands is not working
On 4/9/24 05:07, Arvind Raghuwanshi wrote: Hi Laurenz, Thanks for the response Question: What PostgreSQL version are you using? The feature was introduced in v11. Answer: I am using the 16.0 Postgresql version. db1=> SELECT version(); version - PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit (1 row) Question: How exactly is the publication defined? Perhaps TRUNCATE is excluded. I am not using the subscribe/publication model . but i have created a To be clear you have not done CREATE PUBLICATION on the source machine, correct? What is the rest of your configuration per: https://www.postgresql.org/docs/current/logical-replication-config.html Also what exactly is being TRUNCTEd? replication slot on the source database and then i am calling below procedure to get the details from replication slot: select data from pg_logical_slot_peek_changes('db1_slot', NULL, NULL, 'pretty-print', '1'); In the case of TRUNCATE , the above procedure does not show up any records. however this procedure shows up insert, update and delete events. As you mentioned TRUNCATE is excluded, is there any way to exclude TRUNCATE ? Thanks Arvind On Tue, Apr 9, 2024 at 4:08 PM Laurenz Albe <mailto:laurenz.a...@cybertec.at>> wrote: On Tue, 2024-04-09 at 14:50 +0530, Arvind Raghuwanshi wrote: > I have tried to run the TRUNCATE command and found out that it's not getting replicated using logical replication for pgsql. > I have also checked the schema change using pg_dump command but the schema change also not getting detected for TRUNCATE command. > > However on pgsql logical replication doc page[1] , it's mentioned that Replication of TRUNCATE commands is supported. > > [1] PostgreSQL: Documentation: 16: 31.6. Restrictions > > Any idea how we can solve this? What PostgreSQL version are you using? The feature was introduced in v11. How exactly is the publication defined? Perhaps TRUNCATE is excluded. Yours, Laurenz Albe -- Adrian Klaver adrian.kla...@aklaver.com
Re: PEM install error
On 4/8/24 07:24, mark bradley wrote: While installing PostgreSQL I am getting this error message during the PEM server portion. Do I need the PEM server? If so, what is the solution? PEM Server is an EDB product: https://www.enterprisedb.com/docs/pem/latest/managing_pem_server/ You should reach out to their tech support. Thanks! -- Adrian Klaver adrian.kla...@aklaver.com
Re: Moving delta data faster
On 4/6/24 13:04, yudhi s wrote: On Sat, Apr 6, 2024 at 10:25 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: Your original problem description was: "Then subsequently these rows will be inserted/updated based on the delta number of rows that got inserted/updated in the source database. In some cases these changed data can flow multiple times per day to the downstream i.e. postgres database and in other cases once daily." If the above is not a hard rule, then yes up to some point just replacing the data in mass would be the simplest/fastest method. You could cut a step out by doing something like TRUNCATE target_tab and then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO source_tab. Yes, actually i didn't realize that truncate table transactional/online here in postgres. In other databases like Oracle its downtime for the read queries on the target table, as data will be vanished from the target table post truncate(until the data load happens) and those are auto commit. Thanks Veem for sharing that option. I also think that truncate will be faster if the changes/delta is large , but if its handful of rows like <5%of the rows in the table then Upsert/Merge will be better performant. And also the down side of the truncate option is, it does ask to bring/export all the data from source to the S3 file which may take longer as compared to bringing just the delta records. Correct me if I'm wrong. Since you still have not specified how the data is stored in S3 and how you propose to move them into Postgres I can't really answer. However I am still not able to understand why the upsert is less performant than merge, could you throw some light on this please? I have no idea how this works in the code, but my suspicion is it is due to the following: https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT "The optional ON CONFLICT clause specifies an alternative action to raising a unique violation or exclusion constraint violation error. For each individual row proposed for insertion, either the insertion proceeds, or, if an arbiter constraint or index specified by conflict_target is violated, the alternative conflict_action is taken. ON CONFLICT DO NOTHING simply avoids inserting a row as its alternative action. ON CONFLICT DO UPDATE updates the existing row that conflicts with the row proposed for insertion as its alternative action." vs this: "First, the MERGE command performs a join from data_source to target_table_name producing zero or more candidate change rows. For each candidate change row, the status of MATCHED or NOT MATCHED is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN clause is executed for any candidate change row." Where ON CONFLICT attempts the INSERT then on failure does the UPDATE for the ON CONFLICT DO UPDATE case. MERGE on the hand evaluates based on the join condition(ON tbl1.fld =tbl2.fld) and then based on MATCH/NOT MATCHED takes the appropriate action for the first WHEN match. In other words it goes directly to the appropriate action. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Moving delta data faster
On 4/6/24 08:47, yudhi s wrote: Thank you Adrian, Greg and Veem. I tried writing a small routine to see how the performance differs in these four approaches i.e. Upsert VS traditional update+insert VS Merge vs Truncate+load. Initially I was thinking Upsert will perform the same as Merge as the logic looks similar but it seems it's the worst performing among all, not sure why , yet to know the reason though. Truncate+ load seems to be the best performing among all. Hope i am doing it correctly. Please correct me if I'm wrong. Your original problem description was: "Then subsequently these rows will be inserted/updated based on the delta number of rows that got inserted/updated in the source database. In some cases these changed data can flow multiple times per day to the downstream i.e. postgres database and in other cases once daily." If the above is not a hard rule, then yes up to some point just replacing the data in mass would be the simplest/fastest method. You could cut a step out by doing something like TRUNCATE target_tab and then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO source_tab. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Failure of postgres_fdw because of TimeZone setting
On 4/5/24 02:39, Adnan Dautovic wrote: Dear Adrian, Adrian Klaver wrote: Define 'read-only', especially as it applies to the privileges on the public schema. I am not quite sure which information you are looking for exactly. According to this [1], I ran the following query: WITH "names"("name") AS ( SELECT n.nspname AS "name" FROM pg_catalog.pg_namespace n WHERE n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' ) SELECT "name", pg_catalog.has_schema_privilege(current_user, "name", 'CREATE') AS "create", pg_catalog.has_schema_privilege(current_user, "name", 'USAGE') AS "usage" FROM "names"; And recieved the following result: "name" "create" "usage" "public" true true Looks alright. The below is the issue. Per Tom Lane's comments on timezone, log into the remote server and do: SHOW timezone; Europe/Berlin SET timezone = 'etc/UTC'; ERROR: invalid value for parameter "TimeZone": "etc/UTC" SQL state: 22023 SET timezone = 'UTC'; ERROR: invalid value for parameter "TimeZone": "UTC" SQL state: 22023 However, this lead me to [2] and I find the output very interesting: SELECT * FROM pg_timezone_names ORDER BY name; The below is cut down from the actual output as there should be at least: Europe/Berlin CEST 02:00:00 t present also? "name" "abbrev" "utc_offset" "is_dst" "Turkey" "+03" "03:00:00" false "UCT" "UCT" "00:00:00" false Hmm I get: UCT UTC 00:00:00 f could be version difference though. "Universal" "UTC" "00:00:00" false "W-SU" "MSK" "03:00:00" false And then attempting SET timezone = 'Universal'; SET Query returned successfully in 100 msec. Any ideas on how to proceed? 1) For the long term contact whomever is in charge of the remote server and ask them what they have done with the timezones, why and can they fix it? 2) In short term per the link from your first post and with no guarantees: https://github.com/postgres/postgres/blob/936e3fa3787a51397280c1081587586e83c20399/contrib/postgres_fdw/connection.c#L677 In the source code change do_sql_command(conn, "SET timezone = 'UTC'"); to do_sql_command(conn, "SET timezone = 'Universal'"); As from the link: "Set remote timezone; this is basically just cosmetic" Then recompile the extension. Kind regards, Adnan Dautovic [1]: https://stackoverflow.com/a/36095257 [2]: https://stackoverflow.com/a/32009497 -- Adrian Klaver adrian.kla...@aklaver.com
Re: Moving delta data faster
On 4/4/24 13:42, yudhi s wrote: On Thu, Apr 4, 2024 at 9:04 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 4/3/24 22:24, yudhi s wrote: > > On Thu, Apr 4, 2024 at 10:16 AM Adrian Klaver mailto:adrian.kla...@aklaver.com> > <mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>> wrote: S3 is not a database. You will need to be more specific about '... then from the S3 it will be picked and gets merged to the target postgres database.' The data from S3 will be dumped into the stage table and then the upsert/merge from that table to the actual table. The S3 --> staging table would be helped by having the data as CSV and then using COPY. The staging --> final table step could be done as either ON CONFLICT or MERGE, you would need to test in your situation to verify which works better. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Moving delta data faster
On 4/3/24 22:24, yudhi s wrote: On Thu, Apr 4, 2024 at 10:16 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote: On 4/3/24 20:54, yudhi s wrote: > On Thu, Apr 4, 2024 at 2:41 AM Adrian Klaver mailto:adrian.kla...@aklaver.com> > <mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>> wrote: > > > Thank you Adrian. > > And one thing i forgot to mention this target postgresql database would > be on AWS RDS whereas the source Oracle databases is on premise. I think > we don't have the FDW extension currently in place but we can get that. > I am just not able to understand clearly though, but do you mean export > the data from source using CSV and do truncate and import on target. And > as these data will be moved through the network won't that cause slowness? > > The source database here is Oracle database. Correct me if wrong, it > looks like foreign data wrapper is like a DB link. Or do you mean > writing a query on the target database (which can be UPSERT or MERGE) > but will be joining the table from the source database through the > DBlink/DDW? But my question was whether we should use UPSERT or MERGE > for comparing and loading the delta records to the target postgresql > database. Want to understand which is more performant , as I see in the > past Merge having performance issues in the past, but not very sure > about that. My motivation was to get some basic information about your setup and what you are trying to achieve. If I understand correctly you have: 1) An Oracle database with tables that you want to copy the complete data from to a Postgres database. For this sort of thing COPY(https://www.postgresql.org/docs/current/sql-copy.html <https://www.postgresql.org/docs/current/sql-copy.html>) on the Postgres end using CSV data generated from the source is probably the quickest bulk load method. 2) After the initial load you want to do follow up INSERT/UPDATEs based on a delta of the source tables relative to the initial load. This is still a bit of mystery to me. How are determining the delta: a) On the source end entirely or b) Target relative to source? Also what is the anticipated size of the delta per transfer? Additional information needed: 1) Network distance between source and target? 2) Network capacity? 3) Expected load on both source and target servers from other operations? Thank you. Actually I was trying to understand how to cater the delta load after the one time load is done . The delta change in records is planned to be found based on the primary keys on the tables. If it found the key it will update the records if it does not find the keys it will insert the rows. Basically the select query from the source database will fetch the data with a certain time interval(based on the latest update timestamp or create timestamp if they are available or else full dump) and put it on S3 and then from the S3 it will be picked and gets merged to the target postgres database. As upsert and merge both were looking similar , so was wondering what we should use here for loading the delta records? S3 is not a database. You will need to be more specific about '... then from the S3 it will be picked and gets merged to the target postgres database.' -- Adrian Klaver adrian.kla...@aklaver.com