from:"Adrian Klaver"

Re: Issue with PostgreSQL Installer on Windows and Special Characters in the superuser password

2024-05-31 Thread Adrian Klaver

On 5/31/24 14:18, Timo Schwaak wrote:

Dear PostgreSQL community,

I hope this is the right place for my concern.
Today I discovered a strange behaviour of the Windows Installer of
PostgreSQL.
While migrating an application database from MSSQL, the Postgres setup
did not seem to complete correctly and had problems starting the
Postgres service after installation. I found that the service did not
seem to be created at all.
After some trying around with help I found online, trying non-domain
users and some other stuff that was proposed for the problem, I found
the problem lay in my chosen password. My password was randomly
generated with alphanumeric + special characters. While this did not
work, switching to alphanumeric-only passwords seemed to solve the
problem. I suspect that the quotation marks in particular are causing
problems in the setup, as they seem to terminate some strings
prematurely in the scripts used by the installer.

My setup was as follows:

OS: Windows Server 2019, also tested on Windows Server 2022
PSQL version: 16.3 (latest installer from EDB)
Steps to reproduce:
1. Run the installer
2. Enter any password that contains quotation marks
3. The installer should fail at the end and report some errors.

Could you please help me investigate this, is this a known problem or is
there even a known workaround? If this is not a known issue, can you
help me point this report in the right direction so that it can be
addressed?

1) Don't use passwords with quotation marks.

2) Contact https://www.enterprisedb.com/ and file an issue with them.
The installer is their product not part of the community distribution.

3) Poke pgadmin-support here:

https://www.postgresql.org/list/pgadmin-support/

That is from EDB also, maybe someone over there can help.

4) Jump on this thread at pgsql-bugs about installer :

https://www.postgresql.org/message-id/flat/DU0PR10MB71447B1C6946731AD6C73D13C0E72%40DU0PR10MB7144.EURPRD10.PROD.OUTLOOK.COM

Or start a new one here:

https://www.postgresql.org/account/login/?next=/account/submitbug/

Best regards,

Timo

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Rules and Command Status - update/insert/delete rule with series of commands in action

2024-05-31 Thread Adrian Klaver


On 5/30/24 15:34, johnlu...@hotmail.com wrote:


On 5/30/24 4:56 PM, David G. Johnston wrote:


Very little interest exists in working on user-specified rules.  They 
are practically deprecated.



Ah  -   pity  -   see my last comment to Adrian's




Any interest in perhaps providing a choice via a configuration
parameter?


 If anything is done it would have to be new syntax.



A much bigger task surely.


On 5/30/24 5:19 PM, Adrian Klaver wrote:


2) Use INSTEAD OF triggers:




Unfortunately the same functionality as in my example with the RULE is 
not supported for triggers on views :   from the manual


|INSTEAD OF| triggers may only be defined on views, *and only at row level*;


A RULE is essentially a statement-level operation which is what I need 
for this particular case.  A row-level trigger would not work 
because it cannot "see" the query causing it to be fired, and also ,  
(most importantly) is not fired at all if no rows match the original 
query, whereas a RULE is always in effect regardless of which rows 
are involved. before.  I should add that the RULE I showed in my 
example is not the only RULE being used on this view  -    there are 
other conditional RULEs,   and the combined effect is of being able to 
change the effect of the original statement into a set of new 
statements,   one of which does what is needed.




Basically don't give the user what they asked for, give them some 
cobbled together  on the fly version. Who decides that what the user 
needs? Seems to me this the point at which to have a discussion with the 
application developers about having the application asking the correct 
questions, rather then going down the road of bait and switch.





And if you are now inclined to say "well,    maybe the application 
itself is poorly written and should be changed"  -   I would have to 
agree,    but that is not mine to change.





But I suppose that my next question,   given what you both say about the 
RULE system being a dead-end,  is whether there is any likelihood of 
supporting an INSTEAD OF trigger on a view at statement level?   Maybe 
that stands more chance of going somewhere?




Cheers,     John Lumby




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Rules and Command Status - update/insert/delete rule with series of commands in action

2024-05-30 Thread Adrian Klaver


On 5/30/24 11:32, johnlu...@hotmail.com wrote:

Any thoughts?  Any rationales one way or the other?    Any interest 
in perhaps providing a choice via a configuration parameter?


1) As to using rules, stop now before you invest the time to find out is 
a road you don't want to go down.


2) Use INSTEAD OF triggers:

https://www.postgresql.org/docs/current/sql-createtrigger.html






Cheers, John Lumby





--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Dll libpq.dll 32 bits

2024-05-29 Thread Adrian Klaver

On 5/29/24 11:18, José Mello Júnior wrote:

Again reply to list by using Reply All.
Ccing list.

For my projects in 32 bits, the language of PC-SOFT (wlanguage) use.

Are you talking about this?:

https://windev.com/pcsoft/index.html

Em qua., 29 de mai. de 2024, 15:09, Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> escreveu:

On 5/29/24 10:59 AM, José Mello Júnior wrote:

Reply to list also.
Ccing list.

 > Sorry, for windows

That is the OS, what I am after is what client on Windows do you need
the DLL for?

 >
 >
 >
 > Em qua., 29 de mai. de 2024, 14:53, Adrian Klaver
 > mailto:adrian.kla...@aklaver.com>
<mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> escreveu:
 >
 >
 >
 >     On 5/29/24 10:49 AM, José Mello Júnior wrote:
 >      > Where do i found this dll for acess postgresql 15?
 >
 >     Access Postgres using what client?
 >
 >     --
 >     Adrian Klaver
 > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
<mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Dll libpq.dll 32 bits

2024-05-29 Thread Adrian Klaver





On 5/29/24 10:59 AM, José Mello Júnior wrote:

Reply to list also.
Ccing list.


Sorry, for windows


That is the OS, what I am after is what client on Windows do you need 
the DLL for?






Em qua., 29 de mai. de 2024, 14:53, Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> escreveu:




On 5/29/24 10:49 AM, José Mello Júnior wrote:
 > Where do i found this dll for acess postgresql 15?

Access Postgres using what client?

    -- 
    Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Dll libpq.dll 32 bits

2024-05-29 Thread Adrian Klaver





On 5/29/24 10:49 AM, José Mello Júnior wrote:

Where do i found this dll for acess postgresql 15?


Access Postgres using what client?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Pgpool with high availability

2024-05-29 Thread Adrian Klaver


On 5/28/24 22:15, vijay patil wrote:

Hi Adrian,

After modifying the pgpool.conf configuration to correct a subnet 
mistake,  Previously, the wrong subnet was specified as /26, and it has 
now been corrected to /24. The configuration changes were as follows:


This seems to be a networking issue:

[root@staging-ha0001 ~]# ping 10.127.1.18

PING 10.127.1.18 (10.127.1.18) 56(84) bytes of data.

From 10.127.1.10 icmp_seq=1 Destination Host Unreachable

<...>


Where are ha0001 and ha0003 on the network relative to ha0002?

Also are they reachable for other services?


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Pgpool with high availability

2024-05-28 Thread Adrian Klaver





On 5/28/24 1:31 AM, vijay patil wrote:


HI Team,

"I'm encountering challenges while configuring Pgpool with high 
availability. The initial setup is completed, and Pgpool is operational 
on a single node, functioning without issues. However, upon attempting 
to start Pgpool on any additional nodes, particularly node 2, it becomes 
immediately unreachable.


And how we are supposed to arrive at an answer with essentially no 
information provided?


Need:

1) Configuration for initial setup.

2) A more detailed explanation of what "... upon attempting
to start Pgpool on any additional nodes" means? Include configuration 
changes.


3) The error messages.

4) Where the nodes are located?




I'm seeking assistance to address this issue. My setup consists of three 
nodes, each hosting both PostgreSQL and Pgpool services."



Thanks

Vijay



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: expected authentication request from server, but received H

2024-05-27 Thread Adrian Klaver

On 5/27/24 14:19, Peter J. Holzer wrote:

On 2024-05-23 17:23:14 -0400, Tom Lane wrote:

"Peter J. Holzer" writes:

One of our users reports getting the error message
"expected authentication request from server, but received H"
when trying to connect to the database.

That's very bizarre, and I don't recall any similar reports in the
recent past.

Server is PostgreSQL 15.7 (Ubuntu 15.7-1.pgdg22.04+1)
I haven't been able to find out what client version they are using yet,
but I suspect it's quite old.

It would definitely be interesting to find out what the client-side
code is, including the openssl version being used.

They have psqlODBC 9.3 and 13.2 on a 64-bit Windows platform (probably
Windows 10). The problem occurs with both. I tried to replicate it with
psqlODBC 13.0 and couldn't.

Just to be clear that is 9.03 and 13.02.

OpenSSL is bundled with psqlODBC, AFAICS. For 13.0 that would be 1.1.1.8
(hmm, that doesn't look like an OpenSSL version. Would that be 1.1.1h in
Linux terms?). But the fact that it happens with two very different
versions but I can't replicate it with one in between (and quite close

The only thing I could find that relates to SSL directly in the move
from 13.00 --> 13.02 is:

https://github.com/postgresql-interfaces/psqlodbc/commit/4eaa430e3ef06162a5655dc7baff9be40f2e1cce

Which happened at 13.01.

to one of them) makes it unlikely (IMHO) to be just version related.
There must be something else going on. I've asked them (their IT
department, not the user) to try to disable any anti-virus software.
That's a bit of a reach, but always a possibility if something behaves
weirdly on Windows.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Long running query causing XID limit breach

2024-05-26 Thread Adrian Klaver


On 5/25/24 22:56, David HJ wrote:

anyone know how to describe from this mailing list?


See here:

https://lists.postgresql.org/unsubscribe/


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Json table/column design question

2024-05-23 Thread Adrian Klaver

On 5/23/24 09:06, Skorpeo Skorpeo wrote:
Thank you for the valuable feedback.  I see people are big fans of json 
here.

You can be a fan of JSON and still think it is not the correct way to 
store data in a relational database. When you do that you often end up 
with nested, possibly unstructured, data sets. That ends up with the 
exertion of more effort to get the data out in any meaningful form and 
in a timely manner then the time and effort it would take to enter it in 
a structured way.

On Thu, May 23, 2024 at 3:04 AM Laurenz Albe <mailto:laurenz.a...@cybertec.at>> wrote:

On Wed, 2024-05-22 at 22:38 -0500, Skorpeo Skorpeo wrote:
 > I was wondering if having unrelated columns in a table is a sound
approach when
 > using json.  In other words, if I have two collections of
unrelated json objects,
 > for example "Users" and "Inventory", would it be ok to have one
table with a
 > "Users" column and a "Inventory" column?  My concern is that from
a row
 > perspective the columns could be different lengths, such as more
inventory
 > items as users.  And for any given row the data in one column
would have no
 > relation to another column. I would only query a single column at
a time.
 >
 > Would this approach be ok or are there pitfalls such that it would be
 > advantageous/recommended to have a separate table for each column?

It doesn't matter much if you use one or two columns.
But the word "collection" makes me worry.  Perhaps this article can give
you some ideas:

https://www.cybertec-postgresql.com/en/json-postgresql-how-to-use-it-right/ 
<https://www.cybertec-postgresql.com/en/json-postgresql-how-to-use-it-right/>

Yours,
Laurenz Albe

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-23 Thread Adrian Klaver


On 5/23/24 06:01, HORDER Philip wrote:

Classified as: {OPEN}

I actually hadn't thought to look for Postgres bugs, the system is so reliable 
for us, it just doesn't go wrong!

I can trigger the fault be running a reload of the lfm database, and we've been 
running Postgres 10 & 13 for several years now without seeing this problem.

Our symptoms match that bug description exactly, so I'm pretty confident this 
is the problem.
I'll have to push through the right paperwork to get a Postgres update, and 
that will take a while


While you are doing the paperwork make sure it is for upgrade to 15.7. 
Also might mention minor upgrades (x in X.x) are a good idea as 
Postgres, like any software, does get bugs and the minor releases fix 
them. Point the folks in charge at:


https://www.postgresql.org/docs/15/release.html

Also minor releases as a rule do not require dump/restore or pg_upgrade 
or other heavy lifting. There have been exceptions, but they are rare.




Thank you so much for your time.

Phil Horder
Database Mechanic

Thales
Land & Air Systems




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-22 Thread Adrian Klaver


On 5/22/24 08:55, HORDER Philip wrote:

Classified as: {OPEN}


https://www.postgresql.org/docs/15/release-15-5.html#id-1.11.6.7.4



Fix race condition in database dropping that could lead to the autovacuum 
launcher getting stuck


Wow, that sounds like our problem!


Another thought I had is that since you are on Windows maybe there is an 
AV program seeing the autovacuum as virus-like and killing it. Though I 
would expect there to be more errors in that case. Anyway it might be 
worth investigating should an upgrade not fix it.




I will investigate.  Maybe try and find the orphaned stats entry to prove it.


It would be a good idea to upgrade to 15.7 in any case to get all the 
other bug/security fixes.




Phil Horder
Database Mechanic

Thales
Land & Air Systems




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-22 Thread Adrian Klaver


On 5/22/24 01:33, HORDER Philip wrote:

Classified as: {OPEN}



2) There is a round of autovacuum immediately after the lfm is restored.

Yes, some tables in the lfm database, but not all, an apparently random selection, 
anywhere between 2 and 21 tables, across the lfm schemas, public & pg_catalog.


3) autovacuum then goes silent.

Yes. Dead in a ditch. But with no errors.


4) Before the next drop/create lfm you restart the Postgres server and 
autovacuum starts again.

I haven't restarted in a week, and the pattern remains, with a bit of analyze 
at each reload of lfm, and then nothing.


What is in the logs when you do the restart?

Nothing notable:
1) denied connections, while restarting
2) authorized connections
3) auto analyze going into overdrive:
See below


Is there some process that runs shortly after the drop/create lfm cycle?

Not that I can see.


I was hoping more coffee would lead to enlightenment, it did not. It did 
lead me to do what I should have done at the start which is look at the 
release notes for 15.x. You are on Postgres 15.3 and current is 15.7. On 
the path from .5 --> .7 is:


https://www.postgresql.org/docs/15/release-15-5.html#id-1.11.6.7.4

Fix race condition in database dropping that could lead to the 
autovacuum launcher getting stuck (Andres Freund, Will Mortensen, Jacob 
Speidel)


The race could lead to a statistics entry for the removed database 
remaining present, confusing the launcher's selection of which database 
to process.








Phil Horder
Database Mechanic


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: search_path and SET ROLE

2024-05-22 Thread Adrian Klaver


On 5/22/24 07:27, Ron Johnson wrote:

PG 9.6.24 (Soon, I swear!)

It seems that the search_path of the role that you SET ROLE to does not 
become the new search_path.


Am I missing something, or is that PG's behavior?

AS USER postgres


$ psql -h 10.143.170.52 -Xac "CREATE ROLE dbagrp SUPERUSER INHERIT NOLOGIN;"
CREATE ROLE dbagrp SUPERUSER INHERIT NOLOGIN;
CREATE ROLE

$ psql -h 10.143.170.52 -Xac "CREATE USER rjohnson IN GROUP dbagrp INHERIT;"
CREATE USER rjohnson IN GROUP dbagrp INHERIT;
CREATE ROLE

[postgres@FISPMONDB001 ~]$ psql -h 10.143.170.52 -Xac "CREATE USER 
\"11026270\" IN GROUP dbagrp INHERIT PASSWORD '${NewPass}' VALID UNTIL 
'2024-06-30 23:59:59';"
CREATE USER "11026270" IN GROUP dbagrp INHERIT PASSWORD 'linenoise' 
VALID UNTIL '2024-06-30 23:59:59';

CREATE ROLE

$ psql -h 10.143.170.52 -Xac "ALTER ROLE dbagrp set search_path = 
dbagrp, public, dba, cds, tms;"

ALTER ROLE dbagrp set search_path = dbagrp, public, dba, cds, tms;
ALTER ROLE

AS USER rjohnson


[rjohnson@fpslbxcdsdbppg1 ~]$ psql -dCDSLBXW
psql (9.6.24)
Type "help" for help.

CDSLBXW=> SET ROLE dbagrp;
SET
CDSLBXW=#
CDSLBXW=# SHOW SEARCH_PATH;
    search_path
-
  "$user", public
(1 row)


Back to user postgres
=

$ psql -h 10.143.170.52 -Xac "ALTER ROLE rjohnson set search_path = 
dbagrp, public, dba, cds, tms;"

ALTER ROLE rjohnson set search_path = dbagrp, public, dba, cds, tms;
ALTER ROLE

Back to user rjohnson
=

[rjohnson@fpslbxcdsdbppg1 ~]$ psql -dCDSLBXW
psql (9.6.24)
Type "help" for help.

CDSLBXW=>
CDSLBXW=> SET ROLE dbagrp;
SET

CDSLBXW=# SHOW SEARCH_PATH;
           search_path
---
  dbagrp, public, dba, cds, tms
(1 row)




https://www.postgresql.org/docs/current/sql-alterrole.html


Whenever the role subsequently starts a new session, the specified value 
becomes the session default, overriding whatever setting is present in 
postgresql.conf or has been received from the postgres command line. 
This only happens at login time; executing SET ROLE or SET SESSION 
AUTHORIZATION does not cause new configuration values to be set. 
Settings set for all databases are overridden by database-specific 
settings attached to a role. Settings for specific databases or specific 
roles override settings for all roles.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-21 Thread Adrian Klaver


On 5/21/24 13:44, HORDER Philip wrote:

Classified as: {OPEN}

2024-05-15 03:31:31.290 GMT [4556]: [3-1]
db=lfm,user=superuser,app=[unknown],client=::1 LOG:  connection
authorized: user=superuser database=lfm application_name=pg_restore


That would be the lfm database being restored.
What does the log show after that as pertains to autovacuum?


Yep, pg_restore recreates the dropped lfm database.
And after that nothing.
The log just holds connection requests, and a checkpoint every hour.
That's it.
No "automatic vacuum", or "automatic analyze" anywhere.
And nothing any day since then, for a week.



Just for confirmation your settings are still?:

autovacuum_max_workers = 10
log_autovacuum_min_duration = 0

You said previously:

"The only way I can find of getting the analyzer back is to restart 
Postgres."


To be clear this means:

1) The lfm database is dropped/created.

2) There is a round of autovacuum immediately after the lfm is restored.

3) autovacuum then goes silent.

4) Before the next drop/create lfm you restart the Postgres server and 
autovacuum starts again.


What is in the logs when you do the restart?

Is there some process that runs shortly after the drop/create lfm cycle?



Phil Horder
Database Mechanic



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-21 Thread Adrian Klaver





On 5/21/24 9:29 AM, HORDER Philip wrote:

Classified as: {OPEN}


I am having a hard time figuring out how both of the above can be true.
Dropping and recreating the database would wipe out the statistics.


We have multiple databases on the one Postgres server.
The 'postgres' database contains our main application, with tables in schema 
'a'.
We have two other databases, one of which is 'lfm'.

In PgAdmin, there is Databases (3), then those database names.
As the superuser, I can connect to any of these databases.

When we drop database lfm, and recreate it using pg_restore, the stats 
collection dies, for all databases on the server: 'postgres', 'lfm', 'lfm2'
I haven't dropped database 'postgres'

So the existing stats on tables in the 'postgres' database remain, but no 
further stats are collected.


From int-postgres-log-21-05-2024.txt:

2024-05-15 03:31:31.290 GMT [4556]: [3-1] 
db=lfm,user=superuser,app=[unknown],client=::1 LOG:  connection 
authorized: user=superuser database=lfm application_name=pg_restore


That would be the lfm database being restored.

What does the log show after that as pertains to autovacuum?






Phil Horder
Database Mechanic

Thales
Land & Air Systems


{OPEN}
The information contained in this e-mail is confidential. It is intended only 
for the stated addressee(s) and access to it by any other person is 
unauthorised. If you are not an addressee, you must not disclose, copy, 
circulate or in any other way use or rely on the information contained in this 
e-mail. Such unauthorised use may be unlawful. If you have received this e-mail 
in error, please inform the originator immediately and delete it and all copies 
from your system.

Thales UK Limited. A company registered in England and Wales. Registered 
Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered 
Number: 868273

Please consider the environment before printing a hard copy of this e-mail.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-21 Thread Adrian Klaver





On 5/21/24 9:29 AM, HORDER Philip wrote:

Classified as: {OPEN}


I am having a hard time figuring out how both of the above can be true.
Dropping and recreating the database would wipe out the statistics.


We have multiple databases on the one Postgres server.
The 'postgres' database contains our main application, with tables in schema 
'a'.
We have two other databases, one of which is 'lfm'.


My mistake, I was over concentrating on the log messages and did not pay 
attention to the psql prompt which was showing postgres as the database.




Phil Horder
Database Mechanic

Thales
Land & Air Systems


{OPEN}
The information contained in this e-mail is confidential. It is intended only 
for the stated addressee(s) and access to it by any other person is 
unauthorised. If you are not an addressee, you must not disclose, copy, 
circulate or in any other way use or rely on the information contained in this 
e-mail. Such unauthorised use may be unlawful. If you have received this e-mail 
in error, please inform the originator immediately and delete it and all copies 
from your system.

Thales UK Limited. A company registered in England and Wales. Registered 
Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered 
Number: 868273

Please consider the environment before printing a hard copy of this e-mail.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-21 Thread Adrian Klaver


On 5/21/24 06:00, HORDER Philip wrote:

Classified as: {OPEN}


Assuming clean shutdowns the statistics will survive restarts. They would be 
wiped when you drop a database and start over, have an unclean shutdown or you 
use one of the reset functions...


Yes, stats are permanent, but are not being updated.
We don't use any of the pg_stat_reset functions.

-
I've left the system alone over the weekend.
Here's the timeline:

14th May:
Postgres working ok, 1767 log entries for "automatic analyze", mostly in 
database postgres.
03:30 Jenkins deployed an update, resulting in reload of lfm database.


This is where I am getting confused. In your original post you had:

To apply an update, we:
  stop the applications that use LFM,
  set the user (LFU) to NOLOGIN
  kill any left-over connections: select 
pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE 
pg_stat_activity.datname = 'lfm' and usename = 'lfu';

  drop the existing reference database using the dropDb utility.
  reload the new file using pg_restore and the postgres super user.
  set the user (LFU) to LOGIN

In other words DROP DATABASE then CREATE DATABASE and reload the schema 
objects and associated data.


Yet your int stats output.txt file has things like the following:

-[ RECORD 3 ]---+--
relid   | 923130055
schemaname  | a
relname | cr_pt_e_20240502_20240503
seq_scan| 1264
seq_tup_read| 8800722491
idx_scan| 4601405
idx_tup_fetch   | 4415621
n_tup_ins   | 3851400
n_tup_upd   | 15790
n_tup_del   | 0
n_tup_hot_upd   | 0
n_live_tup  | 7166325
n_dead_tup  | 0
n_mod_since_analyze | 0
n_ins_since_vacuum  | 0
last_vacuum |
last_autovacuum | 2024-05-03 09:03:44.810654+00
last_analyze| 2024-05-09 08:44:37.725674+00
last_autoanalyze| 2024-05-03 09:03:58.838664+00
vacuum_count| 0
autovacuum_count| 1
analyze_count   | 1
autoanalyze_count   | 1


I am having a hard time figuring out how both of the above can be true. 
Dropping and recreating the database would wipe out the statistics.






Summary: since the reload of lfm database on 15th May, the only "automatic 
analyze" events logged have been for the lfm database, at the point of reload.
No other stats analyze seems to have taken place, on any database on the server 
since that point, even partitions with over a million rows.


The above is confusing also. In your original post you only referred to 
the postgres and lfm databases.


What other databases are you referring to?

As to partitions are you referring to partitions of tables in the lfm 
database or something else?




Apart from that, Postgres appears to be working normally.

I'm sure that another restart of Postgres will restore the auto analyze, until 
the next pg_restore of lfm.
So what's going on?  How is it that we're breaking this important function that 
we shouldn't be able to have any effect on?

Thanks for looking,
Phil Horder
Database Mechanic

Thales
Land & Air Systems



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Left join syntax error

2024-05-18 Thread Adrian Klaver


On 5/18/24 08:04, Rich Shepard wrote:

On Sat, 18 May 2024, Adrian Klaver wrote:


... LEFT JOIN companies as c ON c.company_nbr = p.company_nbr;


Adrian,

Tried that:
bustrac-# LEFT JOIN companies as c ON c.company_nbr = p.company_nbr;
ERROR:  table name "c" specified more than once


Show the complete query.

Take the error message as correct, you are specifying 'companies as c' 
more then once.




Thanks,

Rich




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Left join syntax error

2024-05-18 Thread Adrian Klaver


On 5/18/24 08:01, Rich Shepard wrote:

On Sat, 18 May 2024, Ray O'Donnell wrote:

You need to include the alias for the table also - see "...from 
companies as c..." in Shammat's example.


Ray,

That didn't work:
bustrac-# FROM people as p, companies as c
bustrac-# LEFT JOIN companies as c ON c.company_nbr = p.company_nbr;
ERROR:  table name "c" specified more than once

so I tried only the alias on the join line:
bustrac-# LEFT JOIN c ON c.company_nbr = p.company_nbr;
ERROR:  relation "c" does not exist
LINE 3: LEFT JOIN c ON c.company_nbr = p.company_nbr;
   ^
and that didn't work either.


The query needs to be:

SELECT p.lname, p.fname, p.job_title, p.company_nbr, p.email, c.company_name
  FROM people as p
LEFT JOIN companies as c ON c.company_nbr = p.company_nbr;

Only reference companies as c once.



Thanks,

Rich




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: utf8 vs UTF-8

2024-05-18 Thread Adrian Klaver

On 5/18/24 07:48, Troels Arvin wrote:

Hello,

Tom Lane wrote:
 >>  test1  | loc_test | UTF8   | libc | en_US.UTF-8 | en_US.UTF-8
 >>  test3  | troels   | UTF8   | libc | en_US.utf8  | en_US.utf8
 >
 > On most if not all platforms, both those spellings of the locale names
 > will be taken as valid.  You might try running "locale -a" to get an
 > idea of which one is preferred according to your current libc
 > installation

"locale -a" on the Ubuntu system outputs this:

   C
   C.utf8
   en_US.utf8
   POSIX

If you expand that to locale -v -a you get:

locale: en_US.utf8  archive: /usr/lib/locale/locale-archive
---
title | English locale for the USA
   source | Free Software Foundation, Inc.
  address | https://www.gnu.org/software/libc/
email | bug-glibc-loca...@gnu.org
 language | American English
territory | United States
 revision | 1.0
 date | 2000-06-24
  codeset | UTF-8

So at first, I thought en_US.utf8 would be the most correct locale 
identifier. However, when I look at Postgres' own databases, they have 
the slightly different locale string:

   psql --list | grep -E 'postgres|template'
   postgres  | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ...
   template0 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ...
   template1 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ...

Also, when I try to create a database with "en_US.utf8" as locale 
without specifying a template:

troels=# create database test4 locale 'en_US.utf8';
ERROR:  new collation (en_US.utf8) is incompatible with the collation of 
the template database (en_US.UTF-8)
HINT:  Use the same collation as in the template database, or use 
template0 as template.

I'm going to say that is Postgres being exact to a fault.

Given the locale of Postgres' own databases and Postgres' error message, 
I'm leaning to en_US.UTF-8 being the most correct locale to use. Because 
why would Postgres care about it, if utf8/UTF-8 doesn't matter?

but TBH, I doubt it's worth worrying about.

But couldn't there be an issue, if for example the client's locale and 
the server's locale aren't exactly the same? I'm thinking maybe the 
client library has to perform unneeded translation of the stream of data 
to/from the database?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Left join syntax error

2024-05-18 Thread Adrian Klaver


On 5/18/24 07:46, Rich Shepard wrote:

On Sat, 18 May 2024, Shammat wrote:


Don't put the second table in the FROM part

SELECT p.lname, p.fname, p.job_title, p.company_nbr, p.email, 
c.company_name

FROM people as p
 LEFT JOIN companies as c ON c.company_nbr = p.company_nbr


Shammat,

I tried this with this result:

ERROR:  missing FROM-clause entry for table "c"
LINE 3: LEFT JOIN companies ON c.company_nbr = p.company_nbr;


... LEFT JOIN companies as c ON c.company_nbr = p.company_nbr;



Thanks for the suggestion.

Regards,

Rich




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Valid until

2024-05-18 Thread Adrian Klaver


On 5/18/24 03:09, Rama Krishnan wrote:
Reply to list also.
Ccing list


Hi Adrian,

I have modified the pg_hba entry from trust to md5 like below

```
local   all             all                                     md5


That would be the issue. trust ignores the password check.



```


When i have tired with postgres user I am able to connect


Which is expected as postgres does not have a 'valid until' restriction.




[postgres@postgres16 data]$ psql -U postgres -d postgres
Password for user postgres:
psql (16.2)
Type "help" for help.



postgres=# \du
                               List of roles
  Role name  |                         Attributes
+
  pgbackrest | Replication
  postgres   | Superuser, Create role, Create DB, Replication, Bypass RLS
  test       | Password valid until 2023-05-13 00:00:00+00
  user_name  | Password valid until 2024-05-13 00:00:00+00



But when i tried with test or user_name user  even though I am passing 
the correct value I am getting this error


Again as expected as the 'valid until' timestamp is in the past.




```
[postgres@postgres16 data]$ psql -U test -d postgres
Password for user test:
psql: error: connection to server on socket 
"/run/postgresql/.s.PGSQL.5432" failed: FATAL:  password authentication 
failed for user "test"


postgres=# \c  postgres user_name
Password for user user_name:
connection to server on socket "/run/postgresql/.s.PGSQL.5432" failed: 
FATAL:  password authentication failed for user "user_name"


```

Once i done the changes the valid until expiration date

```

postgres=# alter user test VALID UNTIL '2024-05-19';
ALTER ROLE

postgres=> \du
                               List of roles
  Role name  |                         Attributes
+
  pgbackrest | Replication
  postgres   | Superuser, Create role, Create DB, Replication, Bypass RLS
  test       | Password valid until 2024-05-19 00:00:00+00
  user_name  | Password valid until 2024-05-13 00:00:00+00
```

Finally it allows to connect test


Which is correct as the 'valid until' timestamp is in the future.



```

[postgres@postgres16 data]$ psql -d postgres -U test
Password for user test:
psql (16.2)

```

I believe this is a expected output of validunitl , Please correct me if 
i m wrong


The behavior is as referenced in the documentation:

https://www.postgresql.org/docs/current/sql-createrole.html

VALID UNTIL 'timestamp'

The VALID UNTIL clause sets a date and time after which the role's 
password is no longer valid. If this clause is omitted the password will 
be valid for all time.






Regards

A.Rama Krishnan




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Valid until

2024-05-18 Thread Adrian Klaver


On 5/18/24 02:37, Rama Krishnan wrote:

Reply to list also.
Ccing list.



Please find the details below


```
postgres=# \du
                               List of roles
  Role name  |                         Attributes
+
  pgbackrest | Replication
  postgres   | Superuser, Create role, Create DB, Replication, Bypass RLS
  test       | Password valid until 2023-05-13 00:00:00+00
  user_name  | Password valid until 2024-05-13 00:00:00+00

postgres=# show timezone;
  TimeZone
--
  UTC
(1 row)

postgres=# select version();
                                                  version
-
  PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 
20210514 (Red Hat 8.5.0-20), 64-bit

(1 row)

postgres=# \q
[postgres@postgres16 log]$ psql -U username -d postgres
psql: error: connection to server on socket 
"/run/postgresql/.s.PGSQL.5432" failed: FATAL:  role "username" does not 
exist

[postgres@postgres16 log]$ psql -U user_name -d postgres
psql (16.2)
Type "help" for help.
```

Regards

A.Rama Krishnan


On Tue, May 14, 2024 at 8:57 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 5/14/24 00:57, Rama Krishnan wrote:
 > Hi team,
 >
 > I have a question about the "valid until" parameter in the
"create role"
 > command.
 >
 > I recently set a user's password validity to "2024-05-13", but
despite
 > this, the user is still able to connect to the database. Is this
a bug,
 > or is the "valid until" parameter just for identification purposes,
 > indicating when the password will expire?
 >
 > Could you please clarify this for me?

https://www.postgresql.org/docs/current/sql-createrole.html
<https://www.postgresql.org/docs/current/sql-createrole.html>

"
VALID UNTIL 'timestamp'

      The VALID UNTIL clause sets a date and time after which the
role's
password is no longer valid. If this clause is omitted the password
will
be valid for all time.
"

When did the user log in?

What is the TimeZone setting in the database?

 >
 >
     > Thanks
 >
 > Ram

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-16 Thread Adrian Klaver


On 5/16/24 08:59, HORDER Philip wrote:

Classified as: {OPEN}

Adrian,


Still your contention was that autovacuum quit running after the initial 
restore and that is not the case


This Postgres server has been restarted a few times since 2nd May most recently 
on Tuesday 14th, hence the more recent analyze status.


Assuming clean shutdowns the statistics will survive restarts. They 
would be wiped when you drop a database and start over, have an unclean 
shutdown or you use one of the reset functions from here:


https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-FUNCTIONS

28.2.25. Statistics Functions



We've had some problems with our data feeds on this integration system, but 
these are now running again.
I'm planning to leave it all alone until I'm back in the office on Tuesday, and 
run this query again for a few tables and send you an update.
I'm expecting no further stats analysis, (and the performance to be appalling).


From here:

https://www.postgresql.org/docs/current/runtime-config-logging.html

log_autovacuum_min_duration

In addition, when this parameter is set to any value other than -1, a 
message will be logged if an autovacuum action is skipped due to a 
conflicting lock or a concurrently dropped relation.






Thanks for your time.

Phil Horder
Database Mechanic

Thales
Land & Air Systems





--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-16 Thread Adrian Klaver


On 5/16/24 07:38, HORDER Philip wrote:

Classified as: {OPEN}


Did you have chance to do below?
Using psql do



\x
select * from pg_stat_all_tables where relname = 'a.accp';


Sorry, missed that bit.
 From this output you can see that no stats have been collected since the last 
two overnight updates.


Still your contention was that autovacuum quit running after the initial 
restore and that is not the case. Assuming this is the instance you 
restored on May 2 2024 then approximate counts are:


autovacuum_count   1888/13 days = 145 autovacuums/day = 6/hour

autoanalyze_count  1715/13  = 132 autoanalyze/day = 5.5/hr




postgres=# select * from pg_stat_all_tables where relname='accp';
-[ RECORD 1 ]---+
relid   | 811486381
schemaname  | a
relname | accp
seq_scan| 1654767
seq_tup_read| 901811880
idx_scan| 146070383
idx_tup_fetch   | 305949969
n_tup_ins   | 2056948
n_tup_upd   | 0
n_tup_del   | 1662699
n_tup_hot_upd   | 0
n_live_tup  | 294
n_dead_tup  | 75162
n_mod_since_analyze | 0
n_ins_since_vacuum  | 75162
last_vacuum |
last_autovacuum | 2024-05-15 03:25:16.007386+00
last_analyze|
last_autoanalyze| 2024-05-15 03:25:16.008873+00
vacuum_count| 0
autovacuum_count| 1888
analyze_count   | 19836
autoanalyze_count   | 1715


Phil Horder
Database Mechanic

Thales
Land & Air Systems


{OPEN}
The information contained in this e-mail is confidential. It is intended only 
for the stated addressee(s) and access to it by any other person is 
unauthorised. If you are not an addressee, you must not disclose, copy, 
circulate or in any other way use or rely on the information contained in this 
e-mail. Such unauthorised use may be unlawful. If you have received this e-mail 
in error, please inform the originator immediately and delete it and all copies 
from your system.

Thales UK Limited. A company registered in England and Wales. Registered 
Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered 
Number: 868273

Please consider the environment before printing a hard copy of this e-mail.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-16 Thread Adrian Klaver

On 5/16/24 03:08, HORDER Philip wrote:

Classified as: {OPEN}

--oids have not been supported with pg_dump since v11

You're absolutely correct, this command came from my notes, which are obviously
out of date.

We're running Postgres 15 pg_dump, and I've updated my notes.
I'm double-checking the command options with the guy who creates these files,
but he's on holiday. I expect we just dropped the OID option.

FYI, --blobs is only needed when dumping a specific table or schema, otherwise
it happens by default.

Thanks, I didn’t know that.

Did you have chance to do below?

> From yesterday, current stats for table a.accp, from pg_STAT_all_tables:
>"811486381""airscape""accp"16458538988177871456553503047055581967135016364880294000"2024-05-14
14:51:37.158892+00""2024-05-09 08:27:45.328468+00""2024-05-14 13:15:31.999198+00"01815170101653

I can't parse this out well enough to really say anything. Though the
timestamps seem to indicate recent activity.

Using psql do

\x
select * from pg_stat_all_tables where relname = 'a.accp';

to get an extended output that is easier to read.

Phil Horder
Database Mechanic

Thales
Land & Air Systems

{OPEN}
The information contained in this e-mail is confidential. It is intended only
for the stated addressee(s) and access to it by any other person is
unauthorised. If you are not an addressee, you must not disclose, copy,
circulate or in any other way use or rely on the information contained in this
e-mail. Such unauthorised use may be unlawful. If you have received this e-mail
in error, please inform the originator immediately and delete it and all copies
from your system.

Thales UK Limited. A company registered in England and Wales. Registered
Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered
Number: 868273

Please consider the environment before printing a hard copy of this e-mail.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-15 Thread Adrian Klaver

On 5/15/24 01:08, HORDER Philip wrote:

Classified as: {OPEN}

Backups of this db are created with:

pg_dump --file=fsm.dmp -Fc --blobs --oids --dbname=lfm --host=localhost
--port= --username=superuser

From your original post:

"Running Postgres 15.3 ..."

--oids have not been supported with pg_dump since v11

When I try it:

pg_dump -V
pg_dump (PostgreSQL) 15.7 (Ubuntu 15.7-1.pgdg22.04+1)

pg_dump -d test -U postgres --oids -f test.sql
/usr/lib/postgresql/15/bin/pg_dump: unrecognized option '--oids'

So what version of pg_dump are you using?

Or you not working on a v15 instance of Postgres?

FYI, --blobs is only needed when dumping a specific table or schema,
otherwise it happens by default.

Restore is run with:

dropdb --port= --maintenance-db=postgres --username=superuser --if-exists
lfm
pg_restore -Fc --create --dbname=postgres --port= --username=superuser
fsm.dmp

2) From earlier post: '... only analyzes tables in the new db at the point of
reload, then shuts off again.' Provide that sequence of events from the
Postgres log.

Log file extract is attached, with object names obfuscated.

Still working my way through that.

From yesterday, current stats for table a.accp, from pg_STAT_all_tables:
"811486381""airscape""accp"16458538988177871456553503047055581967135016364880294000"2024-05-14
14:51:37.158892+00""2024-05-09 08:27:45.328468+00""2024-05-14 13:15:31.999198+00"01815170101653

I can't parse this out well enough to really say anything. Though the
timestamps seem to indicate recent activity.

Using psql do

\x
select * from pg_stat_all_tables where relname = 'a.accp';

to get an extended output that is easier to read.

This table has a low row count, but high content turnover.
It usually gets auto-analyzed every minute.

For today, this hasn't been auto analysed since the update at 3am.

4) The autovacuum settings you have in effect:

vacuum_cost_limit = 2000
log_autovacuum_min_duration = 0
autovacuum_max_workers = 10

all other vacuum settings are defaults.

Phil Horder
Database Mechanic

Thales
Land & Air Systems

Thales UK Limited. A company registered in England and Wales. Registered
Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered
Number: 868273

Please consider the environment before printing a hard copy of this e-mail.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Valid until

2024-05-14 Thread Adrian Klaver


On 5/14/24 00:57, Rama Krishnan wrote:

Hi team,

I have a question about the "valid until" parameter in the "create role" 
command.


I recently set a user's password validity to "2024-05-13", but despite 
this, the user is still able to connect to the database. Is this a bug, 
or is the "valid until" parameter just for identification purposes, 
indicating when the password will expire?


Could you please clarify this for me?


https://www.postgresql.org/docs/current/sql-createrole.html

"
VALID UNTIL 'timestamp'

The VALID UNTIL clause sets a date and time after which the role's 
password is no longer valid. If this clause is omitted the password will 
be valid for all time.

"

When did the user log in?

What is the TimeZone setting in the database?




Thanks

Ram


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: UTC is not a time zone?

2024-05-13 Thread Adrian Klaver





On 5/13/24 11:18 AM, Christophe Pettus wrote:




On May 13, 2024, at 11:17, Tom Lane  wrote:
What's causing that I can't say.  It doesn't look like we log the
errno anywhere when failing to read a zone file :-(


File descriptor exhaustion?  (Of course, that would mean something somewhere is 
leaking them, which is another problem.)


If I am understanding this correctly:

"The parent process would long since have cached the zone data in its
memory, but this error is in a parallel worker process, which'd have
to read the file for itself during startup.
"

Then this:

"As an experiment, I'm just pounding the server with a single connection 
doing nothing but SET TIMEZONEs repeatedly.  So far, no break, but it is 
*very* intermittent."


May not induce the error unless there are parallel workers involved.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: UTC is not a time zone?

2024-05-13 Thread Adrian Klaver





On 5/13/24 11:02 AM, Christophe Pettus wrote:




On May 13, 2024, at 10:58, Adrian Klaver  wrote:

You sure the timezone file did not get changed under the connection?


Yes (at least, nothing happened on the system that would indicate that).  The 
system wasn't touched during the execution (and, as noted, it worked after as 
well as before).


Hmm.

What does pg_config --configure show for '--with-system-tzdata' ?





Why is SET TIMEZONE TO 'UTC being constantly called?


ORM noise.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: UTC is not a time zone?

2024-05-13 Thread Adrian Klaver





On 5/13/24 10:50 AM, Christophe Pettus wrote:




On May 13, 2024, at 10:48, Adrian Klaver  wrote:
Yes:

https://www.postgresql.org/message-id/5DF49366-10D1-42A4-99BF-F9A7DC3AB0F4%40mailbox.org

Answer:

https://www.postgresql.org/message-id/1273542.1712326418%40sss.pgh.pa.us


Thanks!  Similar, but I don't think it's that.  This was a sudden change in a 
long-running connection that had issued a ton of ` SET TIMEZONE TO 'UTC'; ` 
before the failure.


You sure the timezone file did not get changed under the connection?

Why is SET TIMEZONE TO 'UTC being constantly called?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: UTC is not a time zone?

2024-05-13 Thread Adrian Klaver





On 5/13/24 10:37 AM, Christophe Pettus wrote:

PostgreSQL 16.3 on MacOS Sonoma.  A long-running process (which held a 
connection open the entire time) failed with:

2024-05-13 09:12:44.719 PDT,"cyan","cyan",35926,"[local]",664214f9.8c56,3,"SELECT",2024-05-13 06:26:17 PDT,3/60,0,ERROR,22023,"invalid value for parameter 
""TimeZone"": ""UTC""","while setting parameter ""TimeZone"" to ""UTC""
parallel worker","  select count(1), count(1) filter (where visited > 0) from framework_seenchoice  
",,,"","client backend",,0

It's not (easily) repeatable, and the system was not touched while the process 
was running (no installing new binaries, etc.).  Does this look familiar to 
anyone?



Yes:

https://www.postgresql.org/message-id/5DF49366-10D1-42A4-99BF-F9A7DC3AB0F4%40mailbox.org

Answer:

https://www.postgresql.org/message-id/1273542.1712326418%40sss.pgh.pa.us

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Unexpected data when subscribing to logical replication slot

2024-05-09 Thread Adrian Klaver


On 5/9/24 00:32, Daniel McKenzie wrote:

We've had this running in live now for years without a hiccup so we are 
surprised to learn that we have this massive race condition and it just 
so happens that the hardware is fast enough to process the transaction 
before the .NET application can react to replication slot changes.


On broad scale I'm going to say that over 'for years' there has been an 
increase in load on the Postgres server as well as the I/0 system of the 
machine it is running on. What you are seeing now is the canary in the 
mine giving you the heads up that more trouble lies ahead as the 
hardware and software is reaching load limits.


On finer scale my guess is that the following is happening when 
synchronous_commit = on:


1) Postgres session #1 does data change.

2) This is replicated out and picked up by wal2json, which sees the new 
data.


3) The Postgres server waits for the confirmation that the WAL record 
has been written out to disk. Upon confirmation it commits on the 
server. This is the part that I am not sure of in relation to wal2json.


4) Postgres session #2 queries the database for the record. In the case 
where 3) has not completed it sees the old values as the data change in 
session #1 has not committed and therefore the new values are not seen 
by other sessions.




Daniel McKenzie
Software Developer


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Unexpected data when subscribing to logical replication slot

2024-05-09 Thread Adrian Klaver


On 5/9/24 00:32, Daniel McKenzie wrote:

Asynchronous commit introduces the risk of data loss. There is a short
time window between the report of transaction completion to the client
and the time that the transaction is truly committed.



To get anywhere with this issue you will need to provide the information 
Tomas Vondra requested upstream:


"
Where/how does the enrichment query run? How does the whole process look 
like? I guess an application is receiving decoded changes as JSON, and 
then querying the database?

"

and

"
Would be good to have some sort of reproducer - ideally a script that
sets up an instance + replication, and demonstrates the issue. Or at
least a sufficiently detailed steps to reproduce it without having to
guess what exactly you did.
"


The documentation speaks about synchronous_commit changing how 
transactions change behaviour for the client. So in this case, my psql 
terminal is the client, and I would expect a faster commit (from its 
perspective) and then a period of risk (as a process usually done as 
part of the commit is now being done in the background) but it's not 
clear how that affects a replication slot subscriber.


What we're struggling to understand is: why are we seeing any updates in 
the replication slot before they have been "truly committed"?


From:

https://www.postgresql.org/docs/current/wal-async-commit.html

"As described in the previous section, transaction commit is normally 
synchronous: the server waits for the transaction's WAL records to be 
flushed to permanent storage before returning a success indication to 
the client. The client is therefore guaranteed that a transaction 
reported to be committed will be preserved, even in the event of a 
server crash immediately after. However, for short transactions this 
delay is a major component of the total transaction time. Selecting 
asynchronous commit mode means that the server returns success as soon 
as the transaction is logically completed, before the WAL records it 
generated have actually made their way to disk. This can provide a 
significant boost in throughput for small transactions."


It is about the state of the WAL record, in synchronous_commit=on the 
commit will not happen on the server and be seen by the client until the 
WAL record has been recorded as saved to disk. For 
synchronous_commit=off the commit happens as soon as COMMIT is reached 
in the transaction and the WAL record save happens after that.


I don't use wal2json so I don't know how it deals with the above.



There appears to be a state of limbo between updating data and that data 
being available to query (and our subscriber is picking up changes in 
this period of time) but I can't pin down any documentation which 
describes it.
The answer awaits a the full description of the process requested by 
Tomas Vondra.


Best guess, the fact that synchronous_commit=off 'cures' it implies that 
in synchronous_commit=on mode you are picking up data on the receiving 
and sending end at different points in '... the server waits for the 
transaction's WAL records to be flushed to permanent storage before 
returning a success indication to the client.'




We've had this running in live now for years without a hiccup so we are 
surprised to learn that we have this massive race condition and it just 


I would not say "...perhaps 1 in 50 times" is massive.

so happens that the hardware is fast enough to process the transaction 
before the .NET application can react to replication slot changes.


Daniel McKenzie
Software Developer


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Unexpected data when subscribing to logical replication slot

2024-05-08 Thread Adrian Klaver


On 5/8/24 08:24, Daniel McKenzie wrote:
It's running both (in docker containers) and also quite a few more 
docker containers running various .NET applications.


I think what you found is that the r7a.medium instance is not capable 
enough to do all that it is asked without introducing lag under load. 
Answering the questions posed by Tomas Vondra would help get to the 
actual cause of the lag.


In meantime my suspicion is this part:

"For example, when I use a psql terminal to update a user's last name 
from "Jones" to "Smith" then I would expect the enrichment query to find 
"Smith" but it will sometimes still find "Jones". It finds the old data 
perhaps 1 in 50 times."


If this is being run against the Postgres server my guess is that 
synchronous_commit=on is causing the commit on the server to wait for 
the WAL records to be flushed to disk and this is not happening in a 
timely manner in the '... 1 in 50 times' you mention. In that case you 
see the old values not the new committed values. This seems to be 
confirmed when you set synchronous_commit=off and don't see old values.

For completeness per:

https://www.postgresql.org/docs/current/wal-async-commit.html

"However, for short transactions this delay is a major component of the 
total transaction time. Selecting asynchronous commit mode means that 
the server returns success as soon as the transaction is logically 
completed, before the WAL records it generated have actually made their 
way to disk. This can provide a significant boost in throughput for 
small transactions.


Asynchronous commit introduces the risk of data loss. There is a short 
time window between the report of transaction completion to the client 
and the time that the transaction is truly committed (that is, it is 
guaranteed not to be lost if the server crashes).  ...

"



Daniel McKenzie
Software Developer


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Unexpected data when subscribing to logical replication slot

2024-05-08 Thread Adrian Klaver






We have found two things that appear to resolve the problem -

  * Using a more powerful EC2 instance. We can reproduce the issue with
a r7a.medium instance but not with a r7a.large EC2 instance.
  * Changing the Postgres synchronous_commit parameter from "on" to
"off". We cannot reproduce the issue with synchronous_commit set to
"off".


What is running on the EC2 instance?:

1) The Postgres server.

2) The replication receiver.

3) Both.




We need help to understand this unexpected behaviour.

We are using Postgres 14.10.

Thanks,
Daniel McKenzie


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Question regarding how databases support atomicity

2024-05-07 Thread Adrian Klaver

On 5/7/24 9:48 AM, Siddharth Jain wrote:
Thanks All for the kind responses. I understand how MVCC takes care of 
atomicity for updates to rows. I was developing a project where lets say 
data for each table is stored in its own folder together with metadata 
(we are not talking postgres now). So if I have two tables A and B I 
have a folder structure like:

A
\_ metadata.json
B
\_ metadata.json
Now if I want to rename a table, I need to move the folder and also 
update metadata accordingly. These are two separate operations but need 
to be done atomically - all or none. in this case it is possible that we 
succeed in renaming the folder but fail to update metadata for whatever 
reason. then if we try to undo the folder rename we get another failure 
for whatever reason. how to deal with such scenarios? are there no such 
scenarios in postgres?

The only thing I can think of is creating a function in one of the 
untrusted languages plpython3u or plperlu to do the renaming. Then in 
say plpython3u case wrap the actions in try/except block. On a failure 
take the appropriate undo action.

On Fri, May 3, 2024 at 8:29 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:

"David G. Johnston" mailto:david.g.johns...@gmail.com>> writes:
 > On Friday, May 3, 2024, Siddharth Jain mailto:siddh...@gmail.com>> wrote:
 >> The way I understand this is that if there is a failure
in-between, we
 >>> start undoing and reverting the previous operations one by one.

 > Not in PostgreSQL.  All work performed is considered provisional
until a
 > commit succeeds.  At which point all provisional work, which had been
 > tagged with the same transaction identifier, becomes reality to
the rest of
 > the system, by virtue of marking the transaction live.

Right.  We don't use UNDO; instead, we use multiple versions of
database rows (MVCC).  A transaction doesn't modify the contents
of existing rows, but just marks them as provisionally outdated, and
then inserts new row versions that are marked provisionally inserted.
Other transactions ignore the outdate markings and the uncommitted new
rows, until the transaction commits, at which time the new versions
become live and the old ones become dead.  If the transaction never
does commit -- either through ROLLBACK or a crash -- then the old row
versions remain live and the new ones are dead.  In either case, we
don't have a consistency or correctness problem, but we do have dead
rows that must eventually get vacuumed away to prevent indefinite
storage bloat.  That can be done by background housekeeping processes
though (a/k/a autovacuum).

I believe Oracle, for one, actually does use UNDO.  I don't know
what they do about failure-to-UNDO.

                         regards, tom lane

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-07 Thread Adrian Klaver


On 5/7/24 08:24, Adrian Klaver wrote:

On 5/7/24 02:38, HORDER Philip wrote:

Thanks for your time Adrian






1) What is the exact pg_restore command you are using?

2) From earlier post: '...  only analyzes tables in the new db at the 
point of reload, then shuts off again.' Provide that sequence of events 
from the Postgres log.


3) Also statistics from

https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW

for that table after the reload.



4) The autovacuum settings you have in effect.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-07 Thread Adrian Klaver

On 5/7/24 02:38, HORDER Philip wrote:

Thanks for your time Adrian

Is there enough data processing?

Yes, one table is receiving upwards of 20 million rows daily.
We noticed the problem when fetch performance on this table degraded after
updates.

Autovacuum has thresholds for turning on, are you sure those thresholds are
just not being met?

Yes we're sure. Our biggest table is set for a fixed number of rows rather
than a percentage, this gets an auto analyse about every 15 minutes.

After an update this just stops, and there are no analyse entries in the log
file. None at all, for any table.

1) What is the exact pg_restore command you are using?

2) From earlier post: '... only analyzes tables in the new db at the
point of reload, then shuts off again.' Provide that sequence of events
from the Postgres log.

3) Also statistics from

https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW

for that table after the reload.

When we restart Postgres the auto analyse restarts and catches up with the
backlog.

Phil Horder
Database Mechanic

Thales
Land & Air Systems

The information contained in this e-mail is confidential. It is intended only
for the stated addressee(s) and access to it by any other person is
unauthorised. If you are not an addressee, you must not disclose, copy,
circulate or in any other way use or rely on the information contained in this
e-mail. Such unauthorised use may be unlawful. If you have received this e-mail
in error, please inform the originator immediately and delete it and all copies
from your system.

Thales UK Limited. A company registered in England and Wales. Registered
Office: 350 Longwater Avenue, Green Park, Reading, Berks RG2 6GF. Registered
Number: 868273

Please consider the environment before printing a hard copy of this e-mail.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: problem loading shared lib pg_tde.so

2024-05-06 Thread Adrian Klaver


On 5/6/24 07:42, Adrian Klaver wrote:

On 5/6/24 04:05, Matthias Apitz wrote:




I see three different versions of OpenSSL:

OPENSSL_1_1_1d  -- From error messsage
OpenSSL 1.1.1l-fips    -- SuSE 15 version
OpenSSL 1.1.1t    -- Your built version?

Are you sure you pointing at the same version in all cases?


Should have added what does the below return?:

ldd /usr/local/sisis-pap/pgsql-16.2/lib/pg_tde.so



matthias






--
Adrian Klaver
adrian.kla...@aklaver.com

Re: problem loading shared lib pg_tde.so

2024-05-06 Thread Adrian Klaver


On 5/6/24 04:05, Matthias Apitz wrote:

I have a problem while loading the pg_tde.so shared lib.

contrib/pg_tde was built with:

cd postgresql-16.2/contrib/pg_tde || exit
gmake clean
export LDFLAGS="-L/usr/local/sisis-pap/lib -L/usr/lib64"
export CFLAGS="-m64 -I/usr/local/sisis-pap/include"
export CPPFLAGS="-m64 -I/usr/local/sisis-pap/include"

./configure --prefix=/usr/local/sisis-pap/pgsql-16.2 \
 --libdir=/usr/local/sisis-pap/pgsql-16.2/lib
 --with-libcurl=/usr/local/sisis-pap/

gmake
gmake install

but the shared lib /usr/local/sisis-pap/pgsql-16.2/lib/pg_tde.so
can't be loaded on startup of the server:

024-05-06 11:18:45.967 CEST [15368] FATAL:  could not load library 
"/usr/local/sisis-pap/pgsql-16.2/lib/pg_tde.so": /usr/lib64/libssh.so.4: 
undefined symbol: EVP_KDF_CTX_new_id, version OPENSSL_1_1_1d
2024-05-06 11:18:45.967 CEST [15368] LOG:  database system is shut down

This is the OpenSSL version of SuSE Linux Enterprise 15 SP5:

# openssl version
OpenSSL 1.1.1l-fips  24 Aug 2021 SUSE release 150500.17.25.1

This is what we have compiled and PostgreSQL should use:

# export LD_LIBRARY_PATH=/usr/local/sisis-pap/lib
# /usr/local/sisis-pap/bin/openssl version
OpenSSL 1.1.1t  7 Feb 2023


I see three different versions of OpenSSL:

OPENSSL_1_1_1d  -- From error messsage
OpenSSL 1.1.1l-fips -- SuSE 15 version
OpenSSL 1.1.1t  -- Your built version?

Are you sure you pointing at the same version in all cases?



When I disable 'pg_tde' in data/postgresql.auto.conf the server
starts fine;

vim /data/postgresql162/data/postgresql.auto.conf
# disabled shared_preload_libraries = 'pg_tde'

# /etc/init.d/postgres162 start
starts fine

and the postgres proc is using our libssl.so.1.1

# lsof -p 17254 | egrep 'libssl'
postgres 17254 postgres  mem   REG  254,0   697248  1080241 
/usr/local/sisis-pap/lib/libssl.so.1.1

# strings /usr/local/sisis-pap/lib/libssl.so.1.1 | grep EVP_KDF
(nix)

# strings /usr/lib64/libssh.so.4 | grep EVP_KDF
EVP_KDF_CTX_new_id
EVP_KDF_ctrl
EVP_KDF_CTX_free
EVP_KDF_derive

I have a complete different OpenSSL 3.0.x environment: all OpenSSL
consumers use /usr/local/sisis-pap.sp01/lib/libssl.so.3, also
PostgreSQL and pg_tde have been compiled against this; and this
runs fine with 'pg_tde'.

What the avove error means?

Thanks


matthias




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: \dt shows table but \d says the table doesn't exist ?

2024-05-03 Thread Adrian Klaver


On 5/3/24 21:06, David G. Johnston wrote:
On Friday, May 3, 2024, Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:



By and large, I'd expect people using mixed-case table names to get
accustomed pretty quickly to the fact that they have to double-quote
those names in SQL.  I don't see why it's a surprise that that is also
true in \d commands.


Every day the number of people increases who get mixed-case names in 
their DB because their client language preserves case and doesn’t 
require quoting.


In a sense they do by making quoting the default, which people find out 
when they stray from the language/ORM/framework and get pointed at:


https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS



And it isn’t like any sane person is using case to distinguish names in 
their DB and so would be at risk of seeing multiple unwanted matches for 
any given pattern.


Have you met people?



I don’t think name pattern matching during object lookup is a good tool 
to wield to try and convince people that using camel case is a bad idea 
and they should use snake case instead.  If they don’t write enough raw 
SQL to be annoyed by their choice more power to them, making \d more 
accessible for them is a win and snake case people won’t notice or care.


David J.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: How to Build Postgres in a Portable / Relocatable fashion?

2024-05-03 Thread Adrian Klaver


On 5/3/24 17:35, Bruce Momjian wrote:

On Fri, May  3, 2024 at 05:29:34PM -0700, Adrian Klaver wrote:

 From here:

https://www.enterprisedb.com/

#1 IN POSTGRES
THE MOST ADMIRED, DESIRED, & USED DATABASE.

is not helpful.

Though I would say the bigger issue is here:

https://www.postgresql.org/download/

Where you have

Packages and Installers

and then further down

3rd party distributions

It tends to imply that the 'Packages and Installers' are not third party,
when they are to some degree or another.


I think it is that way because we don't have any guarantees or regular
communication with "3rd party distributions".



I would say it is not clear enough that Core is responsible for the 
source releases, anything above that is some other groups 
responsibility. This becomes apparent when you have to tell people that 
fixing packaging issues requires reaching out to said groups through 
means that are not entirely clear.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: How to Build Postgres in a Portable / Relocatable fashion?

2024-05-03 Thread Adrian Klaver


On 5/3/24 17:07, Bruce Momjian wrote:

On Fri, May  3, 2024 at 11:27:12PM +, AJ ONeal wrote:

Back in the good old days there was official 1st-party support for Linux:

This
was/is a third party site.


Oh. I thought that EDB was *the* Postgres company like MySQL AB was *the* MySQL 
company. My mistake.


How did you come to that conclusion?  Is there something we could do to
avoid this assumption.




From here:

https://www.enterprisedb.com/

#1 IN POSTGRES
THE MOST ADMIRED, DESIRED, & USED DATABASE.

is not helpful.

Though I would say the bigger issue is here:

https://www.postgresql.org/download/

Where you have

Packages and Installers

and then further down

3rd party distributions

It tends to imply that the 'Packages and Installers' are not third 
party, when they are to some degree or another.



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: How to Build Postgres in a Portable / Relocatable fashion?

2024-05-03 Thread Adrian Klaver


On 5/3/24 15:57, AJ ONeal wrote:

Back in the good old days there was official 1st-party support for Linux:


Still there.

The below is for the EDB installer, which at this point is only  for 
MacOS and Windows as Unixen platforms have there own packaging. This 
was/is a third party site.



https://www.enterprisedb.com/downloads/postgres-postgresql-downloads 
<https://www.enterprisedb.com/downloads/postgres-postgresql-downloads>


Those binaries could be |tar xvf|'d and used without any fuss or muss, 
from any location, on any distro.


I'm the core maintainer of https://webinstall.dev/ 
<https://webinstall.dev/> and I'm thinking to break our "official builds 
only" rule because, well, I haven't been able to find any way to get 
up-to-date builds from an official channel - and |apt| is always 
half-a-decade out-of-date (part of the reason for Webi, but Webi is far 
worse in this case, stuck at v10).


I don't see that. Using PGDG repo on Ubuntu 22.04:

apt list --installed | grep postgres

postgresql-14/jammy-pgdg,now 14.11-1.pgdg22.04+1 amd64 [installed]
postgresql-15-pgtap/jammy-pgdg,now 1.3.3-1.pgdg22.04+1 all [installed]
postgresql-15/jammy-pgdg,now 15.6-1.pgdg22.04+1 amd64 [installed]
postgresql-16-pgtap/jammy-pgdg,now 1.3.3-1.pgdg22.04+1 all [installed]
postgresql-16-unit/jammy-pgdg,now 7.8-1.pgdg22.04+1 amd64 [installed]
postgresql-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed,automatic]
postgresql-client-14/jammy-pgdg,now 14.11-1.pgdg22.04+1 amd64 
[installed,automatic]
postgresql-client-15/jammy-pgdg,now 15.6-1.pgdg22.04+1 amd64 
[installed,automatic]
postgresql-client-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 
[installed,automatic]
postgresql-client-common/jammy-pgdg,now 259.pgdg22.04+1 all 
[installed,automatic]

postgresql-common/jammy-pgdg,now 259.pgdg22.04+1 all [installed,automatic]
postgresql-plpython3-14/jammy-pgdg,now 14.11-1.pgdg22.04+1 amd64 [installed]
postgresql-plpython3-15/jammy-pgdg,now 15.6-1.pgdg22.04+1 amd64 [installed]
postgresql-plpython3-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed]
postgresql-server-dev-16/jammy-pgdg,now 16.2-1.pgdg22.04+1 amd64 [installed]
postgresql/jammy-pgdg,now 16+259.pgdg22.04+1 all [installed]



Could I get some help on how to do that?
Are the old build processes documented somewhere? Or are there some 
scripts in a far corner of the Internet that could still do that?


As to Apt build scripts:

https://wiki.postgresql.org/wiki/Apt/RepoDocs



Or what options might I need to pass to ./configure to get it to build 
with relative locations?
I'm not a C developer, and I'm not familiar with C build tools beyond 
|./configure; make; sudo make install|.


I'd really appreciate some direction on this. Thanks.

AJ ONeal


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: \dt shows table but \d says the table doesn't exist ?

2024-05-03 Thread Adrian Klaver


On 5/3/24 14:06, Magnus Hagander wrote:



On Fri, May 3, 2024 at 10:58 PM David Gauthier <mailto:dfgpostg...@gmail.com>> wrote:


psql (15.3, server 14.5) on linux

Someone else's DB which I've been asked to look at.

\dt gives many tables, here are just 3...

  public | some_idIds                                       | table
| cron_user
  public | WarningIds                                       | table
| cron_user
  public | cpf_inv_driverIds                                | table
| cron_user

but \d public.some_idIds gives..

Did not find any relation named "public.some_idIds".



Looks like you might need a \d "some_idIds" (include the quotes) since 
it has an uppercase characters?


This:

"Did not find any relation named "public.some_idIds"."

to me indicates it did look for the properly cased name.



--
  Magnus Hagander
  Me: https://www.hagander.net/ <http://www.hagander.net/>
  Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: \dt shows table but \d says the table doesn't exist ?

2024-05-03 Thread Adrian Klaver


On 5/3/24 13:58, David Gauthier wrote:

psql (15.3, server 14.5) on linux

Someone else's DB which I've been asked to look at.

\dt gives many tables, here are just 3...

  public | some_idIds                                       | table | 
cron_user
  public | WarningIds                                       | table | 
cron_user
  public | cpf_inv_driverIds                                | table | 
cron_user


but \d public.some_idIds gives..

Did not find any relation named "public.some_idIds".

Soo... what am I missing ?


Try:

select relname, length(relname) from pg_class where relname ilike 
'%some_idIds%';


Best guess there are hidden characters.


owner is "cron_user".  \dt shows cron_user is the owner of the table.




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-02 Thread Adrian Klaver


On 5/2/24 08:52, HORDER Philip wrote:

Running Postgres 15.3 with PostGIS 3.3

On Windows 10 (yes, I know)

It’s a single node db with no replication, topping out at about 200GB.

We have a schema (A) in the default 'postgres' maintenance database, 
which our software services connect to, with one set of users (SU)


This above is probably not a good idea, The 'postgres' database is 
generally taken to be a throw away database for establishing an initial 
connection. Many utilities/tools use it for that purpose, having your 
data in it exposes that data.




This works, some of the time.

If I repeat the update process, somewhere around run #4 the auto 
analyzer stops working, and only analyzes tables in the new db at the 
point of reload, then shuts off again.


All vacuum and analyze operations on the 'postgres' database just stops, 
even though there is still data processing into it.


Is there enough data processing?

Autovacuum has thresholds for turning on, are you sure those thresholds 
are just not being met?




With log_autovacuum_min_duration = 0, we are logging all vacuum & 
analyze operations, so we can see when the entries shut off in the 
Postgres log files, e.g.


2024-05-02 14:52:01.597 GMT [6896]: [23-1] db=,user=,app=,client= LOG:  
automatic analyze of table "lfm.pg_catalog.pg_trigger"


Except the above shows it working.

What is the evidence it is not?



The only way I can find of getting the analyzer back is to restart Postgres.


Did you wait to see if activity after the pg_restore crossed the 
autovacuum thresholds?




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Restore of a reference database kills the auto analyze processing.

2024-05-02 Thread Adrian Klaver





On 5/2/24 8:52 AM, HORDER Philip wrote:

Running Postgres 15.3 with PostGIS 3.3

On Windows 10 (yes, I know)

It’s a single node db with no replication, topping out at about 200GB.

We have a schema (A) in the default 'postgres' maintenance database, 
which our software services connect to, with one set of users (SU)


We have another database, let’s call it LFM, which contains reference 
data for some COTS software.  I don't know what's in it, we just get 
given updates for it in pg_backup binary files, about 2MB each.


Do you mean pg_basebackup, pg_dump or something else?


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: How to interpret 'depends on' errors in pg_restore?

2024-05-02 Thread Adrian Klaver


On 5/2/24 02:20, Fire Emerald wrote:
I didn't used pg_dump/restore until today and finally found my mistake 
which lead to the "problem" described below.


The output "depends on" comes from the -l (l as Lima) flag, what i 
wanted was the -1 (number one) flag, which stands for single transaction 
in pg_restore. As -l does not execute anything, nothing was logged in 
the postgres server log and none error was shown anywhere.


-l does indeed execute something per:

https://www.postgresql.org/docs/current/app-pgrestore.html

"
-l
--list

List the table of contents of the archive. The output of this 
operation can be used as input to the -L option. Note that if filtering 
switches such as -n or -t are used with -l, they will restrict the items 
listed.

"

As example:

pg_restore -l redmine41_14_032124.out


;
; Archive created at 2024-03-21 01:00:01 PDT
; dbname: redmine
; TOC Entries: 455
; Compression: -1
; Dump Version: 1.14-0
; Format: CUSTOM
; Integer: 4 bytes
; Offset: 8 bytes
; Dumped from database version: 14.11 (Ubuntu 14.11-1.pgdg22.04+1)
; Dumped by pg_dump version: 14.11 (Ubuntu 14.11-1.pgdg22.04+1)
;
;
; Selected TOC Entries:
;
209; 1259 17070 TABLE public ar_internal_metadata redmine
210; 1259 17075 TABLE public attachments redmine
211; 1259 17088 SEQUENCE public attachments_id_seq redmine
4069; 0 0 SEQUENCE OWNED BY public attachments_id_seq redmine
...


Generally you want to redirect that to a file with -f 
with the goal of using it with:

"
-L list-file
--use-list=list-file

Restore only those archive elements that are listed in list-file, 
and restore them in the order they appear in the file. Note that if 
filtering switches such as -n or -t are used with -L, they will further 
restrict the items restored.


list-file is normally created by editing the output of a previous 
-l operation. Lines can be moved or removed, and can also be commented 
out by placing a semicolon (;) at the start of the line. See below for 
examples.

"

You instead redirected the output to the target database and that led to 
your errors.





Both chars looked so identical in my editors/shells that i thought i 
used -1, in fact using -l.


It's always the tiny obvious thing, which we do not see.

Best regards,
Chris

Am 28. März 2024 16:57:04 schrieb Fire Emerald :


Am 28. März 2024 15:00:06 schrieb Tom Lane :


Fire Emerald  writes:
Then i did a pg_restore -d target --verbose -Fc file.dump and saw in 
the

output this:



5145 0 730750 TABLE subpartitions backends_y2024w03 userA
;        depends on: 237
 and so on ...


That is not an error, it's just verbose display of one of the items
in the dump.


Well, I know it's not an error, but it's everything i got. There was 
no error shown. The command completed, but without anything imported.



Nothing was restored.


You would need to show us the actual errors.  (Suggestion:
leave off --verbose, it's just clutter.)  A guess though is
that the import failed because of foreign key constraints.
--data-only mode is not good at ordering the table loads to
ensure that FK constraints are satisfied on-the-fly.

  regards, tom lane


As i said, the same import but with INSERT INTOs worked without any 
issues. So no, there are no FK constraints failing.


*But* the target and source table had partitioned tables attached, 
using ATTACH PARTITION.


The schema was like:
db1 schema1 public table1 (links to the listed below)
db1 schema1 subpartitions backends_y2024w03
db1 schema1 subpartitions backends_y2024w04
db1 schema1 subpartitions backends_y2024w05

The partitioning must be the problem somehow.




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver

On 4/29/24 08:51, Cocam' server wrote:

 > Did the above work for each table?
Yes, except for the biggest table

 > Have you done something like?:
 >
 > select relname, n_dead_tup from pg_stat_all_tables where relname =
 > '';

I hadn't thought of that, but it seems that some tables have dead tuples

You have something against providing actual numbers?

The point is there is really nothing to be gained by doing VACUUM FULL 
if the dead tuples are some small percentage of the tables.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver


On 4/29/24 08:04, Cocam' server wrote:

When replying use Reply All to include the mailing list
Ccing list


 > How much current free space do you have available on the disk?
as we speak, I only have 6 GB available on the machine running the server

 > Did you VACUUM FULL a table at a time or all of them at once?
I tried to make a VACUUM FULL. I also tried on the biggest tables (200 
Mb and +) but not on all of them


Did the above work for each table?

Have you done something like?:

select relname, n_dead_tup from pg_stat_all_tables where relname = 
'';


to see if there any dead tuples to clean out.

Or if you use the contrib extension pgstattuple:

https://www.postgresql.org/docs/current/pgstattuple.html

then:

SELECT * FROM pgstattuple('');

This returns something like:

-[ RECORD 1 ]--+
table_len  | 3940352
tuple_count| 4310
tuple_len  | 3755414
tuple_percent  | 95.31
dead_tuple_count   | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 124060
free_percent   | 3.15




The two biggest are these:
  state_groups_state | 5475 MB
  event_json | 2328 MB

(I'd particularly like to make room on these two tables, which take up 
the most space)


By the way, excuse me if I make a few mistakes (especially when 
replying), this is the first time I've used Postgres community support 
directly



Le lun. 29 avr. 2024 à 16:37, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> a écrit :


On 4/29/24 07:33, Cocam' server wrote:

Please reply to list also
Ccing list

 > No, the aim is also to reallocate free space to the system for
the other
 > tasks it performs.(That's why I said I'd like it returned to the OS)

You led with:

"I need help to make space on my database".

How much current free space do you have available on the disk?

Did you VACUUM FULL a table at a time or all of them at once?

What are the individual tables sizes?

 >
     > Le lun. 29 avr. 2024 à 16:19, Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > <mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> a écrit :
 >
 >     On 4/29/24 06:45, Cocam' server wrote:
 >      > Hello.
 >      >
 >      > I need help to make space on my database. I have tables
that are
 >     several
 >      > GB in size. I used to use the VACUUM FULL VERBOSE command; but
 >     now, this
 >      > command is too greedy in free space to be used and I'm looking
 >     for a way
 >      > to make free space (given back to the OS)
 >      >
 >      > Thanks in advance to everyone who responds
 >
 >     Per
 >
 > https://www.postgresql.org/docs/current/sql-vacuum.html
<https://www.postgresql.org/docs/current/sql-vacuum.html>
 >     <https://www.postgresql.org/docs/current/sql-vacuum.html
<https://www.postgresql.org/docs/current/sql-vacuum.html>>
 >
 >     "VACUUM reclaims storage occupied by dead tuples. In normal
PostgreSQL
 >     operation, tuples that are deleted or obsoleted by an update
are not
 >     physically removed from their table; they remain present
until a VACUUM
 >     is done. Therefore it's necessary to do VACUUM periodically,
especially
 >     on frequently-updated tables.
 >
 >     <...>
 >
 >     Plain VACUUM (without FULL) simply reclaims space and makes it
 >     available
 >     for re-use. This form of the command can operate in parallel with
 >     normal
 >     reading and writing of the table, as an exclusive lock is not
obtained.
 >     However, extra space is not returned to the operating system
(in most
 >     cases); it's just kept available for re-use within the same
    table.
 >     "
 >
 >     So a regular VACUUM should work if all you want to do is give the
 >     database the ability to recycle the vacuumed tuple space.
 >
 >     --
 >     Adrian Klaver
 > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
<mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver

On 4/29/24 07:33, Cocam' server wrote:

Please reply to list also
Ccing list

No, the aim is also to reallocate free space to the system for the other 
tasks it performs.(That's why I said I'd like it returned to the OS)

You led with:

"I need help to make space on my database".

How much current free space do you have available on the disk?

Did you VACUUM FULL a table at a time or all of them at once?

What are the individual tables sizes?

Le lun. 29 avr. 2024 à 16:19, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> a écrit :

On 4/29/24 06:45, Cocam' server wrote:
 > Hello.
 >
 > I need help to make space on my database. I have tables that are
several
 > GB in size. I used to use the VACUUM FULL VERBOSE command; but
now, this
 > command is too greedy in free space to be used and I'm looking
for a way
 > to make free space (given back to the OS)
 >
 > Thanks in advance to everyone who responds

Per

https://www.postgresql.org/docs/current/sql-vacuum.html
<https://www.postgresql.org/docs/current/sql-vacuum.html>

"VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL
operation, tuples that are deleted or obsoleted by an update are not
physically removed from their table; they remain present until a VACUUM
is done. Therefore it's necessary to do VACUUM periodically, especially
on frequently-updated tables.

<...>

Plain VACUUM (without FULL) simply reclaims space and makes it
available
for re-use. This form of the command can operate in parallel with
normal
reading and writing of the table, as an exclusive lock is not obtained.
However, extra space is not returned to the operating system (in most
cases); it's just kept available for re-use within the same table.
"

So a regular VACUUM should work if all you want to do is give the
database the ability to recycle the vacuumed tuple space.

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver


On 4/29/24 06:45, Cocam' server wrote:

Hello.

I need help to make space on my database. I have tables that are several 
GB in size. I used to use the VACUUM FULL VERBOSE command; but now, this 
command is too greedy in free space to be used and I'm looking for a way 
to make free space (given back to the OS)


Thanks in advance to everyone who responds


Per

https://www.postgresql.org/docs/current/sql-vacuum.html

"VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL 
operation, tuples that are deleted or obsoleted by an update are not 
physically removed from their table; they remain present until a VACUUM 
is done. Therefore it's necessary to do VACUUM periodically, especially 
on frequently-updated tables.


<...>

Plain VACUUM (without FULL) simply reclaims space and makes it available 
for re-use. This form of the command can operate in parallel with normal 
reading and writing of the table, as an exclusive lock is not obtained. 
However, extra space is not returned to the operating system (in most 
cases); it's just kept available for re-use within the same table.

"

So a regular VACUUM should work if all you want to do is give the 
database the ability to recycle the vacuumed tuple space.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Query Discrepancy in Postgres HLL Test

2024-04-28 Thread Adrian Klaver


On 4/28/24 06:01, Ayush Vatsa wrote:

Hi PostgreSQL Community,
I'm currently delving into Postgres HLL (HyperLogLog) functionality and 
have encountered an unexpected behavior while executing queries from the 
"cumulative_add_sparse_edge.sql 
<https://github.com/citusdata/postgresql-hll/blob/master/sql/cumulative_add_sparse_edge.sql#L28-L36>" regress test. This particular test data file <https://github.com/citusdata/postgresql-hll/blob/master/sql/data/cumulative_add_sparse_edge.csv#L515-L516> involves three columns, with the last column representing an HLL (HyperLogLog) value derived from the previous HLL value and the current raw value.


Upon manual inspection of the query responsible for deriving the last 
row's HLL value, I noticed a discrepancy. When executing the query:

"""
-- '\x148B481002' is second last rows hll value
SELECT hll_add('\x148B481002.', hll_hashval(2561));
"""
instead of obtaining the expected value (''\x148B481002''), I 
received a different output which is ('\x138b48000200410061008100a1 
').


My initial assumption is that this could potentially be attributed to a 
precision error. However, I'm reaching out to seek clarity on why this 
disparity is occurring and to explore potential strategies for 
mitigating it (as I want the behaviour to be consistent to regress test 
file).


I would say your best option is to file an issue here:

https://github.com/citusdata/postgresql-hll/issues



Regards
Ayush Vatsa


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: What linux version to install ?

2024-04-25 Thread Adrian Klaver


On 4/25/24 00:46, Kashif Zeeshan wrote:



I worked with both CentOS/RHEL 7/8.


What does the above mean relative to the original question?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Backup_Long Running

2024-04-24 Thread Adrian Klaver


On 4/24/24 00:03, jaya kumar wrote:

Hi Team,

Production database Backup is running very long hours. Any option to 
reduce backup time? Kindly advise me.


Hardware specifications?

Network specifications?

The actual pg_basebackup command used?

Server(s) user load?



DB size: 793 GB

We are taking pg_basebackup backup.

Backup File size=613G

Backup Running Duration: 8 Hours


--
Thanks & Regards,
Jayakumar.S
+91-9840864439.


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: issue with reading hostname

2024-04-22 Thread Adrian Klaver


On 4/22/24 14:54, Atul Kumar wrote:
I mean, Once I change the hostname then how will the socket read the new 
hostname ? Does it require a postgres service restart ?


The host name of the machine?





On Tue, Apr 23, 2024 at 3:19 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/22/24 14:37, Atul Kumar wrote:
 > Can we edit the socket to change the hostname in it ?

On Ubuntu 22.04 install, given:

srwxrwxrwx  1 postgres postgres    0 Apr 22 14:01 .s.PGSQL.5432=
-rw---  1 postgres postgres   68 Apr 22 14:01 .s.PGSQL.5432.lock

The contents of .s.PGSQL.5432.lock(the file that indicates a Postgres
instance has a lock on the socket) are:

862
/var/lib/postgresql/15/main
1713795311
5432
/var/run/postgresql

There is no hostname to be changed as you are working with a local
socket.

 >
 > Regards.
 >
 > On Tue, Apr 23, 2024 at 2:41 AM Ron Johnson
mailto:ronljohnso...@gmail.com>
 > <mailto:ronljohnso...@gmail.com
<mailto:ronljohnso...@gmail.com>>> wrote:
 >
 >     On Mon, Apr 22, 2024 at 4:14 PM Atul Kumar
mailto:akumar14...@gmail.com>
 >     <mailto:akumar14...@gmail.com
<mailto:akumar14...@gmail.com>>> wrote:
 >
 >         Hi,
 >
 >         I have postgresql  version 15 running on centos7.
 >
 >         I have below query that reads hostname from /tmp directory:
 >
 >         psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT
 >         pg_is_in_recovery();'
 >
 >
 >     If you installed from the PGDG repository (possibly also the
CENTOS
 >     repos, but I'm not sure), then the domain socket also lives in :
 >     /var/run/postgresql
 >
 >     * I find that more expressive than /tmp.
 >     * No need to specify the host when using sockets.
 >     * Using a socket name makes parameterizing the hostname easier in
 >     scripts.
 >
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver


On 4/22/24 14:35, Ron Johnson wrote:


 >
 > On Stack Exchange, I've got a question on how to determine when
to run
 > CLUSTER.  It ties in strongly with this thread..
 >

And the link is?


It should have been the initial question of this thread and it explains 
what you are really after.





Sorry.  Got distracted by the answer.

https://dba.stackexchange.com/questions/338870/when-to-rerun-cluster 
<https://dba.stackexchange.com/questions/338870/when-to-rerun-cluster>




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: issue with reading hostname

2024-04-22 Thread Adrian Klaver


On 4/22/24 14:37, Atul Kumar wrote:

Can we edit the socket to change the hostname in it ?


On Ubuntu 22.04 install, given:

srwxrwxrwx  1 postgres postgres0 Apr 22 14:01 .s.PGSQL.5432=
-rw---  1 postgres postgres   68 Apr 22 14:01 .s.PGSQL.5432.lock

The contents of .s.PGSQL.5432.lock(the file that indicates a Postgres 
instance has a lock on the socket) are:


862
/var/lib/postgresql/15/main
1713795311
5432
/var/run/postgresql

There is no hostname to be changed as you are working with a local socket.



Regards.

On Tue, Apr 23, 2024 at 2:41 AM Ron Johnson <mailto:ronljohnso...@gmail.com>> wrote:


On Mon, Apr 22, 2024 at 4:14 PM Atul Kumar mailto:akumar14...@gmail.com>> wrote:

Hi,

I have postgresql  version 15 running on centos7.

I have below query that reads hostname from /tmp directory:

psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT
pg_is_in_recovery();'


If you installed from the PGDG repository (possibly also the CENTOS
repos, but I'm not sure), then the domain socket also lives in :
/var/run/postgresql

* I find that more expressive than /tmp.
* No need to specify the host when using sockets.
* Using a socket name makes parameterizing the hostname easier in
scripts.




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver


On 4/22/24 13:59, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 4:21 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

[snip]

Which gets us back to your comment upstream:

"What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the
PK, if the PK is a sequence (whether that be an actual sequence, or a
timestamp or something else that grows monotonically)."

This is a case specific to you and this particular circumstance, not a
general rule for VACUUM FULL. If for no other reason then it might make
more sense for the application that the CLUSTER be done on some other
index then the PK.


On Stack Exchange, I've got a question on how to determine when to run 
CLUSTER.  It ties in strongly with this thread..




And the link is?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: issue with reading hostname

2024-04-22 Thread Adrian Klaver


On 4/22/24 13:13, Atul Kumar wrote:

Hi,

I have postgresql  version 15 running on centos7.

I have below query that reads hostname from /tmp directory:

psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT pg_is_in_recovery();'


so below are my questions:

1. Is the psql client reading the socket file that resides in the /tmp 
directory to fetch the hostname ?


2. I saw the socket file in /tmp and it is empty. Then how is the psql 
client still reading the socket file successfully for hostname ?



this is my socket looks ( the size is 0 as the file is empty):

srwxrwxrwx. 1 postgres postgres      0 Apr 22 12:47 .s.PGSQL.5432


Please help me clarify these doubts.


https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS

"host

Name of host to connect to. If a host name looks like an absolute 
path name, it specifies Unix-domain communication rather than TCP/IP 
communication; the value is the name of the directory in which the 
socket file is stored. (On Unix, an absolute path name begins with a 
slash. On Windows, paths starting with drive letters are also 
recognized.) If the host name starts with @, it is taken as a 
Unix-domain socket in the abstract namespace (currently supported on 
Linux and Windows). The default behavior when host is not specified, or 
is empty, is to connect to a Unix-domain socket in /tmp (or whatever 
socket directory was specified when PostgreSQL was built). On Windows, 
the default is to connect to localhost.


A comma-separated list of host names is also accepted, in which 
case each host name in the list is tried in order; an empty item in the 
list selects the default behavior as explained above. See Section 
34.1.1.3 for details.

"

The simplistic explanation is that the socket is the "host".





Regards.








--
Adrian Klaver
adrian.kla...@aklaver.com

Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 12:51, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 3:14 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

1) If they are already in enough of a PK order that the CLUSTER time vs
VACUUM FULL time would not be material as there is not much or any
sorting to do then what does the CLUSTER gain you?

Not much. Now they're just "slightly more ordered" instead of "slightly
less ordered" for little if any extra effort.

2) What evidence is there that the records where still in PK order just
because you deleted based on CREATED_ON? I understand the correlation
between CREATED_ON and the PK just not sure why that would necessarily
translate to an on disk order by PK?

1. Records are appended to tables in INSERT order, and INSERT order is
highly correlated to synthetic PK, by the nature of sequences.

Not something I would count on, see:

https://www.postgresql.org/docs/current/sql-createsequence.html

Notes

for how that may not always be the case.

Also any UPDATE or DELETE is going to change that. There is no guarantee
of order for the data in the table. If there where you would not need to
run CLUSTER.

2. My original email showed that CLUSTER took just as long as VACUUM
FULL. That means not many records had to be sorted, because... the
on-disk order was strongly correlated to PK and CREATED_ON. >
Will that happen *every time* in *every circumstance* in *every
database*? No, and I never said it would. But it does in *my *database
in *this *application.

Which gets us back to your comment upstream:

"What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the
PK, if the PK is a sequence (whether that be an actual sequence, or a
timestamp or something else that grows monotonically)."

This is a case specific to you and this particular circumstance, not a
general rule for VACUUM FULL. If for no other reason then it might make
more sense for the application that the CLUSTER be done on some other
index then the PK.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: altering a column to to make it generated

2024-04-22 Thread Adrian Klaver


On 4/22/24 12:42, Celia McInnis wrote:
Can I alter a table column to now make it generated? I tried this 
unsuccessfully:


create temp table tmp1 as select 15::numeric(6,1) as 
distance,'24:30'::interval,0::numeric(7,3) as avgspd;
alter table tmp1 alter column avgspd type numeric(7,3) generated always 
as ((3600.*distance)/EXTRACT (EPOCH FROM rdrtime::interval)) STORED;

ERROR:  syntax error at or near "generated"
LINE 1: ... table tmp1 alter column avgspd type numeric(7,3) generated ...

I think that at least I have the right bracketing this time! :-) 


See David Johnston's comment. I keep on wanting to believe that ALTER 
TABLE supports GENERATED expressions, when it only supports GENERATED 
IDENTITY:)


  







--
Adrian Klaver
adrian.kla...@aklaver.com

Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 11:45 AM, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 12:29 PM David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:

On Mon, Apr 22, 2024, 08:37 Ron Johnson mailto:ronljohnso...@gmail.com>> wrote:

On Mon, Apr 22, 2024 at 10:25 AM Tom Lane mailto:t...@sss.pgh.pa.us>> wrote:

Marcos Pegoraro mailto:mar...@f10.com.br>> writes:
 > But wouldn't it be good that VACUUM FULL uses that index
defined by
 > Cluster, if it exists ?

No ... what would be the difference then?

What the VACUUM docs "should" do, it seems, is suggest CLUSTER
on the PK, if the PK is a sequence (whether that be an actual
sequence, or a timestamp or something else that grows
monotonically).

That's because the data is already roughly in PK order.

If things are bad enough to require a vacuum full that doesn't seem
like a good assumption.

Sure it does.

For example, I just deleted the oldest half of the records in 30 
tables.  Tables who's CREATED_ON timestamp value strongly correlates to 
the synthetic PK sequence values.

Thus, the remaining records were still mostly in PK order.  CLUSTERs on 
the PK values would have taken just about as much time as the VACUUM 
FULL statements which I /did/ run.

1) If they are already in enough of a PK order that the CLUSTER time vs 
VACUUM FULL time would not be material as there is not much or any 
sorting to do then what does the CLUSTER gain you? Unless this table 
then became read only whatever small gain arose from the CLUSTER would 
fade away as UPDATEs and DELETEs where done.

2) What evidence is there that the records where still in PK order just 
because you deleted based on CREATED_ON? I understand the correlation 
between CREATED_ON and the PK just not sure why that would necessarily 
translate to an on disk order by PK?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: adding a generated column to a table?

2024-04-22 Thread Adrian Klaver


On 4/22/24 09:05, Celia McInnis wrote:

If I have a table containing a date field, say:
create temporary table tmp1 as select now()::date as evtdate;
SELECT 1

select DATE_PART('year', evtdate)::integer as year from tmp1;
  year
--
  2024
(1 row)

Is there some way of doing something like the following?:

alter table tmp1 add column year integer generated always as 
DATE_PART('year', evtdate)::integer STORED;

ERROR:  syntax error at or near "DATE_PART"
LINE 1: ... tmp1 add column year integer generated always as DATE_PART(...



https://www.postgresql.org/docs/current/sql-createtable.html

GENERATED ALWAYS AS ( generation_expr ) STORED

So:


generated always as (DATE_PART('year', evtdate)::integer) STORED

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 08:37, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 10:25 AM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:

Marcos Pegoraro mailto:mar...@f10.com.br>> writes:
 > But wouldn't it be good that VACUUM FULL uses that index defined by
 > Cluster, if it exists ?

No ... what would be the difference then?

What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the 
PK, if the PK is a sequence (whether that be an actual sequence, or a 
timestamp or something else that grows monotonically).

Why?

That would, per David Rowley's comments, impose a sort cost on top of 
the cost of hitting every heap page and rewriting it. You end up with 
sorted table granted, until such time as you start making changes to it. 
If you are to the point of running VACUUM FULL that indicates to me the 
table has seen a heavy load of changes that you want to clean out. Given 
the temporary nature of the effects of a  CLUSTER under a change load I 
don't see why it would be the way to go to clean up a changing table.

That's because the data is already roughly in PK order.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: error in trigger creation

2024-04-21 Thread Adrian Klaver


On 4/21/24 14:21, Tom Lane wrote:

Adrian Klaver  writes:

On 4/21/24 11:20, yudhi s wrote:

So in this case i was wondering if "event trigger" can cause any
additional threat and thus there is no such privilege like "create
trigger" exist in postgres and so it should be treated cautiously?



An event trigger runs as a superuser and executes a function that in
turn can do many things, you do the math on the threat level.


As a trivial example: an event trigger could prevent the legitimate
superuser(s) from doing anything at all in that database, just by
blocking all their commands.  This might not even require malicious
intent, merely faulty coding --- but the opportunity for malicious
intent is staggeringly large.


As an FYI to above:

https://www.postgresql.org/docs/current/sql-createeventtrigger.html

"Event triggers are disabled in single-user mode (see postgres). If an 
erroneous event trigger disables the database so much that you can't 
even drop the trigger, restart in single-user mode and you'll be able to 
do that."





    regards, tom lane


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: error in trigger creation

2024-04-21 Thread Adrian Klaver


On 4/21/24 11:20, yudhi s wrote:


On Sun, Apr 21, 2024 at 8:13 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:






So do you mean , we should not create the event trigger using the 
"security definer" , rather have the super user do this each time we 
have to create the event trigger?


Actually , I am not very much aware about the security part, but is it 
fine to give the super user privilege to the application user(say 
app_user) from which normally scripts/procedures get executed by the 
application, but nobody(individual person) can login using that user.


Additionally in other databases, triggers are driven by some 
specific privileges (say for example in oracle "create trigger" 
privilege). And it doesn't need any super user and we were having many 


Which Postgres has

https://www.postgresql.org/docs/current/ddl-priv.html

TRIGGER

Allows creation of a trigger on a table, view, etc.


but you are talking about event triggers

https://www.postgresql.org/docs/current/sql-createeventtrigger.html

where

"Only superusers can create event triggers."

To paraphrase Henry Ford, you can have any user for an event trigger as 
long as the user is a superuser.




applications in which the application user (which were used for app to 
app login) was having these privileges, similar to "create table" 
privileges which comes by default to the schema who owns the objects  
etc. So in this case i was wondering if "event trigger" can cause any 
additional threat and thus there is no such privilege like "create 
trigger" exist in postgres and so it should be treated cautiously?


An event trigger runs as a superuser and executes a function that in 
turn can do many things, you do the math on the threat level.



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Logging statement having any threat?

2024-04-21 Thread Adrian Klaver

On 4/21/24 02:35, Lok P wrote:
On Sat, Apr 20, 2024 at 10:02 PM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:

Have you tried?:

https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

<https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT>

"
log_statement (enum)

    <...>

The default is none. Only superusers and users with the appropriate SET
privilege can change this setting.
"

Or

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET 
<https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET>

set_config ( setting_name text, new_value text, is_local boolean ) →
text

 >
 > Now when we reach out to the infrastructure team , they are
saying these
 > variables(pg_cluster_log_statement,pg_instance_log_statement) were

Where are those variables coming from? I can not find them in RDS or
Terraform docs.

  Thank You Adrian.

Actually I was trying to understand if the auto_explain can only work 
and help us see the slow sql statements in the log, only after we set 
the "log_statement" parameter to non default values (like all, mod, ddl)?

And what is the exact threat with the logging these queries , and i 

log_statement = 'mod'

create role pwd_test with password 'test';
CREATE ROLE

tail -f /var/log/postgresql/postgresql-16-main.log

<...>
2024-04-21 09:04:17.746 PDT [9664] postgres@test LOG:  statement: create 
role pwd_test with password 'test';

think ,I got the point as you mentioned , having access to database  
itself is making someone to see the object details, however do you agree 
that in case of RDS logs are available through different mediums like 
cloud watch, data dog agent etc , so that may pose additional threats as 

Aah, the joys of managed services where you have to check even more 
layers when building out your security.  Logging itself is not the 
issue, who has access to the logs is. The more access points the more 
difficult that gets. Dealing with this is going to require a system wide 
review by all parties and coming up with an agreed upon access policy 
that balances security with the need to monitor what is happening in the 
database. Otherwise troubleshooting issues will be a long drawn out 
process which in itself could end up being a security issue.

because , may be some person doesn't have access to database directly 
but still having permission to see the logs, so the appropriate access 
control need to put in place?

And additionally I was trying to execute the "SELECT 
set_config('log_statement', 'all', true);" but it says "/permission 
denied to set parameter "log_statement/".".So might be it needs a higher 
privileged user to run it.

To answer your question on the variable those we have on the 
terraform module, the terraform module is customized by the database 
infra team so that might be why we are seeing those there which may not 
be exactly the same as its showing in RDS docs for postgres.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.Concepts.PostgreSQL.html

<https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.Concepts.PostgreSQL.html>

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Logging statement having any threat?

2024-04-20 Thread Adrian Klaver


On 4/20/24 07:02, Lok P wrote:


Now when we reach out to the infrastructure team , they are saying these 
variables(pg_cluster_log_statement,pg_instance_log_statement) were 
removed due to potential security threat. So I want to understand from 
experts here , how this is really a security threat and if any option to 
get this logging enabled (which will help us debug performance issues) 
at same time addressing the threat too?


I should have added to previous post, if you have access to the database 
the security wall has already been breached.




Regards
Lok


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Logging statement having any threat?

2024-04-20 Thread Adrian Klaver


On 4/20/24 07:02, Lok P wrote:

Hello All,
Its postgres version 15.4 and its RDS, in which our dev team gets the 
infrastructure code from another third party team which provides us base 
infrastructure code to build a postgres database, in which we will be 
able to do change DB parameter values etc whatever is mentioned in the 
file with possible values. But surprisingly we don't see log_statement 
there. Below was our requirement,


For debugging and evaluating performance we were having 
pg_stat_statements but it contains aggregated information about all the 
query execution. But in case just want to debug any point in time issues 
where the selected few queries were performing bad (may be because of 
plan change), we were planning to have the auto_explain extension added 
and set the log_min_duration to ~5 seconds, So that, all the queries 
going above that time period(5 seconds) will be logged and provide 
detailed information on the exact point of bottleneck. But we see the 
log_statement parameter has been removed from the base infrastructure 
script/terraform script given by the database team here, so that means 
we will get it as default which is "NONE", which means no 
statement(SELECT/DML/DDL etc) can be logged.


Have you tried?:

https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

"
log_statement (enum)

  <...>

The default is none. Only superusers and users with the appropriate SET 
privilege can change this setting.

"

Or

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET

set_config ( setting_name text, new_value text, is_local boolean ) → text




Now when we reach out to the infrastructure team , they are saying these 
variables(pg_cluster_log_statement,pg_instance_log_statement) were 


Where are those variables coming from? I can not find them in RDS or 
Terraform docs.


removed due to potential security threat. So I want to understand from 
experts here , how this is really a security threat and if any option to 
get this logging enabled (which will help us debug performance issues) 
at same time addressing the threat too?


Regards
Lok


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: [help] Error in database import

2024-04-19 Thread Adrian Klaver


On 4/19/24 20:58, Tu Ho wrote:

Hi,

I am currently having a project where I need to combine 2 large 
database. I was not however able to import an excel file .csv into the 


There are Excel files(.xls/.xlsx) and there are *.csv files.

database. The error was ERROR: syntax error at or near "OIDS" LINE 1: 
...ing Site" , "International Name(s)" ) FROM STDIN OIDS DELI...

^"

I have no idea how to fix this because I used the "upload file" 
option. What should I do?


Use a recent version of whatever client you are using as:

https://www.postgresql.org/docs/current/sql-copy.html

"If 1, OIDs are included in the data; if 0, not. Oid system columns are 
not supported in PostgreSQL anymore, but the format still contains the 
indicator."


Your client is using old syntax as the last version of Postgres that 
allowed  FROM STDIN OIDS was:


https://www.postgresql.org/docs/11/sql-copy.html

"where option can be one of:

FORMAT format_name
OIDS [ boolean ]
"

And Postgres 11 is ~6 months past EOL.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Can you refresh a materialized view from a materialized view on another server?

2024-04-19 Thread Adrian Klaver


On 4/18/24 19:49, Michael Nolan wrote:

My production server has a materialized view that is refreshed from a
mysql_fdw several times a day.

What I'd like to be able to do is refresh the materialized view on the
testbed server from the one on the production server so that they are
the same.  (Refreshing it from the MySQL server will result in one
that has records that have been added or updated rather than an exact
copy of the one on the production serve

Use postgres_fdw  to connect the  test bed to the production server?



Mike Nolan
htf...@gmail.com




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Not able to grant access on pg_signal_backend on azure flexible server

2024-04-19 Thread Adrian Klaver


On 4/18/24 22:48, Saksham Joshi wrote:

Hi,
We have created an azure postgresql flexible server and we have added an 
ad admin as a user and Created our database using this admin 
user.However,When are running this command: 'Grant pg_signal_backend To 
adminUser' we are getting an error that says 'permission denied to grant 
role "pg_signal_backend".While this is strange the admin user is infact 
the owner of the said database and we don't have any other user that 
have the said privileges.



https://www.postgresql.org/docs/current/predefined-roles.html

"PostgreSQL provides a set of predefined roles that provide access to 
certain, commonly needed, privileged capabilities and information. 
Administrators (including roles that have the CREATEROLE privilege) can 
GRANT these roles to users and/or other roles in their environment, 
providing those users with access to the specified capabilities and 
information."





--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Question on trigger

2024-04-16 Thread Adrian Klaver

On 4/16/24 12:39, veem v wrote:

On Sat, 13 Apr 2024 at 21:44, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

On 4/13/24 00:03, veem v wrote:
 > Thank you Adrian.
 >
 > So it seems the heavy DML tables will see an impact if having
triggers
 > (mainly for each row trigger) created on them.
 >
 > And also the bulk DML/array based insert (which inserts multiple
rows in
 > one short or one batch) , in those cases it seems the trigger
will not
 > make that happen as it will force it to make it happen row by
row, as
 > the trigger is row based. Will test anyway though.

You said you have triggers in the Oracle database and I assumed they
worked and where not a show stopping issue there. What makes you think
that would be different in Postgres?

What type of triggers where there in Oracle, per row, per statement
or a
mix?

Actually we have row level triggers  in oracle which are running for 
smaller volume DML and are making the direct path inserts to happen in 
conventional row by row insert, in presence of trigger. So was wondering 

Not sure what the above means, you will need to provide a more detailed 
description. Though any DML you are doing on table that has any sort of 
constraint, index, trigger, foreign key, default values, etc is going to 
have more overhead then into an unencumbered table. FYI, some of the 
preceding are system triggers, for example foreign keys.

if it postgres we will be encountering a similar issue and batch inserts 
may be converted back to row by row automatically. And here we are going 
to process higher volume DMLS in postgresql database.

Hard to say with the information provided. Easiest way to find out is 
create a test setup  and run the code. Though I guess, as I have not 
actually tried this, you could have a per row trigger and per statement 
trigger for the same action and disable the per row and enable the per 
statement trigger for batch operations. Then once the batch operation is 
done reverse the process. Again something to test to verify.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: constant crashing

2024-04-14 Thread Adrian Klaver


On 4/14/24 14:50, jack wrote:

Reply to list also
Ccing list


Hello,
I am not sure what "locale" means.


Go to the settings App for whatever version of Windows you are on and 
search for locale.



The Windows app is an inhouse application which uses Actian-Zen SQL.
The data is exported to simple ASCII in a tab delimited format similar to CSV.


And you know it is ASCII for a fact?


Those files are then imported into the PostgreSQL table using COPY.
Importing the data is not an issue.
I am able to load all the data without any problems, even into 1 table which 
ends up with about 1.2 billion records.
But when I try to update the data in that table I get many errors, essentially 
crashes.


Repeating what has been asked and answered it not really going anywhere.


There may be some control characters (garbage) in the data but that should not 
crash postgresql, especially if it can import the data without issues.


Unless it does. That is the point of the questions, getting to what is 
actually causing the issue. Until the problem can be boiled down to a 
reproducible test case there really is not much hope of anything more 
then the the 'yes you have a problem' answer. And there is a difference 
between dumping data into a table and then doing an UPGRADE where the 
data strings are manipulated by functions.



Anyway, I hope I answered your questions.
Thanks for your help.



On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver 
 wrote:




On 4/14/24 13:18, jack wrote:


The CSV files are being produced by another system, a WIndows app on a
Windows machine. I then copy them to a USB key and copy them onto the
ubuntu machine. The data is then imported via the COPY command.



The app?

The locale in use on the Windows machine?

The locale in use in the database?


COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table
without any problems. The issue is only when I start to update the
single table. And that is why I started using smaller temporary tables
for each CSV file, to do the updates in the smaller tables before I move
them all to a single large table.



The import is just dumping the data in, my suspicion is the problem is
related to using string functions on the data.


After all the data is loaded and updated, I run php programs on the
large table to generate reports. All of which works well EXCEPT for
performing the updates on the data. And I do not want to use perl or any
outside tool. I want it all one in SQL because I am required to document
all my steps so that someone else can take over, so everything needs to
be as simple as possible.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: constant crashing

2024-04-14 Thread Adrian Klaver


On 4/14/24 13:18, jack wrote:
The CSV files are being produced by another system, a WIndows app on a 
Windows machine. I then copy them to a USB key and copy them onto the 
ubuntu machine. The data is then imported via the COPY command.


The app?

The locale in use on the Windows machine?

The locale in use in the database?



COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table 
without any problems. The issue is only when I start to update the 
single table. And that is why I started using smaller temporary tables 
for each CSV file, to do the updates in the smaller tables before I move 
them all to a single large table.


The import is just dumping the data in, my suspicion is the problem is 
related to using string functions on the data.




After all the data is loaded and updated, I run php programs on the 
large table to generate reports. All of which works well EXCEPT for 
performing the updates on the data. And I do not want to use perl or any 
outside tool. I want it all one in SQL because I am required to document 
all my steps so that someone else can take over, so everything needs to 
be as simple as possible.




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: constant crashing

2024-04-14 Thread Adrian Klaver


On 4/14/24 12:22, jack wrote:

Here is an excerpt of /var/log/postgresql/postgresql-16-main.log


Where and how are the CSV files being produced?

What is the database locale?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: constant crashing

2024-04-14 Thread Adrian Klaver


On 4/14/24 09:20, jack wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost


Look at the OS system log.



PostgreSQL 16.2

I also believe it is a resource issue which can be rectified with a 
setting, but which setting?

If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified = 
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'), 
'\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND 
POSITION('--' IN category_modified)>0;

UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT 
NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified, 
'-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 
FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND 
category_modified LIKE '%-';


Is the above all being done in one script/transaction?

Again what are the table definitions for the tables being copied into 
and/or modified?









--
Adrian Klaver
adrian.kla...@aklaver.com

Re: constant crashing

2024-04-14 Thread Adrian Klaver


On 4/14/24 07:24, jack wrote:

Hello,
I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually 
into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when 
I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the 
updating (UPDATE table SET field=UPPER(field), etc.).


The errors are all "Server closed the connection unexpectedly"




I have been working on this for just over 1 year now, documenting every 
step, and I am still unable to get this to work without it crashing 
somewhere along the way.

I am beginning to wonder if postgreSQL is bi-polar.


More information needed, to start:

1) Complete example of the code in the script.

2) Table definition(s) of those being copied into.

3) Sample of the data being copied.

4) The error message(s) generated.

5) Database locale



Any help would be greatly appreciated.
Thank you


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Question on trigger

2024-04-13 Thread Adrian Klaver

On 4/13/24 00:03, veem v wrote:

Thank you Adrian.

So it seems the heavy DML tables will see an impact if having triggers 
(mainly for each row trigger) created on them.

And also the bulk DML/array based insert (which inserts multiple rows in 
one short or one batch) , in those cases it seems the trigger will not 
make that happen as it will force it to make it happen row by row, as 
the trigger is row based. Will test anyway though.

You said you have triggers in the Oracle database and I assumed they 
worked and where not a show stopping issue there. What makes you think 
that would be different in Postgres?

What type of triggers where there in Oracle, per row, per statement or a 
mix?

On Thu, 11 Apr 2024 at 22:00, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

On 4/11/24 07:31, veem v wrote:
 > Hi, We used to use Oracle database in which we had audit
 > triggers(something as below) mandated for all tables by the control
 > team. Now we are going to use the postgresql 15.4 database for
one of
 > our applications. So,wanted to understand if there exists any
downside
 > of such audit trigger setup for all the tables? Will it impact
the bulk
 > data insert/update/delete OR slowdown of any of the DML operations
 > significantly (and thus will not be advisable to use for all
tables but
 > selected ones)?

Triggers are overhead in Postgres as they where in Oracle. If they
didn't cause an issue in Oracle I would suspect that would also be the
case in Postgres. To confirm you would need to create a test setup and
run some common operations and see what the overhead is.

Some potential performance improvements:

https://www.postgresql.org/docs/current/sql-createtrigger.html
<https://www.postgresql.org/docs/current/sql-createtrigger.html>

"...a trigger that is marked FOR EACH STATEMENT only executes once for
any given operation, regardless of how many rows it modifies (in
particular, an operation that modifies zero rows will still result in
the execution of any applicable FOR EACH STATEMENT triggers)."

<...>

"The REFERENCING option enables collection of transition relations,
which are row sets that include all of the rows inserted, deleted, or
modified by the current SQL statement. This feature lets the trigger
see
a global view of what the statement did, not just one row at a time.
This option is only allowed for an AFTER trigger that is not a
constraint trigger; also, if the trigger is an UPDATE trigger, it must
not specify a column_name list. OLD TABLE may only be specified once,
and only for a trigger that can fire on UPDATE or DELETE; it creates a
transition relation containing the before-images of all rows updated or
deleted by the statement. Similarly, NEW TABLE may only be specified
once, and only for a trigger that can fire on UPDATE or INSERT; it
creates a transition relation containing the after-images of all rows
updated or inserted by the statement."

As example:

https://www.postgresql.org/docs/current/plpgsql-trigger.html
<https://www.postgresql.org/docs/current/plpgsql-trigger.html>

Example 43.7. Auditing with Transition Tables

 >
 > CREATE OR REPLACE TRIGGER TAB_AUD_TRG
 >    BEFORE DELETE OR INSERT OR UPDATE
 >    ON tab
 >    FOR EACH ROW
 > BEGIN
 >        IF inserting THEN
 >          :NEW.create_timestamp := systimestamp;
 >          :NEW.create_userid  :=
sys_context('USERENV','SESSION_USER');
 >          :NEW.update_timestamp := systimestamp;
 >          :NEW.update_userid := sys_context('USERENV','SESSION_USER');
 >        ELSIF updating THEN
 >          IF  updating('create_userid') OR
updating('create_timestamp') THEN
 >              :new.create_userid   := :old.create_userid;
 >              :new.create_timestamp  := :old.create_timestamp;
 >          END IF;
 >          :NEW.update_timestamp := systimestamp;
 >          :NEW.update_userid := sys_context('USERENV','SESSION_USER');
 >        END IF;
 >    END;
 > /
 >
     > Regards
 > Veem

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Question on trigger

2024-04-11 Thread Adrian Klaver


On 4/11/24 07:31, veem v wrote:
Hi, We used to use Oracle database in which we had audit 
triggers(something as below) mandated for all tables by the control 
team. Now we are going to use the postgresql 15.4 database for one of 
our applications. So,wanted to understand if there exists any downside 
of such audit trigger setup for all the tables? Will it impact the bulk 
data insert/update/delete OR slowdown of any of the DML operations 
significantly (and thus will not be advisable to use for all tables but 
selected ones)?


Triggers are overhead in Postgres as they where in Oracle. If they 
didn't cause an issue in Oracle I would suspect that would also be the 
case in Postgres. To confirm you would need to create a test setup and 
run some common operations and see what the overhead is.


Some potential performance improvements:

https://www.postgresql.org/docs/current/sql-createtrigger.html

"...a trigger that is marked FOR EACH STATEMENT only executes once for 
any given operation, regardless of how many rows it modifies (in 
particular, an operation that modifies zero rows will still result in 
the execution of any applicable FOR EACH STATEMENT triggers)."


<...>

"The REFERENCING option enables collection of transition relations, 
which are row sets that include all of the rows inserted, deleted, or 
modified by the current SQL statement. This feature lets the trigger see 
a global view of what the statement did, not just one row at a time. 
This option is only allowed for an AFTER trigger that is not a 
constraint trigger; also, if the trigger is an UPDATE trigger, it must 
not specify a column_name list. OLD TABLE may only be specified once, 
and only for a trigger that can fire on UPDATE or DELETE; it creates a 
transition relation containing the before-images of all rows updated or 
deleted by the statement. Similarly, NEW TABLE may only be specified 
once, and only for a trigger that can fire on UPDATE or INSERT; it 
creates a transition relation containing the after-images of all rows 
updated or inserted by the statement."



As example:

https://www.postgresql.org/docs/current/plpgsql-trigger.html

Example 43.7. Auditing with Transition Tables



CREATE OR REPLACE TRIGGER TAB_AUD_TRG
   BEFORE DELETE OR INSERT OR UPDATE
   ON tab
   FOR EACH ROW
BEGIN
       IF inserting THEN
         :NEW.create_timestamp := systimestamp;
         :NEW.create_userid  := sys_context('USERENV','SESSION_USER');
         :NEW.update_timestamp := systimestamp;
         :NEW.update_userid := sys_context('USERENV','SESSION_USER');
       ELSIF updating THEN
         IF  updating('create_userid') OR updating('create_timestamp') THEN
             :new.create_userid   := :old.create_userid;
             :new.create_timestamp  := :old.create_timestamp;
         END IF;
         :NEW.update_timestamp := systimestamp;
         :NEW.update_userid := sys_context('USERENV','SESSION_USER');
       END IF;
   END;
/

Regards
Veem


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Failure of postgres_fdw because of TimeZone setting

2024-04-10 Thread Adrian Klaver





On 4/10/24 1:31 PM, Tom Lane wrote:

Adrian Klaver  writes:

On 4/10/24 12:38, Adnan Dautovic wrote:

By the way, the row count of pg_timezone_names is 385, but I do
not know how that compares to a more standard installation.



On my instance of Postgres 16.2, 1196.


You're probably using a build with --with-system-tzdata pointing
at a system tzdata tree that includes leap-second-aware zones.
These tend to have duplicative entries like "America/New_York"
and "posix/America/New_York".  (There's also a subtree like
"right/America/New_York", but we reject those because we don't
do leap seconds.)  The real number of distinct zones in a
standard tzdata file set these days is a shade under 600.


It's the PGDG package running on Ubuntu 22.04.



    regards, tom lane


--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Failure of postgres_fdw because of TimeZone setting

2024-04-10 Thread Adrian Klaver


On 4/10/24 12:38, Adnan Dautovic wrote:

Hi,

On 05. Apr 2024, at 16:13, Tom Lane  wrote:

Adnan Dautovic  writes:



By the way, the row count of pg_timezone_names is 385, but I do
not know how that compares to a more standard installation.


On my instance of Postgres 16.2, 1196.



Kind regards,

Adnan Dautovic




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Some advice need after a 20 year gap after Ingres/GUIs

2024-04-10 Thread Adrian Klaver


On 4/10/24 03:11, John Bateson wrote:

Good morning,

The end of my working life was taken up with developing and supporting a 
Scientific Research establishment near Warrington in the UK. I had a 
small team of programmer who did an excellent job for me and with me.


The software was Ingres and the main program supporting user 
administration on a Synchrotron was built  under OpenROAD with other 
developments around the on-site stores and finance  using ABF since the 
stores workers found this much faster than a GUI! There was also some 
web development use .Net practises.


This was a quite complex system in the end with nearly 200 tables.

We ended up using Ingres Replicator with the intention of running it 
from two sites 180 miles apart – at the time it was a pile of 
*notquitegoodenough*! This was early this century so Actian may have 
made some improvements since then…


So much for the background to establish that I am not a complete newbie, 
just out of the loop for a while.


*Using Postgres and PGAdmin -4*.

So, 20 years later I am developing, (unpaid) a new project for some 
historic railways in the UK and linking these to the development and 
design of a range of kits for those interested in model railways. This 
is getting towards 20 tables so far.


What I really need is a recommendation for the current and best practice 
for an easy GUI that will allow me to press a button without opening up 
PGAdmin or a black screen, i.e. the traditional .EXE file I can put on a 
button on the screen on the screen on a Windows 11 based system.


With Postgres or other similar client-server database there will be a 
need to set up the database server somewhere first, before you get to 
the GUI interface part.


Are you planning on a single instance of Postgres that folks log in to 
from multiple locations?


Or do want a stand alone setup that each user has on their machine?



While my programming history goes back to MDBS-4 and beyond I know I 
have some catching up to do and while not a complete newbie, need 
something I can work on quickly and intuitively and inexpensively!


So, recommendation and possibly some consensus would be very much 
appreciated.


And apologies if I have taken up too much of your time or have placed 
this in the wrong forum.


John



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Issue with date/timezone conversion function

2024-04-09 Thread Adrian Klaver


On 4/9/24 11:24, Lok P wrote:


On Tue, Apr 9, 2024 at 10:33 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:




'EST' is going to rotate to UTC-5, but that's probably not what
you want in the summer.  I'd suggest AT TIME ZONE 'America/New_York'
or the like.  See

https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES 
<https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES>

                         regards, tom lane



  Thank you so much. You are correct. The AT TIME ZONE 
'America/New_York' is giving correct EST time conversion.


But I think regarding why it looks to be shifting i.e. the same time 
duration appears to be holding a different count of transactions while 
the base table is not getting updated/inserted/deleted for its 
historical create_timestamps, I suspect the below conversion part.


The task is to count each ~15minutes duration transaction and publish in 
ordered fashion i.e. something as below, but the way it's been written 
seems wrong. It's an existing script. It first gets the date component 
with truncated hour and then adds the time component to it to make it 
~15minutes interval. Can it be written in some simple way?


9-apr-2024 14:00     12340
9-apr-2024 14:15     12312
9-apr-2024 14:30     12323
9-apr-2024 14:45     12304

/DATE_TRUNC('hour', create_timestamp AT TIME ZONE '/America/New_York'/) +/
/(EXTRACT(MINUTE FROM create_timestamp AT TIME ZONE 
'/America/New_York/') / 15 * 15) * INTERVAL '15 minute'/


Something like?:

create table dt_bin_test(id integer, tz_fld timestamptz);

insert into dt_bin_test values(1, '2024-04-01 8:15'), (2, '2024-04-01 
9:01'), (3, '2024-04-01 9:16'), (4, '2024-04-01 9:45'), (5, '2024-04-01 
8:15'), (6, '2024-04-01 9:01');


select count(tz_fld), date_bin('15 minutes', tz_fld, '2024-01-01') as 
bin from dt_bin_test group by date_bin('15 minutes', tz_fld, '2024-01-01');


 count |  bin
---+
 2 | 2024-04-01 09:00:00-07
 2 | 2024-04-01 08:15:00-07
 1 | 2024-04-01 09:15:00-07
 1 | 2024-04-01 09:45:00-07



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Issue with date/timezone conversion function

2024-04-09 Thread Adrian Klaver

On 4/9/24 9:16 AM, Lok P wrote:

On Tue, Apr 9, 2024 at 9:26 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

On 4/9/24 08:43, Lok P wrote:
 > Hi All,
 > It's version 15.4 of postgresql database. Every "date/time" data
type
 > attribute gets stored in the database in UTC timezone only. One
of the
 > support persons local timezone is "asia/kolkata" and  that support
 > person needs to fetch the count of transactions from a table-
 > transaction_tab and share it with another person/customer who is
in the
 > EST timezone, so basically the transaction has to be shown or
displayed
 > the EST timezone.

What is the datatype for the create_timestamp?

What does SHOW timezone; return on the server?

Thank you for the quick response.

The column data type for "create_timestamp" is "timestamptz'.
Show timezone from the support users client machine UI showing  
"Asia/Calcutta".
Not having access to run "Show timezone" on the server currently, I will 
try to get it.
output from pg_setting showing setting as "Asia/Calcutta', reset_val as 
"Asia/Calcutta", boot_val as "GMT"

In the pg_settings query what are the source, sourcefile, sourceline 
fields set to?

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Issue with date/timezone conversion function

2024-04-09 Thread Adrian Klaver


On 4/9/24 08:43, Lok P wrote:

Hi All,
It's version 15.4 of postgresql database. Every "date/time" data type 
attribute gets stored in the database in UTC timezone only. One of the 
support persons local timezone is "asia/kolkata" and  that support 
person needs to fetch the count of transactions from a table- 
transaction_tab and share it with another person/customer who is in the 
EST timezone, so basically the transaction has to be shown or displayed 
the EST timezone.


What is the datatype for the create_timestamp?

What does SHOW timezone; return on the server?



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: [Code: 0, SQL State: 0A000] when "typing" from pg_catalog

2024-04-09 Thread Adrian Klaver


On 4/9/24 08:12, Thiemo Kellner wrote:

Thanks for taking this up.

Am 09.04.2024 um 17:09 schrieb Adrian Klaver:

On 4/9/24 07:59, Thiemo Kellner wrote:
[Code: 0, SQL State: 0A000] ERROR: References to other databases are 
not implemented: pg_catalog.pg_roles.rolname

    Position: 298 [Script position: 334 - 361]

PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader';

Is being seen as DB_NAME.TABLE_NAME.COLUMN_NAME.


Yes, obviously, but why? With the information_schema view all is fine. 
And, I suppose, with all other objects in other schemas of the same 
database too.


Because you did not do?:

PG_CATALOG.PG_ROLES.ROLNAME%type
--
Adrian Klaver
adrian.kla...@aklaver.com

Re: [Code: 0, SQL State: 0A000] when "typing" from pg_catalog

2024-04-09 Thread Adrian Klaver


On 4/9/24 07:59, Thiemo Kellner wrote:

Hi

I have the following function code. When trying to install, it gives me

[Code: 0, SQL State: 0A000]  FEHLER: Verweise auf andere Datenbanken 
sind nicht implementiert: pg_catalog.pg_roles.rolname

   Position: 298  [Script position: 334 - 361]


[Code: 0, SQL State: 0A000] ERROR: References to other databases are not 
implemented: pg_catalog.pg_roles.rolname

   Position: 298 [Script position: 334 - 361]


To the best of my knowledge, pg_catalog is a schema not a database, like 
information_schema. Am I missing something? And why is it not allowed to 
type from the catalogue?


I presume, this example is rather academic due to the name type.


PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader';

Is being seen as DB_NAME.TABLE_NAME.COLUMN_NAME.





Kind regards

Thiemo


create or replace function GRANT_SELECTS()
returns void
language plpgsql
as
$body$
     declare
     C_SCHEMA_NAME   constant 
INFORMATION_SCHEMA.SCHEMATA.SCHEMA_NAME%type :=

   'snowrunner';
--    C_ROLE_NAME constant    name :=
     C_ROLE_NAME constant    PG_CATALOG.PG_ROLES.ROLNAME :=
   'snowrunner_reader';
     V_SQL_STATEMENT text;
     begin
     -- Check the existance of the schema
     perform 1
     from INFORMATION_SCHEMA.SCHEMATA
     where SCHEMA_NAME = C_SCHEMA_NAME;
     if not found then
     raise exception 'Schema "%s" does not exist!', C_SCHEMA_NAME;
     end if;

     -- Check the existance of the role
     perform 1
     from PG_CATALOG.PG_ROLES
     where ROLNAME = C_ROLE_NAME;
     if not found then
     raise exception 'Role "%s" does not exist!', C_ROLE_NAME;
     end if;

     -- Issue grants
     V_SQL_STATEMENT := format('grant select on all tables in schema 
%i to %i', C_SCHEMA_NAME, C_ROLE_NAME);

     raise info '%', V_SQL_STATEMENT;
     execute V_SQL_STATEMENT;
     V_SQL_STATEMENT := format('grant select on all views in schema 
%i to %i', C_SCHEMA_NAME, C_ROLE_NAME);

     raise info '%', V_SQL_STATEMENT;
     execute V_SQL_STATEMENT;
     V_SQL_STATEMENT := format('grant select on all materialized 
views in schema %i to %i', C_SCHEMA_NAME, C_ROLE_NAME);

     raise info '%', V_SQL_STATEMENT;
     execute V_SQL_STATEMENT;
     commit;

     return;
     end;
$body$;




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Regarding: Replication of TRUNCATE commands is not working

2024-04-09 Thread Adrian Klaver

On 4/9/24 05:07, Arvind Raghuwanshi wrote:

Hi Laurenz,
Thanks for the response
Question: What PostgreSQL version are you using?  The feature was 
introduced in v11.

Answer: I am using the 16.0 Postgresql version.
db1=> SELECT version();
                                                version
-
  PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 
7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit

(1 row)

Question: How exactly is the publication defined?  Perhaps TRUNCATE is 
excluded.
I am not using the subscribe/publication model . but i have created a 

To be clear you have not done CREATE PUBLICATION on the source machine, 
correct?

What is the rest of your configuration per:

https://www.postgresql.org/docs/current/logical-replication-config.html

Also what exactly is being TRUNCTEd?

replication slot on the source database and then i am calling below 
procedure to get the details from replication slot:
select data from pg_logical_slot_peek_changes('db1_slot', NULL, NULL, 
'pretty-print', '1');

In the case of TRUNCATE , the above procedure does not show up any 
records. however this procedure shows up insert, update and delete events.

As you mentioned TRUNCATE is excluded, is there any way to exclude 
TRUNCATE ?

Thanks
Arvind

On Tue, Apr 9, 2024 at 4:08 PM Laurenz Albe <mailto:laurenz.a...@cybertec.at>> wrote:

On Tue, 2024-04-09 at 14:50 +0530, Arvind Raghuwanshi wrote:
 > I have tried to run the TRUNCATE command  and found out that it's
not getting replicated using logical replication for pgsql.
 > I have also checked the schema change using pg_dump command but
the schema change also not getting detected for TRUNCATE command.
 >
 > However on pgsql logical replication doc page[1] , it's mentioned
that Replication of TRUNCATE commands is supported.
 >
 > [1] PostgreSQL: Documentation: 16: 31.6. Restrictions
 >
 > Any idea how we can solve this?

What PostgreSQL version are you using?  The feature was introduced
in v11.
How exactly is the publication defined?  Perhaps TRUNCATE is excluded.

Yours,
Laurenz Albe

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: PEM install error

2024-04-08 Thread Adrian Klaver


On 4/8/24 07:24, mark bradley wrote:
While installing PostgreSQL I am getting this error message during the 
PEM server portion.  Do I need the PEM server?  If so, what is the solution?




PEM Server is an EDB product:

https://www.enterprisedb.com/docs/pem/latest/managing_pem_server/

You should reach out to their tech support.



Thanks!



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Moving delta data faster

2024-04-06 Thread Adrian Klaver

On 4/6/24 13:04, yudhi s wrote:

On Sat, Apr 6, 2024 at 10:25 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

Your original problem description was:

"Then subsequently these rows will be inserted/updated based on the
delta number of rows that got inserted/updated in the source database.
In some cases these changed data can flow multiple times per day to the
downstream i.e. postgres database and in other cases once daily."

If the above is not a hard rule, then yes up to some point just
replacing the data in mass would be the simplest/fastest method. You
could cut a step out by doing something like TRUNCATE target_tab and
then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO
source_tab.

Yes, actually i didn't realize that truncate table transactional/online
here in postgres. In other databases like Oracle its downtime for the
read queries on the target table, as data will be vanished from the
target table post truncate(until the data load happens) and those are
auto commit. Thanks Veem for sharing that option.

I also think that truncate will be faster if the changes/delta is
large , but if its handful of rows like <5%of the rows in the table then
Upsert/Merge will be better performant. And also the down side of the
truncate option is, it does ask to bring/export all the data from
source to the S3 file which may take longer as compared to bringing just
the delta records. Correct me if I'm wrong.

Since you still have not specified how the data is stored in S3 and how
you propose to move them into Postgres I can't really answer.

However I am still not able to understand why the upsert is less
performant than merge, could you throw some light on this please?

I have no idea how this works in the code, but my suspicion is it is due
to the following:

https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT

"The optional ON CONFLICT clause specifies an alternative action to
raising a unique violation or exclusion constraint violation error. For
each individual row proposed for insertion, either the insertion
proceeds, or, if an arbiter constraint or index specified by
conflict_target is violated, the alternative conflict_action is taken.
ON CONFLICT DO NOTHING simply avoids inserting a row as its alternative
action. ON CONFLICT DO UPDATE updates the existing row that conflicts
with the row proposed for insertion as its alternative action."

vs this:

"First, the MERGE command performs a join from data_source to
target_table_name producing zero or more candidate change rows. For each
candidate change row, the status of MATCHED or NOT MATCHED is set just
once, after which WHEN clauses are evaluated in the order specified. For
each candidate change row, the first clause to evaluate as true is
executed. No more than one WHEN clause is executed for any candidate
change row."

Where ON CONFLICT attempts the INSERT then on failure does the UPDATE
for the ON CONFLICT DO UPDATE case. MERGE on the hand evaluates based on
the join condition(ON tbl1.fld =tbl2.fld) and then based on MATCH/NOT
MATCHED takes the appropriate action for the first WHEN match. In other
words it goes directly to the appropriate action.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Moving delta data faster

2024-04-06 Thread Adrian Klaver


On 4/6/24 08:47, yudhi s wrote:

Thank you Adrian, Greg and Veem.

I tried writing a small routine to see how the performance differs in 
these four approaches i.e. Upsert VS traditional update+insert VS Merge 
vs Truncate+load.


Initially I was thinking Upsert will perform the same as Merge as the 
logic looks similar but it seems it's the worst performing among all, 
not sure why , yet to know the reason though. Truncate+ load seems to be 
the best performing among all. Hope i am doing it correctly. Please 
correct me if I'm wrong.


Your original problem description was:

"Then subsequently these rows will be inserted/updated based on the 
delta number of rows that got inserted/updated in the source database. 
In some cases these changed data can flow multiple times per day to the 
downstream i.e. postgres database and in other cases once daily."


If the above is not a hard rule, then yes up to some point just 
replacing the data in mass would be the simplest/fastest method. You 
could cut a step out by doing something like TRUNCATE target_tab and 
then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO source_tab.




--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Failure of postgres_fdw because of TimeZone setting

2024-04-05 Thread Adrian Klaver


On 4/5/24 02:39, Adnan Dautovic wrote:

Dear Adrian,

Adrian Klaver  wrote:
Define 'read-only', especially as it applies to the privileges on the 
public schema.


I am not quite sure which information you are looking for
exactly. According to this [1], I ran the following query:

WITH "names"("name") AS (
   SELECT n.nspname AS "name"
     FROM pg_catalog.pg_namespace n
   WHERE n.nspname !~ '^pg_'
     AND n.nspname <> 'information_schema'
) SELECT "name",
   pg_catalog.has_schema_privilege(current_user, "name", 'CREATE') AS
"create",
   pg_catalog.has_schema_privilege(current_user, "name", 'USAGE') AS 
"usage"

     FROM "names";

And recieved the following result:

"name"    "create"    "usage"
"public"    true    true


Looks alright. The below is the issue.




Per Tom Lane's comments on timezone, log into the remote server and do:

SHOW timezone;

Europe/Berlin


SET timezone = 'etc/UTC';

ERROR: invalid value for parameter "TimeZone": "etc/UTC"
SQL state: 22023


SET timezone = 'UTC';

ERROR: invalid value for parameter "TimeZone": "UTC"
SQL state: 22023

However, this lead me to [2] and I find the output very
interesting:

SELECT * FROM pg_timezone_names ORDER BY name;


The below is cut down from the actual output as there should be at least:

Europe/Berlin  CEST  02:00:00 t

present also?




"name"    "abbrev"    "utc_offset"    "is_dst"
"Turkey"    "+03"    "03:00:00"    false
"UCT"    "UCT"    "00:00:00"    false


Hmm I get:

UCT   UTC  00:00:00  f

could be version difference though.


"Universal"    "UTC"    "00:00:00"    false
"W-SU"    "MSK"    "03:00:00"    false



And then attempting

SET timezone = 'Universal';


SET
Query returned successfully in 100 msec.


Any ideas on how to proceed?


1) For the long term contact whomever is in charge of the remote server 
and ask them what they have done with the timezones, why and can they 
fix it?


2) In short term per the link from your first post and with no guarantees:

https://github.com/postgres/postgres/blob/936e3fa3787a51397280c1081587586e83c20399/contrib/postgres_fdw/connection.c#L677

In the source code change

do_sql_command(conn, "SET timezone = 'UTC'");

to

do_sql_command(conn, "SET timezone = 'Universal'");

As from the link: "Set remote timezone; this is basically just cosmetic"

Then recompile the extension.



Kind regards,

Adnan Dautovic


[1]: https://stackoverflow.com/a/36095257
[2]: https://stackoverflow.com/a/32009497



--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Moving delta data faster

2024-04-04 Thread Adrian Klaver

On 4/4/24 13:42, yudhi s wrote:

On Thu, Apr 4, 2024 at 9:04 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

On 4/3/24 22:24, yudhi s wrote:
 >
 > On Thu, Apr 4, 2024 at 10:16 AM Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > <mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> wrote:

S3 is not a database. You will need to be more specific about '... then
from the S3 it will be picked and gets merged to the target postgres
database.'

The data from S3 will be dumped into the stage table and then the 
upsert/merge from that table to the actual table.

The S3 --> staging table would be helped by having the data as CSV and 
then using COPY. The staging --> final table step could be done as 
either ON CONFLICT or MERGE, you would need to test in your situation to 
verify which works better.

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Moving delta data faster

2024-04-04 Thread Adrian Klaver

On 4/3/24 22:24, yudhi s wrote:

On Thu, Apr 4, 2024 at 10:16 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

On 4/3/24 20:54, yudhi s wrote:
 > On Thu, Apr 4, 2024 at 2:41 AM Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > <mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> wrote:
 >
 >    > Thank you Adrian.
 >
 > And one thing i forgot to mention this target postgresql database
would
 > be on AWS RDS whereas the source Oracle databases is on premise.
I think
 > we don't have the FDW extension currently in place but we can get
that.
 > I am just not able to understand clearly  though, but do you mean
export
 > the data from source using CSV and do truncate and import on
target. And
 > as these data will be moved through the network won't that cause
slowness?
 >
 > The source database here is Oracle database. Correct me if wrong, it
 > looks like foreign data wrapper is like a DB link. Or do you mean
 > writing a query on the target database (which can be UPSERT or
MERGE)
 > but will be joining the table from the source database through the
 > DBlink/DDW? But my question was whether we should use UPSERT or
MERGE
 > for comparing and loading the delta records to the target postgresql
 > database. Want to understand which is more performant , as I see
in the
 > past Merge having performance issues in the past, but not very sure
 > about that.

My motivation was to get some basic information about your setup and
what you are trying to achieve.

If I understand correctly you have:

1) An Oracle database with tables that you want to copy the complete
data from to a Postgres database. For this sort of thing
COPY(https://www.postgresql.org/docs/current/sql-copy.html
<https://www.postgresql.org/docs/current/sql-copy.html>) on the
Postgres end using CSV data generated from the source is probably the
quickest bulk load method.

2) After the initial load you want to do follow up INSERT/UPDATEs based
on a delta of the source tables relative to the initial load. This is
still a bit of mystery to me. How are determining the delta: a) On the
source end entirely or b) Target relative to source? Also what is the
anticipated size of the delta per transfer?

Additional information needed:

1) Network distance between source and target?

2) Network capacity?

3) Expected load on both source and target servers from other
operations?

Thank you. Actually I was trying to understand how to cater the delta 
load after the one time load is done . The delta change in records is 
planned to be found based on the primary keys on the tables. If it found 
the key it will update the records if it does not find the keys it will 
insert the rows.

Basically the select query from the source database will fetch the data 
with a certain time interval(based on the latest update timestamp or 
create timestamp if they are available or else full dump) and put it on 
S3 and then from the S3 it will be picked and gets merged to the target 
postgres database. As upsert and merge both were looking similar , so 
was wondering what we should use here for loading the delta records?

S3 is not a database. You will need to be more specific about '... then 
from the S3 it will be picked and gets merged to the target postgres 
database.'

--
Adrian Klaver
adrian.kla...@aklaver.com

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 4077 matches

Mail list logo