Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver

On 4/29/24 08:51, Cocam' server wrote:

 > Did the above work for each table?
Yes, except for the biggest table

 > Have you done something like?:
 >
 > select relname, n_dead_tup from pg_stat_all_tables where relname =
 > '';

I hadn't thought of that, but it seems that some tables have dead tuples



You have something against providing actual numbers?

The point is there is really nothing to be gained by doing VACUUM FULL 
if the dead tuples are some small percentage of the tables.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver

On 4/29/24 08:04, Cocam' server wrote:

When replying use Reply All to include the mailing list
Ccing list


 > How much current free space do you have available on the disk?
as we speak, I only have 6 GB available on the machine running the server

 > Did you VACUUM FULL a table at a time or all of them at once?
I tried to make a VACUUM FULL. I also tried on the biggest tables (200 
Mb and +) but not on all of them


Did the above work for each table?

Have you done something like?:

select relname, n_dead_tup from pg_stat_all_tables where relname = 
'';


to see if there any dead tuples to clean out.

Or if you use the contrib extension pgstattuple:

https://www.postgresql.org/docs/current/pgstattuple.html

then:

SELECT * FROM pgstattuple('');

This returns something like:

-[ RECORD 1 ]--+
table_len  | 3940352
tuple_count| 4310
tuple_len  | 3755414
tuple_percent  | 95.31
dead_tuple_count   | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 124060
free_percent   | 3.15




The two biggest are these:
  state_groups_state | 5475 MB
  event_json | 2328 MB

(I'd particularly like to make room on these two tables, which take up 
the most space)


By the way, excuse me if I make a few mistakes (especially when 
replying), this is the first time I've used Postgres community support 
directly



Le lun. 29 avr. 2024 à 16:37, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> a écrit :


On 4/29/24 07:33, Cocam' server wrote:

Please reply to list also
Ccing list

 > No, the aim is also to reallocate free space to the system for
the other
 > tasks it performs.(That's why I said I'd like it returned to the OS)

You led with:

"I need help to make space on my database".

How much current free space do you have available on the disk?

Did you VACUUM FULL a table at a time or all of them at once?

What are the individual tables sizes?

 >
     > Le lun. 29 avr. 2024 à 16:19, Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > <mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> a écrit :
 >
 >     On 4/29/24 06:45, Cocam' server wrote:
 >      > Hello.
 >      >
 >      > I need help to make space on my database. I have tables
that are
 >     several
 >      > GB in size. I used to use the VACUUM FULL VERBOSE command; but
 >     now, this
 >      > command is too greedy in free space to be used and I'm looking
 >     for a way
 >      > to make free space (given back to the OS)
 >      >
 >      > Thanks in advance to everyone who responds
 >
 >     Per
 >
 > https://www.postgresql.org/docs/current/sql-vacuum.html
<https://www.postgresql.org/docs/current/sql-vacuum.html>
 >     <https://www.postgresql.org/docs/current/sql-vacuum.html
<https://www.postgresql.org/docs/current/sql-vacuum.html>>
 >
 >     "VACUUM reclaims storage occupied by dead tuples. In normal
PostgreSQL
 >     operation, tuples that are deleted or obsoleted by an update
are not
 >     physically removed from their table; they remain present
until a VACUUM
 >     is done. Therefore it's necessary to do VACUUM periodically,
especially
 >     on frequently-updated tables.
 >
 >     <...>
 >
 >     Plain VACUUM (without FULL) simply reclaims space and makes it
 >     available
 >     for re-use. This form of the command can operate in parallel with
 >     normal
 >     reading and writing of the table, as an exclusive lock is not
obtained.
 >     However, extra space is not returned to the operating system
(in most
 >     cases); it's just kept available for re-use within the same
    table.
 >     "
 >
 >     So a regular VACUUM should work if all you want to do is give the
 >     database the ability to recycle the vacuumed tuple space.
 >
 >     --
 >     Adrian Klaver
 > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
<mailto:adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>>
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver

On 4/29/24 07:33, Cocam' server wrote:

Please reply to list also
Ccing list

No, the aim is also to reallocate free space to the system for the other 
tasks it performs.(That's why I said I'd like it returned to the OS)


You led with:

"I need help to make space on my database".

How much current free space do you have available on the disk?

Did you VACUUM FULL a table at a time or all of them at once?

What are the individual tables sizes?



Le lun. 29 avr. 2024 à 16:19, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> a écrit :


On 4/29/24 06:45, Cocam' server wrote:
 > Hello.
 >
 > I need help to make space on my database. I have tables that are
several
 > GB in size. I used to use the VACUUM FULL VERBOSE command; but
now, this
 > command is too greedy in free space to be used and I'm looking
for a way
 > to make free space (given back to the OS)
 >
 > Thanks in advance to everyone who responds

Per

https://www.postgresql.org/docs/current/sql-vacuum.html
<https://www.postgresql.org/docs/current/sql-vacuum.html>

"VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL
operation, tuples that are deleted or obsoleted by an update are not
physically removed from their table; they remain present until a VACUUM
is done. Therefore it's necessary to do VACUUM periodically, especially
on frequently-updated tables.

<...>

Plain VACUUM (without FULL) simply reclaims space and makes it
available
for re-use. This form of the command can operate in parallel with
normal
reading and writing of the table, as an exclusive lock is not obtained.
However, extra space is not returned to the operating system (in most
cases); it's just kept available for re-use within the same table.
"

So a regular VACUUM should work if all you want to do is give the
database the ability to recycle the vacuumed tuple space.

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Need help to make space on my database

2024-04-29 Thread Adrian Klaver

On 4/29/24 06:45, Cocam' server wrote:

Hello.

I need help to make space on my database. I have tables that are several 
GB in size. I used to use the VACUUM FULL VERBOSE command; but now, this 
command is too greedy in free space to be used and I'm looking for a way 
to make free space (given back to the OS)


Thanks in advance to everyone who responds


Per

https://www.postgresql.org/docs/current/sql-vacuum.html

"VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL 
operation, tuples that are deleted or obsoleted by an update are not 
physically removed from their table; they remain present until a VACUUM 
is done. Therefore it's necessary to do VACUUM periodically, especially 
on frequently-updated tables.


<...>

Plain VACUUM (without FULL) simply reclaims space and makes it available 
for re-use. This form of the command can operate in parallel with normal 
reading and writing of the table, as an exclusive lock is not obtained. 
However, extra space is not returned to the operating system (in most 
cases); it's just kept available for re-use within the same table.

"

So a regular VACUUM should work if all you want to do is give the 
database the ability to recycle the vacuumed tuple space.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Query Discrepancy in Postgres HLL Test

2024-04-28 Thread Adrian Klaver

On 4/28/24 06:01, Ayush Vatsa wrote:

Hi PostgreSQL Community,
I'm currently delving into Postgres HLL (HyperLogLog) functionality and 
have encountered an unexpected behavior while executing queries from the 
"cumulative_add_sparse_edge.sql 
<https://github.com/citusdata/postgresql-hll/blob/master/sql/cumulative_add_sparse_edge.sql#L28-L36>" regress test. This particular test data file <https://github.com/citusdata/postgresql-hll/blob/master/sql/data/cumulative_add_sparse_edge.csv#L515-L516> involves three columns, with the last column representing an HLL (HyperLogLog) value derived from the previous HLL value and the current raw value.


Upon manual inspection of the query responsible for deriving the last 
row's HLL value, I noticed a discrepancy. When executing the query:

"""
-- '\x148B481002' is second last rows hll value
SELECT hll_add('\x148B481002.', hll_hashval(2561));
"""
instead of obtaining the expected value (''\x148B481002''), I 
received a different output which is ('\x138b48000200410061008100a1 
').


My initial assumption is that this could potentially be attributed to a 
precision error. However, I'm reaching out to seek clarity on why this 
disparity is occurring and to explore potential strategies for 
mitigating it (as I want the behaviour to be consistent to regress test 
file).


I would say your best option is to file an issue here:

https://github.com/citusdata/postgresql-hll/issues



Regards
Ayush Vatsa


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: What linux version to install ?

2024-04-25 Thread Adrian Klaver

On 4/25/24 00:46, Kashif Zeeshan wrote:



I worked with both CentOS/RHEL 7/8.


What does the above mean relative to the original question?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Backup_Long Running

2024-04-24 Thread Adrian Klaver

On 4/24/24 00:03, jaya kumar wrote:

Hi Team,

Production database Backup is running very long hours. Any option to 
reduce backup time? Kindly advise me.


Hardware specifications?

Network specifications?

The actual pg_basebackup command used?

Server(s) user load?



DB size: 793 GB

We are taking pg_basebackup backup.

Backup File size=613G

Backup Running Duration: 8 Hours


--
Thanks & Regards,
Jayakumar.S
+91-9840864439.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: issue with reading hostname

2024-04-22 Thread Adrian Klaver

On 4/22/24 14:54, Atul Kumar wrote:
I mean, Once I change the hostname then how will the socket read the new 
hostname ? Does it require a postgres service restart ?


The host name of the machine?





On Tue, Apr 23, 2024 at 3:19 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/22/24 14:37, Atul Kumar wrote:
 > Can we edit the socket to change the hostname in it ?

On Ubuntu 22.04 install, given:

srwxrwxrwx  1 postgres postgres    0 Apr 22 14:01 .s.PGSQL.5432=
-rw---  1 postgres postgres   68 Apr 22 14:01 .s.PGSQL.5432.lock

The contents of .s.PGSQL.5432.lock(the file that indicates a Postgres
instance has a lock on the socket) are:

862
/var/lib/postgresql/15/main
1713795311
5432
/var/run/postgresql

There is no hostname to be changed as you are working with a local
socket.

 >
 > Regards.
 >
 > On Tue, Apr 23, 2024 at 2:41 AM Ron Johnson
mailto:ronljohnso...@gmail.com>
 > <mailto:ronljohnso...@gmail.com
<mailto:ronljohnso...@gmail.com>>> wrote:
 >
 >     On Mon, Apr 22, 2024 at 4:14 PM Atul Kumar
mailto:akumar14...@gmail.com>
 >     <mailto:akumar14...@gmail.com
<mailto:akumar14...@gmail.com>>> wrote:
 >
 >         Hi,
 >
 >         I have postgresql  version 15 running on centos7.
 >
 >         I have below query that reads hostname from /tmp directory:
 >
 >         psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT
 >         pg_is_in_recovery();'
 >
 >
 >     If you installed from the PGDG repository (possibly also the
CENTOS
 >     repos, but I'm not sure), then the domain socket also lives in :
 >     /var/run/postgresql
 >
 >     * I find that more expressive than /tmp.
 >     * No need to specify the host when using sockets.
 >     * Using a socket name makes parameterizing the hostname easier in
 >     scripts.
 >
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 14:35, Ron Johnson wrote:


 >
 > On Stack Exchange, I've got a question on how to determine when
to run
 > CLUSTER.  It ties in strongly with this thread..
 >

And the link is?


It should have been the initial question of this thread and it explains 
what you are really after.





Sorry.  Got distracted by the answer.

https://dba.stackexchange.com/questions/338870/when-to-rerun-cluster 
<https://dba.stackexchange.com/questions/338870/when-to-rerun-cluster>




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: issue with reading hostname

2024-04-22 Thread Adrian Klaver

On 4/22/24 14:37, Atul Kumar wrote:

Can we edit the socket to change the hostname in it ?


On Ubuntu 22.04 install, given:

srwxrwxrwx  1 postgres postgres0 Apr 22 14:01 .s.PGSQL.5432=
-rw---  1 postgres postgres   68 Apr 22 14:01 .s.PGSQL.5432.lock

The contents of .s.PGSQL.5432.lock(the file that indicates a Postgres 
instance has a lock on the socket) are:


862
/var/lib/postgresql/15/main
1713795311
5432
/var/run/postgresql

There is no hostname to be changed as you are working with a local socket.



Regards.

On Tue, Apr 23, 2024 at 2:41 AM Ron Johnson <mailto:ronljohnso...@gmail.com>> wrote:


On Mon, Apr 22, 2024 at 4:14 PM Atul Kumar mailto:akumar14...@gmail.com>> wrote:

Hi,

I have postgresql  version 15 running on centos7.

I have below query that reads hostname from /tmp directory:

psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT
pg_is_in_recovery();'


If you installed from the PGDG repository (possibly also the CENTOS
repos, but I'm not sure), then the domain socket also lives in :
/var/run/postgresql

* I find that more expressive than /tmp.
* No need to specify the host when using sockets.
* Using a socket name makes parameterizing the hostname easier in
scripts.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 13:59, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 4:21 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:

[snip]

Which gets us back to your comment upstream:

"What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the
PK, if the PK is a sequence (whether that be an actual sequence, or a
timestamp or something else that grows monotonically)."

This is a case specific to you and this particular circumstance, not a
general rule for VACUUM FULL. If for no other reason then it might make
more sense for the application that the CLUSTER be done on some other
index then the PK.


On Stack Exchange, I've got a question on how to determine when to run 
CLUSTER.  It ties in strongly with this thread..




And the link is?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: issue with reading hostname

2024-04-22 Thread Adrian Klaver

On 4/22/24 13:13, Atul Kumar wrote:

Hi,

I have postgresql  version 15 running on centos7.

I have below query that reads hostname from /tmp directory:

psql postgres -A -t -p 5432 -h /tmp/ -c 'SELECT pg_is_in_recovery();'


so below are my questions:

1. Is the psql client reading the socket file that resides in the /tmp 
directory to fetch the hostname ?


2. I saw the socket file in /tmp and it is empty. Then how is the psql 
client still reading the socket file successfully for hostname ?



this is my socket looks ( the size is 0 as the file is empty):

srwxrwxrwx. 1 postgres postgres      0 Apr 22 12:47 .s.PGSQL.5432


Please help me clarify these doubts.


https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS

"host

Name of host to connect to. If a host name looks like an absolute 
path name, it specifies Unix-domain communication rather than TCP/IP 
communication; the value is the name of the directory in which the 
socket file is stored. (On Unix, an absolute path name begins with a 
slash. On Windows, paths starting with drive letters are also 
recognized.) If the host name starts with @, it is taken as a 
Unix-domain socket in the abstract namespace (currently supported on 
Linux and Windows). The default behavior when host is not specified, or 
is empty, is to connect to a Unix-domain socket in /tmp (or whatever 
socket directory was specified when PostgreSQL was built). On Windows, 
the default is to connect to localhost.


A comma-separated list of host names is also accepted, in which 
case each host name in the list is tried in order; an empty item in the 
list selects the default behavior as explained above. See Section 
34.1.1.3 for details.

"

The simplistic explanation is that the socket is the "host".





Regards.








--
Adrian Klaver
adrian.kla...@aklaver.com





Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 12:51, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 3:14 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:








1) If they are already in enough of a PK order that the CLUSTER time vs
VACUUM FULL time would not be material as there is not much or any
sorting to do then what does the CLUSTER gain you? 



Not much.  Now they're just "slightly more ordered" instead of "slightly 
less ordered" for little if any extra effort.


2) What evidence is there that the records where still in PK order just
because you deleted based on CREATED_ON? I understand the correlation
between CREATED_ON and the PK just not sure why that would necessarily
translate to an on disk order by PK?


1. Records are appended to tables in INSERT order, and INSERT order is 
highly correlated to synthetic PK, by the nature of sequences.


Not something I would count on, see:

https://www.postgresql.org/docs/current/sql-createsequence.html

Notes

for how that may not always be the case.

Also any UPDATE or DELETE is going to change that. There is no guarantee 
of order for the data in the table. If there where you would not need to 
run CLUSTER.


2. My original email showed that CLUSTER took just as long as VACUUM 
FULL.  That means not many records had to be sorted, because... the 
on-disk order was strongly correlated to PK and CREATED_ON. >
Will that happen *every time* in *every circumstance* in *every 
database*?  No, and I never said it would.  But it does in *my *database 
in *this *application.




Which gets us back to your comment upstream:

"What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the 
PK, if the PK is a sequence (whether that be an actual sequence, or a 
timestamp or something else that grows monotonically)."


This is a case specific to you and this particular circumstance, not a 
general rule for VACUUM FULL. If for no other reason then it might make 
more sense for the application that the CLUSTER be done on some other 
index then the PK.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: altering a column to to make it generated

2024-04-22 Thread Adrian Klaver

On 4/22/24 12:42, Celia McInnis wrote:
Can I alter a table column to now make it generated? I tried this 
unsuccessfully:


create temp table tmp1 as select 15::numeric(6,1) as 
distance,'24:30'::interval,0::numeric(7,3) as avgspd;
alter table tmp1 alter column avgspd type numeric(7,3) generated always 
as ((3600.*distance)/EXTRACT (EPOCH FROM rdrtime::interval)) STORED;

ERROR:  syntax error at or near "generated"
LINE 1: ... table tmp1 alter column avgspd type numeric(7,3) generated ...

I think that at least I have the right bracketing this time! :-) 


See David Johnston's comment. I keep on wanting to believe that ALTER 
TABLE supports GENERATED expressions, when it only supports GENERATED 
IDENTITY:)


  







--
Adrian Klaver
adrian.kla...@aklaver.com





Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver




On 4/22/24 11:45 AM, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 12:29 PM David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:




On Mon, Apr 22, 2024, 08:37 Ron Johnson mailto:ronljohnso...@gmail.com>> wrote:

On Mon, Apr 22, 2024 at 10:25 AM Tom Lane mailto:t...@sss.pgh.pa.us>> wrote:

Marcos Pegoraro mailto:mar...@f10.com.br>> writes:
 > But wouldn't it be good that VACUUM FULL uses that index
defined by
 > Cluster, if it exists ?

No ... what would be the difference then?

What the VACUUM docs "should" do, it seems, is suggest CLUSTER
on the PK, if the PK is a sequence (whether that be an actual
sequence, or a timestamp or something else that grows
monotonically).

That's because the data is already roughly in PK order.


If things are bad enough to require a vacuum full that doesn't seem
like a good assumption.


Sure it does.

For example, I just deleted the oldest half of the records in 30 
tables.  Tables who's CREATED_ON timestamp value strongly correlates to 
the synthetic PK sequence values.


Thus, the remaining records were still mostly in PK order.  CLUSTERs on 
the PK values would have taken just about as much time as the VACUUM 
FULL statements which I /did/ run.


1) If they are already in enough of a PK order that the CLUSTER time vs 
VACUUM FULL time would not be material as there is not much or any 
sorting to do then what does the CLUSTER gain you? Unless this table 
then became read only whatever small gain arose from the CLUSTER would 
fade away as UPDATEs and DELETEs where done.


2) What evidence is there that the records where still in PK order just 
because you deleted based on CREATED_ON? I understand the correlation 
between CREATED_ON and the PK just not sure why that would necessarily 
translate to an on disk order by PK?


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: adding a generated column to a table?

2024-04-22 Thread Adrian Klaver

On 4/22/24 09:05, Celia McInnis wrote:

If I have a table containing a date field, say:
create temporary table tmp1 as select now()::date as evtdate;
SELECT 1

select DATE_PART('year', evtdate)::integer as year from tmp1;
  year
--
  2024
(1 row)

Is there some way of doing something like the following?:

alter table tmp1 add column year integer generated always as 
DATE_PART('year', evtdate)::integer STORED;

ERROR:  syntax error at or near "DATE_PART"
LINE 1: ... tmp1 add column year integer generated always as DATE_PART(...



https://www.postgresql.org/docs/current/sql-createtable.html

GENERATED ALWAYS AS ( generation_expr ) STORED

So:


generated always as (DATE_PART('year', evtdate)::integer) STORED

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: CLUSTER vs. VACUUM FULL

2024-04-22 Thread Adrian Klaver

On 4/22/24 08:37, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 10:25 AM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:


Marcos Pegoraro mailto:mar...@f10.com.br>> writes:
 > But wouldn't it be good that VACUUM FULL uses that index defined by
 > Cluster, if it exists ?

No ... what would be the difference then?

What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the 
PK, if the PK is a sequence (whether that be an actual sequence, or a 
timestamp or something else that grows monotonically).


Why?

That would, per David Rowley's comments, impose a sort cost on top of 
the cost of hitting every heap page and rewriting it. You end up with 
sorted table granted, until such time as you start making changes to it. 
If you are to the point of running VACUUM FULL that indicates to me the 
table has seen a heavy load of changes that you want to clean out. Given 
the temporary nature of the effects of a  CLUSTER under a change load I 
don't see why it would be the way to go to clean up a changing table.




That's because the data is already roughly in PK order.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: error in trigger creation

2024-04-21 Thread Adrian Klaver

On 4/21/24 14:21, Tom Lane wrote:

Adrian Klaver  writes:

On 4/21/24 11:20, yudhi s wrote:

So in this case i was wondering if "event trigger" can cause any
additional threat and thus there is no such privilege like "create
trigger" exist in postgres and so it should be treated cautiously?



An event trigger runs as a superuser and executes a function that in
turn can do many things, you do the math on the threat level.


As a trivial example: an event trigger could prevent the legitimate
superuser(s) from doing anything at all in that database, just by
blocking all their commands.  This might not even require malicious
intent, merely faulty coding --- but the opportunity for malicious
intent is staggeringly large.


As an FYI to above:

https://www.postgresql.org/docs/current/sql-createeventtrigger.html

"Event triggers are disabled in single-user mode (see postgres). If an 
erroneous event trigger disables the database so much that you can't 
even drop the trigger, restart in single-user mode and you'll be able to 
do that."





    regards, tom lane


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: error in trigger creation

2024-04-21 Thread Adrian Klaver

On 4/21/24 11:20, yudhi s wrote:


On Sun, Apr 21, 2024 at 8:13 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:






So do you mean , we should not create the event trigger using the 
"security definer" , rather have the super user do this each time we 
have to create the event trigger?


Actually , I am not very much aware about the security part, but is it 
fine to give the super user privilege to the application user(say 
app_user) from which normally scripts/procedures get executed by the 
application, but nobody(individual person) can login using that user.


Additionally in other databases, triggers are driven by some 
specific privileges (say for example in oracle "create trigger" 
privilege). And it doesn't need any super user and we were having many 


Which Postgres has

https://www.postgresql.org/docs/current/ddl-priv.html

TRIGGER

Allows creation of a trigger on a table, view, etc.


but you are talking about event triggers

https://www.postgresql.org/docs/current/sql-createeventtrigger.html

where

"Only superusers can create event triggers."

To paraphrase Henry Ford, you can have any user for an event trigger as 
long as the user is a superuser.




applications in which the application user (which were used for app to 
app login) was having these privileges, similar to "create table" 
privileges which comes by default to the schema who owns the objects  
etc. So in this case i was wondering if "event trigger" can cause any 
additional threat and thus there is no such privilege like "create 
trigger" exist in postgres and so it should be treated cautiously?


An event trigger runs as a superuser and executes a function that in 
turn can do many things, you do the math on the threat level.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Logging statement having any threat?

2024-04-21 Thread Adrian Klaver

On 4/21/24 02:35, Lok P wrote:
On Sat, Apr 20, 2024 at 10:02 PM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:



Have you tried?:


https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT
 
<https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT>

"
log_statement (enum)

    <...>

The default is none. Only superusers and users with the appropriate SET
privilege can change this setting.
"

Or

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET 
<https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET>

set_config ( setting_name text, new_value text, is_local boolean ) →
text


 >
 > Now when we reach out to the infrastructure team , they are
saying these
 > variables(pg_cluster_log_statement,pg_instance_log_statement) were

Where are those variables coming from? I can not find them in RDS or
Terraform docs.


  Thank You Adrian.

Actually I was trying to understand if the auto_explain can only work 
and help us see the slow sql statements in the log, only after we set 
the "log_statement" parameter to non default values (like all, mod, ddl)?


And what is the exact threat with the logging these queries , and i 


log_statement = 'mod'

create role pwd_test with password 'test';
CREATE ROLE

tail -f /var/log/postgresql/postgresql-16-main.log

<...>
2024-04-21 09:04:17.746 PDT [9664] postgres@test LOG:  statement: create 
role pwd_test with password 'test';


think ,I got the point as you mentioned , having access to database  
itself is making someone to see the object details, however do you agree 
that in case of RDS logs are available through different mediums like 
cloud watch, data dog agent etc , so that may pose additional threats as 


Aah, the joys of managed services where you have to check even more 
layers when building out your security.  Logging itself is not the 
issue, who has access to the logs is. The more access points the more 
difficult that gets. Dealing with this is going to require a system wide 
review by all parties and coming up with an agreed upon access policy 
that balances security with the need to monitor what is happening in the 
database. Otherwise troubleshooting issues will be a long drawn out 
process which in itself could end up being a security issue.



because , may be some person doesn't have access to database directly 
but still having permission to see the logs, so the appropriate access 
control need to put in place?


And additionally I was trying to execute the "SELECT 
set_config('log_statement', 'all', true);" but it says "/permission 
denied to set parameter "log_statement/".".So might be it needs a higher 
privileged user to run it.


To answer your question on the variable those we have on the 
terraform module, the terraform module is customized by the database 
infra team so that might be why we are seeing those there which may not 
be exactly the same as its showing in RDS docs for postgres.


https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.Concepts.PostgreSQL.html
 
<https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.Concepts.PostgreSQL.html>



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Logging statement having any threat?

2024-04-20 Thread Adrian Klaver

On 4/20/24 07:02, Lok P wrote:


Now when we reach out to the infrastructure team , they are saying these 
variables(pg_cluster_log_statement,pg_instance_log_statement) were 
removed due to potential security threat. So I want to understand from 
experts here , how this is really a security threat and if any option to 
get this logging enabled (which will help us debug performance issues) 
at same time addressing the threat too?


I should have added to previous post, if you have access to the database 
the security wall has already been breached.




Regards
Lok


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Logging statement having any threat?

2024-04-20 Thread Adrian Klaver

On 4/20/24 07:02, Lok P wrote:

Hello All,
Its postgres version 15.4 and its RDS, in which our dev team gets the 
infrastructure code from another third party team which provides us base 
infrastructure code to build a postgres database, in which we will be 
able to do change DB parameter values etc whatever is mentioned in the 
file with possible values. But surprisingly we don't see log_statement 
there. Below was our requirement,


For debugging and evaluating performance we were having 
pg_stat_statements but it contains aggregated information about all the 
query execution. But in case just want to debug any point in time issues 
where the selected few queries were performing bad (may be because of 
plan change), we were planning to have the auto_explain extension added 
and set the log_min_duration to ~5 seconds, So that, all the queries 
going above that time period(5 seconds) will be logged and provide 
detailed information on the exact point of bottleneck. But we see the 
log_statement parameter has been removed from the base infrastructure 
script/terraform script given by the database team here, so that means 
we will get it as default which is "NONE", which means no 
statement(SELECT/DML/DDL etc) can be logged.


Have you tried?:

https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

"
log_statement (enum)

  <...>

The default is none. Only superusers and users with the appropriate SET 
privilege can change this setting.

"

Or

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SET

set_config ( setting_name text, new_value text, is_local boolean ) → text




Now when we reach out to the infrastructure team , they are saying these 
variables(pg_cluster_log_statement,pg_instance_log_statement) were 


Where are those variables coming from? I can not find them in RDS or 
Terraform docs.


removed due to potential security threat. So I want to understand from 
experts here , how this is really a security threat and if any option to 
get this logging enabled (which will help us debug performance issues) 
at same time addressing the threat too?


Regards
Lok


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: [help] Error in database import

2024-04-19 Thread Adrian Klaver

On 4/19/24 20:58, Tu Ho wrote:

Hi,

I am currently having a project where I need to combine 2 large 
database. I was not however able to import an excel file .csv into the 


There are Excel files(.xls/.xlsx) and there are *.csv files.

database. The error was ERROR: syntax error at or near "OIDS" LINE 1: 
...ing Site" , "International Name(s)" ) FROM STDIN OIDS DELI...

^"

I have no idea how to fix this because I used the "upload file" 
option. What should I do?


Use a recent version of whatever client you are using as:

https://www.postgresql.org/docs/current/sql-copy.html

"If 1, OIDs are included in the data; if 0, not. Oid system columns are 
not supported in PostgreSQL anymore, but the format still contains the 
indicator."


Your client is using old syntax as the last version of Postgres that 
allowed  FROM STDIN OIDS was:


https://www.postgresql.org/docs/11/sql-copy.html

"where option can be one of:

FORMAT format_name
OIDS [ boolean ]
"

And Postgres 11 is ~6 months past EOL.

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Can you refresh a materialized view from a materialized view on another server?

2024-04-19 Thread Adrian Klaver

On 4/18/24 19:49, Michael Nolan wrote:

My production server has a materialized view that is refreshed from a
mysql_fdw several times a day.

What I'd like to be able to do is refresh the materialized view on the
testbed server from the one on the production server so that they are
the same.  (Refreshing it from the MySQL server will result in one
that has records that have been added or updated rather than an exact
copy of the one on the production serve

Use postgres_fdw  to connect the  test bed to the production server?



Mike Nolan
htf...@gmail.com




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Not able to grant access on pg_signal_backend on azure flexible server

2024-04-19 Thread Adrian Klaver

On 4/18/24 22:48, Saksham Joshi wrote:

Hi,
We have created an azure postgresql flexible server and we have added an 
ad admin as a user and Created our database using this admin 
user.However,When are running this command: 'Grant pg_signal_backend To 
adminUser' we are getting an error that says 'permission denied to grant 
role "pg_signal_backend".While this is strange the admin user is infact 
the owner of the said database and we don't have any other user that 
have the said privileges.



https://www.postgresql.org/docs/current/predefined-roles.html

"PostgreSQL provides a set of predefined roles that provide access to 
certain, commonly needed, privileged capabilities and information. 
Administrators (including roles that have the CREATEROLE privilege) can 
GRANT these roles to users and/or other roles in their environment, 
providing those users with access to the specified capabilities and 
information."





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Question on trigger

2024-04-16 Thread Adrian Klaver

On 4/16/24 12:39, veem v wrote:



On Sat, 13 Apr 2024 at 21:44, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/13/24 00:03, veem v wrote:
 > Thank you Adrian.
 >
 > So it seems the heavy DML tables will see an impact if having
triggers
 > (mainly for each row trigger) created on them.
 >
 > And also the bulk DML/array based insert (which inserts multiple
rows in
 > one short or one batch) , in those cases it seems the trigger
will not
 > make that happen as it will force it to make it happen row by
row, as
 > the trigger is row based. Will test anyway though.

You said you have triggers in the Oracle database and I assumed they
worked and where not a show stopping issue there. What makes you think
that would be different in Postgres?

What type of triggers where there in Oracle, per row, per statement
or a
mix?


Actually we have row level triggers  in oracle which are running for 
smaller volume DML and are making the direct path inserts to happen in 
conventional row by row insert, in presence of trigger. So was wondering 


Not sure what the above means, you will need to provide a more detailed 
description. Though any DML you are doing on table that has any sort of 
constraint, index, trigger, foreign key, default values, etc is going to 
have more overhead then into an unencumbered table. FYI, some of the 
preceding are system triggers, for example foreign keys.


if it postgres we will be encountering a similar issue and batch inserts 
may be converted back to row by row automatically. And here we are going 
to process higher volume DMLS in postgresql database.




Hard to say with the information provided. Easiest way to find out is 
create a test setup  and run the code. Though I guess, as I have not 
actually tried this, you could have a per row trigger and per statement 
trigger for the same action and disable the per row and enable the per 
statement trigger for batch operations. Then once the batch operation is 
done reverse the process. Again something to test to verify.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 14:50, jack wrote:

Reply to list also
Ccing list


Hello,
I am not sure what "locale" means.


Go to the settings App for whatever version of Windows you are on and 
search for locale.



The Windows app is an inhouse application which uses Actian-Zen SQL.
The data is exported to simple ASCII in a tab delimited format similar to CSV.


And you know it is ASCII for a fact?


Those files are then imported into the PostgreSQL table using COPY.
Importing the data is not an issue.
I am able to load all the data without any problems, even into 1 table which 
ends up with about 1.2 billion records.
But when I try to update the data in that table I get many errors, essentially 
crashes.


Repeating what has been asked and answered it not really going anywhere.


There may be some control characters (garbage) in the data but that should not 
crash postgresql, especially if it can import the data without issues.


Unless it does. That is the point of the questions, getting to what is 
actually causing the issue. Until the problem can be boiled down to a 
reproducible test case there really is not much hope of anything more 
then the the 'yes you have a problem' answer. And there is a difference 
between dumping data into a table and then doing an UPGRADE where the 
data strings are manipulated by functions.



Anyway, I hope I answered your questions.
Thanks for your help.



On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver 
 wrote:




On 4/14/24 13:18, jack wrote:


The CSV files are being produced by another system, a WIndows app on a
Windows machine. I then copy them to a USB key and copy them onto the
ubuntu machine. The data is then imported via the COPY command.



The app?

The locale in use on the Windows machine?

The locale in use in the database?


COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table
without any problems. The issue is only when I start to update the
single table. And that is why I started using smaller temporary tables
for each CSV file, to do the updates in the smaller tables before I move
them all to a single large table.



The import is just dumping the data in, my suspicion is the problem is
related to using string functions on the data.


After all the data is loaded and updated, I run php programs on the
large table to generate reports. All of which works well EXCEPT for
performing the updates on the data. And I do not want to use perl or any
outside tool. I want it all one in SQL because I am required to document
all my steps so that someone else can take over, so everything needs to
be as simple as possible.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 13:18, jack wrote:
The CSV files are being produced by another system, a WIndows app on a 
Windows machine. I then copy them to a USB key and copy them onto the 
ubuntu machine. The data is then imported via the COPY command.


The app?

The locale in use on the Windows machine?

The locale in use in the database?



COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table 
without any problems. The issue is only when I start to update the 
single table. And that is why I started using smaller temporary tables 
for each CSV file, to do the updates in the smaller tables before I move 
them all to a single large table.


The import is just dumping the data in, my suspicion is the problem is 
related to using string functions on the data.




After all the data is loaded and updated, I run php programs on the 
large table to generate reports. All of which works well EXCEPT for 
performing the updates on the data. And I do not want to use perl or any 
outside tool. I want it all one in SQL because I am required to document 
all my steps so that someone else can take over, so everything needs to 
be as simple as possible.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 12:22, jack wrote:

Here is an excerpt of /var/log/postgresql/postgresql-16-main.log


Where and how are the CSV files being produced?

What is the database locale?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 09:20, jack wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost


Look at the OS system log.



PostgreSQL 16.2

I also believe it is a resource issue which can be rectified with a 
setting, but which setting?

If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified = 
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'), 
'\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND 
POSITION('--' IN category_modified)>0;

UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT 
NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified, 
'-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 
FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND 
category_modified LIKE '%-';


Is the above all being done in one script/transaction?

Again what are the table definitions for the tables being copied into 
and/or modified?









--
Adrian Klaver
adrian.kla...@aklaver.com





Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 07:24, jack wrote:

Hello,
I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually 
into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when 
I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the 
updating (UPDATE table SET field=UPPER(field), etc.).


The errors are all "Server closed the connection unexpectedly"




I have been working on this for just over 1 year now, documenting every 
step, and I am still unable to get this to work without it crashing 
somewhere along the way.

I am beginning to wonder if postgreSQL is bi-polar.


More information needed, to start:

1) Complete example of the code in the script.

2) Table definition(s) of those being copied into.

3) Sample of the data being copied.

4) The error message(s) generated.

5) Database locale



Any help would be greatly appreciated.
Thank you


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Question on trigger

2024-04-13 Thread Adrian Klaver

On 4/13/24 00:03, veem v wrote:

Thank you Adrian.

So it seems the heavy DML tables will see an impact if having triggers 
(mainly for each row trigger) created on them.


And also the bulk DML/array based insert (which inserts multiple rows in 
one short or one batch) , in those cases it seems the trigger will not 
make that happen as it will force it to make it happen row by row, as 
the trigger is row based. Will test anyway though.


You said you have triggers in the Oracle database and I assumed they 
worked and where not a show stopping issue there. What makes you think 
that would be different in Postgres?


What type of triggers where there in Oracle, per row, per statement or a 
mix?





On Thu, 11 Apr 2024 at 22:00, Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/11/24 07:31, veem v wrote:
 > Hi, We used to use Oracle database in which we had audit
 > triggers(something as below) mandated for all tables by the control
 > team. Now we are going to use the postgresql 15.4 database for
one of
 > our applications. So,wanted to understand if there exists any
downside
 > of such audit trigger setup for all the tables? Will it impact
the bulk
 > data insert/update/delete OR slowdown of any of the DML operations
 > significantly (and thus will not be advisable to use for all
tables but
 > selected ones)?

Triggers are overhead in Postgres as they where in Oracle. If they
didn't cause an issue in Oracle I would suspect that would also be the
case in Postgres. To confirm you would need to create a test setup and
run some common operations and see what the overhead is.

Some potential performance improvements:

https://www.postgresql.org/docs/current/sql-createtrigger.html
<https://www.postgresql.org/docs/current/sql-createtrigger.html>

"...a trigger that is marked FOR EACH STATEMENT only executes once for
any given operation, regardless of how many rows it modifies (in
particular, an operation that modifies zero rows will still result in
the execution of any applicable FOR EACH STATEMENT triggers)."

<...>

"The REFERENCING option enables collection of transition relations,
which are row sets that include all of the rows inserted, deleted, or
modified by the current SQL statement. This feature lets the trigger
see
a global view of what the statement did, not just one row at a time.
This option is only allowed for an AFTER trigger that is not a
constraint trigger; also, if the trigger is an UPDATE trigger, it must
not specify a column_name list. OLD TABLE may only be specified once,
and only for a trigger that can fire on UPDATE or DELETE; it creates a
transition relation containing the before-images of all rows updated or
deleted by the statement. Similarly, NEW TABLE may only be specified
once, and only for a trigger that can fire on UPDATE or INSERT; it
creates a transition relation containing the after-images of all rows
updated or inserted by the statement."


As example:

https://www.postgresql.org/docs/current/plpgsql-trigger.html
<https://www.postgresql.org/docs/current/plpgsql-trigger.html>

Example 43.7. Auditing with Transition Tables

 >
 > CREATE OR REPLACE TRIGGER TAB_AUD_TRG
 >    BEFORE DELETE OR INSERT OR UPDATE
 >    ON tab
 >    FOR EACH ROW
 > BEGIN
 >        IF inserting THEN
 >          :NEW.create_timestamp := systimestamp;
 >          :NEW.create_userid  :=
sys_context('USERENV','SESSION_USER');
 >          :NEW.update_timestamp := systimestamp;
 >          :NEW.update_userid := sys_context('USERENV','SESSION_USER');
 >        ELSIF updating THEN
 >          IF  updating('create_userid') OR
updating('create_timestamp') THEN
 >              :new.create_userid   := :old.create_userid;
 >              :new.create_timestamp  := :old.create_timestamp;
 >          END IF;
 >          :NEW.update_timestamp := systimestamp;
 >          :NEW.update_userid := sys_context('USERENV','SESSION_USER');
 >        END IF;
 >    END;
 > /
 >
     > Regards
 > Veem

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Question on trigger

2024-04-11 Thread Adrian Klaver

On 4/11/24 07:31, veem v wrote:
Hi, We used to use Oracle database in which we had audit 
triggers(something as below) mandated for all tables by the control 
team. Now we are going to use the postgresql 15.4 database for one of 
our applications. So,wanted to understand if there exists any downside 
of such audit trigger setup for all the tables? Will it impact the bulk 
data insert/update/delete OR slowdown of any of the DML operations 
significantly (and thus will not be advisable to use for all tables but 
selected ones)?


Triggers are overhead in Postgres as they where in Oracle. If they 
didn't cause an issue in Oracle I would suspect that would also be the 
case in Postgres. To confirm you would need to create a test setup and 
run some common operations and see what the overhead is.


Some potential performance improvements:

https://www.postgresql.org/docs/current/sql-createtrigger.html

"...a trigger that is marked FOR EACH STATEMENT only executes once for 
any given operation, regardless of how many rows it modifies (in 
particular, an operation that modifies zero rows will still result in 
the execution of any applicable FOR EACH STATEMENT triggers)."


<...>

"The REFERENCING option enables collection of transition relations, 
which are row sets that include all of the rows inserted, deleted, or 
modified by the current SQL statement. This feature lets the trigger see 
a global view of what the statement did, not just one row at a time. 
This option is only allowed for an AFTER trigger that is not a 
constraint trigger; also, if the trigger is an UPDATE trigger, it must 
not specify a column_name list. OLD TABLE may only be specified once, 
and only for a trigger that can fire on UPDATE or DELETE; it creates a 
transition relation containing the before-images of all rows updated or 
deleted by the statement. Similarly, NEW TABLE may only be specified 
once, and only for a trigger that can fire on UPDATE or INSERT; it 
creates a transition relation containing the after-images of all rows 
updated or inserted by the statement."



As example:

https://www.postgresql.org/docs/current/plpgsql-trigger.html

Example 43.7. Auditing with Transition Tables



CREATE OR REPLACE TRIGGER TAB_AUD_TRG
   BEFORE DELETE OR INSERT OR UPDATE
   ON tab
   FOR EACH ROW
BEGIN
       IF inserting THEN
         :NEW.create_timestamp := systimestamp;
         :NEW.create_userid  := sys_context('USERENV','SESSION_USER');
         :NEW.update_timestamp := systimestamp;
         :NEW.update_userid := sys_context('USERENV','SESSION_USER');
       ELSIF updating THEN
         IF  updating('create_userid') OR updating('create_timestamp') THEN
             :new.create_userid   := :old.create_userid;
             :new.create_timestamp  := :old.create_timestamp;
         END IF;
         :NEW.update_timestamp := systimestamp;
         :NEW.update_userid := sys_context('USERENV','SESSION_USER');
       END IF;
   END;
/

Regards
Veem


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Failure of postgres_fdw because of TimeZone setting

2024-04-10 Thread Adrian Klaver




On 4/10/24 1:31 PM, Tom Lane wrote:

Adrian Klaver  writes:

On 4/10/24 12:38, Adnan Dautovic wrote:

By the way, the row count of pg_timezone_names is 385, but I do
not know how that compares to a more standard installation.



On my instance of Postgres 16.2, 1196.


You're probably using a build with --with-system-tzdata pointing
at a system tzdata tree that includes leap-second-aware zones.
These tend to have duplicative entries like "America/New_York"
and "posix/America/New_York".  (There's also a subtree like
"right/America/New_York", but we reject those because we don't
do leap seconds.)  The real number of distinct zones in a
standard tzdata file set these days is a shade under 600.


It's the PGDG package running on Ubuntu 22.04.



    regards, tom lane


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Failure of postgres_fdw because of TimeZone setting

2024-04-10 Thread Adrian Klaver

On 4/10/24 12:38, Adnan Dautovic wrote:

Hi,

On 05. Apr 2024, at 16:13, Tom Lane  wrote:

Adnan Dautovic  writes:



By the way, the row count of pg_timezone_names is 385, but I do
not know how that compares to a more standard installation.


On my instance of Postgres 16.2, 1196.



Kind regards,

Adnan Dautovic




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Some advice need after a 20 year gap after Ingres/GUIs

2024-04-10 Thread Adrian Klaver

On 4/10/24 03:11, John Bateson wrote:

Good morning,

The end of my working life was taken up with developing and supporting a 
Scientific Research establishment near Warrington in the UK. I had a 
small team of programmer who did an excellent job for me and with me.


The software was Ingres and the main program supporting user 
administration on a Synchrotron was built  under OpenROAD with other 
developments around the on-site stores and finance  using ABF since the 
stores workers found this much faster than a GUI! There was also some 
web development use .Net practises.


This was a quite complex system in the end with nearly 200 tables.

We ended up using Ingres Replicator with the intention of running it 
from two sites 180 miles apart – at the time it was a pile of 
*notquitegoodenough*! This was early this century so Actian may have 
made some improvements since then…


So much for the background to establish that I am not a complete newbie, 
just out of the loop for a while.


*Using Postgres and PGAdmin -4*.

So, 20 years later I am developing, (unpaid) a new project for some 
historic railways in the UK and linking these to the development and 
design of a range of kits for those interested in model railways. This 
is getting towards 20 tables so far.


What I really need is a recommendation for the current and best practice 
for an easy GUI that will allow me to press a button without opening up 
PGAdmin or a black screen, i.e. the traditional .EXE file I can put on a 
button on the screen on the screen on a Windows 11 based system.


With Postgres or other similar client-server database there will be a 
need to set up the database server somewhere first, before you get to 
the GUI interface part.


Are you planning on a single instance of Postgres that folks log in to 
from multiple locations?


Or do want a stand alone setup that each user has on their machine?



While my programming history goes back to MDBS-4 and beyond I know I 
have some catching up to do and while not a complete newbie, need 
something I can work on quickly and intuitively and inexpensively!


So, recommendation and possibly some consensus would be very much 
appreciated.


And apologies if I have taken up too much of your time or have placed 
this in the wrong forum.


John



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Issue with date/timezone conversion function

2024-04-09 Thread Adrian Klaver

On 4/9/24 11:24, Lok P wrote:


On Tue, Apr 9, 2024 at 10:33 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:




'EST' is going to rotate to UTC-5, but that's probably not what
you want in the summer.  I'd suggest AT TIME ZONE 'America/New_York'
or the like.  See

https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES 
<https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES>

                         regards, tom lane



  Thank you so much. You are correct. The AT TIME ZONE 
'America/New_York' is giving correct EST time conversion.


But I think regarding why it looks to be shifting i.e. the same time 
duration appears to be holding a different count of transactions while 
the base table is not getting updated/inserted/deleted for its 
historical create_timestamps, I suspect the below conversion part.


The task is to count each ~15minutes duration transaction and publish in 
ordered fashion i.e. something as below, but the way it's been written 
seems wrong. It's an existing script. It first gets the date component 
with truncated hour and then adds the time component to it to make it 
~15minutes interval. Can it be written in some simple way?


9-apr-2024 14:00     12340
9-apr-2024 14:15     12312
9-apr-2024 14:30     12323
9-apr-2024 14:45     12304

/DATE_TRUNC('hour', create_timestamp AT TIME ZONE '/America/New_York'/) +/
/(EXTRACT(MINUTE FROM create_timestamp AT TIME ZONE 
'/America/New_York/') / 15 * 15) * INTERVAL '15 minute'/


Something like?:

create table dt_bin_test(id integer, tz_fld timestamptz);

insert into dt_bin_test values(1, '2024-04-01 8:15'), (2, '2024-04-01 
9:01'), (3, '2024-04-01 9:16'), (4, '2024-04-01 9:45'), (5, '2024-04-01 
8:15'), (6, '2024-04-01 9:01');


select count(tz_fld), date_bin('15 minutes', tz_fld, '2024-01-01') as 
bin from dt_bin_test group by date_bin('15 minutes', tz_fld, '2024-01-01');


 count |  bin
---+
 2 | 2024-04-01 09:00:00-07
 2 | 2024-04-01 08:15:00-07
 1 | 2024-04-01 09:15:00-07
 1 | 2024-04-01 09:45:00-07



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Issue with date/timezone conversion function

2024-04-09 Thread Adrian Klaver




On 4/9/24 9:16 AM, Lok P wrote:


On Tue, Apr 9, 2024 at 9:26 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/9/24 08:43, Lok P wrote:
 > Hi All,
 > It's version 15.4 of postgresql database. Every "date/time" data
type
 > attribute gets stored in the database in UTC timezone only. One
of the
 > support persons local timezone is "asia/kolkata" and  that support
 > person needs to fetch the count of transactions from a table-
 > transaction_tab and share it with another person/customer who is
in the
 > EST timezone, so basically the transaction has to be shown or
displayed
 > the EST timezone.

What is the datatype for the create_timestamp?

What does SHOW timezone; return on the server?



Thank you for the quick response.

The column data type for "create_timestamp" is "timestamptz'.
Show timezone from the support users client machine UI showing  
"Asia/Calcutta".
Not having access to run "Show timezone" on the server currently, I will 
try to get it.
output from pg_setting showing setting as "Asia/Calcutta', reset_val as 
"Asia/Calcutta", boot_val as "GMT"


In the pg_settings query what are the source, sourcefile, sourceline 
fields set to?




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Issue with date/timezone conversion function

2024-04-09 Thread Adrian Klaver

On 4/9/24 08:43, Lok P wrote:

Hi All,
It's version 15.4 of postgresql database. Every "date/time" data type 
attribute gets stored in the database in UTC timezone only. One of the 
support persons local timezone is "asia/kolkata" and  that support 
person needs to fetch the count of transactions from a table- 
transaction_tab and share it with another person/customer who is in the 
EST timezone, so basically the transaction has to be shown or displayed 
the EST timezone.


What is the datatype for the create_timestamp?

What does SHOW timezone; return on the server?



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: [Code: 0, SQL State: 0A000] when "typing" from pg_catalog

2024-04-09 Thread Adrian Klaver

On 4/9/24 08:12, Thiemo Kellner wrote:

Thanks for taking this up.

Am 09.04.2024 um 17:09 schrieb Adrian Klaver:

On 4/9/24 07:59, Thiemo Kellner wrote:
[Code: 0, SQL State: 0A000] ERROR: References to other databases are 
not implemented: pg_catalog.pg_roles.rolname

    Position: 298 [Script position: 334 - 361]

PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader';

Is being seen as DB_NAME.TABLE_NAME.COLUMN_NAME.


Yes, obviously, but why? With the information_schema view all is fine. 
And, I suppose, with all other objects in other schemas of the same 
database too.


Because you did not do?:

PG_CATALOG.PG_ROLES.ROLNAME%type
--
Adrian Klaver
adrian.kla...@aklaver.com





Re: [Code: 0, SQL State: 0A000] when "typing" from pg_catalog

2024-04-09 Thread Adrian Klaver

On 4/9/24 07:59, Thiemo Kellner wrote:

Hi

I have the following function code. When trying to install, it gives me

[Code: 0, SQL State: 0A000]  FEHLER: Verweise auf andere Datenbanken 
sind nicht implementiert: pg_catalog.pg_roles.rolname

   Position: 298  [Script position: 334 - 361]


[Code: 0, SQL State: 0A000] ERROR: References to other databases are not 
implemented: pg_catalog.pg_roles.rolname

   Position: 298 [Script position: 334 - 361]


To the best of my knowledge, pg_catalog is a schema not a database, like 
information_schema. Am I missing something? And why is it not allowed to 
type from the catalogue?


I presume, this example is rather academic due to the name type.


PG_CATALOG.PG_ROLES.ROLNAME := 'snowrunner_reader';

Is being seen as DB_NAME.TABLE_NAME.COLUMN_NAME.





Kind regards

Thiemo


create or replace function GRANT_SELECTS()
returns void
language plpgsql
as
$body$
     declare
     C_SCHEMA_NAME   constant 
INFORMATION_SCHEMA.SCHEMATA.SCHEMA_NAME%type :=

   'snowrunner';
--    C_ROLE_NAME constant    name :=
     C_ROLE_NAME constant    PG_CATALOG.PG_ROLES.ROLNAME :=
   'snowrunner_reader';
     V_SQL_STATEMENT text;
     begin
     -- Check the existance of the schema
     perform 1
     from INFORMATION_SCHEMA.SCHEMATA
     where SCHEMA_NAME = C_SCHEMA_NAME;
     if not found then
     raise exception 'Schema "%s" does not exist!', C_SCHEMA_NAME;
     end if;

     -- Check the existance of the role
     perform 1
     from PG_CATALOG.PG_ROLES
     where ROLNAME = C_ROLE_NAME;
     if not found then
     raise exception 'Role "%s" does not exist!', C_ROLE_NAME;
     end if;

     -- Issue grants
     V_SQL_STATEMENT := format('grant select on all tables in schema 
%i to %i', C_SCHEMA_NAME, C_ROLE_NAME);

     raise info '%', V_SQL_STATEMENT;
     execute V_SQL_STATEMENT;
     V_SQL_STATEMENT := format('grant select on all views in schema 
%i to %i', C_SCHEMA_NAME, C_ROLE_NAME);

     raise info '%', V_SQL_STATEMENT;
     execute V_SQL_STATEMENT;
     V_SQL_STATEMENT := format('grant select on all materialized 
views in schema %i to %i', C_SCHEMA_NAME, C_ROLE_NAME);

     raise info '%', V_SQL_STATEMENT;
     execute V_SQL_STATEMENT;
     commit;

     return;
     end;
$body$;




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Regarding: Replication of TRUNCATE commands is not working

2024-04-09 Thread Adrian Klaver

On 4/9/24 05:07, Arvind Raghuwanshi wrote:

Hi Laurenz,
Thanks for the response
Question: What PostgreSQL version are you using?  The feature was 
introduced in v11.

Answer: I am using the 16.0 Postgresql version.
db1=> SELECT version();
                                                version
-
  PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 
7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit

(1 row)

Question: How exactly is the publication defined?  Perhaps TRUNCATE is 
excluded.
I am not using the subscribe/publication model . but i have created a 


To be clear you have not done CREATE PUBLICATION on the source machine, 
correct?


What is the rest of your configuration per:

https://www.postgresql.org/docs/current/logical-replication-config.html

Also what exactly is being TRUNCTEd?


replication slot on the source database and then i am calling below 
procedure to get the details from replication slot:
select data from pg_logical_slot_peek_changes('db1_slot', NULL, NULL, 
'pretty-print', '1');


In the case of TRUNCATE , the above procedure does not show up any 
records. however this procedure shows up insert, update and delete events.


As you mentioned TRUNCATE is excluded, is there any way to exclude 
TRUNCATE ?


Thanks
Arvind

On Tue, Apr 9, 2024 at 4:08 PM Laurenz Albe <mailto:laurenz.a...@cybertec.at>> wrote:


On Tue, 2024-04-09 at 14:50 +0530, Arvind Raghuwanshi wrote:
 > I have tried to run the TRUNCATE command  and found out that it's
not getting replicated using logical replication for pgsql.
 > I have also checked the schema change using pg_dump command but
the schema change also not getting detected for TRUNCATE command.
 >
 > However on pgsql logical replication doc page[1] , it's mentioned
that Replication of TRUNCATE commands is supported.
 >
 > [1] PostgreSQL: Documentation: 16: 31.6. Restrictions
 >
 > Any idea how we can solve this?

What PostgreSQL version are you using?  The feature was introduced
in v11.
How exactly is the publication defined?  Perhaps TRUNCATE is excluded.

Yours,
Laurenz Albe



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: PEM install error

2024-04-08 Thread Adrian Klaver

On 4/8/24 07:24, mark bradley wrote:
While installing PostgreSQL I am getting this error message during the 
PEM server portion.  Do I need the PEM server?  If so, what is the solution?




PEM Server is an EDB product:

https://www.enterprisedb.com/docs/pem/latest/managing_pem_server/

You should reach out to their tech support.



Thanks!



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Moving delta data faster

2024-04-06 Thread Adrian Klaver

On 4/6/24 13:04, yudhi s wrote:


On Sat, Apr 6, 2024 at 10:25 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:



Your original problem description was:

"Then subsequently these rows will be inserted/updated based on the
delta number of rows that got inserted/updated in the source database.
In some cases these changed data can flow multiple times per day to the
downstream i.e. postgres database and in other cases once daily."

If the above is not a hard rule, then yes up to some point just
replacing the data in mass would be the simplest/fastest method. You
could cut a step out by doing something like TRUNCATE target_tab and
then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO
source_tab.

Yes, actually i didn't realize that truncate table transactional/online 
here in postgres. In other databases like Oracle its downtime for the 
read queries on the target table, as data will be vanished from the 
target table post truncate(until the data load happens) and those are 
auto commit. Thanks Veem for sharing that  option.


  I also think that truncate will be faster if the changes/delta is 
large , but if its handful of rows like <5%of the rows in the table then 
Upsert/Merge will be better performant. And also the down side of the 
truncate option is,  it does ask to bring/export all the data from 
source to the S3 file which may take longer as compared to bringing just 
the delta records. Correct me if I'm wrong.


Since you still have not specified how the data is stored in S3 and how 
you propose to move them into Postgres I can't really answer.




However I am still not able to understand why the upsert is less 
performant than merge, could you throw some light on this please?




I have no idea how this works in the code, but my suspicion is it is due
to the following:

https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT

"The optional ON CONFLICT clause specifies an alternative action to 
raising a unique violation or exclusion constraint violation error. For 
each individual row proposed for insertion, either the insertion 
proceeds, or, if an arbiter constraint or index specified by 
conflict_target is violated, the alternative conflict_action is taken. 
ON CONFLICT DO NOTHING simply avoids inserting a row as its alternative 
action. ON CONFLICT DO UPDATE updates the existing row that conflicts 
with the row proposed for insertion as its alternative action."


vs this:

"First, the MERGE command performs a join from data_source to 
target_table_name producing zero or more candidate change rows. For each 
candidate change row, the status of MATCHED or NOT MATCHED is set just 
once, after which WHEN clauses are evaluated in the order specified. For 
each candidate change row, the first clause to evaluate as true is 
executed. No more than one WHEN clause is executed for any candidate 
change row."


Where ON CONFLICT attempts the INSERT then on failure does the UPDATE 
for the ON CONFLICT DO UPDATE case. MERGE on the hand evaluates based on 
the join condition(ON tbl1.fld =tbl2.fld) and then based on MATCH/NOT 
MATCHED takes the appropriate action for the first WHEN match. In other 
words it goes directly to the appropriate action.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Moving delta data faster

2024-04-06 Thread Adrian Klaver

On 4/6/24 08:47, yudhi s wrote:

Thank you Adrian, Greg and Veem.

I tried writing a small routine to see how the performance differs in 
these four approaches i.e. Upsert VS traditional update+insert VS Merge 
vs Truncate+load.


Initially I was thinking Upsert will perform the same as Merge as the 
logic looks similar but it seems it's the worst performing among all, 
not sure why , yet to know the reason though. Truncate+ load seems to be 
the best performing among all. Hope i am doing it correctly. Please 
correct me if I'm wrong.


Your original problem description was:

"Then subsequently these rows will be inserted/updated based on the 
delta number of rows that got inserted/updated in the source database. 
In some cases these changed data can flow multiple times per day to the 
downstream i.e. postgres database and in other cases once daily."


If the above is not a hard rule, then yes up to some point just 
replacing the data in mass would be the simplest/fastest method. You 
could cut a step out by doing something like TRUNCATE target_tab and 
then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO source_tab.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Failure of postgres_fdw because of TimeZone setting

2024-04-05 Thread Adrian Klaver

On 4/5/24 02:39, Adnan Dautovic wrote:

Dear Adrian,

Adrian Klaver  wrote:
Define 'read-only', especially as it applies to the privileges on the 
public schema.


I am not quite sure which information you are looking for
exactly. According to this [1], I ran the following query:

WITH "names"("name") AS (
   SELECT n.nspname AS "name"
     FROM pg_catalog.pg_namespace n
   WHERE n.nspname !~ '^pg_'
     AND n.nspname <> 'information_schema'
) SELECT "name",
   pg_catalog.has_schema_privilege(current_user, "name", 'CREATE') AS
"create",
   pg_catalog.has_schema_privilege(current_user, "name", 'USAGE') AS 
"usage"

     FROM "names";

And recieved the following result:

"name"    "create"    "usage"
"public"    true    true


Looks alright. The below is the issue.




Per Tom Lane's comments on timezone, log into the remote server and do:

SHOW timezone;

Europe/Berlin


SET timezone = 'etc/UTC';

ERROR: invalid value for parameter "TimeZone": "etc/UTC"
SQL state: 22023


SET timezone = 'UTC';

ERROR: invalid value for parameter "TimeZone": "UTC"
SQL state: 22023

However, this lead me to [2] and I find the output very
interesting:

SELECT * FROM pg_timezone_names ORDER BY name;


The below is cut down from the actual output as there should be at least:

Europe/Berlin  CEST  02:00:00 t

present also?




"name"    "abbrev"    "utc_offset"    "is_dst"
"Turkey"    "+03"    "03:00:00"    false
"UCT"    "UCT"    "00:00:00"    false


Hmm I get:

UCT   UTC  00:00:00  f

could be version difference though.


"Universal"    "UTC"    "00:00:00"    false
"W-SU"    "MSK"    "03:00:00"    false



And then attempting

SET timezone = 'Universal';


SET
Query returned successfully in 100 msec.


Any ideas on how to proceed?


1) For the long term contact whomever is in charge of the remote server 
and ask them what they have done with the timezones, why and can they 
fix it?


2) In short term per the link from your first post and with no guarantees:

https://github.com/postgres/postgres/blob/936e3fa3787a51397280c1081587586e83c20399/contrib/postgres_fdw/connection.c#L677

In the source code change

do_sql_command(conn, "SET timezone = 'UTC'");

to

do_sql_command(conn, "SET timezone = 'Universal'");

As from the link: "Set remote timezone; this is basically just cosmetic"

Then recompile the extension.



Kind regards,

Adnan Dautovic


[1]: https://stackoverflow.com/a/36095257
[2]: https://stackoverflow.com/a/32009497



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Moving delta data faster

2024-04-04 Thread Adrian Klaver

On 4/4/24 13:42, yudhi s wrote:


On Thu, Apr 4, 2024 at 9:04 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/3/24 22:24, yudhi s wrote:
 >
 > On Thu, Apr 4, 2024 at 10:16 AM Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > <mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> wrote:

S3 is not a database. You will need to be more specific about '... then
from the S3 it will be picked and gets merged to the target postgres
database.'


The data from S3 will be dumped into the stage table and then the 
upsert/merge from that table to the actual table.


The S3 --> staging table would be helped by having the data as CSV and 
then using COPY. The staging --> final table step could be done as 
either ON CONFLICT or MERGE, you would need to test in your situation to 
verify which works better.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Moving delta data faster

2024-04-04 Thread Adrian Klaver

On 4/3/24 22:24, yudhi s wrote:


On Thu, Apr 4, 2024 at 10:16 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/3/24 20:54, yudhi s wrote:
 > On Thu, Apr 4, 2024 at 2:41 AM Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > <mailto:adrian.kla...@aklaver.com
<mailto:adrian.kla...@aklaver.com>>> wrote:
 >
 >    > Thank you Adrian.
 >
 > And one thing i forgot to mention this target postgresql database
would
 > be on AWS RDS whereas the source Oracle databases is on premise.
I think
 > we don't have the FDW extension currently in place but we can get
that.
 > I am just not able to understand clearly  though, but do you mean
export
 > the data from source using CSV and do truncate and import on
target. And
 > as these data will be moved through the network won't that cause
slowness?
 >
 > The source database here is Oracle database. Correct me if wrong, it
 > looks like foreign data wrapper is like a DB link. Or do you mean
 > writing a query on the target database (which can be UPSERT or
MERGE)
 > but will be joining the table from the source database through the
 > DBlink/DDW? But my question was whether we should use UPSERT or
MERGE
 > for comparing and loading the delta records to the target postgresql
 > database. Want to understand which is more performant , as I see
in the
 > past Merge having performance issues in the past, but not very sure
 > about that.

My motivation was to get some basic information about your setup and
what you are trying to achieve.

If I understand correctly you have:

1) An Oracle database with tables that you want to copy the complete
data from to a Postgres database. For this sort of thing
COPY(https://www.postgresql.org/docs/current/sql-copy.html
<https://www.postgresql.org/docs/current/sql-copy.html>) on the
Postgres end using CSV data generated from the source is probably the
quickest bulk load method.

2) After the initial load you want to do follow up INSERT/UPDATEs based
on a delta of the source tables relative to the initial load. This is
still a bit of mystery to me. How are determining the delta: a) On the
source end entirely or b) Target relative to source? Also what is the
anticipated size of the delta per transfer?

Additional information needed:

1) Network distance between source and target?

2) Network capacity?

3) Expected load on both source and target servers from other
operations?


Thank you. Actually I was trying to understand how to cater the delta 
load after the one time load is done . The delta change in records is 
planned to be found based on the primary keys on the tables. If it found 
the key it will update the records if it does not find the keys it will 
insert the rows.


Basically the select query from the source database will fetch the data 
with a certain time interval(based on the latest update timestamp or 
create timestamp if they are available or else full dump) and put it on 
S3 and then from the S3 it will be picked and gets merged to the target 
postgres database. As upsert and merge both were looking similar , so 
was wondering what we should use here for loading the delta records?


S3 is not a database. You will need to be more specific about '... then 
from the S3 it will be picked and gets merged to the target postgres 
database.'


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Failure of postgres_fdw because of TimeZone setting

2024-04-04 Thread Adrian Klaver

On 4/3/24 22:23, Adnan Dautovic wrote:

Hi everyone,


I have some trouble using postgres_fdw in order to display some data from a 
Postgres database I do not control in a Postgres database that I do control. I 
filled out the form from the wiki below and would appreciate any tips.


* A description of what you are trying to achieve and what results you expect.:
I am trying to import the public schema of a Postgres instance I do not control (I will call it 
"remote"), but have read-only access to, into a Postgres instance I fully control (I will 
call it "local"), using the foreign data wrapper postgres_fdw.


Define 'read-only', especially as it applies to the privileges on the 
public schema.


Per Tom Lane's comments on timezone, log into the remote server and do:

SHOW timezone;

SET timezone = 'etc/UTC';

SET timezone = 'UTC';



Does anyone have an idea for me?

Kind regards,


Adnan Dautovic





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Moving delta data faster

2024-04-03 Thread Adrian Klaver

On 4/3/24 20:54, yudhi s wrote:
On Thu, Apr 4, 2024 at 2:41 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


On 4/3/24 13:38, yudhi s wrote:
 >   Hi All,
 >   It's postgresql database version 15.4. We have a requirement in
which
 > we will be initially moving full table data for 3-4 tables, from
source
 > database to target(i.e. postgres) . Maximum number of rows will be
 > ~10million rows in those tables. Then subsequently these rows
will be
 > inserted/updated based on the delta number of rows that got
 > inserted/updated in the source database. In some cases these changed
 > data can flow multiple times per day to the downstream i.e. postgres
 > database and in other cases once daily.

What is the source database?

Can it be reached with a FDW?:

https://wiki.postgresql.org/wiki/Foreign_data_wrappers
<https://wiki.postgresql.org/wiki/Foreign_data_wrappers>

Can the delta on the source be output as CSV?



Thank you Adrian.

And one thing i forgot to mention this target postgresql database would 
be on AWS RDS whereas the source Oracle databases is on premise. I think 
we don't have the FDW extension currently in place but we can get that. 
I am just not able to understand clearly  though, but do you mean export 
the data from source using CSV and do truncate and import on target. And 
as these data will be moved through the network won't that cause slowness?


The source database here is Oracle database. Correct me if wrong, it 
looks like foreign data wrapper is like a DB link. Or do you mean 
writing a query on the target database (which can be UPSERT or MERGE) 
but will be joining the table from the source database through the 
DBlink/DDW? But my question was whether we should use UPSERT or MERGE 
for comparing and loading the delta records to the target postgresql 
database. Want to understand which is more performant , as I see in the 
past Merge having performance issues in the past, but not very sure 
about that.


My motivation was to get some basic information about your setup and 
what you are trying to achieve.


If I understand correctly you have:

1) An Oracle database with tables that you want to copy the complete 
data from to a Postgres database. For this sort of thing 
COPY(https://www.postgresql.org/docs/current/sql-copy.html) on the 
Postgres end using CSV data generated from the source is probably the 
quickest bulk load method.


2) After the initial load you want to do follow up INSERT/UPDATEs based 
on a delta of the source tables relative to the initial load. This is 
still a bit of mystery to me. How are determining the delta: a) On the 
source end entirely or b) Target relative to source? Also what is the 
anticipated size of the delta per transfer?


Additional information needed:

1) Network distance between source and target?

2) Network capacity?

3) Expected load on both source and target servers from other operations?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Moving delta data faster

2024-04-03 Thread Adrian Klaver

On 4/3/24 13:38, yudhi s wrote:

  Hi All,
  It's postgresql database version 15.4. We have a requirement in which 
we will be initially moving full table data for 3-4 tables, from source 
database to target(i.e. postgres) . Maximum number of rows will be 
~10million rows in those tables. Then subsequently these rows will be 
inserted/updated based on the delta number of rows that got 
inserted/updated in the source database. In some cases these changed 
data can flow multiple times per day to the downstream i.e. postgres 
database and in other cases once daily.


What is the source database?

Can it be reached with a FDW?:

https://wiki.postgresql.org/wiki/Foreign_data_wrappers

Can the delta on the source be output as CSV?



  Want to understand , if we should use upsert(insert on conflict) or 
merge statements or anything else in such a scenario so as to persist 
those delta records faster in the target database, while making the 
system online to the users?


Regards
Yudhi


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: What linux version to install ?

2024-04-02 Thread Adrian Klaver

On 4/2/24 16:53, David Gauthier wrote:

This is what we are running...
Red Hat Enterprise Linux Server release 7.9 (Maipo)

In our pandora distrib, I see PG v15.3.  I was wondering if there is 
something even better. Can't seem to pin this down using 
https://www.postgresql.org/download/linux/redhat/ 
<https://www.postgresql.org/download/linux/redhat/>.


1) https://yum.postgresql.org/news/rhel7-postgresql-rpms-end-of-life/

"EOL announcement for RHEL 7

PostgreSQL RPM repo stopped adding new packages to the RHEL 7 repo as of 
Aug 2023, including PostgreSQL 16.


We will maintain older major releases until each major release is EOLed 
by PostgreSQL project. Please visit here for latest release dates for 
each major release.


If you have any questions, please either email to 
pgsql-pkg-...@lists.postgresql.org, or create a ticket at our redmine.

"

2) What is 'pandora distrib'?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Getting wrong datetime in database using insert into table query.

2024-04-02 Thread Adrian Klaver

On 4/2/24 01:58, Saksham Joshi wrote:

OS: Windows 10
Psycopg version: 2.9.9
Python version: 3.11.6
PostgreSQL version: 11
pip version : 24.0
1: what you did
We are using 'psycopg2-binary' library to connect to my postgresql 
hosted on Azure.We have created a table named 'apilog' to store our api 
logs using 'Insert Into table' query.We have specifically added two 
columns named create_date and update_date with 'timestamp with time 
zone' property enabled.I only update create_date for each log locally 
using python and i expected update_date column to automatically update 
the datetime when the transaction is committed at the end in python.

2: what you expected to happen
I expected to see update_date column returning datetime values which are 
similar to the time the transaction is committed in python however 
instead the value seems to returning datetime which is more closer to 
the time db connection is established.

3: what happened instead
The datetime value in update_date is coming earlier than the create_date 
value of even the very first log which is creating discrepancy and 
making it difficult to track the exact time logs are committed into 
database.


For example:
This query INSERT INTO api_log(log_detail,create_date)
VALUES('example log 1', datetime.datetime.utcnow'),('example log 2', 
datetime.datetime.utcnow')
Should ideally return update_date which is older than 'example log 2' 
create_date but it is returning a datetime which is even earlier than 
'example log 1' create_date.



Read:

https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Query on Postgres SQL transaction

2024-03-30 Thread Adrian Klaver

On 3/30/24 03:14, Bandi, Venkataramana - Dell Team wrote:

Hi,

Clarifying the problem statement again, Multiple requests are getting to our 
application server and using hibernate framework to persist the data into 
Postgres SQL DB but for one of the request, it is also similar request like 
other requests and there are no differences b/w these requests in terms of 
different OS, different network etc. but data is not persisting few times.


You have fallen into the trap of working the problem(repeating the 
problem definition) instead of working the solution, taking the steps to 
solve it. Start with acknowledging that this "...there are no 
differences b/w these requests ..." is not the case, otherwise we would 
not be having this discusion. This is going to involve working through 
the process from the front end to the database.


Start with:

How do you know which data is not persisting?

Did the INSERT entry you show below persist?




We have enabled below properties in postgresql.conf file and verified but 
didn't get any findings about the transaction and below log statements are 
writing in our data store logs.

log_statement = 'all'
logging_collector = on
log_min_messages = debug5
log_min_error_statement = debug5

2024-02-19 15:21:54.850 +08 [1876] LOG:  execute S_48: insert into xxx 
(f_schedule_name,f_id,f_totaldataredtn,f_invalidationtime,f_statuscode,f_module,f_app_type,f_dbbackuptype,f_is_compressed,f_proxy,f_size,f_sizeprotected,f_groupjobid,f_status,f_bytesmodifiednotsent,f_sizetransferredboffset,f_bytesmodifiedsent,f_errcode,f_jobid2,f_media_server,f_starttime,f_storageid,f_pool,f_queuestart,f_sizescannedboffset,f_errorcodesummary,f_ncopies,f_sizeprotectedboffset,f_snap_target_platform,f_backup_servername,f_nfiles,f_expiry,f_owner,f_policy_id,f_parentjobid,f_sub_name,f_completion_status,f_endtime,f_filesscanned,f_idle_wait,f_storage_unit,f_group_id,f_backup_set,f_ntries,f_job_name,f_level,f_agent_name,f_failed_copies,f_restarted_job,f_success_copies,f_domain_id,f_snap_target,f_jobid,f_request_id,f_pluginname,f_sizetransferred,f_is_snap,f_node_id,f_workflow_id,f_action_name,f_agent_id,f_instancename,f_session,f_totalobjdedup,f_changedbytes,f_sizeboffset,f_dedupredtn,f_statuscodesummary,f_workflow_jobid,f_snap_policy,f_size_copies,f_sizescanned,f_sub_id,f_archive_flag,f_nfilesnot,f_media_wait,f_snap_creation,f_effective_path)
 values 
($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78)
2024-02-19 15:21:54.851 +08 [10928] DEBUG:  bind  to 
2024-02-19 15:21:54.852 +08 [10928] DEBUG:  CommitTransaction(1) name: unnamed; 
blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0

Could you let us know any other way to trace out these kind of DB transactions?


Regards,
Venkat


Internal Use - Confidential
-Original Message-----
From: Adrian Klaver 
Sent: Wednesday, March 27, 2024 9:32 PM
To: Bandi, Venkataramana - Dell Team ; Greg Sabino 
Mullane 
Cc: pgsql-general@lists.postgresql.org; Kishore, Nanda - Dell Team 
; Alampalli, Kishore 

Subject: Re: Query on Postgres SQL transaction


[EXTERNAL EMAIL]

On 3/27/24 04:29, Bandi, Venkataramana - Dell Team wrote:

Hi,

As l already mentioned, for this specific node also data is persisting but 
sometimes(randomly) data is not persisting.


How do you know which data is not persisting?



As you mentioned our application doesn't have any restrictions on OS level and 
on network etc.

different OS or OS version, different encoding, different location on the 
network, different data it is working, etc.


I don't understand what the above is saying. Do you mean there are differences 
in these attributes between the nodes or no differences?

Also please do not top post, use either bottom or inline posting per:

https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Posting_style__;!!LpKI!glyl28rP3t6cLe3s9tF3X5_4YU28-qPJsOaPHUjX9F01s4DCAXmZVedjlyKpvXlv-TgG-800u0Drq4lWC2f8CObdQX3ijBWevQ$
 [en[.]wikipedia[.]org]



Regards,
Venkat




--
Adrian Klaver
adrian.kla...@aklaver.com



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Grants and privileges issue

2024-03-28 Thread Adrian Klaver




On 3/28/24 2:10 PM, sud wrote:

Hi, It's postgres 15.4.

We want to give required privilege to certain users or roles and ensure 
to not to provide any elevated privilege. I have below questions,


I would suggest spending some time here:

https://www.postgresql.org/docs/current/ddl-priv.html

It should answer many of your questions.



1)I am seeing in many places, we have "usage on schema" privilege given. 
along with "grant select on  to " for the objects of 
the schema (something as below). So I wanted to understand, what exact 
privilege "grant usage on schema  to " will provide 
which the "select on" privilege won't?


grant usage on schema  to ;
grant select on  schema1.tab1 to ;

2)Additionally , when we are trying to give select privilege on "cron" 
and "partman" schema to a role (something as below) , so that anybody 
logging through that role would be able to see/fetch the data from the 
tables inside cron and partman schema. its giving output '/no privileges 
were granted for cron/partman/part_config/' message. And during 
accessing that object from the cron/partman schema through that role, it 
errors out with an access denied message. So I wanted to understand the 
cause of this and how we should fix it , such that anybody logging in 
through that role can see/fetch the data from the cron and partman 
schema tables.


grant select on cron.job to ;
grant select on cron.job_run_details to ;
grant select on partman.part_config to ;

Regards
Sud


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Table level restore in postgres

2024-03-28 Thread Adrian Klaver

On 3/28/24 11:27, arun chirappurath wrote:

Dear all,

I am a new bie in postgres world

Suppose I have accidently deleted a table or deleted few rows ,is it 
safe to drop this table and restore just this table from custom backup 
to same database?


1) You can though depending on when you took the backup it might not be 
up to date.


2) Do you have replication(logical or binary) set up?

3) Do you know what was deleted?



Or should I create a new database and restore it there and then migrate 
the data?


That is overkill for a single table.



What is the general methodology used?



One way, create a new table that has the same structure as the one you 
want to restore, do a data only dump from the backup, rename the table 
name in the dump output to the new table name and restore the data to 
the new table, verify the data and then transfer all or part of the to 
existing table.





I tried it in a smaller database and it worked in same database..however 
dbeaver was throwing a warning saying database may get corrupted?


1) DBeaver thinks a lot of things are wrong that are not, I would use
the tools that ship with Postgres; psql, pg_dump, pg_restore, etc.

2) If you want to stay with DBeaver post the actual complete error 
message here.




Thanks,
Arun




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: How to interpret 'depends on' errors in pg_restore?

2024-03-28 Thread Adrian Klaver

On 3/28/24 08:57, Fire Emerald wrote:

Am 28. März 2024 15:00:06 schrieb Tom Lane :


Fire Emerald  writes:

Then i did a pg_restore -d target --verbose -Fc file.dump and saw in the
output this:



5145 0 730750 TABLE subpartitions backends_y2024w03 userA
;        depends on: 237
 and so on ...


That is not an error, it's just verbose display of one of the items
in the dump.


Well, I know it's not an error, but it's everything i got. There was no 
error shown. The command completed, but without anything imported.


Look in the Postgres log to see if there is more information.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Inquiry on Participating in the Korean Translation Project for PostgreSQL Documentation

2024-03-28 Thread Adrian Klaver

On 3/28/24 07:25, Daniel Gustafsson wrote:

On 28 Mar 2024, at 15:22, Adrian Klaver  wrote:

On 3/28/24 04:56, 김명준 wrote:

Hello,
I am deeply fascinated by the powerful features and flexibility of PostgreSQL 
and wish to share it with more Korean speakers. I am interested in contributing 
to the Korean translation of the PostgreSQL official documentation and would 
like guidance on how to begin this process.
1. 1. I am curious to know if there is an ongoing project for
translating PostgreSQL documentation into Korean or if a new project
needs to be initiated.
2. 2. I would like to inquire about any qualifications required to
participate in the translation work, or how one can get involved.
3. 3. I am interested in understanding if there are any style guides or
glossaries that need to be considered during the translation process
and how to access them.
4. 4. I wonder if there is a platform for collaborating or
communicating with other individuals interested in participating in
the translation project.
I aim to assist more developers and users in understanding and accessing 
PostgreSQL by translating its documentation into Korean. I would appreciate any 
information on the details of this project and how to participate.



Translations being done:

https://babel.postgresql.org/

Translator mailing list:

https://www.postgresql.org/list/pgsql-translators/

Translator Wiki:

https://wiki.postgresql.org/wiki/NLS


Do note the above references are for translating the strings in the program,
and not the documentation which is asked about here.  That being said, those
working on the Korean translations of strings are likely to be a good set of
people to start discussing with as they may have thought about it as well.


Yeah, my mistake.

It might also be useful to ask on:

https://www.postgresql.org/list/pgsql-docs/



--
Daniel Gustafsson



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Inquiry on Participating in the Korean Translation Project for PostgreSQL Documentation

2024-03-28 Thread Adrian Klaver

On 3/28/24 04:56, 김명준 wrote:

Hello,

I am deeply fascinated by the powerful features and flexibility of 
PostgreSQL and wish to share it with more Korean speakers. I am 
interested in contributing to the Korean translation of the PostgreSQL 
official documentation and would like guidance on how to begin this process.


 1. 1. I am curious to know if there is an ongoing project for
translating PostgreSQL documentation into Korean or if a new project
needs to be initiated.
 2. 2. I would like to inquire about any qualifications required to
participate in the translation work, or how one can get involved.
 3. 3. I am interested in understanding if there are any style guides or
glossaries that need to be considered during the translation process
and how to access them.
 4. 4. I wonder if there is a platform for collaborating or
communicating with other individuals interested in participating in
the translation project.

I aim to assist more developers and users in understanding and accessing 
PostgreSQL by translating its documentation into Korean. I would 
appreciate any information on the details of this project and how to 
participate.



Translations being done:

https://babel.postgresql.org/

Translator mailing list:

https://www.postgresql.org/list/pgsql-translators/

Translator Wiki:

https://wiki.postgresql.org/wiki/NLS



Thank you.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: After 10 -> 15 upgrade getting "cannot commit while a portal is pinned" on one python function

2024-03-27 Thread Adrian Klaver

On 3/27/24 18:00, Jeff Ross wrote:


On 3/27/24 17:35, Rob Sargent wrote:



On 3/27/24 17:05, Jeff Ross wrote:


On 3/27/24 15:44, Tom Lane wrote:


Perhaps "pinned" in the error message means "open"?
No, it means "pinned" ... but I see that plpython pins the portal
underlying any PLyCursor object it creates.  Most of our PLs do
that too, to prevent a portal from disappearing under them (e.g.
if you were to try to close the portal directly from SQL rather
than via whatever mechanism the PL wants you to use).


I added a cursor.close() as the last line called in that function and it
works again.

It looks to me like PLy_cursor_close does pretty much exactly the same
cleanup as PLy_cursor_dealloc, including unpinning and closing the
underlying portal.  I'm far from a Python expert, but I suspect that
the docs you quote intend to say "cursors are disposed of when Python
garbage-collects them", and that the reason your code is failing is
that there's still a reference to the PLyCursor somewhere after the
plpython function exits, perhaps in a Python global variable.

regards, tom lane



Thank you for your reply, as always, Tom!

Debugging at this level might well be over my paygrade ;-)

I just happy that the function works again, and that I was able to 
share a solution to this apparently rare error with the community.


Jeff

My read of Tom's reply suggests you still have work to do to find the 
other "reference" holding on to your cursor.


Yes, my read was the same.

There are exactly 3 references to that cursor now that I added the 
close() at the end.


Here are the first 2 (cursor renamed from the code I posted):

     plpy_cursor = plpy.cursor(schemas_query)
     while True:
     schema_rows = plpy_cursor.fetch(100)


If the above is the complete while loop how you expect it to break out 
of the loop?


Or did you do per Postgres docs?:

https://www.postgresql.org/docs/current/plpython-database.html

cursor = plpy.cursor("select num from largetable")
while True:
rows = cursor.fetch(batch_size)
if not rows:
break
for row in rows:
if row['num'] % 2:
odd += 1




The last is:

     plpy_cursor.close()

I don't know how to proceed further.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: After 10 -> 15 upgrade getting "cannot commit while a portal is pinned" on one python function

2024-03-27 Thread Adrian Klaver

On 3/27/24 16:35, Rob Sargent wrote:



On 3/27/24 17:05, Jeff Ross wrote:


On 3/27/24 15:44, Tom Lane wrote:


Perhaps "pinned" in the error message means "open"?
No, it means "pinned" ... but I see that plpython pins the portal
underlying any PLyCursor object it creates.  Most of our PLs do
that too, to prevent a portal from disappearing under them (e.g.
if you were to try to close the portal directly from SQL rather
than via whatever mechanism the PL wants you to use).


I added a cursor.close() as the last line called in that function and it
works again.

It looks to me like PLy_cursor_close does pretty much exactly the same
cleanup as PLy_cursor_dealloc, including unpinning and closing the
underlying portal.  I'm far from a Python expert, but I suspect that
the docs you quote intend to say "cursors are disposed of when Python
garbage-collects them", and that the reason your code is failing is
that there's still a reference to the PLyCursor somewhere after the
plpython function exits, perhaps in a Python global variable.

regards, tom lane



Thank you for your reply, as always, Tom!

Debugging at this level might well be over my paygrade ;-)

I just happy that the function works again, and that I was able to 
share a solution to this apparently rare error with the community.


Jeff

My read of Tom's reply suggests you still have work to do to find the 
other "reference" holding on to your cursor.


I would start with:

def logging(comment):
global database
<...>

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Query on Postgres SQL transaction

2024-03-27 Thread Adrian Klaver

On 3/27/24 04:29, Bandi, Venkataramana - Dell Team wrote:

Hi,

As l already mentioned, for this specific node also data is persisting but 
sometimes(randomly) data is not persisting.


How do you know which data is not persisting?



As you mentioned our application doesn't have any restrictions on OS level and 
on network etc.

different OS or OS version, different encoding, different location on the 
network, different data it is working, etc.


I don't understand what the above is saying. Do you mean there are 
differences in these attributes between the nodes or no differences?


Also please do not top post, use either bottom or inline posting per:

https://en.wikipedia.org/wiki/Posting_style



Regards,
Venkat




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Query on Postgres SQL transaction

2024-03-25 Thread Adrian Klaver

On 3/25/24 00:18, Bandi, Venkataramana - Dell Team wrote:

Hi,

Please find my inline comments for your questions.


Regards,
Venkat


Internal Use - Confidential
-Original Message-
From: Adrian Klaver 
Sent: Tuesday, March 19, 2024 9:33 PM
To: Bandi, Venkataramana - Dell Team ; Greg Sabino 
Mullane 
Cc: pgsql-general@lists.postgresql.org; Kishore, Nanda - Dell Team 
; Alampalli, Kishore 

Subject: Re: Query on Postgres SQL transaction


[EXTERNAL EMAIL]

On 3/19/24 02:18, Bandi, Venkataramana - Dell Team wrote:

Hi Greg,

We are using hibernate framework to persist the data into Postgres SQL
DB and data is persisting and committing for all the clients but one
of the client data is not inserted into DB.


What is different about that client?
Ans: In our application data is getting from different nodes(systems) and 
persisting into Postgres SQL DB but for one of the nodes(system) data is not 
persisting and sometimes data is persisting for this node also. We have to 
trace out the transaction why data is not persisting sometimes.


That we knew already. What I was after was whether that particular 
node(system) is different in some important way from the others. For 
instance different OS or OS version, different encoding, different 
location on the network, different data it is working, etc.


Define what you have done to trace the path of the transaction.


Are all the clients passing data through the same instance of the framework?
Ans: Since it is a monolithic architecture application, it is running on same 
instance.
Are you sure that the client is pointed at the correct database?
Ans: Yes, its pointed to correct database and with same database connection, 
data is persisting for other nodes.
Is the log entry below from that client?
Ans: Yes




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Empty materialized view

2024-03-24 Thread Adrian Klaver

On 3/24/24 14:27, Thiemo Kellner wrote:

Feeling quite dumb now. But then, there neither is data visible in 
the install session.


insert data into TASK_DEPENDENCY⠒V
INSERT 0 34


The above says the data was inserted.


But not into the MV but into TASK_DEPENDENCY⠒V.


Where and when was the count query run?






Excerpt of the according protocol:

## tenth level ##
Set materialised view QUERY_PER_TASK⠒MV up
psql:views/QUERY_PER_TASK⠒MV.pg_sql:3: HINWEIS:  materialisierte Sicht 
»query_per_task⠒mv« existiert nicht, wird übersprungen

DROP MATERIALIZED VIEW
SELECT 0
REFRESH MATERIALIZED VIEW
COMMENT
COMMIT


At above you have not entered the data into the tables the MV depends on 
so SELECT 0 is reasonable.



# insert data #
## first level ##
insert data into CENTRICITY
INSERT 0 2
COMMIT
insert data into DIRECTION
INSERT 0 8
COMMIT
insert data into GOOD_CLASS
INSERT 0 15
COMMIT
insert data into NODE_TYPE
INSERT 0 3
COMMIT
insert data into REGION
INSERT 0 15
COMMIT
insert data into TASK_TYPE
INSERT 0 5
COMMIT
## second level ##
insert data into AREA
INSERT 0 16
COMMIT
insert data into DISTANCE⠒V
INSERT 0 3
COMMIT
insert data into GOOD⠒V
INSERT 0 164
COMMIT
insert data into MAP⠒V
INSERT 0 41
COMMIT
## third level ##
insert data into DIRECT_NEIGHBOUR
INSERT 0 8
INSERT 0 16
COMMIT
### Scandinavia ###
insert data into NODE⠒V
INSERT 0 112
COMMIT
insert data into PRODUCTION⠒V
INSERT 0 11
COMMIT
insert data into TASK⠒V
INSERT 0 56
COMMIT
## forth level ##
Scandinavia
insert data into DROP_OFF⠒V
INSERT 0 91
COMMIT
insert data into PICK_UP⠒V
INSERT 0 73
COMMIT
insert data into TASK_DEPENDENCY⠒V
INSERT 0 34
COMMIT
  count
---
     66
(1 row)

  count
---
  0
(1 row)


The 0 count above represents the below correct? :

select count(*) from SNOWRUNNER.QUERY_PER_TASK⠒MV;

If so, again that is reasonable as I don't see anywhere you refresh 
QUERY_PER_TASK⠒MV after the underlying tables have data entered. At this 
point it is still at the state you left it at here:


## tenth level ##
Set materialised view QUERY_PER_TASK⠒MV up
psql:views/QUERY_PER_TASK⠒MV.pg_sql:3: HINWEIS:  materialisierte Sicht 
»query_per_task⠒mv« existiert nicht, wird übersprungen

DROP MATERIALIZED VIEW
SELECT 0
REFRESH MATERIALIZED VIEW
COMMENT
COMMIT



COMMIT





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Empty materialized view

2024-03-24 Thread Adrian Klaver

On 3/24/24 13:58, Thiemo Kellner wrote:

Am 24.03.2024 um 21:50 schrieb Adrian Klaver:

On 3/24/24 13:36, Thiemo Kellner wrote:
It does depending on the order of viewing. Namely if you viewed the 
'old' empty MV in the outside session before you dropped/created the 
'new' MV and committed the changes.


Something like the viewing session is in a transaction before the 
(re-)creation of the mv?


The view session is on auto commit. (It's sole purpose to query stuff 
and not to have explicitly terminate transactions do to syntax errors 
and so on.)


Autocommit will only affect actions in that session, it will not make 
the other sessions actions visible. That depends on the other sessions 
committing actions.


See:

https://www.postgresql.org/docs/current/transaction-iso.html




Feeling quite dumb now. But then, there neither is data visible in the 
install session.


insert data into TASK_DEPENDENCY⠒V
INSERT 0 34


The above says the data was inserted.


COMMIT
  count
---
  0



Where and when was the count query run?


(1 row)


Thanks for taking care.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Empty materialized view

2024-03-24 Thread Adrian Klaver

On 3/24/24 13:36, Thiemo Kellner wrote:

Am 24.03.2024 um 21:30 schrieb Adrian Klaver:

On 3/24/24 13:11, Thiemo Kellner wrote:
Confirmed in the same session that created it or in a different session?


Different session, not knowing what that mattered


It does depending on the order of viewing. Namely if you viewed the 
'old' empty MV in the outside session before you dropped/created the 
'new' MV and committed the changes.




Excerpt of the installation protocol:
…
## tenth level ##
Set materialised view QUERY_PER_TASK⠒MV up
psql:views/QUERY_PER_TASK⠒MV.pg_sql:3: HINWEIS:  materialisierte Sicht 
»query_per_task⠒mv« existiert nicht, wird übersprungen

DROP MATERIALIZED VIEW
SELECT 0
REFRESH MATERIALIZED VIEW
COMMENT
COMMIT
# insert data #
…





select count(*) from SNOWRUNNER.TASK_DEPENDENCY⠒V;


That is not the view you showed in your attached SQL in your previous 
post nor what is mentioned above. Also if I am following your naming 
scheme it is a regular view not a materialized view.








--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Empty materialized view

2024-03-24 Thread Adrian Klaver

On 3/24/24 13:11, Thiemo Kellner wrote:



Am 24.03.2024 um 20:56 schrieb Erik Wienhold:

Maybe you executed REFRESH in a transaction but did not commit it?


While I can see the point for the refresh (but there actually is a 
commit), I cannot hold it valid for a create with data when the mv 
actually is created (confirmed by being empty).


Confirmed in the same session that created it or in a different session?




I can't find any materialized view in your archive.


Oh sh*. That is the file, I forgot to commit. Please find it attached now.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Empty materialized view

2024-03-24 Thread Adrian Klaver

On 3/24/24 11:12, Thiemo Kellner wrote:

Hi

I have created a materialized view with "with data". And I refreshed it 
with "with data". The query of the mv returns records when executed 
outside the mv. I would appreciate help with respect to what I miss that 
my mv is empty. You might want to have a look at the code attached.


That does not contain the statements mentioned above. Provide a simple 
test case as code inline to your reply.




Kind regards

Thiemo


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Is this a buggy behavior?

2024-03-24 Thread Adrian Klaver

On 3/24/24 08:28, Thiemo Kellner wrote:


Am 24.03.2024 um 16:17 schrieb Tom Lane:


To do that, we'd have to remember that you'd said NULL, which we
don't: the word is just discarded as a noise clause.  Considering
that this usage of NULL isn't even permitted by the SQL standard,
that seems like a bit too much work.


If I understood correctly, only the NOT NULL expression gets remembered, 
but the NULL gets discarded. No, I do not quite get it. Somehow, it has 
to be decided whether to create a "check constraint" or not, but this 
information is not available any more when creating the primary key? Not 
even in some kind of intermediary catalogue?


"Considering that this usage of NULL isn't even permitted by the SQL 
standard" is in my opinion a strange argument. To me, it is similar as 
to say, well a column has a not null constraint and that must be enough, 
we do not check whether the data complies when inserting or updating. 
Sure, my example has lots more side effect than silently do the right 
thing.


That is sort of the point the OPs example was for a CREATE TABLE and 
hence had no data. The OP also wanted a PK and per:


https://www.postgresql.org/docs/current/sql-createtable.html

"PRIMARY KEY enforces the same data constraints as a combination of 
UNIQUE and NOT NULL. "


they got a compound PK with the specified constraints.

If they had being doing a ALTER TABLE to add a PK over the columns after 
null values where added they result would be different:


CREATE TABLE test1
(
c1 varchar(36)   NULL ,
c2 varchar(36)  NOT NULL
) ;

insert into test1 values (null, 'test');

alter table test1 add constraint test_pk PRIMARY KEY(c1,c2);
ERROR:  column "c1" of relation "test1" contains null values



Please do not get me wrong. I can totally understand that something 
needs to much work to implement. I am just puzzled.





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Timing out A Blocker Based on Time or Count of Waiters

2024-03-22 Thread Adrian Klaver

On 3/22/24 12:41, Fred Habash wrote:
Lock tree: All PID's waiting on a lock held by/blocked by single blocker 
PID. Similar to what you see in the output of this script: 
https://github.com/dataegret/pg-utils/blob/master/sql/locktree.sql 
<https://github.com/dataegret/pg-utils/blob/master/sql/locktree.sql> . 
It uses the dot connotation to draw a tree.


Waiters: The PID (first column) returned by this query, for example

SELECT
     activity.pid,
     activity.usename,
     activity.query,
     blocking.pid AS blocking_id,
     blocking.query AS blocking_query
FROM pg_stat_activity AS activity
JOIN pg_stat_activity AS blocking ON blocking.pid = 
ANY(pg_blocking_pids(activity.pid));


DDL example: An 'alter table ... alter column ...' would cause all DML 
and SELECT statements to wait/block.


Hope this answers your question. Thanks for your interest.


Yes.

To me the issue is,

"Facing an issue where sometimes humans login to a database and run DDL 
statements causing a long locking tree of over 1000 waiters."


where the problem is people running disruptive statements without regard 
to planning or what else is happening on the database. I am not sure 
that dropping a statement just based on a count is progress. If the DDL 
is important then it needs to be run at some point and you are 
conceivably back at the same blocking issue. This then leads to two 
possibilities either the DDL is not important and shouldn't be run or it 
is and some thought and timing needs to be applied before it is run.




On Fri, Mar 22, 2024 at 12:32 PM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 3/22/24 09:25, Fred Habash wrote:
 > Facing an issue where sometimes humans login to a database and
run DDL
 > statements causing a long locking tree of over 1000 waiters. As a

The above needs more explanation:

1) Define locking tree.

2) Define waiters.

3) Provide examples of the DDL.


 > workaround, we asked developers to always start their DDL sessions
 > with 'SET lock_timeout = 'Xs'.
 >
 > I reviewed the native lock timeout parameter in Postgres and
found 7.
 > None seem to be related to blocker timeouts directly.
 >
 > idle_in_transaction_session_timeout
 > idle_session_timeout
 > lock_timeout: How long a session waits for a lock
 > statement_timeout
 > authentication_timeout
 > deadlock_timeout
 > log_lock_waits
 >
 > Instead, I put together a quick procedure that counts waiter
sessions
 > for a given blocker and terminates it if waiter count exceeds a
threshold.
 >
 > Is there not a native way to ...
 > 1. Automatically time out a blocker
 > 2. A metric that shows how many waiters for a blocker?
 >
 > Thanks
 > --
     >
 > 
 > Thank you
 >
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--


Thank you




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Timing out A Blocker Based on Time or Count of Waiters

2024-03-22 Thread Adrian Klaver

On 3/22/24 09:25, Fred Habash wrote:
Facing an issue where sometimes humans login to a database and run DDL 
statements causing a long locking tree of over 1000 waiters. As a 


The above needs more explanation:

1) Define locking tree.

2) Define waiters.

3) Provide examples of the DDL.


workaround, we asked developers to always start their DDL sessions 
with 'SET lock_timeout = 'Xs'.


I reviewed the native lock timeout parameter in Postgres and found 7. 
None seem to be related to blocker timeouts directly.


idle_in_transaction_session_timeout
idle_session_timeout
lock_timeout: How long a session waits for a lock
statement_timeout
authentication_timeout
deadlock_timeout
log_lock_waits

Instead, I put together a quick procedure that counts waiter sessions 
for a given blocker and terminates it if waiter count exceeds a threshold.


Is there not a native way to ...
1. Automatically time out a blocker
2. A metric that shows how many waiters for a blocker?

Thanks
--


Thank you




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: could not open file "global/pg_filenode.map": Operation not permitted

2024-03-22 Thread Adrian Klaver

On 3/22/24 07:01, Nick Renders wrote:

On 13 Mar 2024, at 12:35, Stephen Frost wrote:



We now have a second machine with this issue: it is an Intel Mac mini running 
macOS Sonoma (14.4) and PostgreSQL 16.2.
This one only has a single Data directory, so there are no multiple instances 
running.

I installed Postgres yesterday and restored a copy from our live database in 
the Data directory. The Postgres process started up without problems, but after 
40 minutes it started throwing the same errors in the log:

2024-03-21 11:49:27.410 CET [1655] FATAL:  could not open file 
"global/pg_filenode.map": Operation not permitted
2024-03-21 11:49:46.955 CET [1760] FATAL:  could not open file 
"global/pg_filenode.map": Operation not permitted
2024-03-21 11:50:07.398 CET [965] LOG:  could not open file 
"postmaster.pid": Operation not permitted; continuing anyway

I stopped and started the process, and it continued working again until around 
21:20, when the issue popped up again. I wasn't doing anything on the machine 
at that time, so I have no idea what might have triggered it.


Have you looked at the OS system logs?



Is there perhaps some feature that I can enable that logs which processes use 
these 2 files?

Thanks,

Nick Renders




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: After 10 -> 15 upgrade getting "cannot commit while a portal is pinned" on one python function

2024-03-20 Thread Adrian Klaver

On 3/20/24 15:52, Jeff Ross wrote:

On 3/20/24 16:25, Adrian Klaver wrote:


On 3/20/24 15:18, Jeff Ross wrote:

Greetings!

I built a trigger fired process that copies an "order" from our 
production database to our dev database.  An order, in this case, is 
an initial row from a table and all of the rows in all of the tables 
in that database/schema that are needed to satisfy all of the foreign 
key constraints for the original insert. Through a web page, one of 
our folks can select a schema and an order id to copy.  That 
information is then inserted into a table.  A trigger attached to 
that table takes care of copying the necessary rows using a function 
that uses both plython3u and psycopg2.  I can supply the source code 
if that will help.


I think that will help, especially the interaction between psycopg2 
and plpython3u.



As requested:

https://openvistas.net/copy_orders_to_dev.html


1) I have not gone through this thoroughly enough to figure out what is 
going on.


2) Things I have noticed, may not be relevant.

a) from psycopg2 import sql
   Never used.

b) #prod_database_connection.set_session(autocommit=True)
   #dev_database_connection.set_session(autocommit=True)

   Why are they commented out?

c) prod_database_connection_string = "host='pgbouncer' dbname='%s' 
application_name = '%s'"


dev_database_connection_string = "host='pgbouncer' dbname='%s' 
application_name = '%s'"


What version of PgBouncer?

How is it setup?

d) Why predefine all those cursors()?

e) Why is database global?



Jeff





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: After 10 -> 15 upgrade getting "cannot commit while a portal is pinned" on one python function

2024-03-20 Thread Adrian Klaver

On 3/20/24 15:52, Jeff Ross wrote:

On 3/20/24 16:25, Adrian Klaver wrote:


On 3/20/24 15:18, Jeff Ross wrote:

Greetings!

I built a trigger fired process that copies an "order" from our 
production database to our dev database.  An order, in this case, is 
an initial row from a table and all of the rows in all of the tables 
in that database/schema that are needed to satisfy all of the foreign 
key constraints for the original insert. Through a web page, one of 
our folks can select a schema and an order id to copy.  That 
information is then inserted into a table.  A trigger attached to 
that table takes care of copying the necessary rows using a function 
that uses both plython3u and psycopg2.  I can supply the source code 
if that will help.


I think that will help, especially the interaction between psycopg2 
and plpython3u.



As requested:

https://openvistas.net/copy_orders_to_dev.html


Haven't had a chance to go through this yet. I'm going to say though 
that Tom Lane is looking for a shorter generic case that anyone could 
run on their system.




Jeff





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: After 10 -> 15 upgrade getting "cannot commit while a portal is pinned" on one python function

2024-03-20 Thread Adrian Klaver

On 3/20/24 15:18, Jeff Ross wrote:

Greetings!

I built a trigger fired process that copies an "order" from our 
production database to our dev database.  An order, in this case, is an 
initial row from a table and all of the rows in all of the tables in 
that database/schema that are needed to satisfy all of the foreign key 
constraints for the original insert.  Through a web page, one of our 
folks can select a schema and an order id to copy.  That information is 
then inserted into a table.  A trigger attached to that table takes care 
of copying the necessary rows using a function that uses both plython3u 
and psycopg2.  I can supply the source code if that will help.


I think that will help, especially the interaction between psycopg2 and 
plpython3u.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Dropping a temporary view?

2024-03-20 Thread Adrian Klaver

On 3/20/24 13:00, Celia McInnis wrote:



On Wed, Mar 20, 2024 at 2:15 PM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:


__


On 3/20/24 10:54 AM, Celia McInnis wrote:

Comments below more to sort out the process in my head then anything
else.

Hi Adrian

The only behaviour changed for the debugging was to make the view
non-temporary, so that I could verify in psql that the content of
the view was what I wanted it to be. Debugging CGI software can be
quite difficult, so it's always good to have debugging hooks as a
part of the software - I know that I always have a DEBUG flag
which, if on, prints out all kinds of stuff into a debug file, and
I just had my software set a different name for DEBUG mode's
non-temporary view than I was using for the temporary view, as
advised by Christophe Pettus.


This indicates you are working in different sessions and therefore
creating a regular view to see the same data in all sessions.

Previously this regular view was named the same as the temporary
view you create in the production database.

Now you name that regular view a unique name not to conflict with
the temporary view name(s).


No, unfortunately I didn't do an explain on the slow query - and
it's too late now since the views are removed. However, I never
had a delay when waiting for the view to be created in my web
software, so, I'll just proceed being more careful and hope that
the delay seen was due to some big mess I created.


In your original post you say the delay occurred on a SELECT not a
CREATE VIEW after:

Correct. But the initial CREATE VIEW was done  as a SELECT from the 
database, so if the create view was quick, I thought that the select 
from the view would be equally quick. Is this a faulty assumption?



https://www.postgresql.org/docs/current/sql-createview.html

"CREATE VIEW defines a view of a query. The view is not physically 
materialized. Instead, the query is run every time the view is 
referenced in a query."


In addition the 'canned' query is running against tables(excepting the 
VALUES case) which in turn maybe getting queries(SELECT, INSERT, UPDATE, 
DELETE) from other sources. This means that each SELECT from a view 
could be seeing an entirely different state.


The above is in reference to a regular(temporary or not) view not a:

https://www.postgresql.org/docs/current/sql-creatematerializedview.html

"CREATE MATERIALIZED VIEW defines a materialized view of a query. The 
query is executed and used to populate the view at the time the command 
is issued (unless WITH NO DATA is used) and may be refreshed later using 
REFRESH MATERIALIZED VIEW."





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Dropping a temporary view?

2024-03-20 Thread Adrian Klaver


On 3/20/24 10:54 AM, Celia McInnis wrote:

Comments below more to sort out the process in my head then anything else.

Hi Adrian

The only behaviour changed for the debugging was to make the view 
non-temporary, so that I could verify in psql that the content of the 
view was what I wanted it to be. Debugging CGI software can be quite 
difficult, so it's always good to have debugging hooks as a part of 
the software - I know that I always have a DEBUG flag which, if on, 
prints out all kinds of stuff into a debug file, and I just had my 
software set a different name for DEBUG mode's non-temporary view than 
I was using for the temporary view, as advised by Christophe Pettus.


This indicates you are working in different sessions and therefore 
creating a regular view to see the same data in all sessions.


Previously this regular view was named the same as the temporary view 
you create in the production database.


Now you name that regular view a unique name not to conflict with the 
temporary view name(s).


No, unfortunately I didn't do an explain on the slow query - and it's 
too late now since the views are removed. However, I never had a delay 
when waiting for the view to be created in my web software, so, I'll 
just proceed being more careful and hope that the delay seen was due 
to some big mess I created.


In your original post you say the delay occurred on a SELECT not a 
CREATE VIEW after:


"DROP VIEW tempview;
DROP VIEW

postgresql did that successfully, but when I then did

select * from tempview:

"

Where the select would have been on the regular view named tempview.




Thanks,
Celia McInnis

On Wed, Mar 20, 2024 at 1:01 PM Adrian Klaver 
 wrote:


On 3/20/24 09:51, Celia McInnis wrote:
> The view is being used in some web query software that multiple
people
> will be accessing and the contents of the view depend on what
the person
> is querying, so I think that temporary views or tables are a
good idea.
> I change to non-temporary views or tables (in a test version of the
> software which is not web-crawl-able) when I'm trying to debug
things,
> and I guess I have to be careful to clean those up when I switch
back to
> the temporary tables/views.

Why change behavior for the tests? Seems that sort of negates the
value
of the testing.

Have you run EXPLAIN ANALYZE on the problem query?


>
>
>
> On Wed, Mar 20, 2024 at 11:46 AM Adrian Klaver
> mailto:adrian.kla...@aklaver.com>>
wrote:
>
>     On 3/20/24 08:39, Celia McInnis wrote:
>      > Ok, thanks - so I guess that means that if there is both a
>     temporary and
>      > a non temporary view called "tempvie",
>      >
>      > DROP VIEW tempview;
>      >
>      > will remove the 1st tempview found, which with my path is the
>     temporary
>      > one. Is there some reason why it then took 7 minutes to
select
>     from the
>      > non-temporary view tempview after I dropped the temporary
view
>     tempview?
>      >
>      > I have sometimes had some very long query times when
running query
>      > software, and maybe they are resulting from my switching
between
>      > temporary and non-temporary views of the same name while
>     debugging. If
>      > so, is there something I should be doing to clean up any
temporary
>      > messes I am creating?
>
>     What is the purpose of the temp view over the the regular
view process?
>
>     How do they differ in data?
>
>     Is all the above happening in one session?
>
>     Have you run EXPLAIN ANALYZE on the select from the regular
view?
>
>      >
    >      > Thanks,
>      > Celia McInnis
>      >
>
>
>     --
>     Adrian Klaver
> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
>

-- 
Adrian Klaver

adrian.kla...@aklaver.com


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: Dropping a temporary view?

2024-03-20 Thread Adrian Klaver

On 3/20/24 09:51, Celia McInnis wrote:
The view is being used in some web query software that multiple people 
will be accessing and the contents of the view depend on what the person 
is querying, so I think that temporary views or tables are a good idea. 
I change to non-temporary views or tables (in a test version of the 
software which is not web-crawl-able) when I'm trying to debug things, 
and I guess I have to be careful to clean those up when I switch back to 
the temporary tables/views.


Why change behavior for the tests? Seems that sort of negates the value 
of the testing.


Have you run EXPLAIN ANALYZE on the problem query?






On Wed, Mar 20, 2024 at 11:46 AM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 3/20/24 08:39, Celia McInnis wrote:
 > Ok, thanks - so I guess that means that if there is both a
temporary and
 > a non temporary view called "tempvie",
 >
 > DROP VIEW tempview;
 >
 > will remove the 1st tempview found, which with my path is the
temporary
 > one. Is there some reason why it then took 7 minutes to select
from the
 > non-temporary view tempview after I dropped the temporary view
tempview?
 >
 > I have sometimes had some very long query times when running query
 > software, and maybe they are resulting from my switching between
 > temporary and non-temporary views of the same name while
debugging. If
 > so, is there something I should be doing to clean up any temporary
 > messes I am creating?

What is the purpose of the temp view over the the regular view process?

How do they differ in data?

Is all the above happening in one session?

Have you run EXPLAIN ANALYZE on the select from the regular view?

 >
     > Thanks,
 > Celia McInnis
 >


-- 
Adrian Klaver

    adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Dropping a temporary view?

2024-03-20 Thread Adrian Klaver

On 3/20/24 08:39, Celia McInnis wrote:
Ok, thanks - so I guess that means that if there is both a temporary and 
a non temporary view called "tempvie",


DROP VIEW tempview;

will remove the 1st tempview found, which with my path is the temporary 
one. Is there some reason why it then took 7 minutes to select from the 
non-temporary view tempview after I dropped the temporary view tempview?


I have sometimes had some very long query times when running query 
software, and maybe they are resulting from my switching between 
temporary and non-temporary views of the same name while debugging. If 
so, is there something I should be doing to clean up any temporary 
messes I am creating?


What is the purpose of the temp view over the the regular view process?

How do they differ in data?

Is all the above happening in one session?

Have you run EXPLAIN ANALYZE on the select from the regular view?



Thanks,
Celia McInnis




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: SSL error on install of PEM during Posgres install

2024-03-20 Thread Adrian Klaver

On 3/20/24 06:26, mark bradley wrote:
I am getting the following error message during install of PEM while 
installing Posgres.



How can I fix this?



What is your connection string?

In postgresql.conf what is ssl set to per below?

https://www.postgresql.org/docs/current/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SSL

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Query on Postgres SQL transaction

2024-03-19 Thread Adrian Klaver

On 3/19/24 02:18, Bandi, Venkataramana - Dell Team wrote:

Hi Greg,

We are using hibernate framework to persist the data into Postgres SQL 
DB and data is persisting and committing for all the clients but one of 
the client data is not inserted into DB.


What is different about that client?

Are all the clients passing data through the same instance of the framework?

Are you sure that the client is pointed at the correct database?

Is the log entry below from that client?



Not getting any error/exception for this case. Could you please let us 
know how we can trace out this scenario on transaction level whether 
transaction is committing or not?


We have enabled below properties in postgresql.conf file and verified 
but didn't get any findings about the transaction and below log 
statements are writing in our data store logs.


log_statement = 'all'

logging_collector = on

log_min_messages = debug5

log_min_error_statement = debug5

2024-02-19 15:21:54.850 +08 [1876] LOG:  execute S_48: insert into 
xxx 
(f_schedule_name,f_id,f_totaldataredtn,f_invalidationtime,f_statuscode,f_module,f_app_type,f_dbbackuptype,f_is_compressed,f_proxy,f_size,f_sizeprotected,f_groupjobid,f_status,f_bytesmodifiednotsent,f_sizetransferredboffset,f_bytesmodifiedsent,f_errcode,f_jobid2,f_media_server,f_starttime,f_storageid,f_pool,f_queuestart,f_sizescannedboffset,f_errorcodesummary,f_ncopies,f_sizeprotectedboffset,f_snap_target_platform,f_backup_servername,f_nfiles,f_expiry,f_owner,f_policy_id,f_parentjobid,f_sub_name,f_completion_status,f_endtime,f_filesscanned,f_idle_wait,f_storage_unit,f_group_id,f_backup_set,f_ntries,f_job_name,f_level,f_agent_name,f_failed_copies,f_restarted_job,f_success_copies,f_domain_id,f_snap_target,f_jobid,f_request_id,f_pluginname,f_sizetransferred,f_is_snap,f_node_id,f_workflow_id,f_action_name,f_agent_id,f_instancename,f_session,f_totalobjdedup,f_changedbytes,f_sizeboffset,f_dedupredtn,f_statuscodesummary,f_workflow_jobid,f_snap_policy,f_size_copies,f_sizescanned,f_sub_id,f_archive_flag,f_nfilesnot,f_media_wait,f_snap_creation,f_effective_path) values ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78)


2024-02-19 15:21:54.851 +08 [10928] DEBUG:  bind  to 

2024-02-19 15:21:54.852 +08 [10928] DEBUG:  CommitTransaction(1) name: 
unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0


Regards,

Venkat

*
*

*

Internal Use - Confidential

From:*Greg Sabino Mullane 
*Sent:* Saturday, March 16, 2024 12:07 AM
*To:* Bandi, Venkataramana - Dell Team 
*Cc:* pgsql-general@lists.postgresql.org; Kishore, Nanda - Dell Team 
; Alampalli, Kishore 


*Subject:* Re: Query on Postgres SQL transaction

[EXTERNAL EMAIL]

That's a very vague question, but you can trace exactly what is 
happening by issuing


SET log_statement = 'all';

Ideally at the session level by your application, but can also set it at 
the database and user level. If all else fails, set it globally (i.e. 
postgresql.conf). Turn it off again as soon as possible, it will make 
your logs extremely verbose. But you can track exactly what your 
application is doing.


Cheers,

Greg



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: operator is only a shell - Error

2024-03-19 Thread Adrian Klaver

On 3/19/24 00:02, Rajesh S wrote:
Sorry Adrian, my bad. I'd mentioned "deposit_sub_no='1'" by mistake, 
actually it was "deposit_sub_no=cast(1 as varchar)". This was throwing 


1) Maybe you could explain the logic of taking a number and casting it 
to a string to compare it to a number?


2) select 1::varchar = 1::varchar;
 ?column?
--
 t

So:

deposit_sub_no::varchar = 1::varchar


error "SQL Error [42883]: ERROR: operator does not exist: numeric = 
character varying  Hint: No operator matches the given name and argument 
types. You might need to add explicit type casts.  Position: 19".  Then 
realized that "numeric=text" works but "numeric=varchar" does not.  I 
could resolve the problem by creating the following function and 
operator by commenting "CUMMUTATOR" and "NEGATOR".  Also the error 
"operator is only a shell" also vanished.  I'm just sharing the script 
for your reference.  Thank you very much for your valuable support.


CREATE OR REPLACE FUNCTION public.num_eq_varchar(
     numeric,
     varchar)
     RETURNS boolean
AS 'select case when $2 ~ ''^[0-9\.]+$'' then $1 operator(pg_catalog.=) 
cast($2 as numeric) else $1::varchar = $2 end;'

LANGUAGE SQL IMMUTABLE;

-- Operator: =;

-- DROP OPERATOR IF EXISTS public.= (numeric , varchar);

CREATE OPERATOR public.= (
     FUNCTION = num_eq_varchar,
     LEFTARG = numeric,
     RIGHTARG = varchar,
-- COMMUTATOR = =,
-- NEGATOR = <>,
     RESTRICT = eqsel,
     JOIN = eqjoinsel,
     HASHES, MERGES
);


Thanks,

Rajesh S




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: operator is only a shell - Error

2024-03-18 Thread Adrian Klaver

On 3/18/24 00:05, Rajesh S wrote:
Thank you for your response.  Actually, I was trying to address the 
following query.


select LIEN_AC_NO from deposit_lien where deposit_no='0002114029832' and 
deposit_sub_no='1' and unlien_dt is null and unlien_remarks is null;


In the above query "deposit_sub_no" column is "numeric" type and passing 
'1' (as varchar).  To address this I'd created the function and operator 
as I'd mentioned in the earlier mail.  Even the following query throws 
error after creating the function and operator.


select * from deposit_lien where deposit_no='0002114029832';

ERROR: operator is only a shell: character varying = numeric LINE 1: 
select * from deposit_lien where deposit_no='0002114029832' ^ SQL state: 
42883 Character: 44


In the above query "deposit_no" column is having "varchar" data type.  
But before creating the function and operator it was working fine.  
Tried dropping the same, even though the same error.  How to proceed now?


Not clear to me what the problem is you are trying to solve?

On a stock Postgres install:

select 1::numeric = '1';
 ?column?
--
 t

select '0002114029832'::varchar = '0002114029832';
 ?column?
--
 t





Thanks,

Rajesh S

On 15-03-2024 19:10, Greg Sabino Mullane wrote:

On Fri, Mar 15, 2024 at 6:26 AM Rajesh S  wrote:

I wanted to implement a new "=" (equal) operator with LEFTARG as
numeric and RIGHTARG as varchar.  But after creating the function
and operator, psql shows the error "operator is only a shell:
character varying = numeric

Your operator has numeric on the left and varchar on the right. But 
your query is doing numeric on the RIGHT. Probably want to make a 
matching one to cover both cases.


Cheers,
Greg



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: PostGres ODBC too slow

2024-03-15 Thread Adrian Klaver


On 3/15/24 10:58 AM, vedant patel wrote:

Thanks Adrian,


I have been to the articles but it didn't help. Many of them have 
mentioned this type of issues but not the solution.



That is why you should ask over on the ODBC list, that is where the 
folks who develop and use the driver hang out.


When you do you will need to provide more information, at a minimum:

1) Postgres version.

2) ODBC version.

3) Example code.

4) The benchmark results.



And my main concern is if python library is able to perform fast 
operation why ODBC is too much slow. And I have to use ODBC for my 
application which is in power builder.


Thanks,

On Fri, 15 Mar, 2024, 11:09 pm Adrian Klaver, 
 wrote:



On 3/15/24 3:40 AM, vedant patel wrote:

Hello There,

Recently, we've encountered some performance issues with the ODBC
driver provided by Postgres. Upon investigation, we noticed that
switching from the UNICODE to ANSI driver resulted in performance
improvements for some queries, albeit at the expense of others.

To delve deeper into this matter, I conducted tests using a
Python script with the psycopg2 library, and the results were
significantly better. However, to address this issue
comprehensively, I've explored alternative ODBC drivers available
in the market. While some minor improvements were observed in a
few queries with a different driver, the overall performance
remains a concern.

Given your extensive experience in this area, I would greatly
appreciate your insights and recommendations on which ODBC driver
would be most suitable for our use case. Any advice or
suggestions you could offer would be immensely helpful in
resolving this performance issue.



This will probably get a better answer quicker over at the ODBC list:


https://www.postgresql.org/list/pgsql-odbc/




Let me know in case of any questions or concerns.

Thanks,


-- 
Adrian Klaver

adrian.kla...@aklaver.com


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: Query on Postgres SQL transaction

2024-03-15 Thread Adrian Klaver


On 3/14/24 11:04 PM, Bandi, Venkataramana - Dell Team wrote:


Hi Team,

We are using JPA entities to persists the records into Postgres SQL DB 
and its working for all the nodes but one of the node data is not 
persisting and it’s not giving any DB related errors/exception. We 
just want to trace out this scenario on transaction level whether 
transaction is committing or not.




"... its working for all the nodes but one of the node data is not 
persisting ..." contradict.


Generally when a record does not throw an error, but is not persisted it 
means the transaction was not committed and the changes did not survive 
the session they where done in. Without the actual code it hard to say 
anything more.


Could you please let us know how we can trace out this scenario and 
let us know how transaction logs enabled in Postgres SQL DB.


Regards,

Venkat


Internal Use - Confidential


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: PostGres ODBC too slow

2024-03-15 Thread Adrian Klaver


On 3/15/24 3:40 AM, vedant patel wrote:

Hello There,

Recently, we've encountered some performance issues with the ODBC 
driver provided by Postgres. Upon investigation, we noticed that 
switching from the UNICODE to ANSI driver resulted in performance 
improvements for some queries, albeit at the expense of others.


To delve deeper into this matter, I conducted tests using a Python 
script with the psycopg2 library, and the results were significantly 
better. However, to address this issue comprehensively, I've explored 
alternative ODBC drivers available in the market. While some minor 
improvements were observed in a few queries with a different driver, 
the overall performance remains a concern.


Given your extensive experience in this area, I would greatly 
appreciate your insights and recommendations on which ODBC driver 
would be most suitable for our use case. Any advice or suggestions you 
could offer would be immensely helpful in resolving this performance 
issue.



This will probably get a better answer quicker over at the ODBC list:


https://www.postgresql.org/list/pgsql-odbc/




Let me know in case of any questions or concerns.

Thanks,


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: select results on pg_class incomplete

2024-03-15 Thread Adrian Klaver

On 3/15/24 03:42, Thiemo Kellner wrote:
You solve a problem that no one has. Data belonging together may still 
be divided into schemas in a database. Thus, the metadata is also 
reported and archived individually per database.


I am not sure, we are taking about the same problem, but would be 
surprised to be the only one having experienced filling disks. Maybe, I 
am just that old already that disk space has become so cheep, the 
problem does not exist any longer.


With respect to metadata and databases: The point is not that I cannot 
see the tables in another schema (I believe, did not check yet), but in 
other databases. While this actually does not matter much, I still hold 


That is backwards, schemas are namespaces within a database you can see 
their contents from the local(database) system catalogs.


it true that a disk getting filled up does not care in which database or 
schema a explosively growing table resides. So, if I have a disk getting 
filled up, I would like to get easily information on the problematic 
structures in one go. With PostgreSQL this does not seem to be possible 
out of the box. I now can query each database separately, or I can 
create auxiliary structures like dblink and views to accommodate for a 
"single" query solution. My two dimes.





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: select results on pg_class incomplete

2024-03-14 Thread Adrian Klaver

On 3/14/24 09:41, Thiemo Kellner wrote:

Hi

I am trying to access PostgreSQL meta data, possibly in a vane attempt 
to get size data.


I use DbVis with a connection as shown in https://ibb.co/2SDzhXt . I try 
to get information on a regular table "umsaetze". When doing the DbVis 
object I can see them - https://ibb.co/WxMnY2c . If I execute following 
SQL query in DbVis's SQL Commander, the result set is empty - 
https://ibb.co/GngdWLH .


select *
   from PG_CLASS
  where RELNAME = 'umsaetze';

I noticed that the sessions producing the different results are not the 
same - https://ibb.co/wdKcCFc , but seem to connect to different 
databases. The "missing" table is indeed in the budget database.


The connection user is, apart from being member of pg_monitor vanilla - 
https://ibb.co/DGs6sQz and https://ibb.co/8xzHrvP .


It seems, that in pg_class only is, with respect to custom databases, 
listed what is in the database one connects to - https://ibb.co/dbbJVbJ.


As listed on the tin:

https://www.postgresql.org/docs/current/catalogs-overview.html

"Most system catalogs are copied from the template database during 
database creation and are thereafter database-specific. A few catalogs 
are physically shared across all databases in a cluster; these are noted 
in the descriptions of the individual catalogs."


pg_class is not one of the global tables.



template1=> select count(*)
   from PG_CLASS
  where RELNAME = 'umsaetze';
  count
---
  0
(1 row)

template1=> \q

C:\Users\thiemo\AppData\Roaming\MOBAXT~1\home>psql -h hp-slimline-260 -p 
5436 -U monitor budget

psql (11.2, server 16.1 (Debian 16.1-1.pgdg110+1))
WARNING: psql major version 11, server major version 16.
  Some psql features might not work.
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, 
bits: 256, compression: off)

Type "help" for help.

budget=> select count(*)
   from PG_CLASS
  where RELNAME = 'umsaetze';
  count
---
  2
(1 row)

budget=> \q


Is there a possibility to make the user monitor see all the objects of 
the cluster? Background is that I was hoping to create a query to spit 
out the size of tables in the cluster.


Kind regards

Thiemo




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Fwd: Receipt for PostgreSQL US Invoice #1840

2024-03-12 Thread Adrian Klaver

On 3/12/24 19:14, Christophe Pettus wrote:

Oops?





Begin forwarded message:

*From: *Sadie Bella <mailto:bellasadie@gmail.com>>

*Subject: **Fwd: Receipt for PostgreSQL US Invoice #1840*
*Date: *March 12, 2024 at 19:13:40 PDT
*To: *Christophe mailto:x...@dvvent.com>>


-- Forwarded message -
From: mailto:treasu...@postgresql.us>>
Date: Tue, Mar 12, 2024, 7:07 PM
Subject: Receipt for PostgreSQL US Invoice #1840
To: mailto:bellasadie@gmail.com>>


Hello!

We have received your payment for invoice #1840:
PostgreSQL US membership for bellasadie@gmail.com 
<mailto:bellasadie@gmail.com>


You will find your receipt for this payment in the attached file.

Thank you!

PostgreSQL US





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: could not open file "global/pg_filenode.map": Operation not permitted

2024-03-12 Thread Adrian Klaver

On 3/12/24 02:57, Nick Renders wrote:

On 11 Mar 2024, at 16:04, Adrian Klaver wrote:


On 3/11/24 03:11, Nick Renders wrote:

Thank you for your reply Laurenz.
I don't think it is related to any third party security software. We have 
several other machines with a similar setup, but this is the only server that 
has this issue.

The one thing different about this machine however, is that it runs 2 instances 
of Postgres:
- cluster A on port 165
- cluster B on port 164
Cluster A is actually a backup from another Postgres server that is restored on 
a daily basis via Barman. This means that we login remotely from the Barman 
server over SSH, stop cluster A's service (port 165), clear the Data folder, 
restore the latest back into the Data folder, and start up the service again.
Cluster B's Data and service (port 164) remain untouched during all this time. This is 
the cluster that experiences the intermittent "operation not permitted" issue.

Over the past 2 weeks, I have suspended our restore script and the issue did 
not occur.
I have just performed another restore on cluster A and now cluster B is 
throwing errors in the log again.


Since it seems to be the trigger, what are the contents of the restore script?



Any idea why this is happening? It does not occur with every restore, but it 
seems to be related anyway.

Thanks,

Nick Renders




--
Adrian Klaver
adrian.kla...@aklaver.com





...how are A and B connected?


The 2 cluster are not connected. They run on the same macOS 14 machine with a 
single Postgres installation ( /Library/PostgreSQL/16/ ) and their respective 
Data folders are located on the same volume ( 
/Volumes/Postgres_Data/PostgreSQL/16/data and 
/Volumes/Postgres_Data/PostgreSQL/16-DML/data ). Beside that, they run 
independently on 2 different ports, specified in the postgresql.conf.



...run them under different users on the system.


Are you referring to the "postgres" user / role? Does that also mean setting up 
2 postgres installation directories?



...what are the contents of the restore script?


## stop cluster A
ssh postgres@10.0.0.1 '/Library/PostgreSQL/16/bin/pg_ctl -D 
/Volumes/Postgres_Data/PostgreSQL/16/data stop'

## save config files (ARC_postgresql_16.conf is included in postgresql.conf and 
contains cluster-specific information like the port number)
ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cp 
ARC_postgresql_16.conf ../ARC_postgresql_16.conf'
ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cp 
pg_hba.conf ../pg_hba.conf'

## clear data directory
ssh postgres@10.0.0.1 'rm -r /Volumes/Postgres_Data/PostgreSQL/16/data/*'

## transfer recovery (this will copy the backup "20240312T040106" and any 
lingering WAL files into the Data folder)
barman recover --remote-ssh-command 'ssh postgres@10.0.0.1' pg 20240312T040106 
/Volumes/Postgres_Data/PostgreSQL/16/data

## restore config files
ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cd .. && 
mv ARC_postgresql_16.conf /Volumes/Postgres_Data/PostgreSQL/16/data/ARC_postgresql_16.conf'
ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cd .. && 
mv pg_hba.conf /Volumes/Postgres_Data/PostgreSQL/16/data/pg_hba.conf'

## start cluster A
ssh postgres@10.0.0.1 '/Library/PostgreSQL/16/bin/pg_ctl -D 
/Volumes/Postgres_Data/PostgreSQL/16/data start > /dev/null'


This script runs on a daily basis at 4:30 AM. It did so this morning and there 
was no issue with cluster B. So even though the issue is most likely related to 
the script, it does not cause it every time.


I'm not seeing anything obvious, caveat I'm on my first cup of coffee.

From your first post:

2024-02-26 10:29:41.580 CET [63962] FATAL:  could not open file 
"global/pg_filenode.map": Operation not permitted
2024-02-26 10:30:11.147 CET [90610] LOG:  could not open file 
"postmaster.pid": Operation not permitted; continuing anyway


For now the only suggestion I have is note the presence, ownership and 
privileges of the above files in the present working setup. Then when it 
fails do the same and see if there is a difference. My hunch it is in 
this step:


barman recover --remote-ssh-command 'ssh postgres@10.0.0.1' pg 
20240312T040106 /Volumes/Postgres_Data/PostgreSQL/16/data


If not the step itself then in the process that creates 20240312T040106.




Best regards,

Nick Renders





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: merge with view

2024-03-11 Thread Adrian Klaver

On 3/11/24 13:25, Lorusso Domenico wrote:

Hello guys,
Merge isn't work on view; anyone know why?
I mean, merge could be performed in many way, but essentially is:

  * join sets
  * if matched update or delete
  * if not matched insert

it doesn't seem to be relevant if set is a table or a view.

Moreover also "insert + on conflict" doesn't work with view (in my case 
is a view on a single table with an instead of trigger)


Reply with:

Postgres version

View definition

Example Merge query.

Definition of doesn't work, with actual complete error message.



There is a way to workaround to this issue?

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: could not open file "global/pg_filenode.map": Operation not permitted

2024-03-11 Thread Adrian Klaver

On 3/11/24 03:11, Nick Renders wrote:

Thank you for your reply Laurenz.
I don't think it is related to any third party security software. We have 
several other machines with a similar setup, but this is the only server that 
has this issue.

The one thing different about this machine however, is that it runs 2 instances 
of Postgres:
- cluster A on port 165
- cluster B on port 164
Cluster A is actually a backup from another Postgres server that is restored on 
a daily basis via Barman. This means that we login remotely from the Barman 
server over SSH, stop cluster A's service (port 165), clear the Data folder, 
restore the latest back into the Data folder, and start up the service again.
Cluster B's Data and service (port 164) remain untouched during all this time. This is 
the cluster that experiences the intermittent "operation not permitted" issue.

Over the past 2 weeks, I have suspended our restore script and the issue did 
not occur.
I have just performed another restore on cluster A and now cluster B is 
throwing errors in the log again.


Since it seems to be the trigger, what are the contents of the restore 
script?




Any idea why this is happening? It does not occur with every restore, but it 
seems to be related anyway.

Thanks,

Nick Renders




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Question related to partitioning with pg_partman

2024-03-10 Thread Adrian Klaver

On 3/10/24 11:34, sud wrote:


On Sun, Mar 10, 2024 at 11:31 PM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


1) The partition will be across one day(24 hours) it is just the times
may confuse people. Per you example 2024-03-07 00:00:00+00  is the same
time as 2024-03-06 19:00:00-05 for EST. The issue is that the +00 and
-05 maybe ignored. Also it depends on the clients being consistent in
using timestamptz.

2) You still have not answered what the datetime range(not date range)
is that will be queried. If you have the partitions Midnight to
Midnight
UTC and the clients are querying Midnight to Midnight local time the
query will not match the partitions.


  My apology if not able to clearly put the details. Actually, the query 
will always happen on a day basis i.e they can query from one day to 15 
days transactions. But as you rightly pointed , the partitions can only 
span from midnight to midnight in one timezone, and thus users who 
queries the data from another time zone will mostly scan two partitions 
(even if they just queries one days transaction data in their own 
timezone). And I don't see an easy solution for this , which will help 
all users across all time zones to scan only a single partition in the 
database, when they queries data for a single transaction date.


And thus my question was, is it necessary to have the creation of 
partitions to happen on UTC time zone only? and then whatever 
transaction data inserted by the users from respective time zones will 
be stored in the database as is and will be queried based on the user 
timezone (it may span across multiple partitions though for a single 
user transaction date).



This is going to depend on many things.

1) Partitions are not free they have overhead, which is fine if the 
cost(overhead) is less then the benefits. For details on that see:


https://www.postgresql.org/docs/current/ddl-partitioning.html

and partition parts of

https://www.postgresql.org/docs/current/sql-createtable.html

As part of this there is the consideration of whether daily partitions 
are really what you want?


2) What you hope to get out of the partitioning?

a) If it is confining queries to the partition boundaries then you have 
already stated that is not going to happen.


b) If it is for data pruning purposes, then you have something to 
consider on both ends. Creating/dropping partitions with Midnight to 
Midnight UTC means you will need to consider whether they cover the 
range of datetimes that your users are interested in. In other words 
creating a partition ahead that covers local times that resolve to a UTC 
time in the 'future'. On the back end not dropping a partition until it 
has gone out of scope for everybody.


To answer 1 & 2 you are probably going to need to create a test setup 
and verify how the expected queries are actually going to work with your 
partition scheme.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Question related to partitioning with pg_partman

2024-03-10 Thread Adrian Klaver

On 3/10/24 10:51, sud wrote:


On Sun, Mar 10, 2024 at 10:32 PM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 3/10/24 05:12, sud wrote:
 >
 > In my example in the first post, I see, if someone connected to a
RDS
 > Postgres database and run the create partition command using
pg_partman
 > by setting the timezone as "UTC", the 7th march partition looks
to be
 > spanned from "7th march midnight" to "8th march midnight", when
queried
 > the partition_experession from the data dictionary view. Which is
correct.
 >
 > And same information if someone querying by setting the timezone
as EST
 > is showing spanning from "6th march 7PM" to "7th March 7PM". And
this
 > can cause sometimes the partition may shift to other days all
together.
 > Similar differences happen if creating the partitions using EST
timezone
 > initially and then querying the data dictionary from UTC timezone.

The above is at odds with your example below which has the correct
values:

2024-03-07 00:00:00+00 = 2024-03-06 19:00:00-05

 >
 > So my question was, if in these types of scenarios, we should
follow a
 > standard approach of setting the timezone as UTC in such a type of
 > global user use case, while the system can persist data from
multiple
 > users sitting across different time zones? So that the
boundary(start
 > and end time) of each of the range partitions will be set as
consistent
 > in one timezone across all the partitioned tables?

You need to first determine what your time frames are going to be?

1) Midnight to Midnight in UTC will be consistent when viewed in
UTC. It
will not be when viewed in other time zone +/- the offset from UTC.

2) Or Midnight to Midnight in the users time zone, in which case the
UTC
values will differ.

You have to decide which of the above is your goal. The bottom line is
by definition the local wall clock time will not equal UTC, GMT
excepted. This comes down to what the purpose of the partitions are? In
other words how do you want to organize the data?

 >
 > And even while inserting the data , should we set the timezone to
first
 > UTC and do the data load ?


 >
 > *** Partition created by pg_partman by setting timezone as UTC
 > ***
 >
 > *UTC*
 > *Partition_name                         Partition_expression*
 > test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07
00:00:00+00') TO
 > ('2024-03-08 00:00:00+00')
 >
 > when queried the partition_expression using EST ..
 >
 > *EST*
 > *Partition_name                         Partition_expression*
 > test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06
19:00:00-05') TO
 > ('2024-03-07 19:00:00-05')
 >
 >
 > *** Partition created by pg_partman by setting timezone as EST
 > ***
 >
 > *EST*
 > *Partition_name                         Partition_expression*
 > test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07
00:00:00-05')
 > TO ('2024-03-08 00:00:00-05')
 >
 > when queried the partition_expression using UTC ..
 >
 > *UTC*
 > *Partition_name                         Partition_expression*
 > test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07
05:00:00+00')
 > TO ('2024-03-08 05:00:00+00')
 >
 > ***
 >
 > Also i see both the "setting" and "reset_val" is showing as local
 > timezone only. If we set the timezone to a different value than the
 > local timezone then it gets updated on the "setting".


Our requirement is to have the transaction table partitioned by 
range daily on the transaction_date column(i.e one midnight to next 
midnight transaction data in one partition). Transaction date column 
will be of timestamptz data type. And this application/database might be 
consuming data from users across multiple time zones in future. These 
tables will be queried based on the date range (minimum being ~1 
transaction day) and also will be purged one day partition.


So for above I understand , it might not be possible to keep the users 
data restricted to one day partition in the table considering the users 
will perform transactions across multiple timezones, but we are thinking 
of restricting the database with UTC timezone irrespective of the users. 
And thus during creating the table partitions , we need to ensure the 
UTC timezone is set , such that the upper and lower boundary for the 
daily range partitions remains consistent for all. Correct me if my 
understanding

Re: Question related to partitioning with pg_partman

2024-03-10 Thread Adrian Klaver

On 3/10/24 05:12, sud wrote:






'2024-03-07 00:00:00+00' and '2024-03-06 19:00:00-05' are the same time
as is '2024-03-07 00:00:00-05' and '2024-03-07 05:00:00+00'.

Still I would think for sanity sake you would want to stick with UTC.



Thank you so much Adrian.

In my example in the first post, I see, if someone connected to a RDS 
Postgres database and run the create partition command using pg_partman 
by setting the timezone as "UTC", the 7th march partition looks to be 
spanned from "7th march midnight" to "8th march midnight", when queried 
the partition_experession from the data dictionary view. Which is correct.


And same information if someone querying by setting the timezone as EST 
is showing spanning from "6th march 7PM" to "7th March 7PM". And this 
can cause sometimes the partition may shift to other days all together. 
Similar differences happen if creating the partitions using EST timezone 
initially and then querying the data dictionary from UTC timezone.


The above is at odds with your example below which has the correct values:

2024-03-07 00:00:00+00 = 2024-03-06 19:00:00-05



So my question was, if in these types of scenarios, we should follow a 
standard approach of setting the timezone as UTC in such a type of 
global user use case, while the system can persist data from multiple 
users sitting across different time zones? So that the boundary(start 
and end time) of each of the range partitions will be set as consistent 
in one timezone across all the partitioned tables?


You need to first determine what your time frames are going to be?

1) Midnight to Midnight in UTC will be consistent when viewed in UTC. It 
will not be when viewed in other time zone +/- the offset from UTC.


2) Or Midnight to Midnight in the users time zone, in which case the UTC 
values will differ.


You have to decide which of the above is your goal. The bottom line is 
by definition the local wall clock time will not equal UTC, GMT 
excepted. This comes down to what the purpose of the partitions are? In 
other words how do you want to organize the data?




And even while inserting the data , should we set the timezone to first 
UTC and do the data load ?






*** Partition created by pg_partman by setting timezone as UTC 
***


*UTC*
*Partition_name                         Partition_expression*
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00+00') TO 
('2024-03-08 00:00:00+00')


when queried the partition_expression using EST ..

*EST*
*Partition_name                         Partition_expression*
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06 19:00:00-05') TO 
('2024-03-07 19:00:00-05')



*** Partition created by pg_partman by setting timezone as EST 
***


*EST*
*Partition_name                         Partition_expression*
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00-05') 
TO ('2024-03-08 00:00:00-05')


when queried the partition_expression using UTC ..

*UTC*
*Partition_name                         Partition_expression*
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 05:00:00+00') 
TO ('2024-03-08 05:00:00+00')


***

Also i see both the "setting" and "reset_val" is showing as local 
timezone only. If we set the timezone to a different value than the 
local timezone then it gets updated on the "setting".


Regards
Sud


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: creating a subset DB efficiently ?

2024-03-09 Thread Adrian Klaver

On 3/8/24 08:22, David Gauthier wrote:

Here's the situation

- The DB contains data for several projects.
- The tables of the DB contain data for all projects (data is not 
partitioned on project name or anything like that)
- The "project" identifier (table column) exists in a few "parent" 
tables with many child... grandchild,... tables under them connected 
with foreign keys defined with "on delete cascade".  So if a record in 
one of the parent table records is deleted, all of its underlying, 
dependent records get deleted too.


How many "... child... grandchild,... tables" ?

Do these tables constitute all the tables in the database?



Related question...
The "delete from par_tbl_a where project <> 'a' " is taking forever.  I 
fear it's because it's trying to journal everything in case I want to 
rollback.  But this is just in the archive DB and I don't mind taking 
the risk if I can speed this up outside of a transaction.  How can I run 
a delete command like this without the rollback recovery overhead ?


I am assuming that at the point you do  "delete from par_tbl_a where 
project <> 'a' " project a is no longer receiving data and its records 
are static. Further assuming there is a PK that you could order by, then 
it would seem the way to go would be to delete in batches as determined 
by the PK.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Help diagnosing replication (copy) error

2024-03-09 Thread Adrian Klaver

On 3/8/24 22:26, Steve Baldwin wrote:



On Sat, Mar 9, 2024 at 11:06 AM Jeff Ross <mailto:jr...@openvistas.net>> wrote:



RDS is a black box--who knows what's really going on there?  It
would be
interesting to see what the response is after you open a support case.
I hope you'll be able to share that with the list.

This is very mysterious. I logged the case, and before it had been 
picked up by an analyst, the issue somehow resolved itself without me 
doing anything.


Is the case still active?
Can you get information from them about what they saw?

I am not big believer in coincidences, that you reported a problem and 
then the problem disappeared.




I now have 418M+ rows in the table that it got stuck on.

:shrug:

Thanks Adrian and Jeff for responding.

Steve

Jeff








--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Insert with Jsonb column hangs

2024-03-09 Thread Adrian Klaver

On 3/9/24 08:00, kuldeep singh wrote:
Copy may not work in our scenario since we need to join data from 
multiple tables & then  convert it to json using  row_to_json . This 
json data eventually  needs to be stored in a target table .


Per:

https://www.postgresql.org/docs/current/sql-copy.html

"
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }

<...>

query

A SELECT, VALUES, INSERT, UPDATE, or DELETE command whose results 
are to be copied. Note that parentheses are required around the query.


For INSERT, UPDATE and DELETE queries a RETURNING clause must be 
provided, and the target relation must not have a conditional rule, nor 
an ALSO rule, nor an INSTEAD rule that expands to multiple statements.

"



Will it be better if we break the process into batches of like 10,000 
rows & insert the data in its individual transactions? Or any other 
better solution available ?


On Sat, Mar 9, 2024 at 9:01 PM hector vass <mailto:hector.v...@gmail.com>> wrote:




On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh
mailto:kuldeeparor...@gmail.com>> wrote:

Hi,

We are inserting data close to 1M record & having a single Jsonb
column but query is getting stuck.

We are using insert into select * .. , so all the operations are
within the DB.

If we are running select query individually then it is returning
the data in 40 sec for all rows but with insert it is getting stuck.

PG Version - 15.

What could be the problem here ?

Regards,
KD


insert 1M rows especially JSON that can be large, variable in size
and stored as blobs and indexed is not perhaps the correct way to do
this
insert performance will also depend on your tuning.  Supporting
transactions, users or bulk processing are 3x sides of a compromise.
you should perhaps consider that insert is for inserting a few rows
into live tables ... you might be better using copy or \copy,
pg_dump if you are just trying to replicate a large table



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Help diagnosing replication (copy) error

2024-03-08 Thread Adrian Klaver

On 3/8/24 14:04, Steve Baldwin wrote:



On Sat, Mar 9, 2024 at 8:56 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:



What are the rest of the values in pg_replication_slots?

b2bcreditonline=> select * from pg_replication_slots;
                   slot_name                  |  plugin  | slot_type | 
datoid |    database     | temporary | active | active_pid | xmin | 
catalog_xmin |  restart_lsn  | confirmed_flush_lsn | wal_status | 
safe_wal_size | two_phase

-+--+---++-+---+++--+--+---+-++---+---
  b2bcreditonline_prod_e_master               | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |      13700 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA478       | reserved   | 
       | f
  b2bcreditonline_prod_sandbox_d_master       | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |       9232 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560       | reserved   | 
       | f
  b2bcreditonline_prod_e_master_only          | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |      13710 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560       | reserved   | 
       | f
  pg_378075177_sync_60067_7343845372910323059 | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | f      |            |      |   
  455719618 | 2E2A/1C0972E0 | 2E2A/1C097318       | extended   | 
       | f
  b2bcreditonline_prod_e_shard                | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |      13718 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560       | reserved   | 
       | f

(5 rows)



For future reference the above would be easier to follow if you did \x 
before the select * from pg_replication_slots;


Someone with more experience will need to comment further but to me:

restart_lsn confirmed_flush_lsn
2E2A/1C0972E0   2E2A/1C097318

does not look like enough data was transferred before the slot stopped 
transmitting.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Help diagnosing replication (copy) error

2024-03-08 Thread Adrian Klaver

On 3/8/24 14:04, Steve Baldwin wrote:



On Sat, Mar 9, 2024 at 8:56 AM Adrian Klaver <mailto:adrian.kla...@aklaver.com>> wrote:



What are the rest of the values in pg_replication_slots?

b2bcreditonline=> select * from pg_replication_slots;
                   slot_name                  |  plugin  | slot_type | 
datoid |    database     | temporary | active | active_pid | xmin | 
catalog_xmin |  restart_lsn  | confirmed_flush_lsn | wal_status | 
safe_wal_size | two_phase

-+--+---++-+---+++--+--+---+-++---+---
  b2bcreditonline_prod_e_master               | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |      13700 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA478       | reserved   | 
       | f
  b2bcreditonline_prod_sandbox_d_master       | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |       9232 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560       | reserved   | 
       | f
  b2bcreditonline_prod_e_master_only          | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |      13710 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560       | reserved   | 
       | f
  pg_378075177_sync_60067_7343845372910323059 | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | f      |            |      |   
  455719618 | 2E2A/1C0972E0 | 2E2A/1C097318       | extended   | 
       | f
  b2bcreditonline_prod_e_shard                | pgoutput | logical   | 
  16404 | b2bcreditonline | f         | t      |      13718 |      |   
  456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560       | reserved   | 
       | f

(5 rows)

Is there data in the subscriber side table?

No there is not, although when I did a 'select count(*) from {table}' it 
took several minutes to return 0 rows.


What are the publisher and subscriber configurations?

Not sure which settings, but here's a few.



I should been clearer.

What are the CREATE PUBLICATION and CREATE SUBSCRIPTION statements?

--
Adrian Klaver
adrian.kla...@aklaver.com





  1   2   3   4   5   6   7   8   9   10   >