Re: Undetected Deadlock

2022-02-09 Thread Simon Riggs
On Wed, 9 Feb 2022 at 23:50, Michael Harris  wrote:
>
> On Mon, 7 Feb 2022 at 09:57, Tom Lane  wrote:
> > Do you want to try this and see if it actually adds any robustness with 
> > your buggy code?
>
> Sorry for the delayed response, & thanks for the patch.
>
> I wasn't able to test with our actual application because it could
> take days for it to actually trigger the problem, so I tested it with
> a simulation, which you can find here:
>
> https://github.com/harmic/pg_almloss

Thanks for writing a test case.

Could you specify the licence of that as The PostgreSQL Licence, to
allow it to be used as a permanent test case?
We can add other misbehaviors as well, as needed.

> With that simulation I could attach gdb to the backend and see that
> signal_pending & signal_due_at were being reset in the expected way,
> even when a missed interrupt was triggered.
>
> I'm convinced your patch improves robustness under the scenario we saw.

Cool, thanks


--
Simon Riggshttp://www.EnterpriseDB.com/




Re: Can we go beyond the standard to make Postgres radically better?

2022-02-09 Thread David G. Johnston
On Wed, Feb 9, 2022 at 10:15 PM Guyren Howe  wrote:

> There are huge developer benefits available to focusing more on making a
> great relational programming environment, well outside the SQL standard.
>

Sure

>
> Examples of small things Postgres could have:
>
>- *SELECT * - b.a_id from a natural join b*
>   - *let me describe a select list by removing fields from a
>   relation. In the example, I get all fields in the join of  a  and b 
> other
>   than the shared key, which I only get once.*
>   - *note how this simplifies maintaining views wrt  changes in
>   tables*
>
> Natural join sucks and "using" already does this.  Not seeing how this
meaningfully impacts views. in their current form.

I have asked for this in the past though...but it is so contrary to the
fundamental design of SQL, and provides such limited benefit, that its
omission seems like a net positive (if you factor in both novelty and
development)

>
>- *Let me put the FROM clause first*
>   - *if I can write FROM a join b SELECT a.height, a.name
>   , b.email then an editor can give me autocomplete when 
> I’m
>   writing the select clause.*
>
> select * from ..., then go back and change * to whatever columns you end
up wanting and your editor can, in theory, do what you want.  Dance with
the person you brought.

>
>- *Hierarchical schemas*
>
>
>- *First-class functions.*
>   - *A global-only namespace is dumb. Schemas are only slightly less
>   dumb. The obvious way to store and call functions is as values of 
> fields.
>   Let me organize them how I choose*
>   - *Also let me pass around functions as values, let me make new
>   ones and so on. Postgres is the best relational programming environment
>   already because it has a decent type system and more general features.
>   Let’s expand and also simplify that.*
>   - *Also nested function definitions, so top-level functions can be
>   built out of local auxiliary functions.*
>
> See past and current discussions regarding modules.  Improvement here is
IMO the most worthy of attention (among this list, not globally)

>
>- *Other languages*
>   - *Tutorial D, Datalog, Quell, let’s open this puppy up!*
>
> Go talk to them.  Adding more of these to the core is undesirable.

>
>- *SQL is a terrible, no good, very bad language*
>
> I'd rather live with SQL as it is and develop static analyzers and other
related tools to make using it less problematic; in ways that we cannot do
in the server without breaking existing code.

>
>- *A portable, low-level API*
>   - *An alternative to SQLite that provides CRUD operations on a
>   Postgres database.*
>
> Not really understanding what features you are including in this one.
Doesn't feel like something that belongs in core though.  I'd rather
efforts be made to make extension writing and installing easier (including
the related "module" feature you note above).

>
>
>
> I adore Postgres, but this is despite rather than because of SQL. By all
> means, let’s have a close-to-standard subset of features, but let’s
> fearlessly go beyond them when it would be obviously much better.
>

No one is saying we shouldn't or cannot do just that.  Though frankly there
is enough WIP in our commitfest to keep the majority of volunteers
sufficiently busy on stuff that has two important attributes this email
lacks - someone championing the feature and a functioning patch.
David J.


Can we go beyond the standard to make Postgres radically better?

2022-02-09 Thread Guyren Howe
Postgres has since the outset gone beyond the SQL standard in many ways : 
types, inheritance, programmability, generality are all well beyond what SQL 
used to mandate and still well beyond the current standard.

There are huge developer benefits available to focusing more on making a great 
relational programming environment, well outside the SQL standard.

Examples of small things Postgres could have:
SELECT * - b.a_id from a natural join b
let me describe a select list by removing fields from a relation. In the 
example, I get all fields in the join of  a  and b other than the shared key, 
which I only get once.
note how this simplifies maintaining views wrt  changes in tables
Let me put the FROM clause first
if I can write FROM a join b SELECT a.height, a.name, b.email then an editor 
can give me autocomplete when I’m writing the select clause.
Hierarchical schemas
Examples of larger things Postgres might have:
First-class functions.
A global-only namespace is dumb. Schemas are only slightly less dumb. The 
obvious way to store and call functions is as values of fields. Let me organize 
them how I choose
Also let me pass around functions as values, let me make new ones and so on. 
Postgres is the best relational programming environment already because it has 
a decent type system and more general features. Let’s expand and also simplify 
that.
Also nested function definitions, so top-level functions can be built out of 
local auxiliary functions.
Other languages
Tutorial D, Datalog, Quell, let’s open this puppy up!
SQL is a terrible, no good, very bad language
A portable, low-level API
An alternative to SQLite that provides CRUD operations on a Postgres database.

I adore Postgres, but this is despite rather than because of SQL. By all means, 
let’s have a close-to-standard subset of features, but let’s fearlessly go 
beyond them when it would be obviously much better.


Re: Compile 14.1 in EL5.8

2022-02-09 Thread Thomas Munro
On Thu, Feb 10, 2022 at 2:23 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > ... I wondered about also removing the leftover comment
> > "We assume that any system that has Linux epoll() also has Linux
> > signalfd()" which was my attempt to explain that there wasn't a
> > separate configure check for signalfd.h, but I guess the sentence is
> > still true in a more general sense, so we can just leave it there.
>
> Oh, I didn't notice that comment, or I probably would have tweaked it.
> Perhaps along the line of "there are too few systems that have epoll
> and not signalfd to justify maintaining a separate code path"?

WFM, though I remain a little unclear on whether our support policy is
stochastic or declarative :-D




Re: Compile 14.1 in EL5.8

2022-02-09 Thread Tom Lane
Thomas Munro  writes:
> ... I wondered about also removing the leftover comment
> "We assume that any system that has Linux epoll() also has Linux
> signalfd()" which was my attempt to explain that there wasn't a
> separate configure check for signalfd.h, but I guess the sentence is
> still true in a more general sense, so we can just leave it there.

Oh, I didn't notice that comment, or I probably would have tweaked it.
Perhaps along the line of "there are too few systems that have epoll
and not signalfd to justify maintaining a separate code path"?

regards, tom lane




Re: Compile 14.1 in EL5.8

2022-02-09 Thread Thomas Munro
On Thu, Feb 10, 2022 at 3:38 AM Tom Lane  wrote:
> Gabriela Serventi  writes:
> > Hi! I'm trying to compile release 14.1 in a very old Linux Server (Red Hat
> > Enterprise Linux Server release 5.8). I can run configure successfully, but
> > when I try to run make, I get the following error:
> > latch.c:85:26: error: sys/signalfd.h: No such file or directory
>
> It looks like since 6a2a70a02, latch.c effectively uses HAVE_SYS_EPOLL_H
> to decide whether it can include , which seems not too
> well thought out.  A proper fix will require another configure check,
> but you could hack it by adding -DWAIT_USE_POLL to CPPFLAGS in
> src/Makefile.global after running configure.

I mean, I did think about it, but I thought it wasn't worth the
configure cycles to detect EOL'd kernels manufactured 2005-2007, since
we've said before that we don't support RHEL5 or kernels that old[1].
My primary goal was not to have to write the support for the
epoll/self-pipe combination with zero coverage and zero users.
Apparently I was off by one on the users, and since it's so easy to
fall back to poll() I agree that we might as well!  BTW after seeing
your commit today I wondered about also removing the leftover comment
"We assume that any system that has Linux epoll() also has Linux
signalfd()" which was my attempt to explain that there wasn't a
separate configure check for signalfd.h, but I guess the sentence is
still true in a more general sense, so we can just leave it there.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGKL_%3DaO%3Dr30N%3Ds9VoDgTqHpRSzePRbA9dkYO7snc7HsxA%40mail.gmail.com




Re: Performance issue questions

2022-02-09 Thread Bruce Momjian
On Thu, Feb 10, 2022 at 05:11:07AM +0530, Rama Krishnan wrote:
> Hi all, 
> 
> It was a famous interview question
> 
> If the table contains multiple indexes what kind of performance issue or 
> impact
> will be there.

Sure:

https://momjian.us/main/blogs/pgblog/2012.html#July_9_2012
https://momjian.us/main/presentations/performance.html#optimizer

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Undetected Deadlock

2022-02-09 Thread Tom Lane
Michael Harris  writes:
> On Mon, 7 Feb 2022 at 09:57, Tom Lane  wrote:
>> Do you want to try this and see if it actually adds any robustness with your 
>> buggy code?

> Sorry for the delayed response, & thanks for the patch.

> I wasn't able to test with our actual application because it could
> take days for it to actually trigger the problem, so I tested it with
> a simulation, which you can find here:

> https://github.com/harmic/pg_almloss

> With that simulation I could attach gdb to the backend and see that
> signal_pending & signal_due_at were being reset in the expected way,
> even when a missed interrupt was triggered.

> I'm convinced your patch improves robustness under the scenario we saw.

Great, thanks for testing!

regards, tom lane




Re: Undetected Deadlock

2022-02-09 Thread Michael Harris
On Mon, 7 Feb 2022 at 09:57, Tom Lane  wrote:
> Do you want to try this and see if it actually adds any robustness with your 
> buggy code?

Sorry for the delayed response, & thanks for the patch.

I wasn't able to test with our actual application because it could
take days for it to actually trigger the problem, so I tested it with
a simulation, which you can find here:

https://github.com/harmic/pg_almloss

With that simulation I could attach gdb to the backend and see that
signal_pending & signal_due_at were being reset in the expected way,
even when a missed interrupt was triggered.

I'm convinced your patch improves robustness under the scenario we saw.

Thanks again!

Cheers
Mike




Performance issue questions

2022-02-09 Thread Rama Krishnan
Hi all,

It was a famous interview question

If the table contains multiple indexes what kind of performance issue or
impact will be there.


Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread David G. Johnston
On Wed, Feb 9, 2022 at 2:24 PM Thomas Kellerer  wrote:

> David G. Johnston schrieb am 09.02.2022 um 21:47:
> > You cannot defer uniqueness checks to transaction commit so either it
> > is going to fail on the insert or it will not fail at all.
>
> You can defer unique constraints, but not primary key constraints.
>
>
Actually we are both wrong...

"This controls whether the constraint can be deferred. A constraint that is
not deferrable will be checked immediately after every command. Checking of
constraints that are deferrable can be postponed until the end of the
transaction (using the SET CONSTRAINTS command). NOT DEFERRABLE is the
default. Currently, only UNIQUE, PRIMARY KEY, EXCLUDE, and REFERENCES
(foreign key) constraints accept this clause."

https://www.postgresql.org/docs/current/sql-createtable.html

In any case I would not expect that primary key constraints would behave
differently than any other unique constraint.  The null-ness check for a PK
cannot be deferred, however.

"NOT NULL and CHECK constraints are not deferrable. Note that deferrable
constraints cannot be used as conflict arbitrators in an INSERT statement
that includes an ON CONFLICT DO UPDATE clause."

David J.


Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Thomas Kellerer

David G. Johnston schrieb am 09.02.2022 um 21:47:

You cannot defer uniqueness checks to transaction commit so either it
is going to fail on the insert or it will not fail at all.


You can defer unique constraints, but not primary key constraints.

create table t
(
  id integer
);

alter table t
   add constraint unique_id
   unique (id)
   deferrable initially deferred;





Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Bryn Llewellyn
> david.g.johns...@gmail.com wrote:
> 
> The convention on these lists is to inline or bottom post (and to trim the 
> reply to just the pertinent parts).

Just for completeness, I expected this test to run without error. (I tried it 
in PG Version 14.1).

create table t(k int primary key, v text not null);
insert into t(k, v) values (1, 'one'), (2, 'two');
select k, v from t order by k;

start transaction;
delete from t where k = 1;
insert into t(k, v) values(1, 'new one');
commit;

select k, v from t order by k;

Indeed it did run without error. And I saw the results that I expected.

Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread David G. Johnston
The convention on these lists is to inline or bottom post (and to trim the
reply to just the pertinent parts).

On Wed, Feb 9, 2022 at 12:33 PM Brent Wood  wrote:

> Because both statements are in the transaction, the delete is not fully
> actioned until the commit. So it still exists in the table when you try to
> insert the record with the duplicate key.
>

A transaction makes your actions invisible (more or less) to other
concurrent sessions in the system.  But so far as your own session is
concerned subsequent commands get to see the changes made during previous
commands.

Check if the error is generated during the transaction or at the commit
> stage, run it without the commit, rollback instead to check this.
>

You cannot defer uniqueness checks to transaction commit so either it is
going to fail on the insert or it will not fail at all.

> I don't see how you can do this within a transaction, someone else might?
>

That should be a good indicator that you are missing something, because the
presence or absence of a transaction should not be impacting this at all.
This is much more likely operator error rather than a system bug, and so
reproducing the error is the only reasonable first step - since the
pseudo-code that was provided and the error simply do not make sense.

David J.


Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Adrian Klaver

On 2/9/22 10:34, Brent Wood wrote:

My take on this...

Because both statements are in the transaction, the delete is not fully 
actioned until the commit. So it still exists in the table when you try 
to insert the record with the duplicate key.


No:

\d animals
  Table "public.animals"
 Column |  Type  | Collation | Nullable | Default
++---+--+-
 id | integer|   | not null |
 cond   | character varying(200) |   | not null |
 animal | character varying(200) |   | not null |
Indexes:
"animals_pkey" PRIMARY KEY, btree (id)


delete from animals where id = 1;
DELETE 1

insert into animals values (1, 'great', 'opossum');
INSERT 0 1



Check if the error is generated during the transaction or at the commit 
stage, run it without the commit, rollback instead to check this.


I don't see how you can do this within a transaction, someone else might?



Brent Wood

Principal Technician, Fisheries
NIWA
DDI:  +64 (4) 3860529




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Andrew Hardy
Really appreciate the input thank you.

I shall try to provide a full script, I guess if the cascades are relevant
in going to have to also provide all of the indirect table creations and
all of the earlier inserts to those various tables.

At the very least I'll try to provide a fuller error message and a minimum
structure that will cause it to occur.

Andrew


On Wed, 9 Feb 2022, 19:26 David G. Johnston, 
wrote:

> On Wed, Feb 9, 2022 at 11:12 AM Andrew Hardy 
> wrote:
>
>> Do I need some particular kind of settings on my transaction to be able
>> to delete and insert afresh in the same transaction?
>>
>
> No.  I cannot reproduce your claim with a trivial example on stock 13.5.
>
> You will need to be less vague and at least provide an exact reproducer
> script.
>
> In case it is relevant - the first delete will lead to cascaded deletes on
>> children.
>>
>
> This may indeed be relevant.  Again, you need to provide an exact
> reproducer, not expect others to reverse-engineer one for you.
>
> David J.
>
>


Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Brent Wood
My take on this...

Because both statements are in the transaction, the delete is not fully 
actioned until the commit. So it still exists in the table when you try to 
insert the record with the duplicate key.

Check if the error is generated during the transaction or at the commit stage, 
run it without the commit, rollback instead to check this.

I don't see how you can do this within a transaction, someone else might?



Brent Wood

Principal Technician, Fisheries
NIWA
DDI:  +64 (4) 3860529


From: Andrew Hardy 
Sent: Thursday, February 10, 2022 07:11
To: pgsql-general@lists.postgresql.org 
Subject: DELETING then INSERTING record with same PK in the same TRANSACTION

Hi,

When I:

Begin a transaction
DELETE from  where id (PK) = 
INSERT INTO  VALUES ()
...
...
...
COMMIT

I get

insert into "itinerary" ("dk", "locator") values ($1, $2) - duplicate key value 
violates unique constraint "itinerary_pkey"

Do I need some particular kind of settings on my transaction to be able to 
delete and insert afresh in the same transaction?

In case it is relevant - the first delete will lead to cascaded deletes on 
children.

Alternatively I wonder if I change the PK column value to 
"-FORDELETION" will I be free to insert under the same original PK 
value in the same transaction, then delete the FORDELETE item just before 
committing or will I hit the same issue?

Thanks,

Andrew


[https://www.niwa.co.nz/static/niwa-2018-horizontal-180.png] 

Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529

National Institute of Water & Atmospheric Research Ltd (NIWA)
301 Evans Bay Parade Hataitai Wellington New Zealand
Connect with NIWA: niwa.co.nz 
Facebook 
LinkedIn 
Twitter 
Instagram
To ensure compliance with legal requirements and to maintain cyber security 
standards, NIWA's IT systems are subject to ongoing monitoring, activity 
logging and auditing. This monitoring and auditing service may be provided by 
third parties. Such third parties can access information transmitted to, 
processed by and stored on NIWA's IT systems


Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread David G. Johnston
On Wed, Feb 9, 2022 at 11:12 AM Andrew Hardy 
wrote:

> Do I need some particular kind of settings on my transaction to be able to
> delete and insert afresh in the same transaction?
>

No.  I cannot reproduce your claim with a trivial example on stock 13.5.

You will need to be less vague and at least provide an exact reproducer
script.

In case it is relevant - the first delete will lead to cascaded deletes on
> children.
>

This may indeed be relevant.  Again, you need to provide an exact
reproducer, not expect others to reverse-engineer one for you.

David J.


Re: DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Adrian Klaver

On 2/9/22 10:11 AM, Andrew Hardy wrote:

Hi,

When I:

Begin a transaction
DELETE from  where id (PK) = 
INSERT INTO  VALUES ()
...
...
...
COMMIT

I get

insert into "itinerary" ("dk", "locator") values ($1, $2) - duplicate 
key value violates unique constraint "itinerary_pkey"


 = itinerary?

What fields is unique constraint "itinerary_pkey" actually pointing at?

There should be more to the error message. Can you provide the complete 
message?




Do I need some particular kind of settings on my transaction to be able 
to delete and insert afresh in the same transaction?


In case it is relevant - the first delete will lead to cascaded deletes 
on children.


Alternatively I wonder if I change the PK column value to 
"-FORDELETION" will I be free to insert under the same 
original PK value in the same transaction, then delete the FORDELETE 
item just before committing or will I hit the same issue?


Thanks,

Andrew





--
Adrian Klaver
adrian.kla...@aklaver.com




DELETING then INSERTING record with same PK in the same TRANSACTION

2022-02-09 Thread Andrew Hardy
Hi,

When I:

Begin a transaction
DELETE from  where id (PK) = 
INSERT INTO  VALUES ()
...
...
...
COMMIT

I get

insert into "itinerary" ("dk", "locator") values ($1, $2) - duplicate key
value violates unique constraint "itinerary_pkey"

Do I need some particular kind of settings on my transaction to be able to
delete and insert afresh in the same transaction?

In case it is relevant - the first delete will lead to cascaded deletes on
children.

Alternatively I wonder if I change the PK column value to
"-FORDELETION" will I be free to insert under the same original
PK value in the same transaction, then delete the FORDELETE item just
before committing or will I hit the same issue?

Thanks,

Andrew


Re: Network Card Not Listening at Startup

2022-02-09 Thread Tom Lane
Ludwig Isaac Lim  writes:
> How do make sure that NIC will be listening every time I started/restarted 
> the server?

You need to tell systemd not to start the postgresql service
until the network is up.

In basic cases, you can edit the postgresql.service file,
or maybe better create an add-on config file something like this:

$ cat /etc/systemd/system/postgresql.service.d/delay_start.conf 
[Unit]
After=network-online.target
Wants=network-online.target

(The add-on file name doesn't matter, the directory name does.)

I'm not certain that this will wait for custom NICs if you are
doing something nonstandard to configure those, but for me
it works for alternate IPs configured in
/etc/sysconfig/network-scripts/ifcfg-whatever.

regards, tom lane




Network Card Not Listening at Startup

2022-02-09 Thread Ludwig Isaac Lim
Hello:

I'm running postgresql 12.9 on Amazon Linux in AWS. 

My listen_addresses in postgresql.conf is configured like below:
listen_addresses = '172.17.1.57,localhost' 


However, postgresql does not listen on 172..17.1.57. I can reproduce the 
phenomenon by simply rebooting the server and then run netstat -tulpn

[root@ip-172-17-1-206 data]#  netstat -tulpn | grep 5432
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN      
2605/postgres



And there's nothing abnormal in the logs:
2022-02-09 15:59:36.906 UTC [2663] LOG:  database system was shut down at 
2022-02-09 15:59:20 UTC
2022-02-09 15:59:36.930 UTC [2605] LOG:  database system is ready to accept 
connections



I had to run:
systemctl restart postgresql


For 172.17.1.57 to listen

[root@ip-172-17-1-206 data]#  netstat -tulpn | grep 5432
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN      
4289/postgres
tcp        0      0 172.17.1.57:5432        0.0.0.0:*               LISTEN      
4289/postgres


Note that 172.17.1.57 is not the primary network interface. I created the 
network interface and attached it to the EC2 instance.

My systemd config file for postgresql (/etc/systemd/system/postgresql.service) 
was copied from 18.3. Starting the Database Server

 
 
 
   
18.3. Starting the Database Server
 18.3. Starting the Database Server 18.3.1. Server Start-up Failures 18.3.2. 
Client Connection Problems Before an...   


Extra information
--
postgres=# select version();
                                                 version

 PostgreSQL 12.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (R
(1 row)


How do make sure that NIC will be listening every time I started/restarted the 
server?

Thanks in advance,
Ludwig Lim




Re: Why are data files getting modified?

2022-02-09 Thread Laurenz Albe
On Thu, 2022-02-10 at 00:29 +0900, Abhishek Bhola wrote:
> > > I see there is an AUTOVACUUM job started on all the partitions that have 
> > > been
> > > read (on which select has run). I am assuming the reason why Postgres is
> > > autovacuuming these large partitions is this:
> > > 
> > > db=# SELECT c.relname as table_name, c.relkind as type, 
> > > age(c.relfrozenxid) as age,
> > > c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <> 2147483647 
> > > ORDER BY 3 DESC LIMIT 20;
> > >         table_name         | type |    age    | relfrozenxid
> > > ---+--+---+--
> > >  hist_omx_orderaudit201506 | r    | 201463508 |       319891
> > >  hist_omx_orderaudit201509 | r    | 201418497 |       364902
> > >  hist_omx_orderaudit201205 | r    | 201142042 |       641357
> > > 
> > > autovacuum_freeze_max_age
> > > ---
> > >  2
> > > (1 row)
>
> I see, but then why is the autovacuum getting triggered after the select 
> command?

Autovacuum is triggered because "relfrozenxid" is nore than 2 
transactions
in the past.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Why are data files getting modified?

2022-02-09 Thread Abhishek Bhola
I see, but then why is the autovacuum getting triggered after the select
command?

On Thu, Feb 10, 2022, 12:14 AM Laurenz Albe 
wrote:

> On Wed, 2022-02-09 at 17:29 +0900, Abhishek Bhola wrote:
> > I have a table with monthly partitions, starting from 2011.
> > Each partition has about 2-3 billion rows and is about 40TB of data in
> total.
> > I am running a select query to \copy the data into csv files for each
> month.
> >
> > I see there is an AUTOVACUUM job started on all the partitions that have
> been
> > read (on which select has run). I am assuming the reason why Postgres is
> > autovacuuming these large partitions is this:
> >
> > db=# SELECT c.relname as table_name, c.relkind as type,
> age(c.relfrozenxid) as age,
> > c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <>
> 2147483647 ORDER BY 3 DESC LIMIT 20;
> > table_name | type |age| relfrozenxid
> > ---+--+---+--
> >  hist_omx_orderaudit201506 | r| 201463508 |   319891
> >  hist_omx_orderaudit201509 | r| 201418497 |   364902
> >  hist_omx_orderaudit201205 | r| 201142042 |   641357
> >
> > autovacuum_freeze_max_age
> > ---
> >  2
> > (1 row)
> >
> > Am I wrong in my understanding why this autovacuum is getting triggered?
> >
> > However, my main question is that why do the data files in the
> tablespace, corresponding
> > to these partition tables (that have been read or selected recently), or
> the ones
> > autovacuumed recently have a latest timestamp?
> >
> > There was no insert or update on these partitions, so why do I see an
> updated timestamp?
> > Does autovacuum change something on the datafiles?
>
> Yes, autovacuum will freeze old rows in the table, which modifies the data
> files.
>
> To keep this to a minimum, you can run VACUUM (FREEZE) on a partition as
> soon
> as you know that its data won't get modified any more.
> Still, I wouldn't be sure that the file modification timestamp won't change
> any more after that.  But that should be pretty irrelevant; at any rate, no
> substantial data modifications will occur after that.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_


Re: Why are data files getting modified?

2022-02-09 Thread Laurenz Albe
On Wed, 2022-02-09 at 17:29 +0900, Abhishek Bhola wrote:
> I have a table with monthly partitions, starting from 2011.
> Each partition has about 2-3 billion rows and is about 40TB of data in total.
> I am running a select query to \copy the data into csv files for each month.
> 
> I see there is an AUTOVACUUM job started on all the partitions that have been
> read (on which select has run). I am assuming the reason why Postgres is
> autovacuuming these large partitions is this:
> 
> db=# SELECT c.relname as table_name, c.relkind as type, age(c.relfrozenxid) 
> as age,
> c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <> 2147483647 
> ORDER BY 3 DESC LIMIT 20;
>         table_name         | type |    age    | relfrozenxid
> ---+--+---+--
>  hist_omx_orderaudit201506 | r    | 201463508 |       319891
>  hist_omx_orderaudit201509 | r    | 201418497 |       364902
>  hist_omx_orderaudit201205 | r    | 201142042 |       641357
> 
> autovacuum_freeze_max_age
> ---
>  2
> (1 row)
> 
> Am I wrong in my understanding why this autovacuum is getting triggered?
> 
> However, my main question is that why do the data files in the tablespace, 
> corresponding
> to these partition tables (that have been read or selected recently), or the 
> ones
> autovacuumed recently have a latest timestamp?
> 
> There was no insert or update on these partitions, so why do I see an updated 
> timestamp?
> Does autovacuum change something on the datafiles?

Yes, autovacuum will freeze old rows in the table, which modifies the data 
files.

To keep this to a minimum, you can run VACUUM (FREEZE) on a partition as soon
as you know that its data won't get modified any more.
Still, I wouldn't be sure that the file modification timestamp won't change
any more after that.  But that should be pretty irrelevant; at any rate, no
substantial data modifications will occur after that.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Compile 14.1 in EL5.8

2022-02-09 Thread Tom Lane
Gabriela Serventi  writes:
> Hi! I'm trying to compile release 14.1 in a very old Linux Server (Red Hat
> Enterprise Linux Server release 5.8). I can run configure successfully, but
> when I try to run make, I get the following error:
> latch.c:85:26: error: sys/signalfd.h: No such file or directory

It looks like since 6a2a70a02, latch.c effectively uses HAVE_SYS_EPOLL_H
to decide whether it can include , which seems not too
well thought out.  A proper fix will require another configure check,
but you could hack it by adding -DWAIT_USE_POLL to CPPFLAGS in
src/Makefile.global after running configure.

FWIW, EL5 is *ancient*.  You really oughta consider an update.

regards, tom lane




Re: Locks on FK Tables From Partitioning

2022-02-09 Thread Simon Riggs
On Tue, 8 Feb 2022 at 23:13, Tom Lane  wrote:
>
> Aaron Sipser  writes:
> > I am fairly confused about the locks generated by some partitioning code I
> > am using. The setup is that we have a partitioned table (call it P), and it
> > has a foreign key constraint to another table F. I'm trying to figure out
> > why when I add partitions to P, it seems that a lock is also taken on F.
>
> Addition of a foreign key constraint implies adding triggers to both sides
> of the FK.  Adding a new partition also adds a child foreign key
> constraint, which I'm pretty sure requires its own trigger.  So the F
> table is going to need whatever lock strength is involved in CREATE
> TRIGGER.  I don't recall offhand what we use, but it would at least need
> to block operations that might fire such a trigger.

Agreed that table F needs a trigger, but only for the first partition.

The trigger exists just once from the referenced table to the
referencing partitioned table, so we don't need to add that once per
partition.

ATAddForeignKeyConstraint() takes a ShareRowExclusiveLock to add the
needed triggers, but just assumes that it is needed for all cases.

ISTM that we could take a temporary/not-for-whole-transaction lock to
inspect whether the required trigger is already there and if so, avoid
taking the lock on the referenced table each time.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Compile 14.1 in EL5.8

2022-02-09 Thread Gabriela Serventi
Hi! I'm trying to compile release 14.1 in a very old Linux Server (Red Hat
Enterprise Linux Server release 5.8). I can run configure successfully, but
when I try to run make, I get the following error:

latch.c:85:26: error: sys/signalfd.h: No such file or directory

Does anybody know if it is possibly?
Thanks


Why are data files getting modified?

2022-02-09 Thread Abhishek Bhola
I have a table with monthly partitions, starting from 2011.
Each partition has about 2-3 billion rows and is about 40TB of data in
total.
I am running a select query to \copy the data into csv files for each month.

I see there is an *AUTOVACUUM* job started on all the partitions that have
been read (on which select has run). I am assuming the reason why Postgres
is autovacuuming these large partitions is this:

"This implies that if a table is not otherwise vacuumed, autovacuum will be
invoked on it approximately once every autovacuum_freeze_max_age minus
vacuum_freeze_min_age transactions. For tables that are regularly vacuumed
for space reclamation purposes, this is of little importance. However, for
static tables (including tables that receive inserts, but no updates or
deletes), there is no need to vacuum for space reclamation, so it can be
useful to try to maximize the interval between forced autovacuums on very
large static tables. Obviously one can do this either by increasing
autovacuum_freeze_max_age or decreasing vacuum_freeze_min_age."

https://www.postgresql.org/docs/13/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

Please see these partition tables are static, i.e., they received data many
years ago, with no updates ever.

db=# SELECT c.relname as table_name, c.relkind as type, age(c.relfrozenxid)
as age, c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <>
2147483647 ORDER BY 3 DESC LIMIT 20;
table_name | type |age| relfrozenxid
---+--+---+--
 hist_omx_orderaudit201506 | r| 201463508 |   319891
 hist_omx_orderaudit201509 | r| 201418497 |   364902
 hist_omx_orderaudit201205 | r| 201142042 |   641357

autovacuum_freeze_max_age
---
 2
(1 row)

Am I wrong in my understanding why this autovacuum is getting triggered?

However, my main question is that why do the data files in the tablespace,
corresponding to these partition tables (that have been read or selected
recently), or the ones autovacuumed recently have a latest timestamp?

There was no insert or update on these partitions, so why do I see an
updated timestamp? Does autovacuum change something on the datafiles?

Thanks

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_