Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Tom Lane
Thomas Munro  writes:
> Did you previously run this same workload on versions < 15 and never
> see any problem?  15 gained a new feature CREATE DATABASE ...
> STRATEGY=WAL_LOG, which is also the default.  I wonder if there is a
> bug somewhere near that, though I have no specific idea.

Per the release notes I was just writing ...



 
  Fix potential corruption of the template (source) database after
  CREATE DATABASE with the STRATEGY
  WAL_LOG option (Nathan Bossart, Ryo Matsumura)
 

 
  Improper buffer handling created a risk that any later modification
  of the template's pg_class catalog would be
  lost.
 


The comment about only pg_class being affected is my interpretation
of what the commit message said, but I might have misunderstood.

regards, tom lane




Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Thomas Munro
On Sun, May 7, 2023 at 10:23 AM Jeffrey Walton  wrote:
> This may be related... I seem to recall the GNUlib folks talking about
> a cp bug on sparse files. It looks like it may be fixed in coreutils
> release 9.2 (2023-03-20):
> https://github.com/coreutils/coreutils/blob/master/NEWS#L233
>
> If I recall correctly, it had something to do with the way
> copy_file_range worked. (Or maybe, it did not work as expected).
>
> According to the GNUlib docs
> (https://www.gnu.org/software/gnulib/manual/html_node/copy_005ffile_005frange.html):
>
> This function has many problems on Linux
> kernel versions before 5.3

That's quite interesting, thanks (we've been talking about making
direct use of copy_file_range() in a few threads, I'll definitely be
looking into that history), but we don't currently use
copy_file_range() or any coreutils stuff in the relevant code paths
here -- this data is copied by plain old pread() and pwrite().




Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Jeffrey Walton
On Sat, May 6, 2023 at 6:35 AM Thomas Munro  wrote:
>
> On Sat, May 6, 2023 at 9:58 PM Evgeny Morozov
>  wrote:
> > Right - I should have realised that! base/1414389/2662 is indeed all
> > nulls, 32KB of them. I included the file anyway in
> > https://objective.realityexists.net/temp/pgstuff2.zip
>
> OK so it's not just page 0, you have 32KB or 4 pages of all zeroes.
> That's the expected length of that relation when copied from the
> initial template, and consistent with the pg_waldump output (it uses
> FPIs to copy blocks 0-3).  We can't see the block contents but we know
> that block 2 definitely is not all zeroes at that point because there
> are various modifications to it, which not only write non-zeroes but
> must surely have required a sane page 0.
>
> So it does indeed look like something unknown has replaced 32KB of
> data with 32KB of zeroes underneath us.

This may be related... I seem to recall the GNUlib folks talking about
a cp bug on sparse files. It looks like it may be fixed in coreutils
release 9.2 (2023-03-20):
https://github.com/coreutils/coreutils/blob/master/NEWS#L233

If I recall correctly, it had something to do with the way
copy_file_range worked. (Or maybe, it did not work as expected).

According to the GNUlib docs
(https://www.gnu.org/software/gnulib/manual/html_node/copy_005ffile_005frange.html):

This function has many problems on Linux
kernel versions before 5.3

> Are there more non-empty
> files that are all-zeroes?  Something like this might find them:
>
> for F in base/1414389/*
> do
>   if [ -s $F ] && ! xxd -p $F | grep -qEv '^(00)*$' > /dev/null
>   then
> echo $F
>   fi
> done




Re: Check that numeric is zero

2023-05-06 Thread Andrew Gierth
> "Gabriel" == Gabriel Furstenheim Milerud  writes:

 Gabriel> Hi,

 Gabriel> I'm writing a Postgres native extension and I would like to
 Gabriel> check that a numeric is zero.

 Gabriel> My problem is that all exported methods like numeric_eq or
 Gabriel> numeric_sign require me to have a numeric to start with, and
 Gabriel> const_zero is not exported in numeric.c.

Currently the easiest and most portable way to get a numeric constant is
to call int4_numeric or int8_numeric via DirectFunctionCall; if you
don't care about versions older than pg14 there's also int64_to_numeric
which can be called directly from C.

Datum zero_num = DirectFunctionCall1(int4_numeric, Int32GetDatum(0));

(remember that this will be allocated in the current memory context; if
you want to keep a copy long-term, you'd want to datumCopy it somewhere
else.)

-- 
Andrew (irc:RhodiumToad)




Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Thomas Munro
On Sun, May 7, 2023 at 12:29 AM Evgeny Morozov
 wrote:
> On 6/05/2023 12:34 pm, Thomas Munro wrote:
> > So it does indeed look like something unknown has replaced 32KB of
> > data with 32KB of zeroes underneath us.  Are there more non-empty
> > files that are all-zeroes?  Something like this might find them:
> >
> > for F in base/1414389/*
> > do
> >   if [ -s $F ] && ! xxd -p $F | grep -qEv '^(00)*$' > /dev/null
> >   then
> > echo $F
> >   fi
> > done
>
> Yes, a total of 309 files are all-zeroes (and 52 files are not).
>
> I also checked the other DB that reports the same "unexpected zero page
> at block 0" error, "test_behavior_638186280406544656" (OID 1414967) -
> similar story there. I uploaded the lists of zeroed and non-zeroed files
> and the ls -la output for both as
> https://objective.realityexists.net/temp/pgstuff3.zip
>
> I then searched recursively such all-zeroes files in $PGDATA/base and
> did not find any outside of those two directories (base/1414389 and
> base/1414967). None in $PGDATA/global, either.

So "diff -u zeroed-files-1414967.txt zeroed-files-1414389.txt" shows
that they have the same broken stuff in the range cloned from the
template database by CREATE DATABASE STRATEGY=WAL_LOG, and it looks
like it's *all* the cloned catalogs, and then they have some
non-matching relfilenodes > 140, presumably stuff you created
directly in the new database (I'm not sure if I can say for sure that
those files are broken, without knowing what they are).

Did you previously run this same workload on versions < 15 and never
see any problem?  15 gained a new feature CREATE DATABASE ...
STRATEGY=WAL_LOG, which is also the default.  I wonder if there is a
bug somewhere near that, though I have no specific idea.  If you
explicitly added STRATEGY=FILE_COPY to your CREATE DATABASE commands,
you'll get the traditional behaviour.  It seems like you have some
kind of high frequency testing workload that creates and tests
databases all day long, and just occasionally detects this corruption.
Would you like to try requesting FILE_COPY for a while and see if it
eventually happens like that too?

My spidey sense is leaning away from filesystem bugs.  We've found
plenty of filesystem bugs on these mailing lists over the years and of
course it's not impossible, but I dunno... it seems quite suspicious
that all the system catalogs have apparently been wiped during or
moments after the creation of a new database that's running new
PostgreSQL 15 code...




Re: Death postgres

2023-05-06 Thread Adrian Klaver

On 5/6/23 10:13, Marc Millas wrote:






When you restarted the server where there any warnings shown?

Sadly, I cannot. Will be done next tuesday.


Cannot do what:

1) Get to the log to see if there are warnings?

2) Restart the server?

Your original post said the server crashed.

If that was the case how can you do any of the below without restarting it?




Using psql can you \d ?

Yes, and no pb to check pg_statistic,...


Can you select from any other table in the database?

Yes






-- 
Adrian Klaver

adrian.kla...@aklaver.com 



--
Adrian Klaver
adrian.kla...@aklaver.com





Check that numeric is zero

2023-05-06 Thread Gabriel Furstenheim Milerud
Hi,
I'm writing a Postgres native extension and I would like to check that a
numeric is zero.

My problem is that all exported methods like numeric_eq or numeric_sign
require me to have a numeric to start with, and const_zero is not exported
in numeric.c.

Any idea how to check it?

Thanks
Gabriel Fürstenheim


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 18:11, Adrian Klaver  a
écrit :

> On 5/6/23 05:25, Marc Millas wrote:
> >
> >
> > Le sam. 6 mai 2023 à 06:18, Adrian Klaver  > > a écrit :
> >
> > On 5/5/23 18:14, Marc Millas wrote:
> >  > Hi,
> >  >
> >  > postgres 14.2 on Linux redhat
> >  >
> >  > temp_file_limit set around 210 GB.
> >  >
> >  > a select request with 2 left join have crashed the server (oom
> > killer)
> >  > after the postgres disk occupation did grow from 15TB to 16 TB.
> >
> > The result of EXPLAIN  would be helpful.
> > Sure!
> >
> > But. One of the table looks "inaccessible" since.
> > Ie. Even explain select * from the_table didnt answer and must be killed
> > by control c
>
> When you restarted the server where there any warnings shown?
>
Sadly, I cannot. Will be done next tuesday.

>
> Using psql can you \d ?
>
Yes, and no pb to check pg_statistic,...

>
> Can you select from any other table in the database?
>
Yes

>
>
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Death postgres

2023-05-06 Thread Adrian Klaver

On 5/6/23 05:25, Marc Millas wrote:



Le sam. 6 mai 2023 à 06:18, Adrian Klaver > a écrit :


On 5/5/23 18:14, Marc Millas wrote:
 > Hi,
 >
 > postgres 14.2 on Linux redhat
 >
 > temp_file_limit set around 210 GB.
 >
 > a select request with 2 left join have crashed the server (oom
killer)
 > after the postgres disk occupation did grow from 15TB to 16 TB.

The result of EXPLAIN  would be helpful.
Sure!

But. One of the table looks "inaccessible" since.
Ie. Even explain select * from the_table didnt answer and must be killed 
by control c


When you restarted the server where there any warnings shown?

Using psql can you \d ?

Can you select from any other table in the database?





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Death postgres

2023-05-06 Thread Thomas Guyot

On 2023-05-05 21:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer) 
after the postgres disk occupation did grow from 15TB to 16 TB.


What are the cases where postgres may grow without caring about 
temp_file_limit ?


thanks,



Some OSes like IIRC RHEL9 now default to tmpfs for /tmp - if your temp 
files are written in a tmpfs then it may may very well trigger the OOM 
because of the temp file used up all RAM.


Check the filesystem type of your temp file's location.

--
Thomas




Re: Death postgres

2023-05-06 Thread Ron

On 5/6/23 08:52, Marc Millas wrote:


Le sam. 6 mai 2023 à 15:15, Ron  a écrit :



[snip]


If your question is about temp_file_limit, don't distract us with OOM
issues.

My question is how postgres can use space without caring about 
temp_file_limit. The oom info is kind of hint about the context as, as 
said, one select did generate both things


It's a distraction to lead with "OOM killed my process".  Evidence of this 
fact is that all respondents have talked about is memory, not disk space.


--
Born in Arizona, moved to Babylonia.

Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 15:15, Ron  a écrit :

> On 5/6/23 07:19, Marc Millas wrote:
>
>
>
> Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :
>
>> On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
>> > postgres 14.2 on Linux redhat
>> >
>> > temp_file_limit set around 210 GB.
>> >
>> > a select request with 2 left join have crashed the server (oom killer)
>> after
>> > the postgres disk occupation did grow from 15TB to 16 TB.
>>
>
> "15TB" and "16TB" are pretty low-resolution.  For example, 15.4TB rounds
> *down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
> only 200GB apart.
>
> Heck, even 15.4TB and 15.6TB are low-resolution.  temp_file_limit may
> actually be working.
>
>
>> temp_file_limit limits the space a process may use on disk while the OOM
>> killer gets activated when the system runs out of RAM. So these seem to
>> be unrelated.
>>
>> hp
>>
> Its clear that oom killer is triggered by RAM and temp_file is a disk
> thing...
> But the sudden growth of disk space usage and RAM did happen exactly at
> the very same time, with only one user connected, and only one query
> running...
>
>
> If your question is about temp_file_limit, don't distract us with OOM
> issues.
>
My question is how postgres can use space without caring about
temp_file_limit. The oom info is kind of hint about the context as, as
said, one select did generate both things

>
> --
> Born in Arizona, moved to Babylonia.
>


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 15:15, Ron  a écrit :

> On 5/6/23 07:19, Marc Millas wrote:
>
>
>
> Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :
>
>> On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
>> > postgres 14.2 on Linux redhat
>> >
>> > temp_file_limit set around 210 GB.
>> >
>> > a select request with 2 left join have crashed the server (oom killer)
>> after
>> > the postgres disk occupation did grow from 15TB to 16 TB.
>>
>
> "15TB" and "16TB" are pretty low-resolution.  For example, 15.4TB rounds
> *down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
> only 200GB apart.
>
> Heck, even 15.4TB and 15.6TB are low-resolution.  temp_file_limit may
> actually be working.
>

It was... 15.2  and becomes 16.3...

>
>
>> temp_file_limit limits the space a process may use on disk while the OOM
>> killer gets activated when the system runs out of RAM. So these seem to
>> be unrelated.
>>
>> hp
>>
> Its clear that oom killer is triggered by RAM and temp_file is a disk
> thing...
> But the sudden growth of disk space usage and RAM did happen exactly at
> the very same time, with only one user connected, and only one query
> running...
>
>
> If your question is about temp_file_limit, don't distract us with OOM
> issues.
>
> --
> Born in Arizona, moved to Babylonia.
>


Re: Death postgres

2023-05-06 Thread Ron

On 5/6/23 07:19, Marc Millas wrote:



Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :

On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
> postgres 14.2 on Linux redhat
>
> temp_file_limit set around 210 GB.
>
> a select request with 2 left join have crashed the server (oom
killer) after
> the postgres disk occupation did grow from 15TB to 16 TB.



"15TB" and "16TB" are pretty low-resolution.  For example, 15.4TB rounds 
*down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact 
only 200GB apart.


Heck, even 15.4TB and 15.6TB are low-resolution.  temp_file_limit may 
actually be working.




temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

        hp

Its clear that oom killer is triggered by RAM and temp_file is a disk 
thing...
But the sudden growth of disk space usage and RAM did happen exactly at 
the very same time, with only one user connected, and only one query 
running...


If your question is about temp_file_limit, don't distract us with OOM issues.

--
Born in Arizona, moved to Babylonia.

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Evgeny Morozov
On 6/05/2023 12:34 pm, Thomas Munro wrote:
> So it does indeed look like something unknown has replaced 32KB of
> data with 32KB of zeroes underneath us.  Are there more non-empty
> files that are all-zeroes?  Something like this might find them:
>
> for F in base/1414389/*
> do
>   if [ -s $F ] && ! xxd -p $F | grep -qEv '^(00)*$' > /dev/null
>   then
> echo $F
>   fi
> done

Yes, a total of 309 files are all-zeroes (and 52 files are not).

I also checked the other DB that reports the same "unexpected zero page
at block 0" error, "test_behavior_638186280406544656" (OID 1414967) -
similar story there. I uploaded the lists of zeroed and non-zeroed files
and the ls -la output for both as
https://objective.realityexists.net/temp/pgstuff3.zip

I then searched recursively such all-zeroes files in $PGDATA/base and
did not find any outside of those two directories (base/1414389 and
base/1414967). None in $PGDATA/global, either.






Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 06:18, Adrian Klaver  a
écrit :

> On 5/5/23 18:14, Marc Millas wrote:
> > Hi,
> >
> > postgres 14.2 on Linux redhat
> >
> > temp_file_limit set around 210 GB.
> >
> > a select request with 2 left join have crashed the server (oom killer)
> > after the postgres disk occupation did grow from 15TB to 16 TB.
>
> The result of EXPLAIN  would be helpful.
> Sure!
>
But. One of the table looks "inaccessible" since.
Ie. Even explain select * from the_table didnt answer and must be killed by
control c

> >
> > What are the cases where postgres may grow without caring about
> > temp_file_limit ?
> >
> > thanks,
> >
> >
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com 
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :

> On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
> > postgres 14.2 on Linux redhat
> >
> > temp_file_limit set around 210 GB.
> >
> > a select request with 2 left join have crashed the server (oom killer)
> after
> > the postgres disk occupation did grow from 15TB to 16 TB.
>
> temp_file_limit limits the space a process may use on disk while the OOM
> killer gets activated when the system runs out of RAM. So these seem to
> be unrelated.
>
> hp
>
Its clear that oom killer is triggered by RAM and temp_file is a disk
thing...
But the sudden growth of disk space usage and RAM did happen exactly at the
very same time, with only one user connected, and only one query running...

>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Thomas Munro
On Sat, May 6, 2023 at 9:58 PM Evgeny Morozov
 wrote:
> Right - I should have realised that! base/1414389/2662 is indeed all
> nulls, 32KB of them. I included the file anyway in
> https://objective.realityexists.net/temp/pgstuff2.zip

OK so it's not just page 0, you have 32KB or 4 pages of all zeroes.
That's the expected length of that relation when copied from the
initial template, and consistent with the pg_waldump output (it uses
FPIs to copy blocks 0-3).  We can't see the block contents but we know
that block 2 definitely is not all zeroes at that point because there
are various modifications to it, which not only write non-zeroes but
must surely have required a sane page 0.

So it does indeed look like something unknown has replaced 32KB of
data with 32KB of zeroes underneath us.  Are there more non-empty
files that are all-zeroes?  Something like this might find them:

for F in base/1414389/*
do
  if [ -s $F ] && ! xxd -p $F | grep -qEv '^(00)*$' > /dev/null
  then
echo $F
  fi
done




Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Evgeny Morozov
On 6/05/2023 1:06 am, Thomas Munro wrote:
> Next can you share the file base/1414389/2662? ("5" was from the wrong
> database.) 

Right - I should have realised that! base/1414389/2662 is indeed all
nulls, 32KB of them. I included the file anyway in
https://objective.realityexists.net/temp/pgstuff2.zip


> Try something like:
> pg_waldump -R 1663/1414389/2662 -F main 00010001
> 00010007
>
> ... but change that to the range of files you have in your pg_wal.

The PG server had only 2 files left from today, and pg_waldumping them
gave this error:

pg_waldump -R 1663/1414389/2662 -F main 00010065005E
00010065005F
pg_waldump: error: error in WAL record at 65/5F629838: invalid record
length at 65/5F62A1E0: wanted 24, got 0

Not sure if that's something to worry about or not!

Then I realised we're actually archiving our WAL files with pgBackRest,
retrieved the WAL files for the time the DB was created and used (~12:39
UTC on 2023-05-02) and re-ran pg_waldump on those.

pg_waldump -R 1663/1414389/2662 -F main 0001005B
0001005B000F

rmgr: XLOG    len (rec/tot): 51/   108, tx: 242382, lsn:
5B/0222BC68, prev 5B/0222BC38, desc: FPI , blkref #0: rel
1663/1414389/2662 blk 0 FPW
rmgr: XLOG    len (rec/tot): 51/  3224, tx: 242382, lsn:
5B/0222BCD8, prev 5B/0222BC68, desc: FPI , blkref #0: rel
1663/1414389/2662 blk 1 FPW
rmgr: XLOG    len (rec/tot): 51/  1544, tx: 242382, lsn:
5B/0222C988, prev 5B/0222BCD8, desc: FPI , blkref #0: rel
1663/1414389/2662 blk 2 FPW
rmgr: XLOG    len (rec/tot): 49/   121, tx: 242382, lsn:
5B/0222CF90, prev 5B/0222C988, desc: FPI , blkref #0: rel
1663/1414389/2662 blk 3 FPW
rmgr: Btree   len (rec/tot): 64/    64, tx: 242384, lsn:
5B/02321870, prev 5B/023217A0, desc: INSERT_LEAF off 132, blkref #0: rel
1663/1414389/2662 blk 2
rmgr: Btree   len (rec/tot): 64/    64, tx: 242384, lsn:
5B/02322640, prev 5B/02322570, desc: INSERT_LEAF off 133, blkref #0: rel
1663/1414389/2662 blk 2
... (many more entries like that)
pg_waldump: error: error in WAL record at 5B/F38: missing contrecord
at 5B/F70

The full output is also in
https://objective.realityexists.net/temp/pgstuff2.zip






Re: Death postgres

2023-05-06 Thread Peter J. Holzer
On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
> postgres 14.2 on Linux redhat
> 
> temp_file_limit set around 210 GB.
> 
> a select request with 2 left join have crashed the server (oom killer) after
> the postgres disk occupation did grow from 15TB to 16 TB.

temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature