subject:"Add 64\-bit XIDs into PostgreSQL 15"

Re: Add 64-bit XIDs into PostgreSQL 15

2024-01-18 Thread Shubham Khanna

On Wed, Dec 13, 2023 at 5:56 PM Maxim Orlov  wrote:
>
> Hi!
>
> Just to keep this thread up to date, here's a new version after recent 
> changes in SLRU.
> I'm also change order of the patches in the set, to make adding initdb MOX 
> options after the
> "core 64 xid" patch, since MOX patch is unlikely to be committed and now for 
> test purpose only.

I tried to apply the patch but it is failing at the Head. It is giving
the following error:
Hunk #1 succeeded at 601 (offset 5 lines).
patching file src/backend/replication/slot.c
patching file src/backend/replication/walreceiver.c
patching file src/backend/replication/walsender.c
Hunk #1 succeeded at 2434 (offset 160 lines).
patching file src/backend/storage/ipc/procarray.c
Hunk #1 succeeded at 1115 with fuzz 2.
Hunk #3 succeeded at 1286 with fuzz 2.
Hunk #7 FAILED at 4341.
Hunk #8 FAILED at 4899.
Hunk #9 FAILED at 4959.
3 out of 10 hunks FAILED -- saving rejects to file
src/backend/storage/ipc/procarray.c.rej
patching file src/backend/storage/ipc/standby.c
Hunk #1 FAILED at 1043.
Hunk #2 FAILED at 1370.
2 out of 2 hunks FAILED -- saving rejects to file
src/backend/storage/ipc/standby.c.rej
Please send the Re-base version of the patch.

Thanks and Regards,
Shubham Khanna.

Re: Add 64-bit XIDs into PostgreSQL 15

2023-12-15 Thread wenhui qiu

Hi Pavel Borisov
Many thanks

Best whish

Pavel Borisov  于2023年12月15日周五 17:13写道：

> Hi, Wenhui!
>
> On Fri, 15 Dec 2023 at 05:52, wenhui qiu  wrote:
>
>> Hi Maxim Orlov
>> Good news,xid64 has achieved a successful first phase,I tried to
>> change the path status (https://commitfest.postgresql.org/43/3594/) ,But
>> it seems incorrect
>>
>> Maxim Orlov  于2023年12月13日周三 20:26写道：
>>
>>> Hi!
>>>
>>> Just to keep this thread up to date, here's a new version after recent
>>> changes in SLRU.
>>> I'm also change order of the patches in the set, to make adding initdb
>>> MOX options after the
>>> "core 64 xid" patch, since MOX patch is unlikely to be committed and now
>>> for test purpose only.
>>>
>>
> If the patch is RwF the CF entry is finished and can't be enabled, rather
> the patch needs to be submitted in a new entry, which I have just done.
> https://commitfest.postgresql.org/46/4703/
>
> Please feel free to submit your review.
>
> Kind regards,
> Pavel Borisov,
> Supabase
>

Re: Add 64-bit XIDs into PostgreSQL 15

2023-12-15 Thread Pavel Borisov

Hi, Wenhui!

On Fri, 15 Dec 2023 at 05:52, wenhui qiu  wrote:

> Hi Maxim Orlov
> Good news,xid64 has achieved a successful first phase,I tried to
> change the path status (https://commitfest.postgresql.org/43/3594/) ,But
> it seems incorrect
>
> Maxim Orlov  于2023年12月13日周三 20:26写道：
>
>> Hi!
>>
>> Just to keep this thread up to date, here's a new version after recent
>> changes in SLRU.
>> I'm also change order of the patches in the set, to make adding initdb
>> MOX options after the
>> "core 64 xid" patch, since MOX patch is unlikely to be committed and now
>> for test purpose only.
>>
>
If the patch is RwF the CF entry is finished and can't be enabled, rather
the patch needs to be submitted in a new entry, which I have just done.
https://commitfest.postgresql.org/46/4703/

Please feel free to submit your review.

Kind regards,
Pavel Borisov,
Supabase

Re: Add 64-bit XIDs into PostgreSQL 15

2023-12-14 Thread wenhui qiu

Hi Maxim Orlov
Good news,xid64 has achieved a successful first phase,I tried to change
the path status (https://commitfest.postgresql.org/43/3594/) ,But it seems
incorrect

Maxim Orlov  于2023年12月13日周三 20:26写道：

> Hi!
>
> Just to keep this thread up to date, here's a new version after recent
> changes in SLRU.
> I'm also change order of the patches in the set, to make adding initdb MOX
> options after the
> "core 64 xid" patch, since MOX patch is unlikely to be committed and now
> for test purpose only.
>
> --
> Best regards,
> Maxim Orlov.
>

回复: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-12-04 Thread Thomas wen

Hi orlo...@gmail.com
  That's good news, I think we can continue discuss for  
(https://commitfest.postgresql.org/43/3594/)




Best regards

发件人: John Naylor 
发送时间: 2023年12月4日 18:22
收件人: Pavel Borisov 
抄送: Alexander Korotkov ; Tom Lane ; 
Alexander Lakhin ; Maxim Orlov ; 
Aleksander Alekseev ; Postgres hackers 
; Heikki Linnakangas ; 
Japin Li ; Andres Freund ; Michael 
Paquier ; Peter Eisentraut 

主题: Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into 
PostgreSQL 15)

On Mon, Dec 4, 2023 at 3:12 PM Pavel Borisov  wrote:
> I think this is complete and could be closed.

Done.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-12-04 Thread John Naylor

On Mon, Dec 4, 2023 at 3:12 PM Pavel Borisov  wrote:
> I think this is complete and could be closed.

Done.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-12-04 Thread Pavel Borisov

On Mon, 4 Dec 2023 at 10:34, John Naylor  wrote:

> On Thu, Nov 30, 2023 at 4:37 PM Alexander Korotkov 
> wrote:
> >
> > On Thu, Nov 30, 2023 at 10:29 AM Pavel Borisov 
> wrote:
> > > Agree. The fix is attached.
> >
> > What an oversight.
> > Thank you, pushed!
>
> With that, is there any more work pending, or can we close the CF entry?
>
I think this is complete and could be closed.

Regards,
Pavel

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-12-03 Thread John Naylor

On Thu, Nov 30, 2023 at 4:37 PM Alexander Korotkov  wrote:
>
> On Thu, Nov 30, 2023 at 10:29 AM Pavel Borisov  wrote:
> > Agree. The fix is attached.
>
> What an oversight.
> Thank you, pushed!

With that, is there any more work pending, or can we close the CF entry?

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-30 Thread Alexander Korotkov

On Thu, Nov 30, 2023 at 10:29 AM Pavel Borisov  wrote:
> Agree. The fix is attached.

What an oversight.
Thank you, pushed!

--
Regards,
Alexander Korotkov

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-30 Thread Pavel Borisov

On Thu, 30 Nov 2023 at 08:03, Tom Lane  wrote:

> Alexander Lakhin  writes:
> > And a warning:
> > $ CC=gcc-12 CFLAGS="-Wall -Wextra -Wno-unused-parameter
> -Wno-sign-compare -Wno-clobbered
> > -Wno-missing-field-initializers" ./configure -q && make -s
> > slru.c:63:1: warning: ‘inline’ is not at beginning of declaration
> [-Wold-style-declaration]
> > 63 | static int  inline
> >| ^~
>
> > Maybe it's worth fixing before committing...
>
> This should have been fixed before commit, because there are now a
> dozen buildfarm animals complaining about it, as well as who-knows-
> how-many developers' compilers.
>
>  calliphoridae | 2023-11-30 02:48:59 |
> /home/bf/bf-build/calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  canebrake | 2023-11-29 14:22:10 |
> /home/bf/bf-build/canebrake/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  culicidae | 2023-11-30 02:49:06 |
> /home/bf/bf-build/culicidae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  desmoxytes| 2023-11-30 03:11:15 |
> /home/bf/bf-build/desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  flaviventris  | 2023-11-30 02:53:19 |
> /home/bf/bf-build/flaviventris/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  francolin | 2023-11-30 02:26:08 |
> ../pgsql/src/backend/access/transam/slru.c:63:1: warning: 'inline' is not
> at beginning of declaration [-Wold-style-declaration]
>  grassquit | 2023-11-30 02:58:36 |
> /home/bf/bf-build/grassquit/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  komodoensis   | 2023-11-30 03:07:52 |
> /home/bf/bf-build/komodoensis/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  phycodurus| 2023-11-29 14:29:02 |
> /home/bf/bf-build/phycodurus/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  piculet   | 2023-11-30 02:32:57 |
> ../pgsql/src/backend/access/transam/slru.c:63:1: warning: 'inline' is not
> at beginning of declaration [-Wold-style-declaration]
>  pogona| 2023-11-29 14:22:31 |
> /home/bf/bf-build/pogona/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  rorqual   | 2023-11-30 02:32:41 |
> ../pgsql/src/backend/access/transam/slru.c:63:1: warning: 'inline' is not
> at beginning of declaration [-Wold-style-declaration]
>  serinus   | 2023-11-30 02:47:05 |
> /home/bf/bf-build/serinus/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  skink | 2023-11-29 14:23:05 |
> /home/bf/bf-build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  taipan| 2023-11-30 03:03:52 |
> /home/bf/bf-build/taipan/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>  tamandua  | 2023-11-30 02:49:50 |
> /home/bf/bf-build/tamandua/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
> warning: 'inline' is not at beginning of declaration
> [-Wold-style-declaration]
>
> regards, tom lane
>
Agree. The fix is attached.

Regards,
Pavel


0001-Fix-warning-due-non-standard-inline-declaration-in-4.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-29 Thread Tom Lane

Alexander Lakhin  writes:
> And a warning:
> $ CC=gcc-12 CFLAGS="-Wall -Wextra -Wno-unused-parameter -Wno-sign-compare 
> -Wno-clobbered 
> -Wno-missing-field-initializers" ./configure -q && make -s
> slru.c:63:1: warning: ‘inline’ is not at beginning of declaration 
> [-Wold-style-declaration]
>     63 | static int  inline
>    | ^~

> Maybe it's worth fixing before committing...

This should have been fixed before commit, because there are now a
dozen buildfarm animals complaining about it, as well as who-knows-
how-many developers' compilers.

 calliphoridae | 2023-11-30 02:48:59 | 
/home/bf/bf-build/calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 canebrake | 2023-11-29 14:22:10 | 
/home/bf/bf-build/canebrake/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 culicidae | 2023-11-30 02:49:06 | 
/home/bf/bf-build/culicidae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 desmoxytes| 2023-11-30 03:11:15 | 
/home/bf/bf-build/desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 flaviventris  | 2023-11-30 02:53:19 | 
/home/bf/bf-build/flaviventris/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 francolin | 2023-11-30 02:26:08 | 
../pgsql/src/backend/access/transam/slru.c:63:1: warning: 'inline' is not at 
beginning of declaration [-Wold-style-declaration]
 grassquit | 2023-11-30 02:58:36 | 
/home/bf/bf-build/grassquit/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 komodoensis   | 2023-11-30 03:07:52 | 
/home/bf/bf-build/komodoensis/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 phycodurus| 2023-11-29 14:29:02 | 
/home/bf/bf-build/phycodurus/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 piculet   | 2023-11-30 02:32:57 | 
../pgsql/src/backend/access/transam/slru.c:63:1: warning: 'inline' is not at 
beginning of declaration [-Wold-style-declaration]
 pogona| 2023-11-29 14:22:31 | 
/home/bf/bf-build/pogona/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 rorqual   | 2023-11-30 02:32:41 | 
../pgsql/src/backend/access/transam/slru.c:63:1: warning: 'inline' is not at 
beginning of declaration [-Wold-style-declaration]
 serinus   | 2023-11-30 02:47:05 | 
/home/bf/bf-build/serinus/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 skink | 2023-11-29 14:23:05 | 
/home/bf/bf-build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 taipan| 2023-11-30 03:03:52 | 
/home/bf/bf-build/taipan/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
 tamandua  | 2023-11-30 02:49:50 | 
/home/bf/bf-build/tamandua/HEAD/pgsql.build/../pgsql/src/backend/access/transam/slru.c:63:1:
 warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]

regards, tom lane

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-28 Thread Alexander Korotkov

Андрей, привет!

Текущее положение у меня такое.

.
*pg_stats and range statisticsTracking statements entry timestamp in
pg_stat_statements*

Уже закоммичены.

*XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL
15)*
Если всё будет и замечаний не возникнет ок, завтра утром закоммичу.

*Add semi-join pushdown to postgres_fdw*
Переработал патч, сделал обработку условий более аккурантно. Хочу попросить
ещё 2 часа на финальное ревью и коммит.

*May be BUG. Periodic burst growth of the checkpoint_req counter on
replica.*
Сегодня вечером планирую доделать и выложить review.

--
Regards,
Alexander Korotkov

--
Regards,
Alexander Korotkov

On Mon, Nov 27, 2023 at 9:00 AM Alexander Lakhin 
wrote:

> Hello Alexander,
>
> 27.11.2023 02:43, Alexander Korotkov wrote:
>
> > v61 looks good to me.  I'm going to push it as long as there are no
> objections.
> >
>
> I've looked at the patch set and found a typo:
> occured
>
> And a warning:
> $ CC=gcc-12 CFLAGS="-Wall -Wextra -Wno-unused-parameter -Wno-sign-compare
> -Wno-clobbered
> -Wno-missing-field-initializers" ./configure -q && make -s
> slru.c:63:1: warning: ‘inline’ is not at beginning of declaration
> [-Wold-style-declaration]
> 63 | static int  inline
>| ^~
>
> Maybe it's worth fixing before committing...
>
> Best regards,
> Alexander
>

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-26 Thread Alexander Lakhin


Hello Alexander,

27.11.2023 02:43, Alexander Korotkov wrote:


v61 looks good to me.  I'm going to push it as long as there are no objections.



I've looked at the patch set and found a typo:
occured

And a warning:
$ CC=gcc-12 CFLAGS="-Wall -Wextra -Wno-unused-parameter -Wno-sign-compare -Wno-clobbered 
-Wno-missing-field-initializers" ./configure -q && make -s

slru.c:63:1: warning: ‘inline’ is not at beginning of declaration 
[-Wold-style-declaration]
   63 | static int  inline
  | ^~

Maybe it's worth fixing before committing...

Best regards,
Alexander

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-26 Thread Alexander Korotkov

Alexander, Maxim,

Thank you for revisions.

On Thu, Nov 9, 2023 at 6:22 PM Maxim Orlov  wrote:
> Aleksander Alekseev,
>
> > Maxim,
> > I see both of us accounted for Alexanders feedback and submitted v59.
> > Your newer version seems to have issues on cfbot, so resubmitting the
> > previous patchset that passes the tests. Please feel free to add
> > changes.
>
> For unknown reasons, I do not receive any of your emails from after 
> 2023-11-07 11:57:12 (Message-ID: 
> CAJ7c6TN1hKqNPGrNcq96SUyD=z61rakgxf8iqq36qr90oud...@mail.gmail.com).
> Even after resend.
>
> Anyway, PFA patch set of version 61.  I've made some minor changes in the 
> 0001 and add 004 in order to test actual 64-bit SLRU pages.
>
> As for CF bot had failed on my v59 patch set, this is because of the bug.  
> It's manifested because of added 64-bit pages tests.
> The problem was in segno calculation, since we convert it from file name 
> using strtol call.  But strtol return long,
> which is 4 byte long in x86.
>
> -   segno = (int) strtol(clde->d_name, NULL, 16);
> +   segno = strtoi64(clde->d_name, NULL, 16);

v61 looks good to me.  I'm going to push it as long as there are no objections.

--
Regards,
Alexander Korotkov

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-09 Thread Maxim Orlov

Aleksander Alekseev,

> Maxim,
> I see both of us accounted for Alexanders feedback and submitted v59.
> Your newer version seems to have issues on cfbot, so resubmitting the
> previous patchset that passes the tests. Please feel free to add
> changes.

For unknown reasons, I do not receive any of your emails from after
2023-11-07 11:57:12 (Message-ID: CAJ7c6TN1hKqNPGrNcq96SUyD=
z61rakgxf8iqq36qr90oud...@mail.gmail.com).
Even after resend.

Anyway, PFA patch set of version 61.  I've made some minor changes in the
0001 and add 004 in order to test actual 64-bit SLRU pages.

As for CF bot had failed on my v59 patch set, this is because of the bug.
It's manifested because of added 64-bit pages tests.
The problem was in segno calculation, since we convert it from file name
using strtol call.  But strtol return long,
which is 4 byte long in x86.

-   segno = (int) strtol(clde->d_name, NULL, 16);
+   segno = strtoi64(clde->d_name, NULL, 16);

-- 
Best regards,
Maxim Orlov.


v61-0003-Make-use-FullTransactionId-in-2PC-filenames.patch
Description: Binary data


v61-0004-Add-SLRU-tests-for-64-bit-page-case.patch
Description: Binary data


v61-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data


v61-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-08 Thread Aleksander Alekseev

Maxim,

I see both of us accounted for Alexanders feedback and submitted v59.
Your newer version seems to have issues on cfbot, so resubmitting the
previous patchset that passes the tests. Please feel free to add
changes.

> See SlruCorrectSegmentFilenameLength():
>
> ```
> if (ctl->long_segment_names)
> return (len == 15); /* see SlruFileName() */
> else
> /*
>  * Commit 638cf09e76d allowed 5-character lengths. Later commit
>  * 73c986adde5 allowed 6-character length.
>  *
>  * XXX should we still consider such names to be valid?
>  */
> return (len == 4 || len == 5 || len == 6);
> ```
>
> Should we just drop it or check that segno is <= 0xFF?

I also choose to change this Assert and to add a corresponding comment:

else
{
-   Assert(segno >= 0 && segno <= 0x);
+   /*
+* Despite the fact that %04X format string is used up to 24 bit
+* integers are allowed. See SlruCorrectSegmentFilenameLength()
+*/
+   Assert(segno >= 0 && segno <= 0xFF);
return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
(unsigned int) segno);
}


--
Best regards,
Aleksander Alekseev


v60-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data


v60-0003-Make-use-FullTransactionId-in-2PC-filenames.patch
Description: Binary data


v60-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-07 Thread Maxim Orlov

On Mon, 6 Nov 2023 at 16:07, Alexander Korotkov 
wrote:

> Hi!
>
> On Wed, Jul 5, 2023 at 4:46 PM Aleksander Alekseev <
> aleksan...@timescale.com> wrote:
> > PFE the corrected patchset v58.
>
> I'd like to revive this thread.
>
Hi! Great news!

> BTW, there is a typo in a word "exceeed".
>
Fixed.

>
> +static int inline
> +SlruFileName(SlruCtl ctl, char *path, int64 segno)
> +{
> ...
> +}
>
> I think it worth adding asserts here to verify there is no overflow making
> us mapping different segments into the same files.
>
Agree, assertion added.

> +   return occupied == max_notify_queue_pages;
>
> I'm not sure if the current code could actually allow to occupy more than
> max_notify_queue_pages.  Probably not even in extreme cases.  But I still
> think it will more safe and easier to read to write "occupied >=
> max_notify_queue"_pages here.
>
Fixed.

> diff --git a/src/test/modules/test_slru/test_slru.c
> b/src/test/modules/test_slru/test_slru.c
>
> The actual 64-bitness of SLRU pages isn't much exercised in our automated
> tests.  It would be too exhausting to make pg_notify actually use higher
> than 2**32 page numbers.  Thus, I think test/modules/test_slru is a good
> place to give high page numbers a good test.
>
PFA, I've add test for a 64-bit SLRU pages.

By the way, there is another one useful thing we may do here.  For now
pg_commit_ts functionality is rather strange: if it was enabled, then
disabled and then enabled again all the data from before will be
discarded.  Meanwhile, users expected to have their commit timestamps for
all transactions, which were "logged" when this feature was enabled.  It's
weird.

AFICS, the only reason for this behaviour is becouse of transaction
wraparound. It may occur while the feature is disabled end it is safe to
simply remove all the data from previous period.  If we switch to
FullTransactionId in commit_ts we can overcome this limitation.  But I'm
not sure if it worth to try to fix this in current patchset, since it is
already non trivial.

-- 
Best regards,
Maxim Orlov.

v59-0003-Make-use-FullTransactionId-in-2PC-filenames.patch
Description: Binary data

v59-0004-Add-SLRU-tests-for-64-bit-page-case.patch
Description: Binary data

v59-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

v59-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-07 Thread Aleksander Alekseev

Hi again,

> PFA the corrected patchset v59.

On second thought, I believe this Assert is incorrect:

```
+else
+{
+Assert(segno >= 0 && segno <= 0x);
+return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+(unsigned int) segno);
+}
```

See SlruCorrectSegmentFilenameLength():

```
if (ctl->long_segment_names)
return (len == 15); /* see SlruFileName() */
else
/*
 * Commit 638cf09e76d allowed 5-character lengths. Later commit
 * 73c986adde5 allowed 6-character length.
 *
 * XXX should we still consider such names to be valid?
 */
return (len == 4 || len == 5 || len == 6);
```

Should we just drop it or check that segno is <= 0xFF?

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-07 Thread Aleksander Alekseev

Hi Alexander,

> -#define TransactionIdToCTsPage(xid) \
> -   ((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
> +
> +/*
> + * Although we return an int64 the actual value can't currently exceeed 
> 2**32.
> + */
> +static inline int64
> +TransactionIdToCTsPage(TransactionId xid)
> +{
> +   return xid / (int64) COMMIT_TS_XACTS_PER_PAGE;
> +}
>
> Is there any reason we transform macro into a function?  If not, I propose to 
> leave this as a macro.  BTW, there is a typo in a word "exceeed".

I kept the inline function, as we agreed above.

Typo fixed.

> +static int inline
> +SlruFileName(SlruCtl ctl, char *path, int64 segno)
> +{
> +   if (ctl->long_segment_names)
> +   /*
> +* We could use 16 characters here but the disadvantage would be that
> +* the SLRU segments will be hard to distinguish from WAL segments.
> +*
> +* For this reason we use 15 characters. It is enough but also means
> +* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
> +*/
> +   return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
> +   (long long) segno);
> +   else
> +   return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
> +   (unsigned int) segno);
> +}
>
> I think it worth adding asserts here to verify there is no overflow making us 
> mapping different segments into the same files.

Added. I noticed a off-by-one error in the code snippet proposed
above, so my code differs a bit.

> +   return occupied == max_notify_queue_pages;
>
> I'm not sure if the current code could actually allow to occupy more than 
> max_notify_queue_pages.  Probably not even in extreme cases.  But I still 
> think it will more safe and easier to read to write "occupied >= 
> max_notify_queue"_pages here.

Fixed.

> diff --git a/src/test/modules/test_slru/test_slru.c 
> b/src/test/modules/test_slru/test_slru.c
>
> The actual 64-bitness of SLRU pages isn't much exercised in our automated 
> tests.  It would be too exhausting to make pg_notify actually use higher than 
> 2**32 page numbers.  Thus, I think test/modules/test_slru is a good place to 
> give high page numbers a good test.

Fixed. I choose not to change any numbers in the test in order to
check any corner cases, etc. The code patches for long_segment_names =
true and long_segment_names = false are almost the same thus it will
not improve code coverage. Using the current numbers will allow to
easily switch back to long_segment_names = false in the test if
necessary.

PFA the corrected patchset v59.

-- 
Best regards,
Aleksander Alekseev


v59-0003-Make-use-FullTransactionId-in-2PC-filenames.patch
Description: Binary data


v59-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data


v59-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Alexander Korotkov

On Mon, Nov 6, 2023 at 4:38 PM Aleksander Alekseev 
wrote:
> > > PFE the corrected patchset v58.
> >
> > I'd like to revive this thread.
>
> Many thanks for your comments and suggestions.
>
> > I think it worth adding asserts here to verify there is no overflow
making us mapping different segments into the same files.
>
> Sorry, I didn't understand this one. Maybe you could provide the exact
code?

I actually meant this.

static int  inline
SlruFileName(SlruCtl ctl, char *path, int64 segno)
{
if (ctl->long_segment_names)
{
/*
 * We could use 16 characters here but the disadvantage would be
that
 * the SLRU segments will be hard to distinguish from WAL segments.
 *
 * For this reason we use 15 characters. It is enough but also means
 * that in the future we can't decrease SLRU_PAGES_PER_SEGMENT
easily.
 */
Assert(segno >= 0 && segno <= 0x1000);
return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
(long long) segno);
}
else
{
Assert(segno >= 0 && segno <= 0x1);
return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
(unsigned int) segno);
}
}

As I now get, snprintf() wouldn't just truncate the high signs, instead it
will use more characters.  But I think assertions are useful anyway.

--
Regards,
Alexander Korotkov

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Aleksander Alekseev

Alexander,

> > PFE the corrected patchset v58.
>
> I'd like to revive this thread.

Many thanks for your comments and suggestions.

> I think it worth adding asserts here to verify there is no overflow making us 
> mapping different segments into the same files.

Sorry, I didn't understand this one. Maybe you could provide the exact code?

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Aleksander Alekseev

Hi,

> > > If I remember right, the compiler will make equivalent code from
> > > inline functions and macros, and functions has an additional benefit:
> > > the compiler will report type mismatch if any. That was the only
> > > reason.
> >
> > Then it's OK to leave it as an inline function.

+1

> > BTW, I'm a bit puzzled on who should be the first author now?
> Thanks! It's long, I agree, but the activity around this was huge and
> involved many people, the patch itself already has more than
> half-hundred iterations to date. I think at least people mentioned in
> the commit message of v58 are fair to have author status.
>
> As for the first, I'm not sure, it's hard for me to evaluate what is
> more important, the initial prototype, or the final improvement
> iterations. I don't think the existing order in a commit message has
> some meaning. Maybe it's worth throwing a dice.

Personally I was not aware that the order of the authors in a commit
message carries any information. To my knowledge it doesn't not. I
believe this patchset is a team effort, so I suggest keeping the
commit message as is or shuffling the authors randomly. I believe we
should do our best to reflect the input of people who previously
contributed to the effort, if anyone are aware of them, and add them
to the commit message if they are not there yet. Pretty sure Git will
forgive us if the list ends up being long. Hopefully so will people
who we mistakenly forget.

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Pavel Borisov

On Mon, 6 Nov 2023 at 18:01, Alexander Korotkov  wrote:
>
> On Mon, Nov 6, 2023 at 3:42 PM Pavel Borisov  wrote:
> >
> > On Mon, 6 Nov 2023 at 17:07, Alexander Korotkov  
> > wrote:
> > > On Wed, Jul 5, 2023 at 4:46 PM Aleksander Alekseev 
> > >  wrote:
> > > > PFE the corrected patchset v58.
> > >
> > > I'd like to revive this thread.
> > >
> > > This patchset is extracted from a larger patchset implementing 64-bit 
> > > xids.  It converts page numbers in SLRUs into 64 bits.  The most SLRUs 
> > > save the same file naming scheme, thus their on-disk representation 
> > > remains the same.  However, the patch 0002 converts pg_notify to actually 
> > > use a new naming scheme.  Therefore pg_notify can benefit from 
> > > simplification and getting rid of wraparound.
> > >
> > > -#define TransactionIdToCTsPage(xid) \
> > > -   ((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
> > > +
> > > +/*
> > > + * Although we return an int64 the actual value can't currently exceeed 
> > > 2**32.
> > > + */
> > > +static inline int64
> > > +TransactionIdToCTsPage(TransactionId xid)
> > > +{
> > > +   return xid / (int64) COMMIT_TS_XACTS_PER_PAGE;
> > > +}
> > >
> > > Is there any reason we transform macro into a function?  If not, I 
> > > propose to leave this as a macro.  BTW, there is a typo in a word 
> > > "exceeed".
> > If I remember right, the compiler will make equivalent code from
> > inline functions and macros, and functions has an additional benefit:
> > the compiler will report type mismatch if any. That was the only
> > reason.
>
> Then it's OK to leave it as an inline function.
>
> > Also, I looked at v58-0001 and don't quite agree with mentioning the
> > authors of the original 64-xid patch, from which the current patch is
> > derived, as just "privious input" persons.
>
> +1, for converting all "previous input" persons as additional authors.
> That would be a pretty long list of authors though.
> BTW, I'm a bit puzzled on who should be the first author now?
Thanks! It's long, I agree, but the activity around this was huge and
involved many people, the patch itself already has more than
half-hundred iterations to date. I think at least people mentioned in
the commit message of v58 are fair to have author status.

As for the first, I'm not sure, it's hard for me to evaluate what is
more important, the initial prototype, or the final improvement
iterations. I don't think the existing order in a commit message has
some meaning. Maybe it's worth throwing a dice.

Regards, Pavel

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Alexander Korotkov

On Mon, Nov 6, 2023 at 3:42 PM Pavel Borisov  wrote:
>
> On Mon, 6 Nov 2023 at 17:07, Alexander Korotkov  wrote:
> > On Wed, Jul 5, 2023 at 4:46 PM Aleksander Alekseev 
> >  wrote:
> > > PFE the corrected patchset v58.
> >
> > I'd like to revive this thread.
> >
> > This patchset is extracted from a larger patchset implementing 64-bit xids. 
> >  It converts page numbers in SLRUs into 64 bits.  The most SLRUs save the 
> > same file naming scheme, thus their on-disk representation remains the 
> > same.  However, the patch 0002 converts pg_notify to actually use a new 
> > naming scheme.  Therefore pg_notify can benefit from simplification and 
> > getting rid of wraparound.
> >
> > -#define TransactionIdToCTsPage(xid) \
> > -   ((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
> > +
> > +/*
> > + * Although we return an int64 the actual value can't currently exceeed 
> > 2**32.
> > + */
> > +static inline int64
> > +TransactionIdToCTsPage(TransactionId xid)
> > +{
> > +   return xid / (int64) COMMIT_TS_XACTS_PER_PAGE;
> > +}
> >
> > Is there any reason we transform macro into a function?  If not, I propose 
> > to leave this as a macro.  BTW, there is a typo in a word "exceeed".
> If I remember right, the compiler will make equivalent code from
> inline functions and macros, and functions has an additional benefit:
> the compiler will report type mismatch if any. That was the only
> reason.

Then it's OK to leave it as an inline function.

> Also, I looked at v58-0001 and don't quite agree with mentioning the
> authors of the original 64-xid patch, from which the current patch is
> derived, as just "privious input" persons.

+1, for converting all "previous input" persons as additional authors.
That would be a pretty long list of authors though.

BTW, I'm a bit puzzled on who should be the first author now?

--
Regards,
Alexander Korotkov

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Pavel Borisov

On Mon, 6 Nov 2023 at 17:07, Alexander Korotkov  wrote:
>
> Hi!
>
> On Wed, Jul 5, 2023 at 4:46 PM Aleksander Alekseev  
> wrote:
> > PFE the corrected patchset v58.
>
> I'd like to revive this thread.
>
> This patchset is extracted from a larger patchset implementing 64-bit xids.  
> It converts page numbers in SLRUs into 64 bits.  The most SLRUs save the same 
> file naming scheme, thus their on-disk representation remains the same.  
> However, the patch 0002 converts pg_notify to actually use a new naming 
> scheme.  Therefore pg_notify can benefit from simplification and getting rid 
> of wraparound.
>
> -#define TransactionIdToCTsPage(xid) \
> -   ((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
> +
> +/*
> + * Although we return an int64 the actual value can't currently exceeed 
> 2**32.
> + */
> +static inline int64
> +TransactionIdToCTsPage(TransactionId xid)
> +{
> +   return xid / (int64) COMMIT_TS_XACTS_PER_PAGE;
> +}
>
> Is there any reason we transform macro into a function?  If not, I propose to 
> leave this as a macro.  BTW, there is a typo in a word "exceeed".
If I remember right, the compiler will make equivalent code from
inline functions and macros, and functions has an additional benefit:
the compiler will report type mismatch if any. That was the only
reason.

Also, I looked at v58-0001 and don't quite agree with mentioning the
authors of the original 64-xid patch, from which the current patch is
derived, as just "privious input" persons.

Regards, Pavel Borisov

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-06 Thread Alexander Korotkov

Hi!

On Wed, Jul 5, 2023 at 4:46 PM Aleksander Alekseev 
wrote:
> PFE the corrected patchset v58.

I'd like to revive this thread.

This patchset is extracted from a larger patchset implementing 64-bit
xids.  It converts page numbers in SLRUs into 64 bits.  The most SLRUs save
the same file naming scheme, thus their on-disk representation remains the
same.  However, the patch 0002 converts pg_notify to actually use a new
naming scheme.  Therefore pg_notify can benefit from simplification and
getting rid of wraparound.

-#define TransactionIdToCTsPage(xid) \
-   ((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
+
+/*
+ * Although we return an int64 the actual value can't currently exceeed
2**32.
+ */
+static inline int64
+TransactionIdToCTsPage(TransactionId xid)
+{
+   return xid / (int64) COMMIT_TS_XACTS_PER_PAGE;
+}

Is there any reason we transform macro into a function?  If not, I propose
to leave this as a macro.  BTW, there is a typo in a word "exceeed".

+static int inline
+SlruFileName(SlruCtl ctl, char *path, int64 segno)
+{
+   if (ctl->long_segment_names)
+   /*
+* We could use 16 characters here but the disadvantage would be
that
+* the SLRU segments will be hard to distinguish from WAL segments.
+*
+* For this reason we use 15 characters. It is enough but also means
+* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT
easily.
+*/
+   return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
+   (long long) segno);
+   else
+   return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+   (unsigned int) segno);
+}

I think it worth adding asserts here to verify there is no overflow making
us mapping different segments into the same files.

+   return occupied == max_notify_queue_pages;

I'm not sure if the current code could actually allow to occupy more than
max_notify_queue_pages.  Probably not even in extreme cases.  But I still
think it will more safe and easier to read to write "occupied >=
max_notify_queue"_pages here.

diff --git a/src/test/modules/test_slru/test_slru.c
b/src/test/modules/test_slru/test_slru.c

The actual 64-bitness of SLRU pages isn't much exercised in our automated
tests.  It would be too exhausting to make pg_notify actually use higher
than 2**32 page numbers.  Thus, I think test/modules/test_slru is a good
place to give high page numbers a good test.

--
Regards,
Alexander Korotkov

回复: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-10-03 Thread Thomas wen

> > This patch makes the filenames always 12 characters wide (for SLRUs that
> > opt-in to the new naming). That's actually not enough for the full range
> > that a 64 bit page number can represent. Should we make it 16 characters
> > now, to avoid repeating the same mistake we made earlier? Or make it
> > more configurable, on a per-SLRU basis. One reason I don't want to just
> > make it 16 characters is that WAL segment filenames are also 16 hex
> > characters, which could cause confusion.
>
> Good catch. I propose the following solution:
>
> ```
>  SlruFileName(SlruCtl ctl, char *path, int64 segno)
>  {
>  if (ctl->long_segment_names)
> -return snprintf(path, MAXPGPATH, "%s/%012llX", ctl->Dir,
> +/*
> + * We could use 16 characters here but the disadvantage would be that
> + * the SLRU segments will be hard to distinguish from WAL segments.
> + *
> + * For this reason we use 15 characters. It is enough but also means
> + * that in the future we can't decrease SLRU_PAGES_PER_SEGMENT 
> easily.
> + */
> +return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
>  (long long) segno);
>  else
>  return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
> ```
>
> PFE the corrected patchset v58.
Good idea


发件人: Aleksander Alekseev 
发送时间: 2023年9月4日 22:41
收件人: Postgres hackers 
抄送: Heikki Linnakangas ; Maxim Orlov ; 
Jacob Champion ; Japin Li ; 
Andres Freund ; Michael Paquier ; 
Pavel Borisov ; Peter Eisentraut 
; Alexander Korotkov 
主题: Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into 
PostgreSQL 15)

Hi,

> > Reviewing this now. I think it's almost ready to be committed.
> >
> > There's another big effort going on to move SLRUs to the regular buffer
> > cache (https://commitfest.postgresql.org/43/3514/). I wonder how moving
> > to 64 bit page numbers will affect that. BlockNumber is still 32 bits,
> > after all.
>
> Somehow I didn't pay too much attention to this effort, thanks. I will
> familiarize myself with the patch. Intuitively I don't think that the
> patchse should block each other.
>
> > This patch makes the filenames always 12 characters wide (for SLRUs that
> > opt-in to the new naming). That's actually not enough for the full range
> > that a 64 bit page number can represent. Should we make it 16 characters
> > now, to avoid repeating the same mistake we made earlier? Or make it
> > more configurable, on a per-SLRU basis. One reason I don't want to just
> > make it 16 characters is that WAL segment filenames are also 16 hex
> > characters, which could cause confusion.
>
> Good catch. I propose the following solution:
>
> ```
>  SlruFileName(SlruCtl ctl, char *path, int64 segno)
>  {
>  if (ctl->long_segment_names)
> -return snprintf(path, MAXPGPATH, "%s/%012llX", ctl->Dir,
> +/*
> + * We could use 16 characters here but the disadvantage would be that
> + * the SLRU segments will be hard to distinguish from WAL segments.
> + *
> + * For this reason we use 15 characters. It is enough but also means
> + * that in the future we can't decrease SLRU_PAGES_PER_SEGMENT 
> easily.
> + */
> +return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
>  (long long) segno);
>  else
>  return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
> ```
>
> PFE the corrected patchset v58.

While triaging the patches for the September CF [1] a consensus was
reached that the patchset needs another round of review. Also I
removed myself from the list of reviewers in order to make it clear
that a review from somebody else would be appreciated.

[1]: https://postgr.es/m/0737f444-59bb-ac1d-2753-873c40da0840%40eisentraut.org
--
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-09-04 Thread Aleksander Alekseev

Hi,

> > Reviewing this now. I think it's almost ready to be committed.
> >
> > There's another big effort going on to move SLRUs to the regular buffer
> > cache (https://commitfest.postgresql.org/43/3514/). I wonder how moving
> > to 64 bit page numbers will affect that. BlockNumber is still 32 bits,
> > after all.
>
> Somehow I didn't pay too much attention to this effort, thanks. I will
> familiarize myself with the patch. Intuitively I don't think that the
> patchse should block each other.
>
> > This patch makes the filenames always 12 characters wide (for SLRUs that
> > opt-in to the new naming). That's actually not enough for the full range
> > that a 64 bit page number can represent. Should we make it 16 characters
> > now, to avoid repeating the same mistake we made earlier? Or make it
> > more configurable, on a per-SLRU basis. One reason I don't want to just
> > make it 16 characters is that WAL segment filenames are also 16 hex
> > characters, which could cause confusion.
>
> Good catch. I propose the following solution:
>
> ```
>  SlruFileName(SlruCtl ctl, char *path, int64 segno)
>  {
>  if (ctl->long_segment_names)
> -return snprintf(path, MAXPGPATH, "%s/%012llX", ctl->Dir,
> +/*
> + * We could use 16 characters here but the disadvantage would be that
> + * the SLRU segments will be hard to distinguish from WAL segments.
> + *
> + * For this reason we use 15 characters. It is enough but also means
> + * that in the future we can't decrease SLRU_PAGES_PER_SEGMENT 
> easily.
> + */
> +return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
>  (long long) segno);
>  else
>  return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
> ```
>
> PFE the corrected patchset v58.

While triaging the patches for the September CF [1] a consensus was
reached that the patchset needs another round of review. Also I
removed myself from the list of reviewers in order to make it clear
that a review from somebody else would be appreciated.

[1]: https://postgr.es/m/0737f444-59bb-ac1d-2753-873c40da0840%40eisentraut.org
-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-07-05 Thread Aleksander Alekseev

Hi,

> Reviewing this now. I think it's almost ready to be committed.
>
> There's another big effort going on to move SLRUs to the regular buffer
> cache (https://commitfest.postgresql.org/43/3514/). I wonder how moving
> to 64 bit page numbers will affect that. BlockNumber is still 32 bits,
> after all.

Somehow I didn't pay too much attention to this effort, thanks. I will
familiarize myself with the patch. Intuitively I don't think that the
patchse should block each other.

> This patch makes the filenames always 12 characters wide (for SLRUs that
> opt-in to the new naming). That's actually not enough for the full range
> that a 64 bit page number can represent. Should we make it 16 characters
> now, to avoid repeating the same mistake we made earlier? Or make it
> more configurable, on a per-SLRU basis. One reason I don't want to just
> make it 16 characters is that WAL segment filenames are also 16 hex
> characters, which could cause confusion.

Good catch. I propose the following solution:

```
 SlruFileName(SlruCtl ctl, char *path, int64 segno)
 {
 if (ctl->long_segment_names)
-return snprintf(path, MAXPGPATH, "%s/%012llX", ctl->Dir,
+/*
+ * We could use 16 characters here but the disadvantage would be that
+ * the SLRU segments will be hard to distinguish from WAL segments.
+ *
+ * For this reason we use 15 characters. It is enough but also means
+ * that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
+ */
+return snprintf(path, MAXPGPATH, "%s/%015llX", ctl->Dir,
 (long long) segno);
 else
 return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
```

PFE the corrected patchset v58.

--
Best regards,
Aleksander Alekseev


v58-0003-Make-use-FullTransactionId-in-2PC-filenames.patch
Description: Binary data


v58-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data


v58-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-07-05 Thread Heikki Linnakangas


On 09/03/2023 17:21, Aleksander Alekseev wrote:

v57-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch


Reviewing this now. I think it's almost ready to be committed.

There's another big effort going on to move SLRUs to the regular buffer 
cache (https://commitfest.postgresql.org/43/3514/). I wonder how moving 
to 64 bit page numbers will affect that. BlockNumber is still 32 bits, 
after all.



+/*
+ * An internal function used by SlruScanDirectory().
+ *
+ * Returns true if a file with a name of a given length may be a correct
+ * SLRU segment.
+ */
+static inline bool
+SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+{
+   if (ctl->long_segment_names)
+   return (len == 12);
+   else
+
+   /*
+* XXX Should we still consider 5 and 6 to be a correct length? 
It
+* looks like it was previously allowed but now SlruFileName() 
can't
+* return such a name.
+*/
+   return (len == 4 || len == 5 || len == 6);
+}


Huh, I didn't realize that 5 and 6 character SLRU segment names are 
possible. But indeed they are. Commit 638cf09e76d allowed 5-character 
lengths:



commit 638cf09e76d70dd83d8123e7079be6c0532102d2
Author: Alvaro Herrera 
Date:   Thu Jan 2 18:17:29 2014 -0300

Handle 5-char filenames in SlruScanDirectory

Original users of slru.c were all producing 4-digit filenames, so that

was all that that code was prepared to handle.  Changes to multixact.c
in the course of commit 0ac5ad5134f made pg_multixact/members create
5-digit filenames once a certain threshold was reached, which
SlruScanDirectory wasn't prepared to deal with; in particular,
5-digit-name files were not removed during truncation.  Change that
routine to make it aware of those files, and have it process them just
like any others.

Right now, some pg_multixact/members directories will contain a mixture

of 4-char and 5-char filenames.  A future commit is expected fix things
so that each slru.c user declares the correct maximum width for the
files it produces, to avoid such unsightly mixtures.

Noticed while investigating bug #8673 reported by Serge Negodyuck.




And later commit 73c986adde5, which introduced commit_ts, allowed 6 
characters. With 8192 block size, pg_commit_ts segments are indeed 5 
chars wide, and with 512 block size, 6 chars are needed.


This patch makes the filenames always 12 characters wide (for SLRUs that 
opt-in to the new naming). That's actually not enough for the full range 
that a 64 bit page number can represent. Should we make it 16 characters 
now, to avoid repeating the same mistake we made earlier? Or make it 
more configurable, on a per-SLRU basis. One reason I don't want to just 
make it 16 characters is that WAL segment filenames are also 16 hex 
characters, which could cause confusion.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Add 64-bit XIDs into PostgreSQL 15

2023-07-04 Thread Aleksander Alekseev

Hi,

> This patch hasn't applied in quite some time, and the thread has moved to
> discussing higher lever items rather than the suggested patch, so I'm closing
> this as Returned with Feedback.  Please feel free to resubmit when there is
> renewed interest and a concensus on how/what to proceed with.

Yes, this thread awaits several other patches to be merged [1] in
order to continue, so it makes sense to mark it as RwF for the time
being. Thanks!

[1]: https://commitfest.postgresql.org/43/3489/

-- 
Best regards,
Aleksander Alekseev

Re: Add 64-bit XIDs into PostgreSQL 15

2023-07-04 Thread Daniel Gustafsson

This patch hasn't applied in quite some time, and the thread has moved to
discussing higher lever items rather than the suggested patch, so I'm closing
this as Returned with Feedback.  Please feel free to resubmit when there is
renewed interest and a concensus on how/what to proceed with.

--
Daniel Gustafsson

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-29 Thread Aleksander Alekseev

Hi,

> As suggested before by Heikki Linnakangas, I've added a patch for making 2PC 
> transaction state 64-bit.
> At first, my intention was to rebuild all twophase interface to use 
> FullTransactionId. But doing this in a proper
> manner would lead to switching from TransactionId to FullTransactionId in 
> PGPROC and patch become too
> big to handle here.
>
> So I decided to keep it simple for now and use wrap logic trick and calc 
> FullTransactionId on current epoch,
> since the span of active xids cannot exceed one epoch at any given time.
>
> Patches 1 and 2 are the same as above.

Thanks! 0003 LGTM. I'll mark the CF entry as RfC.

> I propose that we try to finish 1 and 2 for v16. And maybe 6. I think
> that's doable. It doesn't have any great user-visible benefits yet, but
> we need to start somewhere.

+1.

However there are only a few days left if we are going to do this
within March CF...

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-20 Thread Maxim Orlov

Hi!

As suggested before by Heikki Linnakangas, I've added a patch for making
2PC transaction state 64-bit.
At first, my intention was to rebuild all twophase interface to use
FullTransactionId. But doing this in a proper
manner would lead to switching from TransactionId to FullTransactionId in
PGPROC and patch become too
big to handle here.

So I decided to keep it simple for now and use wrap logic trick and calc
FullTransactionId on current epoch,
since the span of active xids cannot exceed one epoch at any given time.


Patches 1 and 2 are the same as above.

-- 
Best regards,
Maxim Orlov.


v57-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data


v57-0003-Make-use-FullTransactionId-in-2PC-filenames.patch
Description: Binary data


v57-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-09 Thread Aleksander Alekseev

Hi,

> 1. Use internal 64 bit page numbers in SLRUs without changing segments naming.
> 2. Use the larger segment file names in async.c, to lift the current 8
> GB limit on the max number of pending notifications.
> [...]
>
> Where our main focus for PG16 is going to be 1 and 2, and we may try
> to deliver 6 and 7 too if time will permit.
>
> Maxim and I agreed (offlist) that I'm going to submit 1 and 2. The
> patch 1 is ready, please see the attachment. I'm currently working on
> 2 and going to submit it in a bit. It seems to be relatively
> straightforward but I don't want to rush things and want to make sure
> I didn't miss something.

PFA the patch v57 which now includes both 1 and 2.

-- 
Best regards,
Aleksander Alekseev


v57-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data


v57-0002-Use-larger-segment-file-names-for-pg_notify.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-07 Thread Maxim Orlov

On Tue, 7 Mar 2023 at 15:38, Heikki Linnakangas  wrote:

>
> That is true for pg_multixact/offsets. We will indeed need to add an
> epoch and introduce the concept of FullMultiXactIds for that. However,
> we can change pg_multixact/members independently of that. We can extend
> MultiXactOffset from 32 to 64 bits, and eliminate pg_multixact/members
> wraparound, while keeping multi-xids 32 bits wide.
>
> Yes, you're totally correct. If it will be committable that way, I'm all
for that.

-- 
Best regards,
Maxim Orlov.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-07 Thread Aleksander Alekseev

Hi,

> Here's how I see the development path for this [...]

> So, in my view, the plan should be [...]
> Thoughts?

The plan looks great! I would also explicitly include this:

> As we start to refactor these things, I also think it would be good to
> have more explicit tracking of the valid range of SLRU pages in each
> SLRU. Take pg_subtrans for example: it's not very clear what pages have
> been initialized, especially during different stages of startup. It
> would be good to have clear start and end page numbers, and throw an
> error if you try to look up anything outside those bounds. Same for all
> other SLRUs.

So the plan becomes:

1. Use internal 64 bit page numbers in SLRUs without changing segments naming.
2. Use the larger segment file names in async.c, to lift the current 8
GB limit on the max number of pending notifications.
3. Extend pg_xact to 64-bits.
4. Extend pg_subtrans to 64-bits.
5. Extend pg_multixact so that pg_multixact/members is addressed by
64-bit offsets.
6. Extend pg_twophase.c, to use FullTransactionIds. (a bonus thing)
7. More explicit tracking of the valid range of SLRU pages in each SLRU

Where our main focus for PG16 is going to be 1 and 2, and we may try
to deliver 6 and 7 too if time will permit.

Maxim and I agreed (offlist) that I'm going to submit 1 and 2. The
patch 1 is ready, please see the attachment. I'm currently working on
2 and going to submit it in a bit. It seems to be relatively
straightforward but I don't want to rush things and want to make sure
I didn't miss something.

> I wonder if we should actually add an artificial limit, as a GUC.

Yes, I think we need some sort of limit. Using a GUC would be the most
straightforward approach. Alternatively we could derive the limit from
the existing GUCs, similarly to how we derive the default value of
wal_buffers from shared_buffers [1]. However, off the top of my head
we only have max_wal_size and it doesn't strike me as a good candidate
for deriving something for NOTIFY / LISTEN.

So I'm going to add max_notify_segments GUC with the default value of
65536 as it is now. If the new GUC will receive a push back from the
community we can always use a hard-coded value instead, or no limit at
all.

[1]: 
https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-BUFFERS

-- 
Best regards,
Aleksander Alekseev


v56-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-07 Thread Heikki Linnakangas


On 07/03/2023 13:38, Maxim Orlov wrote:
As for making pg_multixact 64 bit, I spend the last couple of days to 
make proper pg_upgrade for pg_multixact's and for pg_xact's
with wraparound and I've understood, that it is not a simple task 
compare to pg_xact's. The problem is, we do not have epoch for
multixacts, so we do not have ability to "overcome" wraparound. The 
solution may be adding some kind of epoch for multixacts or
make them 64 bit in "main" 64-xid patch, but in perspective of this 
thread, in my view, this should be last in line here.


That is true for pg_multixact/offsets. We will indeed need to add an 
epoch and introduce the concept of FullMultiXactIds for that. However, 
we can change pg_multixact/members independently of that. We can extend 
MultiXactOffset from 32 to 64 bits, and eliminate pg_multixact/members 
wraparound, while keeping multi-xids 32 bits wide.


- Heikki

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-07 Thread Maxim Orlov

On Tue, 17 Jan 2023 at 16:33, Aleksander Alekseev 
wrote:

> Hi hackers,
>
> Maxim, perhaps you could share with us what your reasoning was here?
>
> I'm really sorry for late response, but better late than never. Yes, we
can not access shared memory without lock.
In this particular case, we use XidGenLock. That is why we use lock
argument to take it is it was not taken previously.
Actually, we may place assertion in this insist.

As for xid compare: we do not compare xids here, we are checking for
wraparound, so, AFAICS, this code is correct.



On Mon, 6 Mar 2023 at 22:48, Heikki Linnakangas  wrote:

>
> 1. Use 64 bit page numbers in SLRUs (this patch)
>
> 2. Use the larger segment file names in async.c, to lift the current 8
> GB limit on the max number of pending notifications.
>
> 3. Extend pg_multixact so that pg_multixact/members is addressed by
> 64-bit offsets.
>
> 4. Extend pg_subtrans to 64-bits.
>
> 5. Extend pg_xact to 64-bits.
>
>
> 6. (a bonus thing that I noticed while thinking of pg_xact.) Extend
> pg_twophase.c, to use FullTransactionIds.
>
> Currently, the twophase state files in pg_twophase are named according
> to the 32 bit Xid of the transaction. Let's switch to FullTransactionId
> there.
>
> ...
>
> I propose that we try to finish 1 and 2 for v16. And maybe 6. I think
> that's doable. It doesn't have any great user-visible benefits yet, but
> we need to start somewhere.
>
> - Heikki
>
> Yes, this is a great idea! My only concern here is that we're going in
circles here. You see, patch 1 is what was proposed
in the beginning of this thread. Anyway, I will be happy if we are being
able to push this topic forward.

As for making pg_multixact 64 bit, I spend the last couple of days to make
proper pg_upgrade for pg_multixact's and for pg_xact's
with wraparound and I've understood, that it is not a simple task compare
to pg_xact's. The problem is, we do not have epoch for
multixacts, so we do not have ability to "overcome" wraparound. The
solution may be adding some kind of epoch for multixacts or
make them 64 bit in "main" 64-xid patch, but in perspective of this thread,
in my view, this should be last in line here.

In pg_xact we do not have such a problem, we do have epoch for transacions,
so conversion should be pretty obvious:
 -> 
0001 -> 0001
...
0FFE -> 0FFE
0FFF -> 0FFF
 -> 0001
0001 -> 00010001

So, in my view, the plan should be:
1. Use internal 64 bit page numbers in SLRUs without changing segments
naming.
2. Use the larger segment file names in async.c, to lift the current 8 GB
limit on the max number of pending notifications.
3. Extend pg_xact to 64-bits.
4. Extend pg_subtrans to 64-bits.
5. Extend pg_multixact so that pg_multixact/members is addressed by 64-bit
offsets.
6. Extend pg_twophase.c, to use FullTransactionIds. (a bonus thing)

Thoughts?

-- 
Best regards,
Maxim Orlov.

On Mon, 6 Mar 2023 at 22:48, Heikki Linnakangas  wrote:

> On 01/03/2023 12:21, Aleksander Alekseev wrote:
> > Hi,
> >
> >> I'm surprised that these patches extend the page numbering to 64 bits,
> >> but never actually uses the high bits. The XID "epoch" is not used, and
> >> pg_xact still wraps around and the segment names are still reused. I
> >> thought we could stop doing that.
> >
> > To clarify, the idea is to let CLOG grow indefinitely and simply store
> > FullTransactionId -> TransactionStatus (two bits). Correct?
>
> Correct.
>
> > I didn't investigate this in much detail but it may affect quite some
> > amount of code since TransactionIdDidCommit() and
> > TransactionIdDidCommit() currently both deal with TransactionId, not
> > FullTransactionId. IMO, this would be a nice change however, assuming
> > we are ready for it.
>
> Yep, it's a lot of code churn..
>
> > In the previous version of the patch there was an attempt to derive
> > FullTransactionId from TransactionId but it was wrong for the reasons
> > named above in the thread. Thus is was removed and the patch
> > simplified.
>
> Yeah, it's tricky to get it right. Clearly we need to do it at some
> point though.
>
> All in all, this is a big effort. I spent some more time reviewing this
> in the last few days, and thought a lot about what the path forward here
> could be. And I haven't looked at the actual 64-bit XIDs patch set yet,
> just this patch to use 64-bit addressing in SLRUs.
>
> This patch is the first step, but we have a bit of a chicken and egg
> problem, because this patch on its own isn't very interesting, but on
> the other hand, we need it to work on the follow up items. Here's how I
> see the development path for this (and again, this is just for the
> 64-bit SLRUs work, not the bigger 64-bit-XIDs-in-heapam effort):
>
> 1. Use 64 bit page numbers in SLRUs (this patch)
>
> I would like to make one change here: I'd prefer to keep the old 4-digit
> segment names, until we actually start to use the wider address space.
> Let's allow each SLRU to specify how many

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-06 Thread Heikki Linnakangas


On 01/03/2023 12:21, Aleksander Alekseev wrote:

Hi,


I'm surprised that these patches extend the page numbering to 64 bits,
but never actually uses the high bits. The XID "epoch" is not used, and
pg_xact still wraps around and the segment names are still reused. I
thought we could stop doing that.


To clarify, the idea is to let CLOG grow indefinitely and simply store
FullTransactionId -> TransactionStatus (two bits). Correct?


Correct.


I didn't investigate this in much detail but it may affect quite some
amount of code since TransactionIdDidCommit() and
TransactionIdDidCommit() currently both deal with TransactionId, not
FullTransactionId. IMO, this would be a nice change however, assuming
we are ready for it.


Yep, it's a lot of code churn..


In the previous version of the patch there was an attempt to derive
FullTransactionId from TransactionId but it was wrong for the reasons
named above in the thread. Thus is was removed and the patch
simplified.


Yeah, it's tricky to get it right. Clearly we need to do it at some 
point though.


All in all, this is a big effort. I spent some more time reviewing this 
in the last few days, and thought a lot about what the path forward here 
could be. And I haven't looked at the actual 64-bit XIDs patch set yet, 
just this patch to use 64-bit addressing in SLRUs.


This patch is the first step, but we have a bit of a chicken and egg 
problem, because this patch on its own isn't very interesting, but on 
the other hand, we need it to work on the follow up items. Here's how I 
see the development path for this (and again, this is just for the 
64-bit SLRUs work, not the bigger 64-bit-XIDs-in-heapam effort):


1. Use 64 bit page numbers in SLRUs (this patch)

I would like to make one change here: I'd prefer to keep the old 4-digit 
segment names, until we actually start to use the wider address space. 
Let's allow each SLRU to specify how many digits to use in the 
filenames, so that we convert one SLRU at a time.


If we do that, and don't change any of the existing SLRUs to actually 
use the wider space of page and segment numbers yet, this patch becomes 
just refactoring with no on-disk format changes. No pg_upgrade needed.


The next patches will start to make use of the wider address space, one 
SLRU at a time.


2. Use the larger segment file names in async.c, to lift the current 8 
GB limit on the max number of pending notifications.


No one actually minds the limit, it's quite generous as it is. But there 
is some code and complexity in async.c to avoid the wraparound that 
could be made simpler if we used longer SLRU segment names and avoided 
the wraparound altogether.


I wonder if we should actually add an artificial limit, as a GUC. If 
there are gigabytes of notifications queued up, something's probably 
wrong with the system, and you're not going to be happy if we just 
remove the limit so it can grow to terabytes until you run out of disk 
space.


3. Extend pg_multixact so that pg_multixact/members is addressed by 
64-bit offsets.


Currently, multi-XIDs can wrap around, requiring anti-wraparound 
freezing, but independently of that, the pg_multixact/members SLRU can 
also wrap around. We track both, and trigger anti-wraparound if either 
SLRU is about to wrap around. If we redefine MultiXactOffset as a 64-bit 
integer, we can avoid the pg_multixact/members wraparound altogether. A 
downside is that pg_multixact/offsets will take twice as much space, but 
I think that's a good tradeoff. Or perhaps we can play tricks like store 
a single 64-bit offset on each pg_multixact/offsets page, and a 32-bit 
offset from that for each XID, to avoid making it so much larger.


This would reduce the need to do anti-wraparound VACUUMs on systems that 
use multixacts heavily. Needs pg_upgrade support.


4. Extend pg_subtrans to 64-bits.

This isn't all that interesting because the active region of pg_subtrans 
cannot be wider than 32 bits anyway, because you'll still reach the 
general 32-bit XID wraparound. But it might be less confusing in some 
places.


I actually started to write a patch to do this, to see how complicated 
it is. It quickly proliferates into expanding other XIDs to 64-bits, 
like TransactionXmin, frozenXid calculation in vacuum.c, known-assigned 
XID tracking in procarray.c. etc. It's going to be necessary to convert 
32-bit XIDs to FullTransactionIds at some boundaries, and I'm not sure 
where exactly that should happen. It's easier to do the conversions 
close to subtrans.c, but then I'm not sure how much it gets us in terms 
of reducing confusion. It's easy to get confused with the epochs during 
conversions, as you noted. On the other hand, if we change much more of 
the backend to use FullTransactionIds, the patch becomes much more invasive.


Nice thing with pg_subtrans, though, is that it doesn't require 
pg_upgrade support.


5. Extend pg_xact to 64-bits.

Similar to pg_subtrans, really, but needs pg_upgrade support.

6. (a

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-03-01 Thread Aleksander Alekseev

Hi,

> I'm surprised that these patches extend the page numbering to 64 bits,
> but never actually uses the high bits. The XID "epoch" is not used, and
> pg_xact still wraps around and the segment names are still reused. I
> thought we could stop doing that.

To clarify, the idea is to let CLOG grow indefinitely and simply store
FullTransactionId -> TransactionStatus (two bits). Correct?

I didn't investigate this in much detail but it may affect quite some
amount of code since TransactionIdDidCommit() and
TransactionIdDidCommit() currently both deal with TransactionId, not
FullTransactionId. IMO, this would be a nice change however, assuming
we are ready for it.

In the previous version of the patch there was an attempt to derive
FullTransactionId from TransactionId but it was wrong for the reasons
named above in the thread. Thus is was removed and the patch
simplified.

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-02-28 Thread Heikki Linnakangas


On 28/02/2023 16:17, Maxim Orlov wrote:
Either I do not understand something, or the files from pg_commit_ts 
directory are not copied.


Huh, yeah you're right, pg_upgrade doesn't copy pg_commit_ts.

- Heikki

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-02-28 Thread Maxim Orlov

On Mon, 27 Feb 2023 at 12:10, Heikki Linnakangas  wrote:

> (CC list trimmed, gmail wouldn't let me send otherwise)
>
> That sounds right for pg_serial, pg_notify, and pg_subtrans. But not for
> pg_commit_ts and the pg_multixacts.
>
> This needs tests for pg_upgrading those SLRUs, after 0, 1 and N
> wraparounds.
>
> Yep, that's my fault. I've forgotten about pg_multixacts. But for the
pg_commit_ts it's a bit complicated story.
The thing is, if we do upgrade, old files from pg_commit_ts not copied into
a new server.

For example, I've checked one more time on the current master branch:
1). initdb
2). add "track_commit_timestamp = on" into postgresql.conf
3). pgbench
4). $ ls  pg_commit_ts/
  0005  000A  000F  0014  0019  001E  0023...
...009A  009F  00A4  00A9  00AE  00B3  00B8  00BD  00C2
5). do pg_upgrade
6). $ ls  pg_commit_ts/
  00C2

Either I do not understand something, or the files from pg_commit_ts
directory are not copied.

-- 
Best regards,
Maxim Orlov.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-02-27 Thread Heikki Linnakangas

(CC list trimmed, gmail wouldn't let me send otherwise)

On 22/02/2023 16:29, Maxim Orlov wrote:
On Tue, 21 Feb 2023 at 16:59, Aleksander Alekseev
mailto:aleksan...@timescale.com>> wrote:

One thing that still bothers me is that during the upgrade we only
migrate the CLOG segments (i.e. pg_xact / pg_clog) and completely
ignore all the rest of SLRUs:

* pg_commit_ts
* pg_multixact/offsets
* pg_multixact/members
* pg_subtrans
* pg_notify
* pg_serial

Hi! We do ignore these values, since in order to pg_upgrade the server
it must be properly stopped and no transactions can outlast this moment.

That sounds right for pg_serial, pg_notify, and pg_subtrans. But not for
pg_commit_ts and the pg_multixacts.

This needs tests for pg_upgrading those SLRUs, after 0, 1 and N wraparounds.

I'm surprised that these patches extend the page numbering to 64 bits,
but never actually uses the high bits. The XID "epoch" is not used, and
pg_xact still wraps around and the segment names are still reused. I
thought we could stop doing that. Certainly if we start supporting
64-bit XIDs properly, that will need to change and we will pg_upgrade
will need to rename the segments again.

The previous versions of these patches did that, but I think you changed
tact in response to Robert's suggestion at [1]:

Lest we miss the forest for the trees, there is an aspect of this
patch that I find to be an extremely good idea and think we should try
to get committed even if the rest of the patch set ends up in the
rubbish bin. Specifically, there are a couple of patches in here that
have to do with making SLRUs indexed by 64-bit integers rather than by
32-bit integers. We've had repeated bugs in the area of handling SLRU
wraparound in the past, some of which have caused data loss. Just by
chance, I ran across a situation just yesterday where an SLRU wrapped
around on disk for reasons that I don't really understand yet and
chaos ensued. Switching to an indexing system for SLRUs that does not
ever wrap around would probably enable us to get rid of a whole bunch
of crufty code, and would also likely improve the general reliability
of the system in situations where wraparound is threatened. It seems
like a really, really good idea.

These new versions of this patch don't achieve the goal of avoiding
wraparound. I think the previous versions that did that was the right
approach.

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZFmTGjgkmjgkcm2-vQq3_TzcoMKmVimvQLx9oJLbye0Q%40mail.gmail.com

- Heikki

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-02-22 Thread Maxim Orlov

On Tue, 21 Feb 2023 at 16:59, Aleksander Alekseev 
wrote:

> Hi,
>
> One thing that still bothers me is that during the upgrade we only
> migrate the CLOG segments (i.e. pg_xact / pg_clog) and completely
> ignore all the rest of SLRUs:
>
> * pg_commit_ts
> * pg_multixact/offsets
> * pg_multixact/members
> * pg_subtrans
> * pg_notify
> * pg_serial

Hi! We do ignore these values, since in order to pg_upgrade the server it
must be properly stopped and no transactions can outlast this moment.

-- 
Best regards,
Maxim Orlov.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-02-21 Thread Aleksander Alekseev

Hi,

> The convert_pg_xact_segments() function is still obviously
> overengineered. As I understand, all it has to do is simply renaming
> pg_xact/ to pg_xact/. Unfortunately I used up all the
> mana for today and have to take a long rest in order to continue.

PFA the corrected patch v55.

One thing that still bothers me is that during the upgrade we only
migrate the CLOG segments (i.e. pg_xact / pg_clog) and completely
ignore all the rest of SLRUs:

* pg_commit_ts
* pg_multixact/offsets
* pg_multixact/members
* pg_subtrans
* pg_notify
* pg_serial

My knowledge in this area is somewhat limited and I can't tell whether
this is OK. I will investigate but also I could use some feedback from
the reviewers.

-- 
Best regards,
Aleksander Alekseev


v55-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-02-20 Thread Aleksander Alekseev

Hi,

> OK, here is the patchset v53 where I mostly modified the commit
> messages. It is explicitly said that 0001 modifies the WAL records and
> why we decided to do it in this patch. Additionally any mention of
> 64-bit XIDs is removed since it is not guaranteed that the rest of the
> patches are going to be accepted. 64-bit SLRU page numbering is a
> valuable change per se.
>
> Changing the status of the CF entry to RfC apparently was a bit
> premature. It looks like the patchset can use a few more rounds of
> review.
>
> In 0002:
>
> [...]
>
> Maxim, perhaps you could share with us what your reasoning was here?

I played with the patch a bit and managed to figure out what you tried
to accomplish. Unfortunately generally you can't derive a
FullTransactionId from a TransactionId, and you can't access
ShmemVariableCache fields without taking a lock unless during the
startup when there are no concurrent processes.

I don't think this patch should do anything but change the SLRU
indexing from 32-bit to 64-bit one. Trying to address the wraparounds
would be nice but I'm afraid we are not quite there yet.

Also I found strage little changes that seemed to be unrelated to the
patch. I believe they ended up here by accident (used to be a part of
64-bit XIDs patchset) and removed them.

PFA the cleaned up version of the patch. I noticed that splitting it
into parts doesn't help much with the review or testing, nor seems it
likely that the patches are going to be merged separately one by one.
For these reasons I merged everything into a single patch.

The convert_pg_xact_segments() function is still obviously
overengineered. As I understand, all it has to do is simply renaming
pg_xact/ to pg_xact/. Unfortunately I used up all the
mana for today and have to take a long rest in order to continue.

-- 
Best regards,
Aleksander Alekseev


v54-0001-Index-SLRUs-by-64-bit-integers-rather-than-by-32.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-17 Thread Aleksander Alekseev

Hi hackers,

> Yeah, pg_upgrade will briefly start and stop the old server to make
> sure all the WAL is replayed, and won't transfer any of the files
> over. AFAIK, major-version WAL changes are fine; it was the previous
> claim that we could do it in a minor version that I was unsure about.

OK, here is the patchset v53 where I mostly modified the commit
messages. It is explicitly said that 0001 modifies the WAL records and
why we decided to do it in this patch. Additionally any mention of
64-bit XIDs is removed since it is not guaranteed that the rest of the
patches are going to be accepted. 64-bit SLRU page numbering is a
valuable change per se.

Changing the status of the CF entry to RfC apparently was a bit
premature. It looks like the patchset can use a few more rounds of
review.

In 0002:

```
-#define TransactionIdToCTsPage(xid) \
-((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
+static inline int64
+TransactionIdToCTsPageInternal(TransactionId xid, bool lock)
+{
+FullTransactionIdfxid,
+nextXid;
+uint32epoch;
+
+if (lock)
+LWLockAcquire(XidGenLock, LW_SHARED);
+
+/* make a local copy */
+nextXid = ShmemVariableCache->nextXid;
+
+if (lock)
+LWLockRelease(XidGenLock);
+
+epoch = EpochFromFullTransactionId(nextXid);
+if (xid > XidFromFullTransactionId(nextXid))
+--epoch;
+
+fxid = FullTransactionIdFromEpochAndXid(epoch, xid);
+
+return fxid.value / (uint64) COMMIT_TS_XACTS_PER_PAGE;
+}
```

I'm pretty confident that shared memory can't be accessed like this,
without taking a lock. Although it may work on x64 generally we can
get garbage, unless nextXid is accessed atomically and has a
corresponding atomic type. On top of that I'm pretty sure
TransactionIds can't be compared with the regular comparison
operators. All in all, so far I don't understand why this piece of
code should be so complicated.

The same applies to:

```
-#define TransactionIdToPage(xid) ((xid) / (TransactionId)
SUBTRANS_XACTS_PER_PAGE)
+static inline int64
+TransactionIdToPageInternal(TransactionId xid, bool lock)
```

... in subtrans.c

Maxim, perhaps you could share with us what your reasoning was here?

-- 
Best regards,
Aleksander Alekseev


v53-0002-Utilize-64-bit-SLRU-page-numbers-in-SLRU-callers.patch
Description: Binary data


v53-0001-Use-64-bit-numbering-of-SLRU-pages.patch
Description: Binary data


v53-0003-pg_upgrade-from-32-bit-to-64-bit-SLRU-page-numbe.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-11 Thread Jacob Champion

On Wed, Jan 11, 2023 at 1:48 AM Aleksander Alekseev
 wrote:
> After reading [1] carefully it looks like we shouldn't worry about
> this. The upgrade procedure explicitly requires to run `pg_ctl stop`
> during step 8 of the upgrade procedure, i.e. not in the immediate mode
> [2].

Yeah, pg_upgrade will briefly start and stop the old server to make
sure all the WAL is replayed, and won't transfer any of the files
over. AFAIK, major-version WAL changes are fine; it was the previous
claim that we could do it in a minor version that I was unsure about.

Thanks!
--Jacob

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-11 Thread Aleksander Alekseev

Hi Maxim,

> Secondly, shouldn't we introduce a new WAL record type in order to
> make the code backward compatible with previous PG versions? I'm not
> 100% sure how the upgrade procedure works in all the details. If it
> requires the DBMS to be gracefully shut down before the upgrade then
> we are probably fine here.

After reading [1] carefully it looks like we shouldn't worry about
this. The upgrade procedure explicitly requires to run `pg_ctl stop`
during step 8 of the upgrade procedure, i.e. not in the immediate mode
[2]. It also has explicit instructions regarding the replicas. From
what I can tell there is no way they will see WAL records they
wouldn't understand.

[1]: https://www.postgresql.org/docs/current/pgupgrade.html
[2]: https://www.postgresql.org/docs/current/app-pg-ctl.html

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-11 Thread Aleksander Alekseev

Hi Maxim,

> Here is a new patch set.
> I've added comments and make use GetClogDirName call in copy_subdir_files.

Jacob Champion pointed out (offlist, cc:'ed) that we may be wrong on this one:

> 0001 patch changes the SLRU internals without affecting the callers.

> 0001 - should make SLRU internally 64–bit, no effects from "outside"

... and apparently Jacob is right.

Besides other things 0001 modifies CLOG_ZEROPAGE and CLOG_TRUNCATE WAL
records - it makes changes to WriteZeroPageXlogRec() and
WriteTruncateXlogRec() and corresponding changes to clog_desc() and
clog_redo().

Firstly, it means that the patch doesn't change what it claims to
change. I think these changes should be part of 0002.

Secondly, shouldn't we introduce a new WAL record type in order to
make the code backward compatible with previous PG versions? I'm not
100% sure how the upgrade procedure works in all the details. If it
requires the DBMS to be gracefully shut down before the upgrade then
we are probably fine here.

-- 
Best regards,
Aleksander Alekseev

Re: Add 64-bit XIDs into PostgreSQL 15

2023-01-10 Thread Aleksander Alekseev

Hi Maxim,

> Anyway. Let's discuss on-disk page format, shall we?

Here are my two cents.

> AFAICS, we have a following options:
> [...]
> 2. Put special in every page where base for XIDs are stored. This is what we 
> have done in the current patch set.

The approach of using special space IMO is fine. I'm still a bit
sceptical about the need to introduce a new entity "64-bit base XID"
while we already have 32-bit XID epochs that will do the job. I
suspect that having fewer entities helps to reason about the code and
that it is important in the long run, but maybe it's just me. In any
case, I don't have a strong opinion here.

Additionally, I think we should be clear about the long term goals. As
Peter G. pointed out above:

> I think that we'll be able to get rid of freezing in a few years time.

IMO eventually getting rid of freezing and "magic" XIDs will simplify
the maintenance of the project and also make the behaviour of the
system much more predictable. The user will have to worry only about
the disk space reclamation.

If we have a consensus that this is the final goal then we should
definitely be moving toward 64-bit XIDs and perhaps even include a
corresponding PoC to the patchset. If we want to keep freezing
indefinitely then, as Chris Travers argued, 64-bit XIDs don't bring
that much value and maybe the community should be focusing on
something else.

-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-09 Thread Maxim Orlov

Hi!

Here is a new patch set.
I've added comments and make use GetClogDirName call in copy_subdir_files.

-- 
Best regards,
Maxim Orlov.


v52-0003-Make-pg_upgrade-from-32-bit-to-64-bit-SLRU.patch
Description: Binary data


v52-0002-Use-64-bit-pages-representation-in-SLRU-callers.patch
Description: Binary data


v52-0001-Use-internal-64-bit-numbering-of-SLRU-pages.patch
Description: Binary data

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-09 Thread Maxim Orlov

On Fri, 6 Jan 2023 at 09:51, Japin Li  wrote:

>
> For v51-0003. We can use GetClogDirName instead of GET_MAJOR_VERSION in
> copy_subdir_files().
>

Of course! Tanks! I'll address this in the next iteration, v52.

-- 
Best regards,
Maxim Orlov.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-09 Thread Aleksander Alekseev

Hi Andrey,

> Hi! I think that 64-bit xids are a very important feature and I want
> to help advance it. That's why I want to try to understand a patch
> better.

Thanks for your interest to the patchset!

> Do I get it right that the proposed v51 patchset only changes the SLRU
> filenames and type of pageno representation? Is SLRU wraparound still
> exactly there after 0x byte?

OK, let me give some background then. I suspect you already know this,
but this can be useful for other reviewers. Additionally we have two
rather large threads on our hands and it's easy to lose track of
things.

SLRU is basically a general-purpose LRU implementation with ReadPage()
/ WritePage() interface with the only exception that instead of
something like Page* object it operates slot numbers (array indexes).
SLRU is used as an underlying container for several internal
PostgreSQL structures, most importantly CLOG. Despite the name CLOG is
not a log (journal) but rather a large bit array. For every
transaction it stores two bits that reflect the status of the
transaction (more detail in clog.c / clog.h).

Currently SLRU operates 32-bit page numbers. What we currently agreed
on [1] and what we are trying to achieve in this thread is to make
SLRU pages 64-bit. The rest of the 64-bit XIDs is discussed in another
thread [2].

As Robert Haas put it:

> Specifically, there are a couple of patches in here that
> have to do with making SLRUs indexed by 64-bit integers rather than by
> 32-bit integers. We've had repeated bugs in the area of handling SLRU
> wraparound in the past, some of which have caused data loss. Just by
> chance, I ran across a situation just yesterday where an SLRU wrapped
> around on disk for reasons that I don't really understand yet and
> chaos ensued. Switching to an indexing system for SLRUs that does not
> ever wrap around would probably enable us to get rid of a whole bunch
> of crufty code, and would also likely improve the general reliability
> of the system in situations where wraparound is threatened. It seems
> like a really, really good idea.

So our goal here is to eliminate wrap-around for SLRU. It means that
if I save something to the page 0x12345678 it will stay there
forever. Other parts of the system however have to form proper 64-bit
page numbers in order to make it work. If they don't the wrap-around
is possible for these particular subsystems (but not SLRU per se).

> Also, I do not understand what is the reason for splitting 1st and 2nd
> steps. Certainly, there must be some logic behind it, but I just can't
> grasp it...

0001 patch changes the SLRU internals without affecting the callers.
It also preserves the short SLRU filenames which means nothing changes
for an outside observer. All it changes is PostgreSQL binary. It can
be merged any time and even backported to the previous versions if we
want to.

The 0002 patch makes changes to the callers and also enlarges SLRU
filenames. For sure we could do everything at once, but it would
complicate testing and more importantly code review. Personally I
believe Maxim did a great job here. Both patches were easy to read and
understand (relatively, of course).

> And the purpose of the 3rd step with pg_upgrade changes is a complete
> mystery for me.

0001 and 0002 will work fine for new PostgreSQL instances. But if you
have an instance that already has on-disk state we have to move the
SLRU segments accordingly. This is what 0003 does.

That's the theory at least. Personally I still have to meditate a bit
more on the code in order to get a good understanding of it,
especially the parts that deal with transaction epochs because this is
something I have limited experience with. Also I wouldn't exclude the
possibility of bugs. Particularly this part of 0003:

```
+oldseg.segno = pageno / SLRU_PAGES_PER_SEGMENT;
+oldseg.pageno = pageno % SLRU_PAGES_PER_SEGMENT;
+
+newseg.segno = pageno / SLRU_PAGES_PER_SEGMENT;
+newseg.pageno = pageno % SLRU_PAGES_PER_SEGMENT;
```

looks suspicious to me.

I agree that adding a couple of additional comments could be
appropriate, especially when it comes to epochs.

[1]: 
https://www.postgresql.org/message-id/CA%2BTgmoZFmTGjgkmjgkcm2-vQq3_TzcoMKmVimvQLx9oJLbye0Q%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
-- 
Best regards,
Aleksander Alekseev

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-09 Thread Maxim Orlov

>
> Do I get it right that the proposed v51 patchset only changes the SLRU
> filenames and type of pageno representation? Is SLRU wraparound still
> exactly there after 0x byte?
>
After applying the whole patch set, SLRU will become 64–bit without a
wraparound. Thus, no wraparound
should be there.

0001 - should make SLRU internally 64–bit, no effects from "outside"
0002 - should make SLRU callers 64–bit, SLRU segment files naming are
changed
0003 - make upgrade from previous versions feasible


> Also, I do not understand what is the reason for splitting 1st and 2nd
> steps. Certainly, there must be some logic behind it, but I just can't
> grasp it...
>
As we did discuss somewhere in the beginning of the discussion, we try to
make every commit as independent as possible.
Thus, it is much easier to review and commit. I see no problem to meld
these commits into one, if consensus will be reached.


> And the purpose of the 3rd step with pg_upgrade changes is a complete
> mystery for me. Please excuse my incompetence in the topic, but maybe
> some commit message or comments would help. What kind of xact segments
> conversion we do? Why is it only necessary for xacts, but not other
> SLRUs?
>
The purpose of the third patch is to make upgrade feasible. Since we've
change pg_xact files naming,
Postgres could not read status of "old" transactions from "old" pg_xact
files. So, we have to convert those files.
The major problem here is that we must handle possible segment wraparound
(in "old" cluster).  The whole idea
for an upgrade is to read SLRU pages for pg_xact one by one and write it in
a "new" filename.

Maybe, It's just a little bit complicated, since the algorithm is intended
to deal with different SLRU pages per segment
in "new" and "old" clusters. But, on the other hand, it is already created
in original patch set of 64–bit XIDs and will be useful
in the future. AFAICS, arguably, any variant of 64–bit XIDs should lead to
increase of an amount of SLRU pages per segment.

And as for other SLRUs, they cannot survive pg_upgrade mostly by the fact,
that cluster must be stopped upon upgrade.
Thus, no conversion needed.

-- 
Best regards,
Maxim Orlov.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-05 Thread Japin Li



On Mon, 19 Dec 2022 at 22:40, Maxim Orlov  wrote:
> Hi!
>
> As a result of discussion in the thread [0], Robert Haas proposed to focus
> on making SLRU 64 bit, as a first step towards 64 bit XIDs.
> Here is the patch set.
>
> In overall, code of this patch set is based on the existing code from [0]
> and may be simplified, due to the fact, that SLRU_PAGES_PER_SEGMENT is not
> meant to be changed now.
> But I decided to leave it that way. At least for now.
>
> As always, reviews and opinions are very welcome.
>

For v51-0003. We can use GetClogDirName instead of GET_MAJOR_VERSION in
copy_subdir_files().

diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 1c49c63444..3934978b97 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -857,10 +857,7 @@ copy_xact_xlog_xid(void)
pfree(new_path);
}
else
-   copy_subdir_files(GET_MAJOR_VERSION(old_cluster.major_version) 
<= 906 ?
- "pg_clog" : "pg_xact",
- 
GET_MAJOR_VERSION(new_cluster.major_version) <= 906 ?
- "pg_clog" : "pg_xact");
+   copy_subdir_files(GetClogDirName(old_cluster), 
GetClogDirName(new_cluster));
 
prep_status("Setting oldest XID for new cluster");
exec_prog(UTILITY_LOG_FILE, NULL, true, true,

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-01-05 Thread Andrey Borodin

On Mon, Dec 19, 2022 at 6:41 AM Maxim Orlov  wrote:
>
> As always, reviews and opinions are very welcome.

Hi! I think that 64-bit xids are a very important feature and I want
to help advance it. That's why I want to try to understand a patch
better.
Do I get it right that the proposed v51 patchset only changes the SLRU
filenames and type of pageno representation? Is SLRU wraparound still
exactly there after 0x byte?

The thing is we had some nasty bugs because SLRU wraparound is tricky.
And I think it would be beneficial if we could get to continuous SLRU
space. But the patch seems to avoid addressing this problem.

Also, I do not understand what is the reason for splitting 1st and 2nd
steps. Certainly, there must be some logic behind it, but I just can't
grasp it...
And the purpose of the 3rd step with pg_upgrade changes is a complete
mystery for me. Please excuse my incompetence in the topic, but maybe
some commit message or comments would help. What kind of xact segments
conversion we do? Why is it only necessary for xacts, but not other
SLRUs?

Thank you for working on this important project!

Best regards, Andrey Borodin.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-30 Thread adherent postgres

Hi Maxim Orlov:


>AFAICS, we have a following options:
>1. Making "true" 64�Cbit XIDs. I.e. making every tuple have 64�Cbit xmin and 
>>xmax fields.
>2. Put special in every page where base for XIDs are stored. This is what we 
>>have done in the current patch set.
>3. Put base for XIDs in a fork.
>4. Make explicit 64�Cbit XIDs for concrete relations. I.e. CREATE TABLE foo 
>>WITH (xid8) of smth.

I think the first solution will not be agreed by the core committers, they will 
consider that the change is too big and will affect the stability of 
PostgreSQL,I   think the second solution is actually quite good, and you've 
been working on it now,and there are successful cases (opengauss is implemented 
in this way,In order to save space and be compatible with older versions, 
opengauss design is to store the xmin/xmax of the head of the tuple in two 
parts, the xmin/xmax of the head of the tuple is the number of uint32; the 
header of the page stores the 64-bit xid_base, which is the xid_base of the 
current page.),I think it's best to stick to this solution now.
Opengauss  tuple structure:
[cid:3fae289c-7f88-46be-a775-2d93b1a9c41e]
Best wish





发件人: Maxim Orlov 
发送时间: 2022年12月28日 18:14
收件人: Pavel Borisov 
抄送: Robert Haas ; Chris Travers ; 
Bruce Momjian ; Aleksander Alekseev 
; pgsql-hackers@lists.postgresql.org 
; Chris Travers ; 
Peter Geoghegan ; Fedor Sigaev ; Alexander 
Korotkov ; Konstantin Knizhnik ; 
Nikita Glukhov ; Yura Sokolov 
; Simon Riggs 
主题: Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

I want to make a quick summary here.

1. An overall consensus has been reached: we shall focus on committing SLRU 
changes first.
2. I've created an appropriate patch set here [0].
3. How [0] is waiting for a review. As always, all opinions will be welcome.
4. While discussing error/warning messages and some other stuff, this thread 
was marked as "Waiting on Author".
5. I do rebase this patch set once in a week, but do not post it here, since 
there is no need in it. See (1).
6. For now, I don't understand what changes I have to make here. So, does 
"Waiting on Author" is appropriate status here?

Anyway. Let's discuss on-disk page format, shall we?

AFAICS, we have a following options:
1. Making "true" 64�Cbit XIDs. I.e. making every tuple have 64�Cbit xmin and 
xmax fields.
2. Put special in every page where base for XIDs are stored. This is what we 
have done in the current patch set.
3. Put base for XIDs in a fork.
4. Make explicit 64�Cbit XIDs for concrete relations. I.e. CREATE TABLE foo 
WITH (xid8) of smth.

There were opinions that the proposed solution (2) is not the optimal. It would 
be great to hear your concerns and thoughts.

[0] 
https://www.postgresql.org/message-id/CACG%3Dezav34TL%2BfGXD5vJ48%3DQbQBL9BiwkOTWduu9yRqie-h%2BDg%40mail.gmail.com

--
Best regards,
Maxim Orlov.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-28 Thread Maxim Orlov

Hi!

I want to make a quick summary here.

1. An overall consensus has been reached: we shall focus on committing SLRU
changes first.
2. I've created an appropriate patch set here [0].
3. How [0] is waiting for a review. As always, all opinions will be welcome.
4. While discussing error/warning messages and some other stuff, this
thread was marked as "Waiting on Author".
5. I do rebase this patch set once in a week, but do not post it here,
since there is no need in it. See (1).
6. For now, I don't understand what changes I have to make here. So,
does "Waiting
on Author" is appropriate status here?

Anyway. Let's discuss on-disk page format, shall we?

AFAICS, we have a following options:
1. Making "true" 64–bit XIDs. I.e. making every tuple have 64–bit xmin and
xmax fields.
2. Put special in every page where base for XIDs are stored. This is what
we have done in the current patch set.
3. Put base for XIDs in a fork.
4. Make explicit 64–bit XIDs for concrete relations. I.e. CREATE TABLE foo
WITH (xid8) of smth.

There were opinions that the proposed solution (2) is not the optimal. It
would be great to hear your concerns and thoughts.

[0]
https://www.postgresql.org/message-id/CACG%3Dezav34TL%2BfGXD5vJ48%3DQbQBL9BiwkOTWduu9yRqie-h%2BDg%40mail.gmail.com

-- 
Best regards,
Maxim Orlov.

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-12-19 Thread Maxim Orlov

Hi!

As a result of discussion in the thread [0], Robert Haas proposed to focus
on making SLRU 64 bit, as a first step towards 64 bit XIDs.
Here is the patch set.

In overall, code of this patch set is based on the existing code from [0]
and may be simplified, due to the fact, that SLRU_PAGES_PER_SEGMENT is not
meant to be changed now.
But I decided to leave it that way. At least for now.

As always, reviews and opinions are very welcome.

Should we change status for this thread to "need review"?

[0]
https://www.postgresql.org/message-id/flat/CA%2BTgmoZFmTGjgkmjgkcm2-vQq3_TzcoMKmVimvQLx9oJLbye0Q%40mail.gmail.com#03a4ab82569bb7b112db4a2f352d96b9

-- 
Best regards,
Maxim Orlov.


v51-0003-Make-pg_upgrade-from-32-bit-to-64-bit-SLRU.patch
Description: Binary data


v51-0001-Use-internal-64-bit-numbering-of-SLRU-pages.patch
Description: Binary data


v51-0002-Use-64-bit-pages-representation-in-SLRU-callers.patch
Description: Binary data

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread Maxim Orlov

On Fri, 9 Dec 2022 at 16:54, adherent postgres <
adherent_postg...@hotmail.com> wrote:

> Hi Aleksander Alekseev
>  I think the xids 32bit transformation project has been dragged on for too
> long. Huawei's openGauss referenced this patch to implement xids 64bit, and
> Postgrespro also implemented xids 64bit, which is enough to prove that
> their worries are redundant.I think postgresql has no reason not to
> implement xid 64 bit. What about your opinion?
>

Yeah, I totally agree, the time has come. With a high transaction load,
Postgres become more and more difficult to maintain.
The problem is in the overall complexity of the patch set. We need more
reviewers.

Since committing such a big patch is not viable once at a time, from the
start of the work we did split it into several logical parts.
The evolution concept is more preferable in this case. As far as I can see,
there is overall consensus to commit SLRU related
changes first.

-- 
Best regards,
Maxim Orlov.

回复: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread adherent postgres

Hi Pavel Borisov
Now the disk performance has been improved many times, and the capacity has 
also been increased many times,The wal log already supports lz4 and zstd 
compression, I think each XLOG record size will increase at least by 4 bytes 
which is not a big problem.What about your opinion?

Best whish

发件人: Pavel Borisov 
发送时间: 2022年12月9日 22:13
收件人: adherent postgres 
抄送: Aleksander Alekseev ; 
pgsql-hackers@lists.postgresql.org ; Chris 
Travers ; Chris Travers ; Bruce 
Momjian 
主题: Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Adherent!

On Fri, 9 Dec 2022 at 17:54, adherent postgres
 wrote:
>
> Hi Aleksander Alekseev
>  I think the xids 32bit transformation project has been dragged on for too 
> long. Huawei's openGauss referenced this patch to implement xids 64bit, and 
> Postgrespro also implemented xids 64bit, which is enough to prove that their 
> worries are redundant.I think postgresql has no reason not to implement xid 
> 64 bit. What about your opinion?

I agree it's high time to commit 64xids into PostgreSQL.

If you can do your review of the whole proposed patchset or only the
part that is likely to be committed earlier [1] it would help a lot!
I'd recommend beginning with the last version of the patch in thread
[1]. First, it is easier. Also, this review is going to be useful
sooner and will help a committer on January commitfest a lot.
[1]: 
https://www.postgresql.org/message-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A%40mail.gmail.com

Regards,
Pavel Borisov,
Supabase

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread Pavel Borisov

Hi, Adherent!

On Fri, 9 Dec 2022 at 17:54, adherent postgres
 wrote:
>
> Hi Aleksander Alekseev
>  I think the xids 32bit transformation project has been dragged on for too 
> long. Huawei's openGauss referenced this patch to implement xids 64bit, and 
> Postgrespro also implemented xids 64bit, which is enough to prove that their 
> worries are redundant.I think postgresql has no reason not to implement xid 
> 64 bit. What about your opinion?

I agree it's high time to commit 64xids into PostgreSQL.

If you can do your review of the whole proposed patchset or only the
part that is likely to be committed earlier [1] it would help a lot!
I'd recommend beginning with the last version of the patch in thread
[1]. First, it is easier. Also, this review is going to be useful
sooner and will help a committer on January commitfest a lot.
[1]: 
https://www.postgresql.org/message-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A%40mail.gmail.com

Regards,
Pavel Borisov,
Supabase

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread Maxim Orlov

> So I'd vote for an evolutionary approach and give my +1 for
> undertaking efforts to first committing [1] to 16.
>
> [1]:
> https://www.postgresql.org/message-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A%40mail.gmail.com
>
> Kind regards,
> Pavel Borisov,
> Supabase.
>

+1 Totally support the idea. Let's focus on committing SLRU changes.

-- 
Best regards,
Maxim Orlov.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread adherent postgres

Hi Aleksander Alekseev
 I think the xids 32bit transformation project has been dragged on for too 
long. Huawei's openGauss referenced this patch to implement xids 64bit, and 
Postgrespro also implemented xids 64bit, which is enough to prove that their 
worries are redundant.I think postgresql has no reason not to implement xid 64 
bit. What about your opinion?

Best whish

发件人: Aleksander Alekseev 
发送时间: 2022年12月9日 20:49
收件人: pgsql-hackers@lists.postgresql.org 
抄送: adherent postgres ; Chris Travers 
; Chris Travers ; Bruce Momjian 

主题: Re: Add 64-bit XIDs into PostgreSQL 15

Hi adherent,

>  Robertmhaas said that the project Zheap is 
> dead（https://twitter.com/andy_pavlo/status/1590703943176589312）, which means 
> that we cannot use Zheap to deal with the issue of xid wraparound and dead 
> tuples in tables. The dead tuple issue is not a big deal because I can still 
> use pg_repack to handle, although pg_repack will cause wal log to increase 
> dramatically and may take one or two days to handle a large table. During 
> this time the database can be accessed by external users, but the xid 
> wraparound will cause PostgreSQL to be down, which is a disaster for DBAs. 
> Maybe you are not a DBA, or your are from a small country, Database system 
> tps is very low, so xid32 is  enough for your database system ,  Oracle's scn 
> was also 32bits, however, Oracle realized the issue and changed scn to 64 
> bits. The transaction id in mysql is 48 bits. MySQL didn't fix the 
> transaction id wraparound problem because they think that 48 bits is enough 
> for the transaction id. This project has been running for almost 1 year and 
> now it is coming to an end. I strongly disagree with your idea of stopping 
> this patch, and I suspect you are a saboteur. I strongly disagree with your 
> viewpoint, as it is not a fundamental way to solve the xid wraparound 
> problem. The PostgreSQL community urgently needs developers who solve 
> problems like this, not bury one' head in the sand

This is not uncommon for people on the mailing list to have
disagreements. This is part of the process, we all are looking for
consensus. It's true that different people have different use cases in
mind and different backgrounds as well. It doesn't mean these use
cases are wrong and/or the experience is irrelevant and/or the
received feedback should be just discarded.

Although I also expressed my disagreement with Chris before, let's not
assume any bad intent and especially sabotage as you put it. (Unless
you have a strong proof of this of course which I doubt you have.) We
want all kinds of feedback to be welcomed here. I'm sure our goal here
is mutual, to make PostgreSQL even better than it is now. The only
problem is that the definition of "better" varies sometimes.

I see you believe that 64-bit XIDs are going to be useful. That's
great! Tell us more about your case and how the patch is going to help
with it. Also, maybe you could test your load with the applied
patchset and tell us whether it makes things better or worse?
Personally I would love hearing this from you.

--
Best regards,
Aleksander Alekseev

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread Pavel Borisov

Hi, Robert!
> Lest we miss the forest for the trees, there is an aspect of this
> patch that I find to be an extremely good idea and think we should try
> to get committed even if the rest of the patch set ends up in the
> rubbish bin. Specifically, there are a couple of patches in here that
> have to do with making SLRUs indexed by 64-bit integers rather than by
> 32-bit integers. We've had repeated bugs in the area of handling SLRU
> wraparound in the past, some of which have caused data loss. Just by
> chance, I ran across a situation just yesterday where an SLRU wrapped
> around on disk for reasons that I don't really understand yet and
> chaos ensued. Switching to an indexing system for SLRUs that does not
> ever wrap around would probably enable us to get rid of a whole bunch
> of crufty code, and would also likely improve the general reliability
> of the system in situations where wraparound is threatened. It seems
> like a really, really good idea.

I totally support the idea that the part related to SLRU is worth
committing whether it is being the first step to 64xid or separately.
This subset is discussed in a separate thread [1]. It seems that we
need more time to reach a consensus on the implementation of a whole
big thing. Just this discussion is a complicated thing and reveals
many different aspects concurrently in one thread.

So I'd vote for an evolutionary approach and give my +1 for
undertaking efforts to first committing [1] to 16.

[1]: 
https://www.postgresql.org/message-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A%40mail.gmail.com

Kind regards,
Pavel Borisov,
Supabase.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread Aleksander Alekseev

Hi adherent,

>  Robertmhaas said that the project Zheap is 
> dead（https://twitter.com/andy_pavlo/status/1590703943176589312）, which means 
> that we cannot use Zheap to deal with the issue of xid wraparound and dead 
> tuples in tables. The dead tuple issue is not a big deal because I can still 
> use pg_repack to handle, although pg_repack will cause wal log to increase 
> dramatically and may take one or two days to handle a large table. During 
> this time the database can be accessed by external users, but the xid 
> wraparound will cause PostgreSQL to be down, which is a disaster for DBAs. 
> Maybe you are not a DBA, or your are from a small country, Database system 
> tps is very low, so xid32 is  enough for your database system ,  Oracle's scn 
> was also 32bits, however, Oracle realized the issue and changed scn to 64 
> bits. The transaction id in mysql is 48 bits. MySQL didn't fix the 
> transaction id wraparound problem because they think that 48 bits is enough 
> for the transaction id. This project has been running for almost 1 year and 
> now it is coming to an end. I strongly disagree with your idea of stopping 
> this patch, and I suspect you are a saboteur. I strongly disagree with your 
> viewpoint, as it is not a fundamental way to solve the xid wraparound 
> problem. The PostgreSQL community urgently needs developers who solve 
> problems like this, not bury one' head in the sand

This is not uncommon for people on the mailing list to have
disagreements. This is part of the process, we all are looking for
consensus. It's true that different people have different use cases in
mind and different backgrounds as well. It doesn't mean these use
cases are wrong and/or the experience is irrelevant and/or the
received feedback should be just discarded.

Although I also expressed my disagreement with Chris before, let's not
assume any bad intent and especially sabotage as you put it. (Unless
you have a strong proof of this of course which I doubt you have.) We
want all kinds of feedback to be welcomed here. I'm sure our goal here
is mutual, to make PostgreSQL even better than it is now. The only
problem is that the definition of "better" varies sometimes.

I see you believe that 64-bit XIDs are going to be useful. That's
great! Tell us more about your case and how the patch is going to help
with it. Also, maybe you could test your load with the applied
patchset and tell us whether it makes things better or worse?
Personally I would love hearing this from you.

-- 
Best regards,
Aleksander Alekseev

回复: Add 64-bit XIDs into PostgreSQL 15

2022-12-09 Thread adherent postgres

Hi Chris Travers
 Robertmhaas said that the project Zheap is 
dead（https://twitter.com/andy_pavlo/status/1590703943176589312）, which means 
that we cannot use Zheap to deal with the issue of xid wraparound and dead 
tuples in tables. The dead tuple issue is not a big deal because I can still 
use pg_repack to handle, although pg_repack will cause wal log to increase 
dramatically and may take one or two days to handle a large table. During this 
time the database can be accessed by external users, but the xid wraparound 
will cause PostgreSQL to be down, which is a disaster for DBAs. Maybe you are 
not a DBA, or your are from a small country, Database system tps is very low, 
so xid32 is  enough for your database system ,  Oracle's scn was also 32bits, 
however, Oracle realized the issue and changed scn to 64 bits. The transaction 
id in mysql is 48 bits. MySQL didn't fix the transaction id wraparound problem 
because they think that 48 bits is enough for the transaction id. This project 
has been running for almost 1 year and now it is coming to an end. I strongly 
disagree with your idea of stopping this patch, and I suspect you are a 
saboteur. I strongly disagree with your viewpoint, as it is not a fundamental 
way to solve the xid wraparound problem. The PostgreSQL community urgently 
needs developers who solve problems like this, not bury one' head in the sand

Best whish

发件人: Peter Geoghegan 
发送时间: 2022年12月1日 0:35
收件人: Robert Haas 
抄送: Chris Travers ; Bruce Momjian ; 
Aleksander Alekseev ; 
pgsql-hackers@lists.postgresql.org ; Chris 
Travers ; Fedor Sigaev ; Alexander 
Korotkov ; Konstantin Knizhnik ; 
Nikita Glukhov ; Yura Sokolov 
; Maxim Orlov ; Pavel Borisov 
; Simon Riggs 
主题: Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Nov 30, 2022 at 8:13 AM Robert Haas  wrote:
> I haven't checked the patches to see whether they look correct, and
> I'm concerned in particular about upgrade scenarios. But if there's a
> way we can get that part committed, I think it would be a clear win.

+1

--
Peter Geoghegan

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-12-07 Thread Peter Geoghegan

On Wed, Dec 7, 2022 at 9:50 AM Andres Freund  wrote:
> performing post-bootstrap initialization ... 
> ../src/backend/access/transam/slru.c:1520:9: runtime error: load of 
> misaligned address 0x7fff6362db8c for type 'int64', which requires 8 byte 
> alignment
> 0x7fff6362db8c: note: pointer points here
>   01 00 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  
> 01 00 00 00 00 00 00 00

I bet that this alignment issue can be fixed by using PGAlignedBlock
instead of a raw char buffer for a page. (I'm guessing, I haven't
directly checked.)

-- 
Peter Geoghegan

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 11:40:08 +0300, Aleksander Alekseev wrote:
> Hi Michael,
>
> > The CF bot is showing some failures here.  You may want to
> > double-check.
>
> Thanks! PFA v48.

This causes a lot of failures with ubsan:

https://cirrus-ci.com/task/6035600772431872

performing post-bootstrap initialization ... 
../src/backend/access/transam/slru.c:1520:9: runtime error: load of misaligned 
address 0x7fff6362db8c for type 'int64', which requires 8 byte alignment
0x7fff6362db8c: note: pointer points here
  01 00 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  01 
00 00 00 00 00 00 00
  ^
==18947==Using libbacktrace symbolizer.
#0 0x564d7c45cc6b in SlruScanDirCbReportPresence 
../src/backend/access/transam/slru.c:1520
#1 0x564d7c45cd88 in SlruScanDirectory 
../src/backend/access/transam/slru.c:1595
#2 0x564d7c44872c in TruncateCLOG ../src/backend/access/transam/clog.c:889
#3 0x564d7c62ecd7 in vac_truncate_clog ../src/backend/commands/vacuum.c:1779
#4 0x564d7c6320a8 in vac_update_datfrozenxid 
../src/backend/commands/vacuum.c:1642
#5 0x564d7c632a78 in vacuum ../src/backend/commands/vacuum.c:537
#6 0x564d7c63347d in ExecVacuum ../src/backend/commands/vacuum.c:273
#7 0x564d7ca4afea in standard_ProcessUtility 
../src/backend/tcop/utility.c:866
#8 0x564d7ca4b723 in ProcessUtility ../src/backend/tcop/utility.c:530
#9 0x564d7ca46e81 in PortalRunUtility ../src/backend/tcop/pquery.c:1158
#10 0x564d7ca4755d in PortalRunMulti ../src/backend/tcop/pquery.c:1315
#11 0x564d7ca47c02 in PortalRun ../src/backend/tcop/pquery.c:791
#12 0x564d7ca40ecb in exec_simple_query ../src/backend/tcop/postgres.c:1238
#13 0x564d7ca43c01 in PostgresMain ../src/backend/tcop/postgres.c:4551
#14 0x564d7ca441a4 in PostgresSingleUserMain 
../src/backend/tcop/postgres.c:4028
#15 0x564d7c74d883 in main ../src/backend/main/main.c:197
#16 0x7fde7793dd09 in __libc_start_main 
(/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
#17 0x564d7c2d30c9 in _start 
(/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8530c9)

Greetings,

Andres Freund

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-12-01 Thread Michael Paquier

On Mon, Nov 21, 2022 at 12:21:09PM +0300, Aleksander Alekseev wrote:
> After merging 1489b1ce [1] the patchset needed a rebase. PFA v47.
> 
> [1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1489b1ce

The CF bot is showing some failures here.  You may want to
double-check.
--
Michael


signature.asc
Description: PGP signature

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-30 Thread Peter Geoghegan

On Wed, Nov 30, 2022 at 8:13 AM Robert Haas  wrote:
> I haven't checked the patches to see whether they look correct, and
> I'm concerned in particular about upgrade scenarios. But if there's a
> way we can get that part committed, I think it would be a clear win.

+1

-- 
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-30 Thread Robert Haas

On Tue, Nov 29, 2022 at 9:35 PM Chris Travers  wrote:
> My proposal would be to make the threshold configurable and start warning on 
> every transaction after that.  There are a couple reasons to do that.
>
> The first is that noisy warnings are extremely easy to see.  You get them in 
> cron emails, from psql, in the db logs etc.  Having them every million makes 
> them harder to catch.
>
> The point here is not to ensure there are no problems, but to make sure that 
> an existing layer in the current swiss cheese model of safety doesn't go 
> away.  Will it stop all problems?  No.  But the current warning strategy is 
> effective, given how many times we hear of cases of people having to take 
> drastic action to avoid impending xid wraparound.
>
> If someone has an insert only database and maye doesn't want to ever freeze, 
> they can set the threshold to -1 or something.  I would suggest keeping the 
> default as at 2 billion to be in line with existing limitations and 
> practices.  People can then adjust as they see fit.
>
> Warning text might be something like "XID Lag Threshold Exceeded.  Is 
> autovacuum clearing space and keeping up?"

None of this seems unreasonable to me. If we want to allow more
configurability, we could also let you choose the threshold and the
frequency of warnings (every N transactions).

But, I think we might be getting down a little bit in the weeds. It's
not clear that everybody's on board with the proposed page format
changes. I'm not completely opposed, but I'm also not wild about the
approach. It's probably not a good idea to spend all of our energy
debating the details of how to reform xidWrapLimit without having some
consensus on those points. It is, in a word, bikeshedding: on-disk
page format changes are hard, but everyone understands warning
messages.

Lest we miss the forest for the trees, there is an aspect of this
patch that I find to be an extremely good idea and think we should try
to get committed even if the rest of the patch set ends up in the
rubbish bin. Specifically, there are a couple of patches in here that
have to do with making SLRUs indexed by 64-bit integers rather than by
32-bit integers. We've had repeated bugs in the area of handling SLRU
wraparound in the past, some of which have caused data loss. Just by
chance, I ran across a situation just yesterday where an SLRU wrapped
around on disk for reasons that I don't really understand yet and
chaos ensued. Switching to an indexing system for SLRUs that does not
ever wrap around would probably enable us to get rid of a whole bunch
of crufty code, and would also likely improve the general reliability
of the system in situations where wraparound is threatened. It seems
like a really, really good idea.

I haven't checked the patches to see whether they look correct, and
I'm concerned in particular about upgrade scenarios. But if there's a
way we can get that part committed, I think it would be a clear win.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Chris Travers

On Tue, Nov 29, 2022 at 5:57 PM Bruce Momjian  wrote:

> On Tue, Nov 29, 2022 at 11:41:07AM -0500, Robert Haas wrote:
> > My argument is that removing xidStopLimit is totally fine, because it
> > only serves to stop the database. What to do about xidWarnLimit is a
> > slightly more complex question. Certainly it can't be left untouched,
> > because warning that we're about to shut down the database for lack of
> > allocatable XIDs is not sensible if there is no such lack and we
> > aren't going to shut it down. But I'm also not sure if the model is
> > right. Doing nothing for a long time and then warning in every
> > transaction when some threshold is crossed is an extreme behavior
> > change. Right now that's somewhat justified because we're about to hit
> > a brick wall at full speed, but if we remove the brick wall and
> > replace it with a gentle pelting with rotten eggs, it's unclear that a
> > similarly strenuous reaction is the right model. But that's also not
> > to say that we should do nothing at all.
>
> Yeah, we would probably need to warn on every 1 million transactions or
> something.
>
>
My proposal would be to make the threshold configurable and start warning
on every transaction after that.  There are a couple reasons to do that.

The first is that noisy warnings are extremely easy to see.  You get them
in cron emails, from psql, in the db logs etc.  Having them every million
makes them harder to catch.

The point here is not to ensure there are no problems, but to make sure
that an existing layer in the current swiss cheese model of safety doesn't
go away.  Will it stop all problems?  No.  But the current warning strategy
is effective, given how many times we hear of cases of people having to
take drastic action to avoid impending xid wraparound.

If someone has an insert only database and maye doesn't want to ever
freeze, they can set the threshold to -1 or something.  I would suggest
keeping the default as at 2 billion to be in line with existing limitations
and practices.  People can then adjust as they see fit.

Warning text might be something like "XID Lag Threshold Exceeded.  Is
autovacuum clearing space and keeping up?"

> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
> Embrace your flaws.  They make you human, rather than perfect,
> which you will never be.
>

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Greg Stark

On Mon, 28 Nov 2022 at 16:53, Peter Geoghegan  wrote:
>

> Imagine if we actually had 64-bit XIDs -- let's assume for a moment
> that it's a done deal. This raises a somewhat awkward question: do you
> just let the system get further and further behind on freezing,
> forever? We can all agree that 2 billion XIDs is very likely the wrong
> time to start refusing new XIDs -- because it just isn't designed with
> debt in mind. But what's the right time, if any? How much debt is too
> much?

My first thought was... why not? Just let the system get further and
further behind on freezing. Where's the harm?

Picture an insert-only database that is receiving data very quickly
never having data deleted or modified. vacuum takes several days to
complete and the system wraps 32-bit xid several times a day.

The DBA asks you why are they even bothering running vacuum? They have
plenty of storage for clog, latency on selects is not a pain point,
not compared to running multi-day vacuums that impact insert times

That isn't far off the scenario where I've seen wraparound being a
pain btw. Anti-wraparound vacuum took about 2 days and was kicking off
pretty much as soon as the previous one finished. For a table that was
mostly read-only.

Of course to make the judgement the DBA needs to have good ways to
measure the space usage of clog, and the overhead caused by clog
lookups that could be avoided. Then they can judge for themselves how
much freezing is appropriate.

-- 
greg

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Bruce Momjian

On Tue, Nov 29, 2022 at 11:41:07AM -0500, Robert Haas wrote:
> My argument is that removing xidStopLimit is totally fine, because it
> only serves to stop the database. What to do about xidWarnLimit is a
> slightly more complex question. Certainly it can't be left untouched,
> because warning that we're about to shut down the database for lack of
> allocatable XIDs is not sensible if there is no such lack and we
> aren't going to shut it down. But I'm also not sure if the model is
> right. Doing nothing for a long time and then warning in every
> transaction when some threshold is crossed is an extreme behavior
> change. Right now that's somewhat justified because we're about to hit
> a brick wall at full speed, but if we remove the brick wall and
> replace it with a gentle pelting with rotten eggs, it's unclear that a
> similarly strenuous reaction is the right model. But that's also not
> to say that we should do nothing at all.

Yeah, we would probably need to warn on every 1 million transactions or
something.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Robert Haas

On Tue, Nov 29, 2022 at 8:03 AM Chris Travers  wrote:
> To be clear, I never suggested shutting down the database.  What I have 
> suggested is that repurposing the current approaching-xid-wraparound warnings 
> to start complaining loudly when a threshold is exceeded would be helpful.   
> I think it makes sense to make that threshold configurable especially if we 
> eventually have people running bloat-free table structures.
>
> There are two fundamental problems here.  The first is that if, as you say, a 
> table is progressively bloating and we are getting further and further behind 
> on vacuuming and freezing, something is seriously wrong and we need to do 
> something about it.  In these cases, I my experience is that vacuuming and 
> related tools tend to suffer degraded performance, and determining how to 
> solve the problem takes quite a bit more time than a routine bloat issue 
> would.  So what I am arguing against is treating the problem just as a bloat 
> issue.  If you get there due to vacuum being slow, something else is wrong 
> and you are probably going to have to find and fix that as well in order to 
> catch up.  At least that's my experience.
>
> I don't object to the db continuing to run, allocate xids etc.  What I object 
> to is it doing so in silently where things are almost certainly going very 
> wrong.

OK. My feeling is that the set of things we can do to warn the user is
somewhat limited. I'm open to trying our best, but we need to have
reasonable expectations. Sophisticated users will be monitoring for
problems even if we do nothing to warn, and dumb ones won't look at
their logs. Any feature that proposes to warn must aim at the uses who
are smart enough to check the logs but dumb enough not to have any
more sophisticated monitoring. Such users certainly exist and are not
even uncommon, but they aren't the only kind by a long shot.

My argument is that removing xidStopLimit is totally fine, because it
only serves to stop the database. What to do about xidWarnLimit is a
slightly more complex question. Certainly it can't be left untouched,
because warning that we're about to shut down the database for lack of
allocatable XIDs is not sensible if there is no such lack and we
aren't going to shut it down. But I'm also not sure if the model is
right. Doing nothing for a long time and then warning in every
transaction when some threshold is crossed is an extreme behavior
change. Right now that's somewhat justified because we're about to hit
a brick wall at full speed, but if we remove the brick wall and
replace it with a gentle pelting with rotten eggs, it's unclear that a
similarly strenuous reaction is the right model. But that's also not
to say that we should do nothing at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Robert Haas

On Mon, Nov 28, 2022 at 4:53 PM Peter Geoghegan  wrote:
> Imagine if we actually had 64-bit XIDs -- let's assume for a moment
> that it's a done deal. This raises a somewhat awkward question: do you
> just let the system get further and further behind on freezing,
> forever? We can all agree that 2 billion XIDs is very likely the wrong
> time to start refusing new XIDs -- because it just isn't designed with
> debt in mind. But what's the right time, if any? How much debt is too
> much?

I simply don't see a reason to ever stop the server entirely. I don't
even agree with the idea of slowing down XID allocation, let alone
refusing it completely. When the range of allocated XIDs become too
large, several bad things happen. First, we become unable to allocate
new XIDs without corrupting the database. Second, pg_clog and other
SLRUs become uncomfortably large. There may be some other things too
that I'm not thinking about. But these things are not all equally bad.
If these were medical problems, being unable to allocate new XIDs
without data corruption would be a heart attack, and SLRUs getting
bigger on disk would be acne. You don't handle problems of such wildly
differing severity in the same way. When someone is having a heart
attack, an ambulance rushes them to the hospital, running red lights
as necessary. When someone has acne, you don't take them to the same
hospital in the same ambulance and drive it at a slower rate of speed.
You do something else entirely, and it's something that is in every
way much less dramatic. There's no such thing as an attack of acne
that's so bad that it requires an ambulance ride, but even a mild
heart attack should result in a fast trip to the ER. So here. The two
problems are so qualitatively different that the responses should also
be qualitatively different.

> Admittedly this argument works a lot better with the failsafe than it
> does with xidStopLimit. Both are removed by the patch.

I don't think the failsafe stuff should be removed, but it should
probably be modified in some way. Running out of XIDs is the only
valid reason for stopping the world, at least IMO, but it is
definitely NOT the only reason for vacuuming more aggressively.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Jan Wieck


On 11/29/22 09:46, Bruce Momjian wrote:

As far as I know, all our freeze values are focused on avoiding XID
wraparound.  If XID wraparound is no longer an issue, we might find that
our freeze limits can be much higher than they are now.



I'd be careful in that direction as the values together with maintenance 
work mem also keep a lid on excessive index cleanup rounds.



Regards, Jan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Bruce Momjian

On Tue, Nov 29, 2022 at 02:35:20PM +0100, Chris Travers wrote:
> So I think the problem is that PostgreSQL is becoming more and more scalabile,
> hardware is becoming more capable, and certain use cases are continuing to
> scale up.  Over time, we tend to find ourselves approaching the end of the
> runway at ever higher velocities.  That's a problem that will get 
> significantly
> worse over time.
> 
> Of course, as I think we agree, the priorities should be (in order):
> 1.  Avoid trouble
> 2.  Recover from trouble early
> 3.  Provide more and better options for recovery.

Warn about trouble is another area we should focus on here.

> I think 64bit xids are a very good idea, but they really fit in this bottom
> tier.   Not being up against mathematical limits to the software when things
> are going bad is certainly a good thing.  But I am really worried about the
> attitude that this patch really avoids trouble because in many cases, I don;t
> think it does and therefore I believe we need to make sure we are not reducing
> visibility of underlying problems.

As far as I know, all our freeze values are focused on avoiding XID
wraparound.  If XID wraparound is no longer an issue, we might find that
our freeze limits can be much higher than they are now.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Chris Travers

On Mon, Nov 28, 2022 at 11:06 PM Peter Geoghegan  wrote:

> On Mon, Nov 28, 2022 at 1:52 PM Bruce Momjian  wrote:
> > I think the problem is that we still have bloat with 64-bit XIDs,
> > specifically pg_xact and pg_multixact files.  Yes, that bloat is less
> > serious, but it is still an issue worth reporting in the server logs,
> > though not serious enough to stop the server from write queries.
>
> That's definitely a big part of it.
>
> Again, I don't believe that the idea is fundamentally without merit.
> Just that it's not worth it, given that having more XID space is very
> much not something that I think fixes most of the problems. And given
> the real risk of serious bugs with something this invasive.

> I believe that it would be more useful to focus on just not getting
> into trouble in the first place, as well as on mitigating specific
> problems that lead to the system reaching xidStopLimit in practice. I
> don't think that there is any good reason to allow datfrozenxid to go
> past about a billion. When it does the interesting questions are
> questions about what went wrong, and how that specific failure can be
> mitigated in a fairly direct way.
>
> We've already used way to much "XID space runway", so why should using
> even more help? It might, I suppose, but it almost seems like a
> distraction to me, as somebody that wants to make things better for
> users in general. As long as the system continues to misbehave (in
> whatever way it happens to be misbehaving), why should any amount of
> XID space ever be enough?
>

So I think the problem is that PostgreSQL is becoming more and more
scalabile, hardware is becoming more capable, and certain use cases are
continuing to scale up.  Over time, we tend to find ourselves approaching
the end of the runway at ever higher velocities.  That's a problem that
will get significantly worse over time.

Of course, as I think we agree, the priorities should be (in order):
1.  Avoid trouble
2.  Recover from trouble early
3.  Provide more and better options for recovery.

I think 64bit xids are a very good idea, but they really fit in this bottom
tier.   Not being up against mathematical limits to the software when
things are going bad is certainly a good thing.  But I am really worried
about the attitude that this patch really avoids trouble because in many
cases, I don;t think it does and therefore I believe we need to make sure
we are not reducing visibility of underlying problems.

>
> I think that we'll be able to get rid of freezing in a few years time.
> But as long as we have freezing, we have these problems.
>
> --
> Peter Geoghegan
>

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-29 Thread Chris Travers

Hi;

I suppose I must not have been clear in what I am suggesting we do and
why.   I will try to answer specific points below and then restate what I
think the problem is, and what I think should be done about it.

On Mon, Nov 28, 2022 at 5:53 PM Robert Haas  wrote:

> On Sat, Nov 26, 2022 at 4:08 AM Chris Travers 
> wrote:
> > I didn't see any changes to pg_upgrade to make this change possible on
> upgrade.  Is that also outside of the scope of your patch set?  If so how
> is that continuity supposed to be ensured?
>
> The scheme is documented in their 0006 patch, in a README.XID file.
> I'm not entirely confident that it's the best design and have argued
> against it in the past, but it's not crazy.
>

Right.  Per previous discussion I thought there was some discussion of
allowing people to run with the existing behavior.I must have been
mistaken.  If that is off the table then pg_upgrade and runtime replication
checks don't matter.

>
> More generally, while I think there's plenty of stuff to be concerned
> about in this patch set and while I'm somewhat skeptical about the
> likelihood of its getting or staying committed, I can't really
> understand your concerns in particular. The thrust of your concern
> seems to be that if we allow people to get further behind, recovery
> will be more difficult. I'm not sure I see the problem. Suppose that
> we adopt this proposal and that it is bug-free. Now, consider a user
> who gets 8 billion XIDs behind. They probably have to vacuum pretty
> much every page in the database to do that, or least every page in the
> tables that haven't been vacuumed recently. But that would likely also
> be true if they were 800 million XIDs behind, as is possible today.
> The effort to catch up doesn't increase linearly with how far behind
> you are, and is always bounded by the DB size.
>

Right.  I agree with all of that.

>
> It is true that if the table is progressively bloating, it is likely
> to be more bloated by the time you are 8 billion XIDs behind than it
> was when you were 800 million XIDs behind. I don't see that as a very
> good reason not to adopt this patch, because you can bloat the table
> by an arbitrarily large amount while consuming only a small number of
> XiDs, even just 1 XID. Protecting against bloat is good, but shutting
> down the database when the XID age reaches a certain value is not a
> particularly effective way of doing that, so saying that we'll be
> hurting people by not shutting down the database at the point where we
> do so today doesn't ring true to me. I think that most people who get
> to the point of wraparound shutdown have workloads where bloat isn't a
> huge issue, because those who do start having problems with the bloat
> way before they run out of XIDs.
>

To be clear, I never suggested shutting down the database.  What I have
suggested is that repurposing the current approaching-xid-wraparound
warnings to start complaining loudly when a threshold is exceeded would be
helpful.   I think it makes sense to make that threshold configurable
especially if we eventually have people running bloat-free table structures.

There are two fundamental problems here.  The first is that if, as you say,
a table is progressively bloating and we are getting further and further
behind on vacuuming and freezing, something is seriously wrong and we need
to do something about it.  In these cases, I my experience is that
vacuuming and related tools tend to suffer degraded performance, and
determining how to solve the problem takes quite a bit more time than a
routine bloat issue would.  So what I am arguing against is treating the
problem just as a bloat issue.  If you get there due to vacuum being slow,
something else is wrong and you are probably going to have to find and fix
that as well in order to catch up.  At least that's my experience.

I don't object to the db continuing to run, allocate xids etc.  What I
object to is it doing so in silently where things are almost certainly
going very wrong.

>
> It would be entirely possible to add a parameter to the system that
> says "hey, you know we can keep running even if we're a shazillion
> XiDs behind, but instead shut down when we are behind by this number
> of XIDs." Then, if somebody wants to force an automatic shutdown at
> that point, they could, and I think that then the scenario you're
> worried about just can't happen any more . But isn't that a little bit
> silly? You could also just monitor how far behind you are and page the
> DBA when you get behind by more than a certain number of XIDs. Then,
> you wouldn't be risking a shutdown, and you'd still be able to stay on
> top of the XID ages of your tables.
>
> Philosophically, I disagree with the idea of shutting down the
> database completely in any situation in which a reasonable alternative
> exists. Losing read and write availability is really bad, and I don't
> think it's what users want. I think that most users want the database
> to

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Peter Geoghegan

On Mon, Nov 28, 2022 at 1:52 PM Peter Geoghegan  wrote:
> I'm not claiming to know how to build this "better xidStopLimit
> mechanism", by the way. I'm not seriously proposing it. Mostly I'm
> just saying that the question "where do you draw the line if not at 2
> billion XIDs?" is a very pertinent question. It is not a question that
> is made any less valid by the fact that we already know that 2 billion
> XIDs is pretty far from optimal in almost all cases. Having some limit
> based on something is likely to be more effective than having no limit
> based on nothing.
>
> Admittedly this argument works a lot better with the failsafe than it
> does with xidStopLimit. Both are removed by the patch.

Come to think of it, if you were going to do something like this it
would probably work by throttling XID allocations, with a gradual
ramp-up. It would rarely get to the point that the system refused to
allocate XIDs completely.

It's not fundamentally unreasonable to force the application to live
within its means by throttling. Feedback that slows down the rate of
writes is much more common in the LSM tree world, within systems like
MyRocks [1], where the risk of the system being destabilized by debt
is more pressing.

As I said, I don't think that this is a particularly promising way of
addressing problems with Postgres XID space exhaustion, since I
believe that the underlying issue isn't usually a simple lack of "XID
space slack capacity". But if you assume that I'm wrong here (if you
assume that we very often don't have the ability to freeze lazily
enough), then ISTM that throttling or feedback to stall new writes is
a very reasonable option. In fact, it's practically mandatory.

[1] 
https://docs.google.com/presentation/d/1WgP-SlKay5AnSoVDSvOIzmu7edMmtYhdywoa0oAR4JQ/edit#slide=id.g8839c9d71b_0_79
-- 
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Peter Geoghegan

On Mon, Nov 28, 2022 at 1:52 PM Bruce Momjian  wrote:
> I think the problem is that we still have bloat with 64-bit XIDs,
> specifically pg_xact and pg_multixact files.  Yes, that bloat is less
> serious, but it is still an issue worth reporting in the server logs,
> though not serious enough to stop the server from write queries.

That's definitely a big part of it.

Again, I don't believe that the idea is fundamentally without merit.
Just that it's not worth it, given that having more XID space is very
much not something that I think fixes most of the problems. And given
the real risk of serious bugs with something this invasive.

I believe that it would be more useful to focus on just not getting
into trouble in the first place, as well as on mitigating specific
problems that lead to the system reaching xidStopLimit in practice. I
don't think that there is any good reason to allow datfrozenxid to go
past about a billion. When it does the interesting questions are
questions about what went wrong, and how that specific failure can be
mitigated in a fairly direct way.

We've already used way to much "XID space runway", so why should using
even more help? It might, I suppose, but it almost seems like a
distraction to me, as somebody that wants to make things better for
users in general. As long as the system continues to misbehave (in
whatever way it happens to be misbehaving), why should any amount of
XID space ever be enough?

I think that we'll be able to get rid of freezing in a few years time.
But as long as we have freezing, we have these problems.

-- 
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Peter Geoghegan

On Mon, Nov 28, 2022 at 1:30 PM Robert Haas  wrote:
> What is the purpose of using 64-bit XIDs, if not to avoid having to
> stop the world when we run short of XIDs?

I agree that the xidStopLimit mechanism was designed with the specific
goal of preventing "true" physical XID wraparound that results in
wrong answers to queries. It was not written with the intention of
limiting the accumulation of freeze debt, which would have to use
units like unfrozen heap pages to make any sense. That is a historic
fact -- no question.

I think that it is nevertheless quite possible that just refusing to
allocate any more XIDs could make sense with 64-bit XIDs, where we
don't strictly have to to keep the system fully functional. That might
still be the lesser evil, in that other world. The cutoff itself would
depend on many workload details, I suppose.

Imagine if we actually had 64-bit XIDs -- let's assume for a moment
that it's a done deal. This raises a somewhat awkward question: do you
just let the system get further and further behind on freezing,
forever? We can all agree that 2 billion XIDs is very likely the wrong
time to start refusing new XIDs -- because it just isn't designed with
debt in mind. But what's the right time, if any? How much debt is too
much?

At the very least these seem like questions that deserve serious consideration.

> I'd say that if this patch, or any patch with broadly similar goals,
> fails to remove xidStopLimit, it might as well not exist.

Well, it could in principle be driven by lots of different kinds of
information, and make better decisions by actually targeting freeze
debt in some way.  An "enhanced version of xidStopLimit with 64-bit
XIDs" could kick in far far later than it currently would. Obviously
that has some value.

I'm not claiming to know how to build this "better xidStopLimit
mechanism", by the way. I'm not seriously proposing it. Mostly I'm
just saying that the question "where do you draw the line if not at 2
billion XIDs?" is a very pertinent question. It is not a question that
is made any less valid by the fact that we already know that 2 billion
XIDs is pretty far from optimal in almost all cases. Having some limit
based on something is likely to be more effective than having no limit
based on nothing.

Admittedly this argument works a lot better with the failsafe than it
does with xidStopLimit. Both are removed by the patch.

--
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Bruce Momjian

On Mon, Nov 28, 2022 at 04:30:22PM -0500, Robert Haas wrote:
> What is the purpose of using 64-bit XIDs, if not to avoid having to
> stop the world when we run short of XIDs?
> 
> I'd say that if this patch, or any patch with broadly similar goals,
> fails to remove xidStopLimit, it might as well not exist.
> 
> xidStopLimit is not a general defense against falling too far behind
> on freezing, and in general, there is no reason to think that we need
> such a defense. xidStopLimit is a defense against reusing the same XID
> and thus causing irreversible database corruption. When that
> possibility no longer exists, it has outlived its usefulness and we
> should be absolutely delighted to bid it adieu.
> 
> It seems like you and Chris are proposing the moral equivalent of
> paying off your home loan but still sending a check to the mortgage
> company every month just to be sure they don't get mad.

I think the problem is that we still have bloat with 64-bit XIDs,
specifically pg_xact and pg_multixact files.  Yes, that bloat is less
serious, but it is still an issue worth reporting in the server logs,
though not serious enough to stop the server from write queries.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Robert Haas

On Mon, Nov 28, 2022 at 4:09 PM Peter Geoghegan  wrote:
> Granted, the specifics of the current XidStopLimit mechanism are
> unlikely to directly carry over to 64-bit XIDs. XidStopLimit is
> structured in a way that doesn't actually consider freeze debt in
> units like unfrozen pages. Like Chris, I just don't see why the patch
> obviates the need for something like XidStopLimit, since the patch
> doesn't remove freezing. An improved XidStopLimit mechanism might even
> end up kicking in *before* the oldest relfrozenxid reached 2 billion
> XIDs, depending on the specifics.

What is the purpose of using 64-bit XIDs, if not to avoid having to
stop the world when we run short of XIDs?

I'd say that if this patch, or any patch with broadly similar goals,
fails to remove xidStopLimit, it might as well not exist.

xidStopLimit is not a general defense against falling too far behind
on freezing, and in general, there is no reason to think that we need
such a defense. xidStopLimit is a defense against reusing the same XID
and thus causing irreversible database corruption. When that
possibility no longer exists, it has outlived its usefulness and we
should be absolutely delighted to bid it adieu.

It seems like you and Chris are proposing the moral equivalent of
paying off your home loan but still sending a check to the mortgage
company every month just to be sure they don't get mad.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Peter Geoghegan

On Mon, Nov 28, 2022 at 8:53 AM Robert Haas  wrote:
> It is true that if the table is progressively bloating, it is likely
> to be more bloated by the time you are 8 billion XIDs behind than it
> was when you were 800 million XIDs behind. I don't see that as a very
> good reason not to adopt this patch, because you can bloat the table
> by an arbitrarily large amount while consuming only a small number of
> XiDs, even just 1 XID. Protecting against bloat is good, but shutting
> down the database when the XID age reaches a certain value is not a
> particularly effective way of doing that, so saying that we'll be
> hurting people by not shutting down the database at the point where we
> do so today doesn't ring true to me.

I can't speak for Chris, but I think that almost everybody will agree
on this much, without really having to think about it. It's easy to
see that having more XID space is, in general, strictly a good thing.
If there was a low risk way of getting that benefit, then I'd be in
favor of it.

Here's the problem that I see with this patch: I don't think that the
risks are commensurate with the benefits. I can imagine being in favor
of an even more invasive patch that (say) totally removes the concept
of freezing, but that would have to be a very different sort of
design.

> Philosophically, I disagree with the idea of shutting down the
> database completely in any situation in which a reasonable alternative
> exists. Losing read and write availability is really bad, and I don't
> think it's what users want.

At a certain point it may make more sense to activate XidStopLimit
protections (which will only prevent new XID allocations) instead of
getting further behind on freezing, even in a world where we're never
strictly obligated to activate XidStopLimit. It may in fact be the
lesser evil, even with 64-bit XIDs -- because we still have to freeze,
and the factors that drive when and how we freeze mostly aren't
changed.

Fundamentally, when we're falling behind on freezing, at a certain
point we can expect to keep falling behind -- unless some kind of
major shift happens. That's just how freezing works, with or without
64-bit XIDs/MXIDs. If VACUUM isn't keeping up with the allocation of
transactions, then the system is probably misconfigured in some way.
We should do our best to signal this as early and as frequently as
possible, and we should mitigate specific hazards (e.g. old temp
tables) if at all possible. We should activate the failsafe when
things really start to look dicey (which, incidentally, the patch just
removes). These mitigations may be very effective, but in the final
analysis they don't address the fundamental weakness in freezing.

Granted, the specifics of the current XidStopLimit mechanism are
unlikely to directly carry over to 64-bit XIDs. XidStopLimit is
structured in a way that doesn't actually consider freeze debt in
units like unfrozen pages. Like Chris, I just don't see why the patch
obviates the need for something like XidStopLimit, since the patch
doesn't remove freezing. An improved XidStopLimit mechanism might even
end up kicking in *before* the oldest relfrozenxid reached 2 billion
XIDs, depending on the specifics.

Removing the failsafe mechanism seems misguided to me for similar
reasons. I recently learned that Amazon RDS has set a *lower*
vacuum_failsafe_age default than the standard default (its default of
1.6 billion to only 1.2 billion on RDS). This decision predates my
joining AWS. It seems as if practical experience has shown that
allowing any table's age(relfrozenxid) to get too far past a billion
is not a good idea. At least it's not a good idea on modern Postgres
versions, that have the freeze map.

We really shouldn't have to rely on having billions of XIDs available
in the first place -- XID space isn't really a fungible commodity.
It's much more important to recognize that something (some specific
thing) just isn't working as designed, which in general could be
pretty far removed from freezing. For example, index corruption could
do it (at least without the failsafe). Some kind of autovacuum
starvation could do it. It's almost always more complicated than "not
enough available XID space".

-- 
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-28 Thread Robert Haas

On Sat, Nov 26, 2022 at 4:08 AM Chris Travers  wrote:
> I didn't see any changes to pg_upgrade to make this change possible on 
> upgrade.  Is that also outside of the scope of your patch set?  If so how is 
> that continuity supposed to be ensured?

The scheme is documented in their 0006 patch, in a README.XID file.
I'm not entirely confident that it's the best design and have argued
against it in the past, but it's not crazy.

More generally, while I think there's plenty of stuff to be concerned
about in this patch set and while I'm somewhat skeptical about the
likelihood of its getting or staying committed, I can't really
understand your concerns in particular. The thrust of your concern
seems to be that if we allow people to get further behind, recovery
will be more difficult. I'm not sure I see the problem. Suppose that
we adopt this proposal and that it is bug-free. Now, consider a user
who gets 8 billion XIDs behind. They probably have to vacuum pretty
much every page in the database to do that, or least every page in the
tables that haven't been vacuumed recently. But that would likely also
be true if they were 800 million XIDs behind, as is possible today.
The effort to catch up doesn't increase linearly with how far behind
you are, and is always bounded by the DB size.

It is true that if the table is progressively bloating, it is likely
to be more bloated by the time you are 8 billion XIDs behind than it
was when you were 800 million XIDs behind. I don't see that as a very
good reason not to adopt this patch, because you can bloat the table
by an arbitrarily large amount while consuming only a small number of
XiDs, even just 1 XID. Protecting against bloat is good, but shutting
down the database when the XID age reaches a certain value is not a
particularly effective way of doing that, so saying that we'll be
hurting people by not shutting down the database at the point where we
do so today doesn't ring true to me. I think that most people who get
to the point of wraparound shutdown have workloads where bloat isn't a
huge issue, because those who do start having problems with the bloat
way before they run out of XIDs.

It would be entirely possible to add a parameter to the system that
says "hey, you know we can keep running even if we're a shazillion
XiDs behind, but instead shut down when we are behind by this number
of XIDs." Then, if somebody wants to force an automatic shutdown at
that point, they could, and I think that then the scenario you're
worried about just can't happen any more . But isn't that a little bit
silly? You could also just monitor how far behind you are and page the
DBA when you get behind by more than a certain number of XIDs. Then,
you wouldn't be risking a shutdown, and you'd still be able to stay on
top of the XID ages of your tables.

Philosophically, I disagree with the idea of shutting down the
database completely in any situation in which a reasonable alternative
exists. Losing read and write availability is really bad, and I don't
think it's what users want. I think that most users want the database
to degrade gracefully when things do not go according to plan.
Ideally, they'd like everything to Just Work, but reasonable users
understand that sometimes there are going to be problems, and in my
experience, what makes them happy is when the database acts to contain
the scope of the problem so that it affects their workload as little
as possible, rather than acting to magnify the problem so that it
impacts their workload as much as possible. This patch, implementation
and design concerns to one side, does that.

I don't believe there's a single right answer to the question of what
to do about vacuum falling behind, and I think it's worth exploring
multiple avenues to improve the situation. You can have vacuum never
run on a table at all, say because all of the workers are busy
elsewhere, or because the table is locked until the heat death of the
universe. You can have vacuum run on a table but too slowly to do any
good, because of the vacuum cost delay mechanism. You can have vacuum
run and finish but do little good because of prepared transactions or
replication slots or long-running queries. It's reasonable to think
about what kinds of steps might help in those different scenarios, and
especially to think about what kind of steps might help in multiple
cases. We should do that. But, I don't think any of that means that we
can ignore the need for some kind of expansion of the XID space
forever. Computers are getting faster. It's already possible to burn
through the XID space in hours, and the number of hours is going to go
down over time and maybe eventually the right unit will be minutes, or
even seconds. Sometime before then, we need to do something to make
the runway bigger, or else just give up on PostgreSQL being a relevant
piece of software.

Perhaps the thing we need to do is not exactly this, but if not, it's
probably a sibling or cousin of this.

--

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-26 Thread Chris Travers

Hi;

Trying to discuss where we are talking past eachother.

On Fri, Nov 25, 2022 at 9:38 AM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi hackers,
>
> > I'm wondering whether the safest way to handle this is by creating a
> > new TAM called "heap64", so that all storage changes happens there.
>
> > Many current users see stability as one of the greatest strengths of
> > Postgres, so while I very much support this move, I wonder if this
> > gives us a way to have both stability and innovation at the same time?
>
> That would be nice.
>
> However from what I see TransactionId is a type used globally in
> PostgresSQL. It is part of structures used by TAM interface, used in
> WAL records, etc. So we will have to learn these components to work
> with 64-bit XIDs anyway and then start thinking about cases like: when
> a user runs a transaction affecting two tables, a heap32 one and
> heap64 one and we will have to figure out which tuples are visible and
> which are not. This perhaps is doable but the maintenance burden for
> the project will be too high IMO.
>
> It seems to me that the best option we can offer for the users looking
> for stability is to use the latest PostgreSQL version with 32-bit
> XIDs. Assuming these users care that much about this particular design
> choice of course.
>

I didn't see any changes to pg_upgrade to make this change possible on
upgrade.  Is that also outside of the scope of your patch set?  If so how
is that continuity supposed to be ensured?

Also related to that, I think you would have to have a check on streaming
replication that both instances use the same xid format (that you don't
accidently upgrade this somehow), since this is set per db cluster, right?

>
> > The whole project seems to just ignore basic, pertinent questions.
> > Questions like: why are we falling behind like this in the first
> > place? And: If we don't catch up soon, why should we be able to catch
> > up later on? Falling behind on freezing is still a huge problem with
> > 64-bit XIDs.
>
> Is the example I provided above wrong?
>
> """
> Consider the case when you run a slow OLAP query that takes 12h to
> complete and 100K TPS of fast OLTP-type queries on the same system.
> The fast queries will consume all 32-bit XIDs in less than 12 hours,
> while the OLAP query started 12 hours ago didn't finish yet and thus
> its tuples can't be frozen.
> """
>
> If it is, please let me know. I would very much like to know if my
> understanding here is flawed.
>

So, you have described a scenario we cannot support today (because xids
would be exhausted within 5.5 hours at that transactional rate).
Additionally as PostgreSQL becomes more capable, this sort of scale will
increasingly be within reach and that is an important point in favor of
this effort.

This being said, there is another set of xid wraparound cases which today
is much larger in number that I think would be hurt if this patchset were
to be accepted into Postgres without mitigating measures which you consider
out of bounds -- the cases like Mailchimp, Adjust, and the like.  This is
why I keep stressing this, and I don't think waiving away concerns about
use cases outside of the one you are focusing on is helpful, particularly
from those of us who have faced xid wraparounds in these cases in the
past.  In these cases, database teams are usually faced with an operational
emergency while tools like vacuum, pg_repack, etc are severely degraded due
to getting so far behind on freezing.  The deeper the hole, the harder it
will be to dig out of.

Every large-scale high-throughput database I have ever worked on had
long-running query alerts precisely because of the impact on vacuum and the
downstream performance impacts.   I would love to get to a point where this
wasn't necessary and maybe in a few specific workloads we might be there
very soon.  The effort you are engaging in here is an important part of the
path to get there, but let's not forget the people who today are facing xid
wraparounds due to vacuum problems and what this sort of set of changes
will mean for them.

-- 
> Best regards,
> Aleksander Alekseev
>
>
>

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-25 Thread Aleksander Alekseev

Hi hackers,

> I'm wondering whether the safest way to handle this is by creating a
> new TAM called "heap64", so that all storage changes happens there.

> Many current users see stability as one of the greatest strengths of
> Postgres, so while I very much support this move, I wonder if this
> gives us a way to have both stability and innovation at the same time?

That would be nice.

However from what I see TransactionId is a type used globally in
PostgresSQL. It is part of structures used by TAM interface, used in
WAL records, etc. So we will have to learn these components to work
with 64-bit XIDs anyway and then start thinking about cases like: when
a user runs a transaction affecting two tables, a heap32 one and
heap64 one and we will have to figure out which tuples are visible and
which are not. This perhaps is doable but the maintenance burden for
the project will be too high IMO.

It seems to me that the best option we can offer for the users looking
for stability is to use the latest PostgreSQL version with 32-bit
XIDs. Assuming these users care that much about this particular design
choice of course.

> The whole project seems to just ignore basic, pertinent questions.
> Questions like: why are we falling behind like this in the first
> place? And: If we don't catch up soon, why should we be able to catch
> up later on? Falling behind on freezing is still a huge problem with
> 64-bit XIDs.

Is the example I provided above wrong?

"""
Consider the case when you run a slow OLAP query that takes 12h to
complete and 100K TPS of fast OLTP-type queries on the same system.
The fast queries will consume all 32-bit XIDs in less than 12 hours,
while the OLAP query started 12 hours ago didn't finish yet and thus
its tuples can't be frozen.
"""

If it is, please let me know. I would very much like to know if my
understanding here is flawed.

-- 
Best regards,
Aleksander Alekseev

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-24 Thread Peter Geoghegan

On Sun, Nov 20, 2022 at 11:58 PM Chris Travers  wrote:
> I can start by saying I think it would be helpful (if the other issues are 
> approached reasonably) to have 64-bit xids, but there is an important piece 
> of context in reventing xid wraparounds that seems missing from this patch 
> unless I missed something.
>
> XID wraparound is a symptom, not an underlying problem.  It usually occurs 
> when autovacuum or other vacuum strategies have unexpected stalls and 
> therefore fail to work as expected.  Shifting to 64-bit XIDs dramatically 
> changes the sorts of problems that these stalls are likely to pose to 
> operational teams.  -- you can find you are running out of storage rather 
> than facing an imminent database shutdown.  Worse, this patch delays the 
> problem until some (possibly far later!) time, when vacuum will take far 
> longer to finish, and options for resolving the problem are diminished.  As a 
> result I am concerned that merely changing xids from 32-bit to 64-bit will 
> lead to a smaller number of far more serious outages.

This is exactly what I think (except perhaps for the part about having
fewer outages overall). The more transaction ID space you need, the
more space you're likely to need in the near future.

We can all agree that having more runway is better than having less
runway, at least in some abstract sense, but that in itself doesn't
help the patch series very much. The first time the system-level
oldestXid (or database level datminfrozenxid) attains an age of 2
billion XIDs will usually *also* be the first time it attains an age
of (say) 300 million XIDs. Even 300 million is usually a huge amount
of XID space relative to (say) the number of XIDs used every 24 hours.
So I know exactly what you mean about just addressing a symptom.

The whole project seems to just ignore basic, pertinent questions.
Questions like: why are we falling behind like this in the first
place? And: If we don't catch up soon, why should we be able to catch
up later on? Falling behind on freezing is still a huge problem with
64-bit XIDs.

Part of the problem with the current design is that table age has
approximately zero relationship with the true cost of catching up on
freezing -- we are "using the wrong units", in a very real sense. In
general we may have to do zero freezing to advance a table's
relfrozenxid age by a billion XIDs, or we might have to write
terabytes of FREEZE_PAGE records to advance a similar looking table's
relfrozenxid by just one single XID (it could also swing wildly over
time for the same table). Which the system simply doesn't try to
reason about right now.

There are no settings for freezing that use physical units, and frame
the problem as a problem of being behind by this many unfrozen pages
(they are all based on XID age). And so the problem with letting table
age get into the billions isn't even that we'll never catch up -- we
actually might catch up very easily! The real problem is that we have
no way of knowing ahead of time (at least not right now). VACUUM
should be managing the debt, *and* the uncertainty about how much debt
we're really in. VACUUM needs to have a dynamic, probabilistic
understanding of what's really going on -- something much more
sophisticated than looking at table age in autovacuum.c.

One reason why you might want to advance relfrozenxid proactively is
to give the system a better general sense of the true relationship
between logical XID space and physical freezing for a given table and
workload -- it gives a clearer picture about the conditions in the
table. The relationship between SLRU space and physical heap pages and
the work of freezing is made somewhat clearer by a more proactive
approach to advancing relfrozenxid. That's one way that VACUUM can
lower the uncertainty I referred to.

-- 
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-24 Thread Simon Riggs

On Fri, 21 Oct 2022 at 17:09, Maxim Orlov  wrote:

> Reviews and opinions are very welcome!

I'm wondering whether the safest way to handle this is by creating a
new TAM called "heap64", so that all storage changes happens there.
(Obviously there are still many other changes in core, but they are
more easily fixed).

That would reduce the code churn around "heap", allowing us to keep it
stable while we move to the brave new world.

Many current users see stability as one of the greatest strengths of
Postgres, so while I very much support this move, I wonder if this
gives us a way to have both stability and innovation at the same time?

-- 
Simon Riggshttp://www.EnterpriseDB.com/

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-24 Thread Aleksander Alekseev

Hi Chris,

> XID wraparound doesn't happen to healthy databases
> If you disagree, I would like to hear why.

Consider the case when you run a slow OLAP query that takes 12h to
complete and 100K TPS of fast OLTP-type queries on the same system.
The fast queries will consume all 32-bit XIDs in less than 12 hours,
while the OLAP query started 12 hours ago didn't finish yet and thus
its tuples can't be frozen.

> I agree that 64-bit xids are a good idea.  I just don't think that existing 
> safety measures should be ignored or reverted.

Fair enough.

> The problem isn't just the lack of disk space, but the difficulty that stuck 
> autovacuum runs pose in resolving the issue.  Keep in mind that everything 
> you can do to reclaim disk space (vacuum full, cluster, pg_repack) will be 
> significantly slowed down by an extremely bloated table/index comparison.  
> The problem is that if you are running out of disk space, and your time to 
> recovery much longer than expected, then you have a major problem.  It's not 
> just one or the other, but the combination that poses the real risk here.
>
> Now that's fine if you want to run a bloatless table engine but to my 
> knowledge none of these are production-ready yet.  ZHeap seems mostly 
> stalled.  Oriole is still experimental.  But with the current PostgreSQL 
> table structure.
>
> A DBA can monitor disk space, but if the DBA is not also monitoring xid lag, 
> then by the time corrective action is taken it may be too late.

Good point but I still don't think this is related to the XID
wraparound problem.

-- 
Best regards,
Aleksander Alekseev

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-23 Thread Chris Travers

On Tue, Nov 22, 2022 at 10:01 AM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi Chris,
>
> > Right now the way things work is:
> > 1.  Database starts throwing warnings that xid wraparound is approaching
> > 2.  Database-owning team initiates an emergency response, may take
> downtime or degradation of services as a result
> > 3.  People get frustrated with PostgreSQL because this is a reliability
> problem.
> >
> > What I am worried about is:
> > 1.  Database is running out of space
> > 2.  Database-owning team initiates an emergency response and takes more
> downtime to into a good spot
> > 3.  People get frustrated with PostgreSQL because this is a reliability
> problem.
> >
> > If that's the way we go, I don't think we've solved that much.  And as
> humans we also bias our judgments towards newsworthy events, so rarer, more
> severe problems are a larger perceived problem than the more routine, less
> severe problems.  So I think our image as a reliable database would suffer.
> >
> > An ideal resolution from my perspective would be:
> > 1.  Database starts throwing warnings that xid lag has reached severely
> abnormal levels
> > 2.  Database owning team initiates an effort to correct this, and does
> not take downtime or degradation of services as a result
> > 3.  People do not get frustrated because this is not a reliability
> problem anymore.
> >
> > Now, 64-big xids are necessary to get us there but they are not
> sufficient.  One needs to fix the way we handle this sort of problem.
> There is existing logic to warn if we are approaching xid wraparound.  This
> should be changed to check how many xids we have used rather than remaining
> and have a sensible default there (optionally configurable).
> >
> > I agree it is not vacuum's responsibility.  It is the responsibility of
> the current warnings we have to avoid more serious problems arising from
> this change.  These should just be adjusted rather than dropped.
>
> I disagree with the axiom that XID wraparound is merely a symptom and
> not a problem.
>

XID wraparound doesn't happen to healthy databases, nor does it happen to
databases actively monitoring this possibility.  The cases where it
happens, two circumstances are present:

1.  Autovacuum is stalled, and
2.  Monitoring is not checking for xid lag (which would be fixed by
autovacuum if it were running properly).

XID wraparound is downstream of those problems.   At least that is my
experience.  If you disagree, I would like to hear why.

Additionally those problems still will cause worse outages with this change
unless there are some mitigating measures in place.  If you don't like my
proposal, I would be open to other mitigating measures.  But I think there
need to be mitigating measures in a change like this.

>
> Using 32-bit XIDs was a reasonable design decision back when disk
> space was limited and disks were slow. The drawback of this approach
> is the need to do the wraparound but agaig back then it was a
> reasonable design choice. If XIDs were 64-bit from the beginning users
> could run one billion (1,000,000,000) TPS for 584 years without a
> wraparound. We wouldn't have it similarly as there is no wraparound
> for WAL segments. Now when disks are much faster and much cheaper
> 32-bit XIDs are almost certainly not a good design choice anymore.
> (Especially considering the fact that this particular patch mitigates
> the problem of increased disk consumption greatly.)
>

I agree that 64-bit xids are a good idea.  I just don't think that existing
safety measures should be ignored or reverted.

>
> Also I disagree with an argument that a DBA that doesn't monitor disk
> space would care much about some strange warnings in the logs. If a
> DBA doesn't monitor basic system metrics I'm afraid we can't help this
> person much.
>

The problem isn't just the lack of disk space, but the difficulty that
stuck autovacuum runs pose in resolving the issue.  Keep in mind that
everything you can do to reclaim disk space (vacuum full, cluster,
pg_repack) will be significantly slowed down by an extremely bloated
table/index comparison.  The problem is that if you are running out of disk
space, and your time to recovery much longer than expected, then you have a
major problem.  It's not just one or the other, but the combination that
poses the real risk here.

Now that's fine if you want to run a bloatless table engine but to my
knowledge none of these are production-ready yet.  ZHeap seems mostly
stalled.  Oriole is still experimental.  But with the current PostgreSQL
table structure.

A DBA can monitor disk space, but if the DBA is not also monitoring xid
lag, then by the time corrective action is taken it may be too late.

> I do agree that we could probably provide some additional help for the
> rest of the users when it comes to configuring VACUUM. This is indeed
> non-trivial. However I don't think this is in scope of this particular
> patchset. I suggest we keep the focus

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-22 Thread Dilip Kumar

On Tue, Nov 22, 2022 at 7:44 PM Aleksander Alekseev
 wrote:
>
> Hi hackers,
>
> [ Excluding my personal e-mail from cc:, not sure how it got there.
> Please don't cc: to afis...@gmail.com, I'm not using it for reading
> pgsql-hackers@. ]
>
> > I agree with Alexander, that notifications for DBA are a little bit
> > outside the scope of the activity in this thread unless we've just
> > dropped some existing notifications, considering they're not
> > significant anymore. If that was the point, please Chris mention what
> > existing notifications you want to return. I don't think it's a big
> > deal to have the patch with certain notifications inherited from
> > Master branch.
>
> To clarify a bit: currently we DO notify the user about the upcoming
> wraparound point [1]:
>
> """
> If for some reason autovacuum fails to clear old XIDs from a table,
> the system will begin to emit warning messages like this when the
> database's oldest XIDs reach forty million transactions from the
> wraparound point:
>
> WARNING:  database "mydb" must be vacuumed within 39985967 transactions
> HINT:  To avoid a database shutdown, execute a database-wide VACUUM in
> that database.
> """
>
> So I'm not sure how the notification Chris proposes should differ or
> why it is in scope of this patch. If the point was to make sure
> certain existing notifications will be preserved - sure, why not.

IMHO, after having 64-bit XID this WARNING doesn't really make sense.
Those warnings exist because those limits were problematic for 32-bit
xid but now it is not so I think we should not have such warnings.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-22 Thread Aleksander Alekseev

Hi hackers,

[ Excluding my personal e-mail from cc:, not sure how it got there.
Please don't cc: to afis...@gmail.com, I'm not using it for reading
pgsql-hackers@. ]

> I agree with Alexander, that notifications for DBA are a little bit
> outside the scope of the activity in this thread unless we've just
> dropped some existing notifications, considering they're not
> significant anymore. If that was the point, please Chris mention what
> existing notifications you want to return. I don't think it's a big
> deal to have the patch with certain notifications inherited from
> Master branch.

To clarify a bit: currently we DO notify the user about the upcoming
wraparound point [1]:

"""
If for some reason autovacuum fails to clear old XIDs from a table,
the system will begin to emit warning messages like this when the
database's oldest XIDs reach forty million transactions from the
wraparound point:

WARNING:  database "mydb" must be vacuumed within 39985967 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in
that database.
"""

So I'm not sure how the notification Chris proposes should differ or
why it is in scope of this patch. If the point was to make sure
certain existing notifications will be preserved - sure, why not.

[1]: 
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

-- 
Best regards,
Aleksander Alekseev

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-22 Thread Pavel Borisov

Hi, Alexander!

On Tue, 22 Nov 2022 at 13:01, Aleksander Alekseev
 wrote:
>
> Hi Chris,
>
> > Right now the way things work is:
> > 1.  Database starts throwing warnings that xid wraparound is approaching
> > 2.  Database-owning team initiates an emergency response, may take downtime 
> > or degradation of services as a result
> > 3.  People get frustrated with PostgreSQL because this is a reliability 
> > problem.
> >
> > What I am worried about is:
> > 1.  Database is running out of space
> > 2.  Database-owning team initiates an emergency response and takes more 
> > downtime to into a good spot
> > 3.  People get frustrated with PostgreSQL because this is a reliability 
> > problem.
> >
> > If that's the way we go, I don't think we've solved that much.  And as 
> > humans we also bias our judgments towards newsworthy events, so rarer, more 
> > severe problems are a larger perceived problem than the more routine, less 
> > severe problems.  So I think our image as a reliable database would suffer.
> >
> > An ideal resolution from my perspective would be:
> > 1.  Database starts throwing warnings that xid lag has reached severely 
> > abnormal levels
> > 2.  Database owning team initiates an effort to correct this, and does not 
> > take downtime or degradation of services as a result
> > 3.  People do not get frustrated because this is not a reliability problem 
> > anymore.
> >
> > Now, 64-big xids are necessary to get us there but they are not sufficient. 
> >  One needs to fix the way we handle this sort of problem.  There is 
> > existing logic to warn if we are approaching xid wraparound.  This should 
> > be changed to check how many xids we have used rather than remaining and 
> > have a sensible default there (optionally configurable).
> >
> > I agree it is not vacuum's responsibility.  It is the responsibility of the 
> > current warnings we have to avoid more serious problems arising from this 
> > change.  These should just be adjusted rather than dropped.
>
> I disagree with the axiom that XID wraparound is merely a symptom and
> not a problem.
>
> Using 32-bit XIDs was a reasonable design decision back when disk
> space was limited and disks were slow. The drawback of this approach
> is the need to do the wraparound but agaig back then it was a
> reasonable design choice. If XIDs were 64-bit from the beginning users
> could run one billion (1,000,000,000) TPS for 584 years without a
> wraparound. We wouldn't have it similarly as there is no wraparound
> for WAL segments. Now when disks are much faster and much cheaper
> 32-bit XIDs are almost certainly not a good design choice anymore.
> (Especially considering the fact that this particular patch mitigates
> the problem of increased disk consumption greatly.)
>
> Also I disagree with an argument that a DBA that doesn't monitor disk
> space would care much about some strange warnings in the logs. If a
> DBA doesn't monitor basic system metrics I'm afraid we can't help this
> person much.
>
> I do agree that we could probably provide some additional help for the
> rest of the users when it comes to configuring VACUUM. This is indeed
> non-trivial. However I don't think this is in scope of this particular
> patchset. I suggest we keep the focus in this discussion. If you have
> a concrete proposal please consider starting a new thread.
>
> This at least is my personal opinion. Let's give the rest of the
> community a chance to share their thoughts.

I agree with Alexander, that notifications for DBA are a little bit
outside the scope of the activity in this thread unless we've just
dropped some existing notifications, considering they're not
significant anymore. If that was the point, please Chris mention what
existing notifications you want to return. I don't think it's a big
deal to have the patch with certain notifications inherited from
Master branch.

Kind regards,
Pavel Borisov
Supabase.

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-22 Thread Aleksander Alekseev

Hi Chris,

> Right now the way things work is:
> 1.  Database starts throwing warnings that xid wraparound is approaching
> 2.  Database-owning team initiates an emergency response, may take downtime 
> or degradation of services as a result
> 3.  People get frustrated with PostgreSQL because this is a reliability 
> problem.
>
> What I am worried about is:
> 1.  Database is running out of space
> 2.  Database-owning team initiates an emergency response and takes more 
> downtime to into a good spot
> 3.  People get frustrated with PostgreSQL because this is a reliability 
> problem.
>
> If that's the way we go, I don't think we've solved that much.  And as humans 
> we also bias our judgments towards newsworthy events, so rarer, more severe 
> problems are a larger perceived problem than the more routine, less severe 
> problems.  So I think our image as a reliable database would suffer.
>
> An ideal resolution from my perspective would be:
> 1.  Database starts throwing warnings that xid lag has reached severely 
> abnormal levels
> 2.  Database owning team initiates an effort to correct this, and does not 
> take downtime or degradation of services as a result
> 3.  People do not get frustrated because this is not a reliability problem 
> anymore.
>
> Now, 64-big xids are necessary to get us there but they are not sufficient.  
> One needs to fix the way we handle this sort of problem.  There is existing 
> logic to warn if we are approaching xid wraparound.  This should be changed 
> to check how many xids we have used rather than remaining and have a sensible 
> default there (optionally configurable).
>
> I agree it is not vacuum's responsibility.  It is the responsibility of the 
> current warnings we have to avoid more serious problems arising from this 
> change.  These should just be adjusted rather than dropped.

I disagree with the axiom that XID wraparound is merely a symptom and
not a problem.

Using 32-bit XIDs was a reasonable design decision back when disk
space was limited and disks were slow. The drawback of this approach
is the need to do the wraparound but agaig back then it was a
reasonable design choice. If XIDs were 64-bit from the beginning users
could run one billion (1,000,000,000) TPS for 584 years without a
wraparound. We wouldn't have it similarly as there is no wraparound
for WAL segments. Now when disks are much faster and much cheaper
32-bit XIDs are almost certainly not a good design choice anymore.
(Especially considering the fact that this particular patch mitigates
the problem of increased disk consumption greatly.)

Also I disagree with an argument that a DBA that doesn't monitor disk
space would care much about some strange warnings in the logs. If a
DBA doesn't monitor basic system metrics I'm afraid we can't help this
person much.

I do agree that we could probably provide some additional help for the
rest of the users when it comes to configuring VACUUM. This is indeed
non-trivial. However I don't think this is in scope of this particular
patchset. I suggest we keep the focus in this discussion. If you have
a concrete proposal please consider starting a new thread.

This at least is my personal opinion. Let's give the rest of the
community a chance to share their thoughts.

-- 
Best regards,
Aleksander Alekseev

1 2 3 >

1 - 100 of 223 matches

Mail list logo