pgsql: Rename io_direct to debug_io_direct.

2023-05-14 Thread Thomas Munro
Rename io_direct to debug_io_direct. Give the new GUC introduced by d4e71df6 a name that is clearly not intended for mainstream use quite yet. Future proposals would drop the prefix only after adding infrastructure to make it efficient. Having the switch in the tree sooner is good because it

Re: Large files for relations

2023-05-12 Thread Thomas Munro
On Sat, May 13, 2023 at 11:01 AM Thomas Munro wrote: > On Sat, May 13, 2023 at 4:41 AM MARK CALLAGHAN wrote: > > use XFS and O_DIRECT As for direct I/O, we're only just getting started on that. We currently can't produce more than one concurrent WAL write, and then for relation data

Re: Large files for relations

2023-05-12 Thread Thomas Munro
On Sat, May 13, 2023 at 4:41 AM MARK CALLAGHAN wrote: > Repeating what was mentioned on Twitter, because I had some experience with > the topic. With fewer files per table there will be more contention on the > per-inode mutex (which might now be the per-inode rwsem). I haven't read >

Re: smgrzeroextend clarification

2023-05-12 Thread Thomas Munro
On Sat, May 13, 2023 at 6:07 AM Greg Stark wrote: > On Thu, 11 May 2023 at 05:37, Peter Eisentraut > wrote: > > Maybe it was never meant that way and only works accidentally? Maybe > > hash indexes are broken? > > It's explicitly documented to be this way. And I think it has to work > this way

Re: Large files for relations

2023-05-11 Thread Thomas Munro
On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski wrote: > On Mon, May 1, 2023 at 9:29 PM Thomas Munro wrote: >> I am not aware of any modern/non-historic filesystem[2] that can't do >> large files with ease. Anyone know of anything to worry about on that >> front? >

Re: [PATCH] Add native windows on arm64 support

2023-05-10 Thread Thomas Munro
On Thu, May 11, 2023 at 11:34 AM Michael Paquier wrote: > On Wed, May 10, 2023 at 09:24:39AM -0400, Andrew Dunstan wrote: > > We will definitely want buildfarm support. I don't have such a machine and > > am not likely to have one any time soon. I do run drongo and fairywren on an > > EC2

Unlinking Parallel Hash Join inner batch files sooner

2023-05-09 Thread Thomas Munro
nk i22of32.p2.0 93662: unlink i5of32.p0.0 93662: unlink o20of32.p0.0 93662: unlink i24of32.p0.0 93662: unlink o18of32.p1.0 93662: unlink o17of32.p1.0 93662: unlink i13of32.p1.0 93662: unlink o30of32.p0.0 93662: unlink o5of32.p1.0 From 660ee4b9f7ba6c08cc8bc00b18bdbe6c83eb581b Mon Sep 17 00:00:00 2001 Fr

Re: DROP DATABASE is interruptible

2023-05-08 Thread Thomas Munro
On Tue, May 9, 2023 at 3:41 PM Thomas Munro wrote: > I tried out the patch you posted over at [1]. I forgot to add, +1, I think this is a good approach. (I'm still a little embarrassed at how long we spent trying to debug this in the other thread from the supplied clues, when you'd alre

Re: DROP DATABASE is interruptible

2023-05-08 Thread Thomas Munro
I tried out the patch you posted over at [1]. For those wanting an easy way to test it, or test the buggy behaviour in master without this patch, you can simply kill -STOP the checkpointer, so that DROP DATABASE hangs in RequestCheckpoint() (or you could SIGSTOP any other backend so it hangs in

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-08 Thread Thomas Munro
On Tue, May 9, 2023 at 10:04 AM Tom Lane wrote: > Michael Paquier writes: > > One thing I was wondering about to improve the odds of the hits is to > > be more aggressive with the number of relations created at once, so as > > we are much more aggressive with the number of pages extended in > >

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-07 Thread Thomas Munro
On Mon, May 8, 2023 at 2:24 PM Michael Paquier wrote: > I can reproduce the same backtrace here. That's just my usual laptop > with ext4, so this would be a Postgres bug. First, here are the four > things running in parallel so as I can get a failure in loading a > critical index when

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-07 Thread Thomas Munro
On Mon, May 8, 2023 at 4:10 AM Evgeny Morozov wrote: > On 6/05/2023 11:13 pm, Thomas Munro wrote: > > Would you like to try requesting FILE_COPY for a while and see if it > > eventually happens like that too? > Sure, we can try that. Maybe you could do some one way and

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-07 Thread Thomas Munro
On Sun, May 7, 2023 at 1:21 PM Tom Lane wrote: > Thomas Munro writes: > > Did you previously run this same workload on versions < 15 and never > > see any problem? 15 gained a new feature CREATE DATABASE ... > > STRATEGY=WAL_LOG, which is also the default. I wond

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Thomas Munro
On Sun, May 7, 2023 at 10:23 AM Jeffrey Walton wrote: > This may be related... I seem to recall the GNUlib folks talking about > a cp bug on sparse files. It looks like it may be fixed in coreutils > release 9.2 (2023-03-20): > https://github.com/coreutils/coreutils/blob/master/NEWS#L233 > > If I

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Thomas Munro
On Sun, May 7, 2023 at 12:29 AM Evgeny Morozov wrote: > On 6/05/2023 12:34 pm, Thomas Munro wrote: > > So it does indeed look like something unknown has replaced 32KB of > > data with 32KB of zeroes underneath us. Are there more non-empty > > files that are all-zeroes? Some

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-06 Thread Thomas Munro
On Sat, May 6, 2023 at 9:58 PM Evgeny Morozov wrote: > Right - I should have realised that! base/1414389/2662 is indeed all > nulls, 32KB of them. I included the file anyway in > https://objective.realityexists.net/temp/pgstuff2.zip OK so it's not just page 0, you have 32KB or 4 pages of all

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-05 Thread Thomas Munro
On Fri, May 5, 2023 at 7:50 PM Evgeny Morozov wrote: > The OID of the bad DB ('test_behavior_638186279733138190') is 1414389 and > I've uploaded base/1414389/pg_filenode.map and also base/5/2662 (in case > that's helpful) as https://objective.realityexists.net/temp/pgstuff1.zip Thanks. That

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-04 Thread Thomas Munro
On Fri, May 5, 2023 at 11:15 AM Thomas Munro wrote: > What does select > pg_relation_filepath('pg_class_oid_index') show in the corrupted > database, base/5/2662 or something else? Oh, you can't get that far, but perhaps you could share the pg_filenode.map file? Or alternatively

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-04 Thread Thomas Munro
On Fri, May 5, 2023 at 11:15 AM Thomas Munro wrote: > Now *that* is a piece of > logic that changed in PostgreSQL 15. It changed from sector-based > atomicity assumptions to a directory entry swizzling trick, in commit > d8cd0c6c95c0120168df93aae095df4e0682a08a. Hmm. I sp

Re: "PANIC: could not open critical system index 2662" - twice

2023-05-04 Thread Thomas Munro
On Fri, May 5, 2023 at 6:11 AM Evgeny Morozov wrote: > Meanwhile, what do I do with the existing server, though? Just try to > drop the problematic DBs again manually? That earlier link to a FreeBSD thread is surely about bleeding edge new ZFS stuff that was briefly broken then fixed, being

Re: Fsync IO issue

2023-05-04 Thread Thomas Munro
On Fri, May 5, 2023 at 8:37 AM ProfiVPS Support wrote: > I feel like ANYTHING would be better than this. Even risking loosing _some_ > of the latest data in case of a server crash (if it crashes we lose data > anyways until restart, ofc we could have HA I know and we will when there'll > be a

Re: Direct I/O

2023-05-03 Thread Thomas Munro
On Wed, Apr 19, 2023 at 7:35 AM Greg Stark wrote: > On Mon, 17 Apr 2023 at 17:45, Thomas Munro wrote: > > (2) without a page cache, you really need to size your shared_buffers > > adequately and we can't do that automatically. > > Well I'm more optimistic... That may not

Re: Large files for relations

2023-05-02 Thread Thomas Munro
On Wed, May 3, 2023 at 5:21 PM Thomas Munro wrote: > rsync --link-dest I wonder if rsync will grow a mode that can use copy_file_range() to share blocks with a reference file (= previous backup). Something like --copy-range-dest. That'd work for large-file relations (assuming a file sys

Re: Large files for relations

2023-05-02 Thread Thomas Munro
On Tue, May 2, 2023 at 3:28 PM Pavel Stehule wrote: > I like this patch - it can save some system sources - I am not sure how much, > because bigger tables usually use partitioning usually. Yeah, if you only use partitions of < 1GB it won't make a difference. Larger partitions are not uncommon,

Re: Autogenerate some wait events code and documentation

2023-05-01 Thread Thomas Munro
> [patch] This is not a review of the perl/make/meson glue/details, but I just wanted to say thanks for working on this Bertrand & Michael, at a quick glance that .txt file looks like it's going to be a lot more fun to maintain!

Large files for relations

2023-05-01 Thread Thomas Munro
for a while [1] https://wiki.postgresql.org/wiki/AllComputers [2] https://en.wikipedia.org/wiki/Comparison_of_file_systems From b4b6f27af1d196f9d6b3b8d599121cf2900f Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 24 Apr 2023 18:04:43 +1200 Subject: [PATCH 01/11] Assert that pgoff_t is wide enough. O

Re: Direct I/O

2023-04-30 Thread Thomas Munro
On Mon, May 1, 2023 at 12:00 PM Tom Lane wrote: > Justin Pryzby writes: > > On Sun, Apr 30, 2023 at 06:35:30PM +1200, Thomas Munro wrote: > >> What about a > >> warning message about that at startup if it's on? > > > Such a warning wouldn't be particula

Re: Direct I/O

2023-04-30 Thread Thomas Munro
On Sun, Apr 30, 2023 at 6:35 PM Thomas Munro wrote: > On Sun, Apr 30, 2023 at 4:11 PM Noah Misch wrote: > > Speaking of the developer-only status, I find the io_direct name more > > enticing > > than force_parallel_mode, which PostgreSQL renamed due to overuse from > &g

Re: Direct I/O

2023-04-30 Thread Thomas Munro
On Sun, Apr 30, 2023 at 4:11 PM Noah Misch wrote: > Speaking of the developer-only status, I find the io_direct name more enticing > than force_parallel_mode, which PostgreSQL renamed due to overuse from people > expecting non-developer benefits. Should this have a name starting with > debug_?

Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call

2023-04-25 Thread Thomas Munro
On Tue, Apr 25, 2023 at 12:16 PM Andres Freund wrote: > On 2023-04-24 15:32:25 -0700, Andres Freund wrote: > > We obviously can add a retry loop to FileFallocate(), similar to what's > > already present e.g. in FileRead(). But I wonder if we shouldn't go a bit > > further, and do it for all the

pgsql: Remove bogus #include added by d4e71df6d75.

2023-04-25 Thread Thomas Munro
Remove bogus #include added by d4e71df6d75. The recently added inclusion of guc.h in smgr.h is not necessary and introduces more server-related stuff. Removing the directive helps avoid potential issues with including sgmr.h in frontends. Author: Kyotaro Horiguchi Discussion:

Re: seemingly useless #include recently added

2023-04-24 Thread Thomas Munro
On Tue, Apr 25, 2023 at 3:12 PM Tom Lane wrote: > Kyotaro Horiguchi writes: > > While working on a patch, I noticed that a rcent commit (d4e71df6d75) > > added an apparently unnecessary inclusion of guc.h in smgr.h. > > Yes, that seems quite awful, and I also wonder why it changed fd.h. > Adding

Bufferless buffered files

2023-04-22 Thread Thomas Munro
similar for SharedTuplestore's internal chunk buffer (for PHJ)). From 7bad3ebe32f2f319fd76c613c8bb04a7f50e3c4f Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 21 Apr 2023 15:58:02 +1200 Subject: [PATCH 1/2] Skip useless buffering in BufFile. Some callers do small reads and writes (tuples), and b

Re: New committers: Nathan Bossart, Amit Langote, Masahiko Sawada

2023-04-20 Thread Thomas Munro
On Fri, Apr 21, 2023 at 5:40 AM Tom Lane wrote: > The Core Team would like to extend our congratulations to > Nathan Bossart, Amit Langote, and Masahiko Sawada, who have > accepted invitations to become our newest Postgres committers. > > Please join me in wishing them much success and few

Re: check_strxfrm_bug()

2023-04-19 Thread Thomas Munro
On Wed, Apr 19, 2023 at 2:31 PM Jonathan S. Katz wrote: > To be clear, is the proposal to remove both "check_strxfrm_bug" and > "TRUST_STRXFRM"? > > Given a bunch of folks who have expertise in this area of code all agree > with removing the above as part of the collation cleanups targeted for >

pgsql: Remove obsolete defense against strxfrm() bugs.

2023-04-19 Thread Thomas Munro
Remove obsolete defense against strxfrm() bugs. Old versions of Solaris and illumos had buffer overrun bugs in their strxfrm() implementations. The bugs were fixed more than a decade ago and the relevant releases are long out of vendor support. It's time to remove the defense added by commit

Re: pg_collation.collversion for C.UTF-8

2023-04-18 Thread Thomas Munro
On Wed, Apr 19, 2023 at 1:30 PM Jeff Davis wrote: > On Wed, 2023-04-19 at 07:48 +1200, Thomas Munro wrote: > > Many OSes have a locale with this name. I don't know this history, > > who did it first etc, but now I am wondering if they all took the > > "obvious"

Re: check_strxfrm_bug()

2023-04-18 Thread Thomas Munro
On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier wrote: > On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote: > > +1 for getting rid of TRUST_STRXFRM. +1 The situation is not improving fast, and requires hard work to follow on each OS. Clearly, mainstreaming ICU is the way to go.

Re: pg_collation.collversion for C.UTF-8

2023-04-18 Thread Thomas Munro
On Wed, Apr 19, 2023 at 12:36 AM Daniel Verite wrote: > This seems to be based on the idea that C.* collations provide an > immutable sort like "C", but it appears that it's not the case. Hmm. It seems I added that exemption initially for FreeBSD only in ca051d8b101, and then merged the cases

Re: High QPS, random index writes and vacuum

2023-04-17 Thread Thomas Munro
On Tue, Apr 18, 2023 at 2:43 PM peter plachta wrote: > I was trying to understand whether there are any known workarounds for random > access + index vacuums. Are my vacuum times 'normal' ? Ah, it's not going to help on the old versions you mentioned, but for what it's worth: I remember

Re: Direct I/O

2023-04-17 Thread Thomas Munro
On Tue, Apr 18, 2023 at 4:06 AM Tom Lane wrote: > Robert Haas writes: > > On Sat, Apr 15, 2023 at 2:19 PM Tom Lane wrote: > >> I get the impression that we are going to need an actual runtime > >> test if we want to defend against this. Not entirely convinced > >> it's worth the trouble. Who,

Re: check_strxfrm_bug()

2023-04-16 Thread Thomas Munro
On Sun, Dec 18, 2022 at 10:27 AM Thomas Munro wrote: > With my garbage collector hat on, that made me wonder if there was > some more potential cleanup here: could we require locale_t yet? The > last straggler systems on our target OS list to add the POSIX locale_t > stuff were

Re: Where are we on supporting LLVM's opaque-pointer changes?

2023-04-15 Thread Thomas Munro
On Sat, Apr 15, 2023 at 2:31 AM Tom Lane wrote: > I know we've been letting this topic slide, but we are out of runway. > I propose adding this as a must-fix open item for PG 16. I had a patch that solved many of the problems[1], but it isn't all the way there and I got stuck. I am going to

Re: Direct I/O

2023-04-15 Thread Thomas Munro
On Sun, Apr 16, 2023 at 6:19 AM Tom Lane wrote: > So apparently, the fact that you even get a warning about the > alignment not being honored is something OpenBSD patched in > after-the-fact; it's not there in genuine vintage gcc. Ah, that is an interesting discovery, and indeed kills the

Re: Direct I/O

2023-04-14 Thread Thomas Munro
On Sat, Apr 15, 2023 at 7:50 AM Mikael Kjellström wrote: > want me to switch to clang instead? I vote +1, that's the system compiler in modern OpenBSD. https://www.cambus.net/the-state-of-toolchains-in-openbsd/ As for curculio, I don't understand the motivation for maintaining that machine.

Re: Direct I/O

2023-04-14 Thread Thomas Munro
On Sat, Apr 15, 2023 at 7:38 AM Tom Lane wrote: > Andres Freund writes: > > On 2023-04-14 15:21:18 -0400, Tom Lane wrote: > >> +1 for that, though. (Also, the fact that these animals aren't > >> actually failing suggests that 004_io_direct.pl needs expansion.) > > > It's skipped, due to lack of

Re: OOM in hash join

2023-04-14 Thread Thomas Munro
On Fri, Apr 14, 2023 at 11:43 PM Jehan-Guillaume de Rorthais wrote: > Would you be able to test the latest patchset posted [1] ? This does not fix > the work_mem overflow, but it helps to keep the number of batches > balanced and acceptable. Any feedback, comment or review would be useful. > >

Re: OOM in hash join

2023-04-14 Thread Thomas Munro
On Fri, Apr 14, 2023 at 10:59 PM Konstantin Knizhnik wrote: > Too small value of work_mem cause memory overflow in parallel hash join > because of too much number batches. Yeah. Not only in parallel hash join, but in any hash join (admittedly parallel hash join has higher per-batch overheads;

Build farm breakage over time

2023-04-13 Thread Thomas Munro
Just for fun, I broke time up into 15 minute intervals and counted how many machines were showing red on HEAD at each sample point (lateral join for last tick interpolation of data I collect from the BF), and plotted that over time. See attached. I excluded seawasp (it tells us about *future*

Re: Wrong results from Parallel Hash Full Join

2023-04-13 Thread Thomas Munro
On Thu, Apr 13, 2023 at 12:31 PM Melanie Plageman wrote: > On Wed, Apr 12, 2023 at 6:50 PM Thomas Munro wrote: > > I think "Discussion:" footers are supposed to use > > https://postgr.es/m/XXX shortened URLs. > > Hmm. Is the problem with mine that I inc

pgsql: Fix PHJ match bit initialization.

2023-04-13 Thread Thomas Munro
Fix PHJ match bit initialization. Hash join tuples reuse the HOT status bit to indicate match status during hash join execution. Correct reuse requires clearing the bit in all tuples. Serial hash join and parallel multi-batch hash join do so upon inserting the tuple into the hashtable. Single

Re: Backends stunk in wait event IPC/MessageQueueInternal

2023-04-13 Thread Thomas Munro
On Sun, Aug 28, 2022 at 11:03 AM Thomas Munro wrote: > On Sun, Jun 26, 2022 at 11:18 AM Thomas Munro wrote: > > On Tue, May 17, 2022 at 3:31 PM Thomas Munro wrote: > > > On Mon, May 16, 2022 at 3:45 PM Japin Li wrote: > > > > Maybe use the _

Re: Direct I/O

2023-04-12 Thread Thomas Munro
Thanks both for looking, and thanks for the explanation Ilmari. Pushed with your improvements. The 4 CI systems run the tests (Windows and Mac by special always-expected-to-work case, Linux and FreeBSD by successful pre-flight perl test of O_DIRECT), and I also tested three unusual systems, two

pgsql: Skip the 004_io_direct.pl test if a pre-flight check fails.

2023-04-12 Thread Thomas Munro
Skip the 004_io_direct.pl test if a pre-flight check fails. The test previously had a list of OSes that direct I/O was expected to work on. That worked well enough for the systems in our build farm, but didn't survive contact with the Debian build bots running on tmpfs via overlayfs. tmpfs does

Re: Clean up hba.c of code freeing regexps

2023-04-12 Thread Thomas Munro
On Thu, Apr 13, 2023 at 12:16 PM Michael Paquier wrote: > The logic in hba.c that scans all the HBA and ident lines to any > regexps can be simplified a lot. Most of this code is new in 16~, so > I think that it is worth cleaning up this stuff now rather than wait > for 17 to open for business.

Re: Wrong results from Parallel Hash Full Join

2023-04-12 Thread Thomas Munro
On Thu, Apr 13, 2023 at 9:48 AM Melanie Plageman wrote: > Attached patch includes the fix for ExecParallelHashTableInsert() as > well as a test. I toyed with adapting one of the existing parallel full > hash join tests to cover this case, however, I think Richard's repro is > much more clear.

Re: Parallel Full Hash Join

2023-04-12 Thread Thomas Munro
On Mon, Apr 10, 2023 at 11:33 AM Michael Paquier wrote: > On Sat, Apr 08, 2023 at 02:19:54PM -0400, Melanie Plageman wrote: > > Another worker attached to the batch barrier, saw that it was in > > PHJ_BATCH_SCAN, marked it done and detached. This is fine. > > BarrierArriveAndDetachExceptLast() is

pgsql: Remove overzealous assertion from PHJ.

2023-04-12 Thread Thomas Munro
Remove overzealous assertion from PHJ. We can't assert that we're the only process attached to a barrier after BarrierArriveAndDetachExceptLast(). Although that'll be true almost always, a late-starting parallel worker can attach very briefly (that is, immediately detach after checking the

Re: Direct I/O

2023-04-12 Thread Thomas Munro
On Wed, Apr 12, 2023 at 5:48 PM Thomas Munro wrote: > On Wed, Apr 12, 2023 at 3:04 PM Thomas Munro wrote: > > On Wed, Apr 12, 2023 at 2:56 PM Christoph Berg wrote: > > > I'm hitting a panic in t_004_io_direct. The build is running on > > > overlayfs on tmpfs/ext4 (upp

Re: Direct I/O

2023-04-11 Thread Thomas Munro
On Wed, Apr 12, 2023 at 3:04 PM Thomas Munro wrote: > On Wed, Apr 12, 2023 at 2:56 PM Christoph Berg wrote: > > I'm hitting a panic in t_004_io_direct. The build is running on > > overlayfs on tmpfs/ext4 (upper/lower) which is probably a weird > > combination but has wor

Re: Direct I/O

2023-04-11 Thread Thomas Munro
On Wed, Apr 12, 2023 at 2:56 PM Christoph Berg wrote: > I'm hitting a panic in t_004_io_direct. The build is running on > overlayfs on tmpfs/ext4 (upper/lower) which is probably a weird > combination but has worked well for building everything over the last > decade. On Debian unstable: > >

Re: v15b1: FailedAssertion("segment_map->header->magic == (DSA_SEGMENT_HEADER_MAGIC ^ area->control->handle ^ index)", File: "dsa.c", ..)

2023-04-11 Thread Thomas Munro
On Wed, Apr 12, 2023 at 11:37 AM Justin Pryzby wrote: > $ ls /dev/shm/ |grep 3696856876 || echo not found > not found Oh, of course it would have restarted after it crashed and unlinked that... So the remaining traces of that memory *might* be in the core file, depending (IIRC) on the core

Re: v15b1: FailedAssertion("segment_map->header->magic == (DSA_SEGMENT_HEADER_MAGIC ^ area->control->handle ^ index)", File: "dsa.c", ..)

2023-04-11 Thread Thomas Munro
On Wed, Apr 12, 2023 at 7:46 AM Justin Pryzby wrote: > Unfortunately: > (gdb) p area->control->handle > $3 = 0 > (gdb) p segment_map->header->magic > value has been optimized out > (gdb) p index > $4 = Hmm, well index I can find from parameters: > #2 0x00991470 in ExceptionalCondition

Re: cfbot is listing committed patches?

2023-04-11 Thread Thomas Munro
On Tue, Apr 11, 2023 at 6:16 PM Peter Smith wrote: > cfbot [1] is listing some already committed patches under the "Needs > Review" category. For example here are some of mine [1][2]. And > because they are already committed, the 'apply' fails, so they get > flagged by cfbot as needed rebase. >

Re: longfin missing gssapi_ext.h

2023-04-10 Thread Thomas Munro
On Tue, Apr 11, 2023 at 2:53 PM Thomas Munro wrote: > On Tue, Apr 11, 2023 at 2:31 AM Stephen Frost wrote: > > Have you tried running the tests in src/test/kerberos with elver? Or is > > it configured to run them? Would be awesome if it could be, or if > > there's issues w

Re: Direct I/O

2023-04-10 Thread Thomas Munro
On Tue, Apr 11, 2023 at 2:31 PM Thomas Munro wrote: > I tried to find out what POSIX says about this (But of course whatever it might say is of especially limited value when O_DIRECT is in the picture, being completely unstandardised. Really I guess all they meant was "if you *copy* s

Re: longfin missing gssapi_ext.h

2023-04-10 Thread Thomas Munro
On Tue, Apr 11, 2023 at 2:31 AM Stephen Frost wrote: > Have you tried running the tests in src/test/kerberos with elver? Or is > it configured to run them? Would be awesome if it could be, or if > there's issues with running the tests on FBSD w/ MIT Kerberos, I'd be > happy to try and help work

Re: Direct I/O

2023-04-10 Thread Thomas Munro
On Tue, Apr 11, 2023 at 2:15 PM Andres Freund wrote: > And the fix has been merged into > https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/log/?h=for-next > > I think that means it'll have to wait for 6.4 development to open (in a few > weeks), and then will be merged into the

Re: Direct I/O

2023-04-10 Thread Thomas Munro
On Mon, Apr 10, 2023 at 7:27 PM Thomas Munro wrote: > Debian's 6.0.10-2 kernel (Debian 12 on a random laptop). Realising I hadn't updated for a bit, I did so and it still reproduces on: $ uname -a Linux x1 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19) x86_64 GNU/Linux

Re: Direct I/O

2023-04-10 Thread Thomas Munro
On Mon, Apr 10, 2023 at 2:57 PM Andres Freund wrote: > Have you tried to write a reproducer for this that doesn't involve postgres? I tried a bit. I'll try harder soon. > ... What kernel version did you repro > this on Thomas? Debian's 6.0.10-2 kernel (Debian 12 on a random laptop). Here's

Re: Direct I/O

2023-04-09 Thread Thomas Munro
On Mon, Apr 10, 2023 at 8:43 AM Tom Lane wrote: > Boy, it's hard to look at that trace and not call it a filesystem bug. Agreed. > Given the apparent dependency on COW, I wonder if this has something > to do with getting confused about which copy is current? Yeah, I suppose it would require

Re: Direct I/O

2023-04-09 Thread Thomas Munro
On Sun, Apr 9, 2023 at 11:25 PM Andrew Dunstan wrote: > Didn't seem to make any difference. Thanks for testing. I think it's COW (and I think that implies also checksums?) that needs to be turned off, at least based on experiments here.

Re: Direct I/O

2023-04-09 Thread Thomas Munro
On Sun, Apr 9, 2023 at 4:52 PM Thomas Munro wrote: > Here, btrfs seems to be taking a different path that I can't quite > make out... I see no warning/error about a checksum failure like [1], > and we apparently managed to read something other than a mix of the > old and new page con

Re: Direct I/O

2023-04-08 Thread Thomas Munro
Indeed, I can't reproduce this with (our) checksums on. I also can't reproduce it with O_DIRECT off. I also can't reproduce it if I use "mkdir pgdata && chattr +C pgdata && initdb -D pgdata" to have a pgdata directory with copy-on-write and (their) checksums disabled. But it reproduces quite

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 2:18 PM Andres Freund wrote: > On 2023-04-09 13:55:33 +1200, Thomas Munro wrote: > > I think that particular thing might relate to modifications of the > > user buffer while a write is in progress (breaking btrfs's internal > > checksums). I don't th

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 11:05 AM Tom Lane wrote: > Googling finds a lot of suggestions that O_DIRECT doesn't play nice > with btrfs, for example > > https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg92824.html > > It's not clear to me how much of that lore is still current, > but it's

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 10:17 AM Andrew Dunstan wrote: > I can run the test in isolation, and it's get an error reliably. Random idea: it looks like you have compression enabled. What if you turn it off in the directory where the test runs? Something like btrfs property set compression ...

Re: longfin missing gssapi_ext.h

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 6:40 AM Tom Lane wrote: > The exact same thing applies to FreeBSD, except that their in-core > Heimdal is ancient (1.5.2). Also, they do have MIT Kerberos > available as a package [1]. I'd been misled by the lack of a hit > on "kerberos", but "krb5" finds it. Our code

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 10:08 AM Andrew Dunstan wrote: > btrfs Aha!

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 9:10 AM Tom Lane wrote: > 2023-04-08 16:50:03.177 EDT [2023-04-08 16:50:03 EDT 3257645:3] > 004_io_direct.pl LOG: statement: select count(*) from t1 > 2023-04-08 16:50:03.316 EDT [2023-04-08 16:50:03 EDT 3257646:1] ERROR: > invalid page in block 56 of relation

pgsql: Use higher wal_level for 004_io_direct.pl.

2023-04-08 Thread Thomas Munro
Use higher wal_level for 004_io_direct.pl. The new direct I/O test deliberately uses a very small shared_buffers to force some disk transfers without making the data set large and slow, but ran into a problem with wal_level = minimal: log_newpage_range() pins many buffers, leading to a few

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sun, Apr 9, 2023 at 6:55 AM Andres Freund wrote: > Given the frequency of failures on this in the buildfarm, I propose using the > temporary workaround of using wal_level=replica. That avoids the use of the > over-eager log_newpage_range(). Will do.

pgsql: Update tsearch regex memory management.

2023-04-08 Thread Thomas Munro
Update tsearch regex memory management. Now that our regex engine uses palloc(), it's not necessary to set up a special memory context callback to free compiled regexes. The regex has no resources other than the memory that is already going to be freed in bulk. Reviewed-by: Tom Lane

pgsql: Update contrib/trgm_regexp's memory management.

2023-04-08 Thread Thomas Munro
Update contrib/trgm_regexp's memory management. While no code change was necessary for this code to keep working, we don't need to use PG_TRY()/PG_FINALLY() with explicit clean-up while working with regexes anymore. Reviewed-by: Tom Lane Discussion:

pgsql: Use MemoryContext API for regex memory management.

2023-04-08 Thread Thomas Munro
Use MemoryContext API for regex memory management. Previously, regex_t objects' memory was managed with malloc() and free() directly. Switch to palloc()-based memory management instead. Advantages: * memory used by cached regexes is now visible with MemoryContext observability tools *

pgsql: Redesign interrupt/cancel API for regex engine.

2023-04-08 Thread Thomas Munro
Redesign interrupt/cancel API for regex engine. Previously, a PostgreSQL-specific callback checked by the regex engine had a way to trigger a special error code REG_CANCEL if it detected that the next call to CHECK_FOR_INTERRUPTS() would certainly throw via ereport(). A later proposed bugfix

Re: Direct I/O

2023-04-08 Thread Thomas Munro
On Sat, Apr 8, 2023 at 4:59 PM Thomas Munro wrote: > On Sat, Apr 8, 2023 at 4:47 PM Thomas Munro wrote: > > After a bit more copy-editing on docs and comments and a round of > > automated indenting, I have now pushed this. I will now watch the > > build farm. I teste

Re: broken master branch

2023-04-08 Thread Thomas Munro
On Sat, Apr 8, 2023 at 8:04 PM Pavel Stehule wrote: > on fresh Fedora 38, I cannot to run regress tests Looks like the new LLVM 16. I'll try to look at this again next week. In the meantime you could try using 15.

Re: Direct I/O

2023-04-07 Thread Thomas Munro
On Sat, Apr 8, 2023 at 4:47 PM Thomas Munro wrote: > After a bit more copy-editing on docs and comments and a round of > automated indenting, I have now pushed this. I will now watch the > build farm. I tested on quite a few OSes that I have access to, but > this is obvious

Re: Direct I/O

2023-04-07 Thread Thomas Munro
I did some testing with non-default block sizes, and found a few minor things that needed adjustment. The short version is that I blocked some configurations that won't work or would break an assertion. After a bit more copy-editing on docs and comments and a round of automated indenting, I have

pgsql: Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.

2023-04-07 Thread Thomas Munro
() and smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack alignment tricks may be unavailable) or the block size has been set too small to allow arrays of buffers to be all aligned. Author: Thomas Munro Author: Andres Freund Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/ca

pgsql: Add io_direct setting (developer-only).

2023-04-07 Thread Thomas Munro
]sync and wal_level=minimal (which also requires max_wal_senders=0). Those are non-default and unlikely settings, and this behavior wasn't (correctly) documented. The same effect can be achieved with io_direct=wal. Author: Thomas Munro Author: Andres Freund Author: Bharath Rupireddy Reviewed

check_GUC_init(wal_writer_flush_after) fails with non-default block size

2023-04-07 Thread Thomas Munro
: 1519, PID: 84605 From 48d971e0b19f770991e334b8dc38422462b4485e Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Apr 2023 13:12:48 +1200 Subject: [PATCH] Fix default wal_writer_flush_after value. Commit a73952b7956 requires default values in guc_table.c and C variable initializers to match.

Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?

2023-04-07 Thread Thomas Munro
en take a little bit longer on the recovery conflict patch itself (v6-0005) on the basis that it's bugfix work and not subject to the feature freeze. From a21a43bf5b1ba073abb3238968b9f8d13b1b318a Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 4 Jan 2023 14:15:40 +1300 Subject: [PATCH v6 1/

Re: Direct I/O

2023-04-07 Thread Thomas Munro
. [1] https://twitter.com/MengTangmu/status/994770040745615361 [2] http://kos.enix.org/pub/gingell8.pdf From c6e01d506762fb7c11a3fb31d56902fa53ea822b Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 13 Dec 2022 16:25:59 +1300 Subject: [PATCH v4 1/3] Introduce PG_IO_ALIGN_SIZE and align all I/O buff

Re: Using each rel as both outer and inner for JOIN_ANTI

2023-04-06 Thread Thomas Munro
On Thu, Apr 6, 2023 at 6:40 PM Richard Guo wrote: > Seems it wins if the parallel scan becomes part of a hash join in final > plan. I wonder if we have a way to know that in this early stage. I haven't tried but I'm not sure off the top of my head how to make a decision that early unless it's

Re: Using each rel as both outer and inner for JOIN_ANTI

2023-04-05 Thread Thomas Munro
On Thu, Apr 6, 2023 at 12:17 PM Thomas Munro wrote: > I tried the original example from the top of this thread and saw a > decent speedup from parallelism, but only if I set > min_parallel_table_scan_size=0, and otherwise it doesn't choose > Parallel Hash Right Anti Join. Same if I

Re: Using each rel as both outer and inner for JOIN_ANTI

2023-04-05 Thread Thomas Munro
On Thu, Apr 6, 2023 at 9:11 AM Tom Lane wrote: > Richard Guo writes: > > Thanks for reminding. Attached is the rebased patch, with no other > > changes. I think the patch is ready for commit. > > Pushed after a little further fooling with the comments. I also had > to rebase it over 11c2d6fdf

Re: How should we wait for recovery conflict resolution?

2023-04-05 Thread Thomas Munro
On Thu, Apr 6, 2023 at 7:46 AM Thomas Munro wrote: > Initially I was suspicious that there may be tricky races to deal with > around that wakeup logic, and the poll/sleep loop was due to an > inability to come up with something reliable. (Oops lost a sentence) ... but then I realised t

How should we wait for recovery conflict resolution?

2023-04-05 Thread Thomas Munro
277740 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 5 Apr 2023 17:21:18 +1200 Subject: [PATCH] WIP: latchify standby sleep diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index 9f56b4e95c..7770877d9b 100644 --- a/src/backend/storage/ipc/standby.c +++ b/s

<    3   4   5   6   7   8   9   10   11   12   >