Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 2:02 PM Thomas Munro wrote: > ... In practice on a non-toy system, that's always going to be > io_combine_limit. ... And to be more explicit about that: you're right that we initialise max_pinned_buffers such that it's usually at least io_combine_l

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Mon, Mar 25, 2024 at 2:02 AM Thomas Munro wrote: > On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > > > /* > > >* Skip the initial ramp-up phase if the caller says we're going to > > > be > > >* reading the

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 10:52 AM Thomas Munro wrote: > I think 1 is good, as a rescan is even more likely to find the pages > in cache, and if that turns out to be wrong it'll very soon adjust. Hmm, no I take that back, it probably won't be due to the strategy/ring... I se

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 9:43 AM Melanie Plageman wrote: > For sequential scan, I added a little reset function to the streaming > read API (read_stream_reset()) that just releases all the buffers. > Previously, it set finished to true before releasing the buffers (to > indicate it was done) and th

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Wed, Mar 27, 2024 at 1:40 AM Heikki Linnakangas wrote: > Is int16 enough though? It seems so, because: > > max_pinned_buffers = Max(max_ios * 4, buffer_io_size); > > and max_ios is constrained by the GUC's maximum MAX_IO_CONCURRENCY, and > buffer_io_size is constrained by MAX_BUFFER_IO_SIZ

Re: Large block sizes support in Linux

2024-03-25 Thread Thomas Munro
On Tue, Mar 26, 2024 at 3:34 AM Pankaj Raghav wrote: > One question: Does ZFS do something like FUA request to force the device > to clear the cache before it can update the node to point to the new page? > > If it doesn't do it, there is no guarantee from device to update the data > atomically un

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Thomas Munro
ublic functions _begin(), _next(), _end() to be next to each other after the static helper functions. Working on perf regression/tuning reports today, more soon... From edd3d078cf8d4b0c2f08df82295825f7107ec62b Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 26 Feb 2024 23:48:31 +1300

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Thomas Munro
On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > On 12/03/2024 15:02, Thomas Munro wrote: > > src/backend/storage/aio/streaming_read.c > > src/include/storage/streaming_read.h > > Standard file header comments missing. Fixed. > It would be nice to have

Re: Large block sizes support in Linux

2024-03-22 Thread Thomas Munro
On Fri, Mar 22, 2024 at 10:56 PM Pankaj Raghav (Samsung) wrote: > My team and I have been working on adding Large block size(LBS) > support to XFS in Linux[1]. Once this feature lands upstream, we will be > able to create XFS with FS block size > page size of the system on Linux. > We also gave a

Re: Potential stack overflow in incremental base backup

2024-03-22 Thread Thomas Munro
On Fri, Mar 8, 2024 at 6:53 AM Robert Haas wrote: > But I think that's really only necessary if we're actually going to > get rid of the idea of segmented relations altogether, which I don't > think is happening at least for v17, and maybe not ever. Yeah, I consider the feedback on ext4's size li

Re: Cannot find a working 64-bit integer type on Illumos

2024-03-22 Thread Thomas Munro
On Sat, Mar 23, 2024 at 6:26 AM Tom Lane wrote: > conftest.c:139:5: error: no previous prototype for 'does_int64_work' > [-Werror=missing-prototypes] > 139 | int does_int64_work() > | ^~~ > cc1: all warnings being treated as errors > configure:17003: $? = 1 > configure: pr

Re: pg_upgrade --copy-file-range

2024-03-22 Thread Thomas Munro
Hmm, this discussion seems to assume that we only use copy_file_range() to copy/clone whole segment files, right? That's great and may even get most of the available benefit given typical databases with many segments of old data that never changes, but... I think copy_write_range() allows us to go

Re: Vectored I/O in bulk_write.c

2024-03-19 Thread Thomas Munro
On Sun, Mar 17, 2024 at 8:10 AM Andres Freund wrote: > I don't think zeroextend on the one hand and and on the other hand a normal > write or extend are really the same operation. In the former case the content > is hard-coded in the latter it's caller provided. Sure, we can deal with that > by sp

Re: Built-in CTYPE provider

2024-03-18 Thread Thomas Munro
On Tue, Mar 19, 2024 at 11:55 AM Tom Lane wrote: > Jeff Davis writes: > > On Mon, 2024-03-18 at 18:04 -0400, Tom Lane wrote: > >> This is causing all CI jobs to fail the "compiler warnings" check. > > > I did run CI before checkin, and it passed: > > https://cirrus-ci.com/build/5382423490330624 >

Re: Confine vacuum skip logic to lazy_scan_skip

2024-03-16 Thread Thomas Munro
On Tue, Mar 12, 2024 at 10:03 AM Melanie Plageman wrote: > I've rebased the attached v10 over top of the changes to > lazy_scan_heap() Heikki just committed and over the v6 streaming read > patch set. I started testing them and see that you are right, we no > longer pin too many buffers. However,

Re: Streaming I/O, vectored I/O (WIP)

2024-03-15 Thread Thomas Munro
I am planning to push the bufmgr.c patch soon. At that point the new API won't have any direct callers yet, but the traditional ReadBuffer() family of functions will internally reach StartReadBuffers(nblocks=1) followed by WaitReadBuffers(), ZeroBuffer() or nothing as appropriate. Any more though

Re: Vectored I/O in bulk_write.c

2024-03-15 Thread Thomas Munro
I canvassed Andres off-list since smgrzeroextend() is his invention, and he wondered if it was a good idea to blur the distinction between the different zero-extension strategies like that. Good question. My take is that it's fine: mdzeroextend() already uses fallocate() only for nblocks > 8, bu

Re: Weird test mixup

2024-03-15 Thread Thomas Munro
On Sat, Mar 16, 2024 at 7:27 AM Tom Lane wrote: > Are there limits on the runtime of CI or cfbot jobs? Maybe > somebody should go check those systems. Those get killed at a higher level after 60 minutes (configurable but we didn't change it AFAIK): https://cirrus-ci.org/faq/#instance-timed-out

Re: broken JIT support on Fedora 40

2024-03-14 Thread Thomas Munro
For me it seems that the LLVMRunPasses() call, new in commit 76200e5ee469e4a9db5f9514b9d0c6a31b496bff Author: Thomas Munro Date: Wed Oct 18 22:15:54 2023 +1300 jit: Changes for LLVM 17. is reaching code that segfaults inside libLLVM, specifically in llvm::InlineFunction(llvm::CallBase

Re: broken JIT support on Fedora 40

2024-03-14 Thread Thomas Munro
em to be a > combination we have covered in the buildfarm. Yeah, 18.1 (note they switched to 1-based minor numbers, there was no 18.0) just came out a week or so ago. Despite testing their 18 branch just before their "RC1" tag, as recently as commit d282e88e50521a457fa1b36e55f43bac02a3

Re: Weird test mixup

2024-03-14 Thread Thomas Munro
On Fri, Mar 15, 2024 at 11:19 AM Tom Lane wrote: > Heikki Linnakangas writes: > > Somehow the 'gin-leave-leaf-split-incomplete' injection point was active > > in the 'intarray' test. That makes no sense. That injection point is > > only used by the test in src/test/modules/gin/. Perhaps that ran

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-14 Thread Thomas Munro
On Fri, Mar 15, 2024 at 3:18 AM Tomas Vondra wrote: > So, IIUC this means (1) the patched code is more aggressive wrt > prefetching (because we prefetch more data overall, because master would > prefetch N pages and patched prefetches N ranges, each of which may be > multiple pages. And (2) it's n

Re: Recent 027_streaming_regress.pl hangs

2024-03-14 Thread Thomas Munro
On Fri, Mar 15, 2024 at 7:00 AM Alexander Lakhin wrote: > Could it be that the timeout (360 sec?) is just not enough for the test > under the current (changed due to switch to meson) conditions? Hmm, well it looks like he switched over to meson around 42 days ago 2024-02-01, looking at "calliphor

Re: Recent 027_streaming_regress.pl hangs

2024-03-13 Thread Thomas Munro
On Thu, Mar 14, 2024 at 3:27 PM Michael Paquier wrote: > Hmm. Perhaps 8af25652489? That looks like the closest thing in the > list that could have played with the way WAL is generated, hence > potentially impacting the records that are replayed. Yeah, I was wondering if its checkpoint delaying

Re: Recent 027_streaming_regress.pl hangs

2024-03-13 Thread Thomas Munro
On Wed, Mar 13, 2024 at 10:53 AM Thomas Munro wrote: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-02-23%2015%3A44%3A35 Assuming it is due to a commit in master, and given the failure frequency, I think it is very likely to be a change from this 3 day window of

ERROR: error triggered for injection point gin-leave-leaf-split-incomplete

2024-03-13 Thread Thomas Munro
Hi, I noticed 3 regression test failures like $SUBJECT in cfbot runs for unrelated patches that probably shouldn't affect GIN, so I guess this is probably a problem in master. All three happened on FreeBSD, but I doubt that's relevant, it's just that the FreeBSD CI task was randomly selected to a

Re: Volatile write caches on macOS and Windows, redux

2024-03-13 Thread Thomas Munro
Short sales pitch for these patches: * the default settings eat data on Macs and Windows * nobody understands what wal_sync_method=fsync_writethrough means anyway * it's a weird kludge that it affects not only WAL, let's clean that up

Re: Vectored I/O in bulk_write.c

2024-03-13 Thread Thomas Munro
eduction to do so by merging, like this. From 61b351b60d22060e5fc082645cdfc19188ac4841 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300 Subject: [PATCH v5 1/3] Use smgrwritev() for both overwriting and extending. Since mdwrite() and mdextend() were basically the same and both nee

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-13 Thread Thomas Munro
On Sun, Mar 3, 2024 at 11:41 AM Tomas Vondra wrote: > On 3/2/24 23:28, Melanie Plageman wrote: > > On Sat, Mar 2, 2024 at 10:05 AM Tomas Vondra > > wrote: > >> With the current "master" code, eic=1 means we'll issue a prefetch for B > >> and then read+process A. And then issue prefetch for C and

Re: Vectored I/O in bulk_write.c

2024-03-13 Thread Thomas Munro
rnal details of the smgr implementation, it's part of the "contract" for the API. From 0a57274e29369e61712941e379c24f7db1dec068 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300 Subject: [PATCH v4 1/3] Merge smgrzeroextend() and smgrextend() with smgr

Re: Vectored I/O in bulk_write.c

2024-03-13 Thread Thomas Munro
On Wed, Mar 13, 2024 at 9:57 PM Heikki Linnakangas wrote: > Let's bite the bullet and merge the smgrwrite and smgrextend functions > at the smgr level too. I propose the following signature: > > #define SWF_SKIP_FSYNC 0x01 > #define SWF_EXTEND 0x02 > #define SWF_ZERO

Re: CI speed improvements for FreeBSD

2024-03-12 Thread Thomas Munro
On Wed, Mar 13, 2024 at 4:50 AM Maxim Orlov wrote: > I looked at the changes and I liked them. Here are my thoughts: Thanks for looking! Pushed.

Recent 027_streaming_regress.pl hangs

2024-03-12 Thread Thomas Munro
Hi, Several animals are timing out while waiting for catchup, sporadically. I don't know why. The oldest example I have found so far by clicking around is: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-02-23%2015%3A44%3A35 So perhaps something was committed ~3 weeks ago

Re: Vectored I/O in bulk_write.c

2024-03-12 Thread Thomas Munro
One more observation while I'm thinking about bulk_write.c... hmm, it writes the data out and asks the checkpointer to fsync it, but doesn't call smgrwriteback(). I assume that means that on Linux the physical writeback sometimes won't happen until the checkpointer eventually calls fsync() sequent

Re: Streaming I/O, vectored I/O (WIP)

2024-03-11 Thread Thomas Munro
On Tue, Mar 12, 2024 at 7:40 PM Thomas Munro wrote: > possible. So in the current patch you say "hey please read these 16 > blocks" and it returns saying "only read 1", you call again with 15 Oops, typo worth correcting: s/15/16/. Point being that the caller is inter

Re: Streaming I/O, vectored I/O (WIP)

2024-03-11 Thread Thomas Munro
On Tue, Mar 12, 2024 at 7:15 PM Dilip Kumar wrote: > I am planning to review this patch set, so started going through 0001, > I have a question related to how we are issuing smgrprefetch in > StartReadBuffers() Thanks! > + /* > + * In theory we should only do this if PrepareReadBuffers() had to

Re: [PROPOSAL] Skip test citext_utf8 on Windows

2024-03-11 Thread Thomas Munro
On Tue, Mar 12, 2024 at 2:56 PM Andrew Dunstan wrote: > On 2024-03-11 Mo 04:21, Oleg Tselebrovskiy wrote: > > Greetings, everyone! > > > > While running "installchecks" on databases with UTF-8 encoding the test > > citext_utf8 fails because of Turkish dotted I like this: > > > > SELECT 'i'::citex

Re: Confine vacuum skip logic to lazy_scan_skip

2024-03-10 Thread Thomas Munro
On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman wrote: > On Wed, Mar 6, 2024 at 6:47 PM Melanie Plageman > wrote: > > Performance results: > > > > The TL;DR of my performance results is that streaming read vacuum is > > faster. However there is an issue with the interaction of the streaming > >

Re: Vectored I/O in bulk_write.c

2024-03-10 Thread Thomas Munro
tings. We write out the index 128kB at a time, but the WAL 8kB at a time. From 793caf2db6c00314f5bd8f7146d4797508f2f627 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300 Subject: [PATCH v3 1/3] Provide vectored variant of smgrextend(). Since mdwrite() and mdextend(

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-10 Thread Thomas Munro
On Mon, Mar 11, 2024 at 9:59 AM Thomas Munro wrote: > On Mon, Mar 11, 2024 at 9:30 AM Heikki Linnakangas wrote: > > Hmm, I'm not sure if we need even smgrreleaseall() here anymore. It's > > not required for correctness AFAICS. We don't do it in

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-10 Thread Thomas Munro
On Mon, Mar 11, 2024 at 9:30 AM Heikki Linnakangas wrote: > Barring objections, I'll commit the attached. +1 I guess the comment for smgrreleaseall() could also be updated. It mentions only PROCSIGNAL_BARRIER_SMGRRELEASE, but I think sinval overflow (InvalidateSystemCaches()) should also be men

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-09 Thread Thomas Munro
On Sun, Mar 10, 2024 at 6:48 PM Thomas Munro wrote: > I won't be surprised if the answer is: if you're holding a reference, > you have to get a pin (referring to bulk_write.c). Ahhh, on second thoughts, I take that back, I think the original theory still actually works just fine.

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-09 Thread Thomas Munro
On Sun, Mar 10, 2024 at 5:02 PM Thomas Munro wrote: > Thanks, reproduced here (painfully slowly). Looking... I changed that ERROR to a PANIC and now I can see that _bt_metaversion() is failing to read a meta page (block 0), and the file is indeed of size 0 in my filesystem. Which is not c

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-09 Thread Thomas Munro
On Sat, Mar 9, 2024 at 9:48 AM Tomas Vondra wrote: > I spent a bit of time investigating this today, but haven't made much > progress due to (a) my unfamiliarity with the smgr code in general and > the patch in particular, and (b) CLOBBER_CACHE_ALWAYS making it quite > time consuming to iterate an

Re: Vectored I/O in bulk_write.c

2024-03-09 Thread Thomas Munro
Slightly better version, adjusting a few obsoleted comments, adjusting error message to distinguish write/extend, fixing a thinko in smgr_cached_nblocks maintenance. From c786f979b0c38364775e32b9403b79303507d37b Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300

Vectored I/O in bulk_write.c

2024-03-09 Thread Thomas Munro
Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com From 4611cb121bbfa787ddbba4bc0e80ac6c732345d0 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300 Subject: [PATCH 1/2] Provide vectored variant of smgrextend(). Since mdwrite() and mdextend() were basical

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-08 Thread Thomas Munro
On Sat, Mar 9, 2024 at 2:36 AM Tomas Vondra wrote: > On 3/8/24 13:21, Tomas Vondra wrote: > > My guess would be 8af25652489, as it's the only storage-related commit. > > > > I'm currently running tests to verify this. > > > > Yup, the breakage starts with this commit. I haven't looked into the > r

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-08 Thread Thomas Munro
Happened again. I see this is OpenSUSE. Does that mean the file system is Btrfs?

Re: Combine headerscheck and cpluspluscheck scripts

2024-03-06 Thread Thomas Munro
+1

Re: Large files for relations

2024-03-06 Thread Thomas Munro
://www.postgresql.org/message-id/flat/CA%2BhUKG%2B2hZ0sBztPW4mkLfng0qfkNtAHFUfxOMLizJ0BPmi5%2Bg%40mail.gmail.com From 85678257fef94aa3ca3efb39ce55fb66df7c889e Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 26 May 2023 01:41:11 +1200 Subject: [PATCH v3] Allow relation segment size to be set by initdb. MIME

Potential stack overflow in incremental base backup

2024-03-05 Thread Thomas Munro
ever really uses --with-segsize (do they?), but if we make it an initdb option it will be more popular and this will become a problem. Hmm. From 1d183245e9676ef45ca6a93e7d442ee903a2a14c Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 6 Mar 2024 17:44:19 +1300 Subject: [PATCH] Fix potential

Re: CREATE DATABASE with filesystem cloning

2024-03-05 Thread Thomas Munro
On Wed, Mar 6, 2024 at 3:16 PM Thomas Munro wrote: > Here's a rebase. Now with a wait event and a paragraph of documentation. From 9d5a60e9a9cc4a4312de3081be99c254a8876e42 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 2 Sep 2023 22:21:49 +1200 Subject: [PATCH v4] CREATE

Re: CREATE DATABASE with filesystem cloning

2024-03-05 Thread Thomas Munro
On Wed, Oct 11, 2023 at 7:40 PM Peter Eisentraut wrote: > On 07.10.23 07:51, Thomas Munro wrote: > > Here is an experimental POC of fast/cheap database cloning. > > Here are some previous discussions of this: > > https://www.postgresql.org/message-id/flat/20131001223108.G

Re: pg_upgrade --copy-file-range

2024-03-05 Thread Thomas Munro
On Wed, Mar 6, 2024 at 2:43 AM Peter Eisentraut wrote: > As far as I can tell, the original pg_upgrade patch has been ready to > commit since October. Unless Thomas has any qualms that have not been > made explicit in this thread, I suggest we move ahead with that. pg_upgrade --copy-file-range p

Re: processes stuck in shutdown following OOM/recovery

2024-03-05 Thread Thomas Munro
On Sat, Dec 2, 2023 at 3:30 PM Thomas Munro wrote: > On Sat, Dec 2, 2023 at 2:18 PM Thomas Munro wrote: > > On Fri, Dec 1, 2023 at 6:13 PM Justin Pryzby wrote: > > > $ kill -9 2524495; sleep 0.05; pg_ctl -D ./pgdev.dat1 stop -m fast # > > > 2524495 is a child'

Failures in constraints regression test, "read only 0 of 8192 bytes"

2024-03-02 Thread Thomas Munro
These two animals seem to have got mixed up about about the size of this relation in the same place: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=avocet&dt=2024-02-28%2017%3A34%3A30 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-03-01%2006%3A47%3A53 +++ /hom

Re: pread, pwrite, etc return ssize_t not int

2024-03-01 Thread Thomas Munro
On Sat, Mar 2, 2024 at 3:12 AM Peter Eisentraut wrote: > 0001-Return-ssize_t-in-fd.c-I-O-functions.patch > > This patch looks correct to me. Thanks, I'll push this one. > 0002-Fix-theoretical-overflow-in-Windows-pg_pread-pg_pwri.patch > > I have two comments on that: > > For the overflow of the

Re: Volatile write caches on macOS and Windows, redux

2024-03-01 Thread Thomas Munro
Rebased over 8d140c58. v2-0001-Make-wal_sync_method-fdatasync-the-default-on-all.patch Description: Binary data v2-0002-Remove-fsync_writethrough-add-fsync-full-macOS-on.patch Description: Binary data

Re: Relation bulk write facility

2024-02-27 Thread Thomas Munro
On Wed, Feb 28, 2024 at 9:24 AM Heikki Linnakangas wrote: > Here's a patch to fully remove AIX support. --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -3401,7 +3401,7 @@ export MANPATH PostgreSQL can be expected to work on current versions of these operat

Re: pread, pwrite, etc return ssize_t not int

2024-02-27 Thread Thomas Munro
Patches attached. PS Correction to my earlier statement about POSIX: the traditional K&R interfaces were indeed in the original POSIX.1 1988 but it was the 1990 edition (approximately coinciding with standard C) that adopted void, size_t, const and invented ssize_t. 0001-Return-ssize_t-in-fd.c-I

Re: Streaming I/O, vectored I/O (WIP)

2024-02-26 Thread Thomas Munro
On Wed, Feb 7, 2024 at 11:54 PM Nazir Bilal Yavuz wrote: > 0001-Provide-vectored-variant-of-ReadBuffer: > > - Do we need to pass the hit variable to ReadBuffer_common()? I think > it can be just declared in the ReadBuffer_common() now. Right, thanks! Done, in the version I'll post shortly. > 00

Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2024-02-26 Thread Thomas Munro
[Sorry to those who received this message twice -- the first time got bounced by the list because of a defunct email address in the CC list.] Here is a rebase of Palak's v2 patch. I didn't change anything except for the required resource manager API change, a pgindent run, and removal of a stray

Re: Relation bulk write facility

2024-02-24 Thread Thomas Munro
On Sun, Feb 25, 2024 at 11:16 AM Thomas Munro wrote: > On Sun, Feb 25, 2024 at 11:06 AM Heikki Linnakangas wrote: > > Regarding the issue at hand, perhaps we should define PG_IO_ALIGN_SIZE as > > 16 on AIX, if that's the best the linker can do on that platform. > > You

Re: Relation bulk write facility

2024-02-24 Thread Thomas Munro
On Sun, Feb 25, 2024 at 11:06 AM Heikki Linnakangas wrote: > Regarding the issue at hand, perhaps we should define PG_IO_ALIGN_SIZE as 16 > on AIX, if that's the best the linker can do on that platform. You'll probably get either an error or silently fall back to buffered I/O, if direct I/O is e

Re: Relation bulk write facility

2024-02-24 Thread Thomas Munro
On Sun, Feb 25, 2024 at 9:12 AM Thomas Munro wrote: > On Sun, Feb 25, 2024 at 8:50 AM Noah Misch wrote: > > On GNU/Linux x64, gcc correctly records alignment=2**12 for the associated > > section (.rodata for bulk_write.o zero_buffer, .bss for pg_prewarm.o > > blockbuffer).

Re: Relation bulk write facility

2024-02-24 Thread Thomas Munro
On Sun, Feb 25, 2024 at 8:50 AM Noah Misch wrote: > On GNU/Linux x64, gcc correctly records alignment=2**12 for the associated > section (.rodata for bulk_write.o zero_buffer, .bss for pg_prewarm.o > blockbuffer). If I'm reading this right, neither AIX gcc nor xlc is marking > the section with su

Re: Relation bulk write facility

2024-02-24 Thread Thomas Munro
On Sun, Feb 25, 2024 at 6:24 AM Noah Misch wrote: > On Fri, Feb 23, 2024 at 04:27:34PM +0200, Heikki Linnakangas wrote: > > Committed this. Thanks everyone! > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2024-02-24%2015%3A13%3A14 > got: > TRAP: failed Assert("(uintptr_t)

Re: Unlinking Parallel Hash Join inner batch files sooner

2024-02-22 Thread Thomas Munro
On Thu, Feb 22, 2024 at 5:37 PM Andrei Lepikhov wrote: > On 22/2/2024 06:42, Thomas Munro wrote: > > extreme skew for one version of the problem, but even with zero/normal > > skewness and perfect estimation of the number of partitions, if you Sorry, I meant to write "but eve

Re: Unlinking Parallel Hash Join inner batch files sooner

2024-02-21 Thread Thomas Munro
On Wed, Feb 21, 2024 at 7:34 PM Andrei Lepikhov wrote: > I see in [1] that the reporter mentioned a delay between the error > message in parallel HashJoin and the return control back from PSQL. Your > patch might reduce this delay. > Also, I have the same complaint from users who processed gigabyt

Re: DSA_ALLOC_NO_OOM doesn't work

2024-02-21 Thread Thomas Munro
On Thu, Feb 22, 2024 at 10:30 AM Thomas Munro wrote: > collisions arbitrarily far apart (just decide how many bits to use). . o O ( Perhaps if you also allocated slots using a FIFO freelist, instead of the current linear search for the first free slot, you could maximise the time before a s

Re: DSA_ALLOC_NO_OOM doesn't work

2024-02-21 Thread Thomas Munro
On Thu, Feb 22, 2024 at 8:19 AM Heikki Linnakangas wrote: > - Separate dsm_handle, used by backend code to interact with the high > level interface in dsm.c, from dsm_impl_handle, which is used to > interact with the low-level functions in dsm_impl.c. This gets rid of > the convention in dsm.c of

Re: margay fails assertion in stats/dsa/dsm code

2024-02-21 Thread Thomas Munro
On Sat, Jul 2, 2022 at 11:10 AM Thomas Munro wrote: > On Sat, Jul 2, 2022 at 1:15 AM Robert Haas wrote: > > Changing the default on certain platforms to 'posix' or 'sysv' > > according to what works best on that platform seems reasonable to me. > >

Re: PGC_SIGHUP shared_buffers?

2024-02-16 Thread Thomas Munro
On Fri, Feb 16, 2024 at 5:29 PM Robert Haas wrote: > 3. Reserve lots of address space and then only use some of it. I hear > rumors that some forks of PG have implemented something like this. The > idea is that you convince the OS to give you a whole bunch of address > space, but you try to avoid

Re: make add_paths_to_append_rel aware of startup cost

2024-02-13 Thread Thomas Munro
On Thu, Oct 5, 2023 at 9:07 PM David Rowley wrote: > Thanks. Pushed. FYI somehow this plan from a8a968a8212e flipped in this run: === dumping /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs === diff -U3 /home/bf/bf-build/mylodon/HEAD/pgsql/s

Re: Collation version tracking for macOS

2024-02-13 Thread Thomas Munro
On Tue, Feb 13, 2024 at 9:25 AM Jeff Davis wrote: > On Sun, 2024-02-11 at 22:04 +0530, Robert Haas wrote: > > "icu_multilib must be loaded via shared_preload_libraries. > > icu_multilib ignores any ICU library with a major version greater > > than > > that with which PostgreSQL was built." > > > >

Re: DSA_ALLOC_NO_OOM doesn't work

2024-02-13 Thread Thomas Munro
On Wed, Feb 14, 2024 at 3:23 AM Heikki Linnakangas wrote: > On 29/01/2024 14:06, Heikki Linnakangas wrote: > > If you call dsa_allocate_extended(DSA_ALLOC_NO_OOM), it will still > > ereport an error if you run out of space (originally reported at [0]). > > > > Attached patch adds code to test_dsa.

Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Thomas Munro
On Sat, Feb 10, 2024 at 8:36 AM Andres Freund wrote: > Also, yikes, that's an ugly way of doing hardware detection. Jumping out of a > signal handler into normal code. Brrr. Maybe it's a little baroque but what's actually wrong with it? OpenSSL does something similar during initialisation as a fa

Re: gai_strerror() is not thread-safe on Windows

2024-02-11 Thread Thomas Munro
On Tue, Jan 16, 2024 at 8:52 AM Robert Haas wrote: > On Wed, Dec 6, 2023 at 8:45 PM Kyotaro Horiguchi > wrote: > > > So I think we should just hard-code the error messages in English and > > > move on. However, English is my language so perhaps I should abstain > > > and leave it to others to de

Re: glibc qsort() vulnerability

2024-02-07 Thread Thomas Munro
On Thu, Feb 8, 2024 at 3:38 PM Thomas Munro wrote: > Perhaps you could wrap it in a branch-free sign() function so you get > a narrow answer? > > https://stackoverflow.com/questions/14579920/fast-sign-of-integer-in-c Ah, strike that, it is much like the suggested (a > b) - (a <

Re: glibc qsort() vulnerability

2024-02-07 Thread Thomas Munro
On Thu, Feb 8, 2024 at 3:06 PM Andres Freund wrote: > On 2024-02-07 19:52:11 -0600, Nathan Bossart wrote: > > On Wed, Feb 07, 2024 at 04:42:07PM -0800, Andres Freund wrote: > > > On 2024-02-07 16:21:24 -0600, Nathan Bossart wrote: > > >> The assembly for that looks encouraging, but I still need to

Re: cfbot is failing all tests on FreeBSD/Meson builds

2024-02-07 Thread Thomas Munro
On Tue, Jan 30, 2024 at 5:06 PM Tom Lane wrote: > Thomas Munro writes: > > On Sat, Jan 13, 2024 at 1:51 PM Tom Lane wrote: > >> Time for a bug report to IO::Tty's authors, I guess. > > > Ahh, there is one: https://github.com/cpan-authors/IO-Tty/issues/38 > &

Re: InstallXLogFileSegment() vs concurrent WAL flush

2024-02-02 Thread Thomas Munro
On Fri, Feb 2, 2024 at 12:56 PM Yugo NAGATA wrote: > On Fri, 2 Feb 2024 11:18:18 +0100 > Thomas Munro wrote: > > One simple way to address that would be to make XLogFileInitInternal() > > wait for InstallXLogFileSegment() to finish. It's a little > > Or, can we ma

InstallXLogFileSegment() vs concurrent WAL flush

2024-02-02 Thread Thomas Munro
Hi, New WAL space is created by renaming a file into place. Either a newly created file with a temporary name or, ideally, a recyclable old file with a name derived from an old LSN. I think there is a data loss window between rename() and fsync(parent_directory). A concurrent backend might open

Re: Extending SMgrRelation lifetimes

2024-01-31 Thread Thomas Munro
On Wed, Nov 29, 2023 at 1:42 PM Heikki Linnakangas wrote: > I spent some more time digging into this, experimenting with different > approaches. Came up with pretty significant changes; see below: Hi Heikki, I think this approach is good. As I wrote in the first email, I had briefly considered

Re: Guiding principle for dropping LLVM versions?

2024-01-25 Thread Thomas Munro
On Thu, Jan 25, 2024 at 4:44 PM Thomas Munro wrote: > ... A few build farm animals will > now fail in the configure step as discussed, and need some adjustment > (ie disable LLVM or upgrade to LLVM 10+ for the master branch). Owners pinged.

Re: Remove pthread_is_threaded_np() checks in postmaster

2024-01-24 Thread Thomas Munro
On Wed, Jan 24, 2024 at 1:39 PM Andres Freund wrote: > On 2024-01-23 17:26:19 -0600, Tristan Partin wrote: > > On Tue Jan 23, 2024 at 4:23 PM CST, Andres Freund wrote: > > > A fork() while threads are running is undefined behavior IIRC, and > > > undefined > > > behavior isn't limited to a single

Re: Guiding principle for dropping LLVM versions?

2024-01-24 Thread Thomas Munro
Thanks all for the discussion. Pushed. A few build farm animals will now fail in the configure step as discussed, and need some adjustment (ie disable LLVM or upgrade to LLVM 10+ for the master branch). Next year I think we should be able to do a much bigger cleanup, by moving to LLVM 14+.

Re: LLVM 18

2024-01-24 Thread Thomas Munro
On Wed, Jan 3, 2024 at 6:04 PM Thomas Munro wrote: > LLVM 16 provided a new function name[1], and LLVM 18 (not shipped yet) > has started complaining[2] about the old spelling. > > Here's a patch. And pushed. Just in case anyone else is confused by this, be aware that they

Re: Oom on temp (un-analyzed table caused by JIT) V16.1 [ NOT Fixed ]

2024-01-24 Thread Thomas Munro
On Thu, Jan 25, 2024 at 8:51 AM Kirk Wolak wrote: > getrusage(RUSAGE_SELF, &usage); > memory_usage_bytes = usage.ru_maxrss * 1024; FWIW log_statement_stats = on shows that in the logs. See ShowUsage() in postgres.c.

Re: 039_end_of_wal: error in "xl_tot_len zero" test

2024-01-18 Thread Thomas Munro
On Fri, Jan 19, 2024 at 1:47 AM Anton Voloshin wrote: > I believe there is a small problem in the 039_end_of_wal.pl's > "xl_tot_len zero" test. It supposes that after immediate shutdown the > server, upon startup recovery, should always produce a message matching > "invalid record length at .*: wa

Re: cfbot is failing all tests on FreeBSD/Meson builds

2024-01-12 Thread Thomas Munro
On Sat, Jan 13, 2024 at 1:51 PM Tom Lane wrote: > Time for a bug report to IO::Tty's authors, I guess. Ahh, there is one: https://github.com/cpan-authors/IO-Tty/issues/38 In the meantime, will look into whether I can pin that package to 1.17 somewhere in the pipeline, hopefully later today...

Re: cfbot is failing all tests on FreeBSD/Meson builds

2024-01-12 Thread Thomas Munro
On Sat, Jan 13, 2024 at 9:32 AM Tom Lane wrote: > It looks like every recent cfbot run has failed in the > FreeBSD-13-Meson build, even if it worked in other ones. > The symptoms are failures in the TAP tests that try to > use interactive_psql: > > Can't call method "slave" on an undefined value a

Re: Streaming I/O, vectored I/O (WIP)

2024-01-11 Thread Thomas Munro
On Fri, Jan 12, 2024 at 3:31 AM Heikki Linnakangas wrote: > Ok. It feels surprising to have three steps. I understand that you need > two steps, one to start the I/O and another to wait for them to finish, > but why do you need separate Prepare and Start steps? What can you do in > between them? (

Re: Streaming I/O, vectored I/O (WIP)

2024-01-10 Thread Thomas Munro
On Thu, Jan 11, 2024 at 8:58 AM Heikki Linnakangas wrote: > On 10/01/2024 06:13, Thomas Munro wrote: > > Bikeshedding call: I am open to better suggestions for the names > > PrepareReadBuffer() and CompleteReadBuffers(), they seem a little > > grammatically clumsy. > &g

LLVM 18

2024-01-02 Thread Thomas Munro
ommit/5ac12951b4e9bbfcc5791282d0961ec2b65575e9 From bc2a07e0012aa58af2cf97a202d181f473a4d7bc Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 3 Jan 2024 17:45:30 +1300 Subject: [PATCH] Track LLVM 18 changes. https://github.com/llvm/llvm-project/commit/1b97645e56bf321b06d1353024339958b64fd242 https://github.com

Re: Windows sockets (select missing events?)

2023-12-30 Thread Thomas Munro
Hi Ranier, I doubt it really matters, unless perhaps it's upsetting some static analysis tool? It's bounded by FD_SETSIZE, a small number. FWIW, I would like to delete pgwin32_select() in PG17. Before PG16 (commit 7389aad6), the postmaster used it to receive incoming connections, but that was r

Re: cannot abort transaction 2737414167, it was already committed

2023-12-27 Thread Thomas Munro
On Thu, Dec 28, 2023 at 4:02 AM Justin Pryzby wrote: > My main question is why an IO error would cause the DB to abort, rather > than raising an ERROR. In CommitTransaction() there is a stretch of code beginning s->state = TRANS_COMMIT and ending s->state = TRANS_DEFAULT, from which we call out t

Re: pread, pwrite, etc return ssize_t not int

2023-12-25 Thread Thomas Munro
On Mon, Dec 25, 2023 at 7:09 AM Tom Lane wrote: > Coverity whinged this morning about the following bit in > the new pg_combinebackup code: > > 644 unsignedrb; > 645 > 646 /* Read the block from the correct source, except if > dry-run. */ > 647

Re: pg_upgrade --copy-file-range

2023-12-22 Thread Thomas Munro
On Sat, Dec 23, 2023 at 9:40 AM Peter Eisentraut wrote: > On 13.11.23 08:15, Peter Eisentraut wrote: > > On 08.10.23 07:15, Thomas Munro wrote: > >>> About your patch: > >>> > >>> I think you should have a "check" function called from > &

Re: pg_serial bloat

2023-12-21 Thread Thomas Munro
On Fri, Dec 15, 2023 at 9:53 AM Thomas Munro wrote: > ... We've seen a system with ~30GB of files in there > (note: full/untruncated be would be 2³² xids × sizeof(uint64_t) = > 32GB). It's not just a gradual disk space leak: according to disk > space monitoring, this syste

<    1   2   3   4   5   6   7   8   9   10   >