Re: Streaming I/O, vectored I/O (WIP)

2024-04-23 Thread David Rowley
I've attached a patch with a few typo fixes and what looks like an incorrect type for max_ios. It's an int16 and I think it needs to be an int. Doing "max_ios = Min(max_ios, PG_INT16_MAX);" doesn't do anything when max_ios is int16. David diff --git a/src/backend/storage/aio/read_stream.c

Re: Streaming I/O, vectored I/O (WIP)

2024-04-07 Thread Nazir Bilal Yavuz
Hi, On Mon, 8 Apr 2024 at 00:01, Nazir Bilal Yavuz wrote: > > Hi, > > On Sun, 7 Apr 2024 at 20:33, Nazir Bilal Yavuz wrote: > > > > Hi, > > > > On Tue, 2 Apr 2024 at 11:40, Thomas Munro wrote: > > > > > > I had been planning to commit v14 this morning but got cold feet with > > > the BMR-based

Re: Streaming I/O, vectored I/O (WIP)

2024-04-07 Thread Nazir Bilal Yavuz
Hi, On Sun, 7 Apr 2024 at 20:33, Nazir Bilal Yavuz wrote: > > Hi, > > On Tue, 2 Apr 2024 at 11:40, Thomas Munro wrote: > > > > I had been planning to commit v14 this morning but got cold feet with > > the BMR-based interface. Heikki didn't like it much, and in the end, > > neither did I. I

Re: Streaming I/O, vectored I/O (WIP)

2024-04-07 Thread Melanie Plageman
On Sun, Apr 7, 2024 at 1:33 PM Nazir Bilal Yavuz wrote: > > Hi, > > On Tue, 2 Apr 2024 at 11:40, Thomas Munro wrote: > > > > I had been planning to commit v14 this morning but got cold feet with > > the BMR-based interface. Heikki didn't like it much, and in the end, > > neither did I. I have

Re: Streaming I/O, vectored I/O (WIP)

2024-04-07 Thread Nazir Bilal Yavuz
Hi, On Tue, 2 Apr 2024 at 11:40, Thomas Munro wrote: > > I had been planning to commit v14 this morning but got cold feet with > the BMR-based interface. Heikki didn't like it much, and in the end, > neither did I. I have now removed it, and it seems much better. No > other significant

Re: Streaming I/O, vectored I/O (WIP)

2024-04-03 Thread Melanie Plageman
On Tue, Apr 2, 2024 at 8:32 PM Thomas Munro wrote: > > Here are the remaining patches discussed in this thread. They give > tablespace-specific io_combine_limit, effective_io_readahead_window > (is this useful?), and up-to-1MB io_combine_limit (is this useful?). > I think the first two would

Re: Streaming I/O, vectored I/O (WIP)

2024-04-02 Thread Thomas Munro
On Tue, Apr 2, 2024 at 9:39 PM Thomas Munro wrote: > So this is the version I'm going to commit shortly, barring objections. And done, after fixing a small snafu with smgr-only reads coming from CreateAndCopyRelationData() (BM_PERMANENT would be incorrectly/unnecessarily set for unlogged

Re: Streaming I/O, vectored I/O (WIP)

2024-04-02 Thread Thomas Munro
I had been planning to commit v14 this morning but got cold feet with the BMR-based interface. Heikki didn't like it much, and in the end, neither did I. I have now removed it, and it seems much better. No other significant changes, just parameter types and inlining details. For example: *

Re: Streaming I/O, vectored I/O (WIP)

2024-03-29 Thread Heikki Linnakangas
On 29/03/2024 09:01, Thomas Munro wrote: On Fri, Mar 29, 2024 at 9:45 AM Heikki Linnakangas wrote: master (213c959a29):8.0 s streaming-api v13: 9.5 s Hmm, that's not great, and I think I know one factor that has confounded my investigation and the conflicting reports

Re: Streaming I/O, vectored I/O (WIP)

2024-03-29 Thread Thomas Munro
On Fri, Mar 29, 2024 at 9:45 AM Heikki Linnakangas wrote: > master (213c959a29):8.0 s > streaming-api v13: 9.5 s Hmm, that's not great, and I think I know one factor that has confounded my investigation and the conflicting reports I have received from a couple of people:

Re: Streaming I/O, vectored I/O (WIP)

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 12:06 AM Thomas Munro wrote: > Small bug fix: the condition in the final test at the end of > read_stream_look_ahead() wasn't quite right. In general when looking > ahead, we don't need to start a read just because the pending read > would bring us up to stream->distance

Re: Streaming I/O, vectored I/O (WIP)

2024-03-28 Thread Thomas Munro
Small bug fix: the condition in the final test at the end of read_stream_look_ahead() wasn't quite right. In general when looking ahead, we don't need to start a read just because the pending read would bring us up to stream->distance if submitted now (we'd prefer to build it all the way up to

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
New version with some cosmetic/comment changes, and Melanie's read_stream_reset() function merged, as required by her sequential scan user patch. I tweaked it slightly: it might as well share code with read_stream_end(). I think setting distance = 1 is fine for now, and we might later want to

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 2:02 PM Thomas Munro wrote: > ... In practice on a non-toy system, that's always going to be > io_combine_limit. ... And to be more explicit about that: you're right that we initialise max_pinned_buffers such that it's usually at least io_combine_limit, but then if you

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Mon, Mar 25, 2024 at 2:02 AM Thomas Munro wrote: > On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > > > /* > > >* Skip the initial ramp-up phase if the caller says we're going to > > > be > > >* reading the whole relation. This way we start out doing > > >

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 10:52 AM Thomas Munro wrote: > I think 1 is good, as a rescan is even more likely to find the pages > in cache, and if that turns out to be wrong it'll very soon adjust. Hmm, no I take that back, it probably won't be due to the strategy/ring... I see your point now...

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 9:43 AM Melanie Plageman wrote: > For sequential scan, I added a little reset function to the streaming > read API (read_stream_reset()) that just releases all the buffers. > Previously, it set finished to true before releasing the buffers (to > indicate it was done) and

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Melanie Plageman
On Wed, Mar 27, 2024 at 10:11 AM Thomas Munro wrote: > > I got rid of "finished" (now represented by distance == 0, I was > removing branches and variables). I got rid of "started", which can > now be deduced (used for suppressing advice when you're calling > _next() because you need a block and

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Wed, Mar 27, 2024 at 1:40 AM Heikki Linnakangas wrote: > Is int16 enough though? It seems so, because: > > max_pinned_buffers = Max(max_ios * 4, buffer_io_size); > > and max_ios is constrained by the GUC's maximum MAX_IO_CONCURRENCY, and > buffer_io_size is constrained by

Re: Streaming I/O, vectored I/O (WIP)

2024-03-26 Thread Heikki Linnakangas
On 24/03/2024 15:02, Thomas Munro wrote: On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: Maybe 'pg_streaming_read_next_buffer' or just 'pg_streaming_read_next', for a shorter name. Hmm. The idea of 'buffer' appearing in a couple of names is that there are conceptually other kinds

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Thomas Munro
On Mon, Mar 25, 2024 at 6:30 AM Melanie Plageman wrote: > I haven't reviewed the whole patch, but as I was rebasing > bitmapheapscan streaming read user, I found callback_private confusing > because it seems like it is a private callback, not private data > belonging to the callback. Perhaps call

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Melanie Plageman
On Sun, Mar 24, 2024 at 9:02 AM Thomas Munro wrote: > > On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > > On 12/03/2024 15:02, Thomas Munro wrote: > > > src/backend/storage/aio/streaming_read.c > > > src/include/storage/streaming_read.h > > > > Standard file header comments missing.

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Thomas Munro
On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > On 12/03/2024 15:02, Thomas Munro wrote: > > src/backend/storage/aio/streaming_read.c > > src/include/storage/streaming_read.h > > Standard file header comments missing. Fixed. > It would be nice to have a comment at the top of

Re: Streaming I/O, vectored I/O (WIP)

2024-03-19 Thread Heikki Linnakangas
Some quick comments: On 12/03/2024 15:02, Thomas Munro wrote: src/backend/storage/aio/streaming_read.c src/include/storage/streaming_read.h Standard file header comments missing. It would be nice to have a comment at the top of streaming_read.c, explaining at a high level how the circular

Re: Streaming I/O, vectored I/O (WIP)

2024-03-18 Thread Nazir Bilal Yavuz
Hi, On Sat, 16 Mar 2024 at 02:53, Thomas Munro wrote: > > I am planning to push the bufmgr.c patch soon. At that point the new > API won't have any direct callers yet, but the traditional > ReadBuffer() family of functions will internally reach > StartReadBuffers(nblocks=1) followed by

Re: Streaming I/O, vectored I/O (WIP)

2024-03-15 Thread Thomas Munro
I am planning to push the bufmgr.c patch soon. At that point the new API won't have any direct callers yet, but the traditional ReadBuffer() family of functions will internally reach StartReadBuffers(nblocks=1) followed by WaitReadBuffers(), ZeroBuffer() or nothing as appropriate. Any more

Re: Streaming I/O, vectored I/O (WIP)

2024-03-12 Thread Dilip Kumar
On Tue, Mar 12, 2024 at 12:10 PM Thomas Munro wrote: > > I think you'd be right if StartReadBuffers() were capable of > processing a sequence consisting of a hit followed by misses, but > currently it always gives up after the first hit. That is, it always > processes some number of misses

Re: Streaming I/O, vectored I/O (WIP)

2024-03-12 Thread Thomas Munro
On Tue, Mar 12, 2024 at 7:40 PM Thomas Munro wrote: > possible. So in the current patch you say "hey please read these 16 > blocks" and it returns saying "only read 1", you call again with 15 Oops, typo worth correcting: s/15/16/. Point being that the caller is interested in more blocks after

Re: Streaming I/O, vectored I/O (WIP)

2024-03-12 Thread Thomas Munro
On Tue, Mar 12, 2024 at 7:15 PM Dilip Kumar wrote: > I am planning to review this patch set, so started going through 0001, > I have a question related to how we are issuing smgrprefetch in > StartReadBuffers() Thanks! > + /* > + * In theory we should only do this if PrepareReadBuffers() had to

Re: Streaming I/O, vectored I/O (WIP)

2024-03-12 Thread Dilip Kumar
On Sat, Mar 9, 2024 at 3:55 AM Thomas Munro wrote: > Hi Thomas, I am planning to review this patch set, so started going through 0001, I have a question related to how we are issuing smgrprefetch in StartReadBuffers() + if (operation->io_buffers_len > 0) + { + if (flags &

Re: Streaming I/O, vectored I/O (WIP)

2024-02-26 Thread Robert Haas
On Tue, Feb 27, 2024 at 9:25 AM Thomas Munro wrote: > Here's the 2 step version. The streaming_read.c API is unchanged, but > the bugmfr.c API now has only the following extra functions: > > bool StartReadBuffers(..., int *nblocks, ..., ReadBuffersOperation *op) >

Re: Streaming I/O, vectored I/O (WIP)

2024-02-26 Thread Thomas Munro
On Wed, Feb 7, 2024 at 11:54 PM Nazir Bilal Yavuz wrote: > 0001-Provide-vectored-variant-of-ReadBuffer: > > - Do we need to pass the hit variable to ReadBuffer_common()? I think > it can be just declared in the ReadBuffer_common() now. Right, thanks! Done, in the version I'll post shortly. >

Re: Streaming I/O, vectored I/O (WIP)

2024-02-07 Thread Nazir Bilal Yavuz
Hi, Thanks for working on this! On Wed, 10 Jan 2024 at 07:14, Thomas Munro wrote: > Thanks! I committed the patches up as far as smgr.c before the > holidays. The next thing to commit would be the bufmgr.c one for > vectored ReadBuffer(), v5-0001. Here's my response to your review of >

Re: Streaming I/O, vectored I/O (WIP)

2024-01-11 Thread Thomas Munro
On Fri, Jan 12, 2024 at 3:31 AM Heikki Linnakangas wrote: > Ok. It feels surprising to have three steps. I understand that you need > two steps, one to start the I/O and another to wait for them to finish, > but why do you need separate Prepare and Start steps? What can you do in > between them?

Re: Streaming I/O, vectored I/O (WIP)

2024-01-11 Thread Heikki Linnakangas
On 11/01/2024 05:19, Thomas Munro wrote: On Thu, Jan 11, 2024 at 8:58 AM Heikki Linnakangas wrote: On 10/01/2024 06:13, Thomas Munro wrote: Bikeshedding call: I am open to better suggestions for the names PrepareReadBuffer() and CompleteReadBuffers(), they seem a little grammatically clumsy.

Re: Streaming I/O, vectored I/O (WIP)

2024-01-10 Thread Thomas Munro
On Thu, Jan 11, 2024 at 8:58 AM Heikki Linnakangas wrote: > On 10/01/2024 06:13, Thomas Munro wrote: > > Bikeshedding call: I am open to better suggestions for the names > > PrepareReadBuffer() and CompleteReadBuffers(), they seem a little > > grammatically clumsy. > > How will these functions

Re: Streaming I/O, vectored I/O (WIP)

2024-01-10 Thread Heikki Linnakangas
On 10/01/2024 06:13, Thomas Munro wrote: Bikeshedding call: I am open to better suggestions for the names PrepareReadBuffer() and CompleteReadBuffers(), they seem a little grammatically clumsy. How will these functions work in the brave new async I/O world? I assume PrepareReadBuffer() will

Re: Streaming I/O, vectored I/O (WIP)

2023-12-31 Thread Melanie Plageman
I've written a new version of the vacuum streaming read user on top of the rebased patch set [1]. It differs substantially from Andres' and includes several refactoring patches that can apply on top of master. As such, I've proposed those in a separate thread [2]. I noticed mac and windows fail to

Re: Streaming I/O, vectored I/O (WIP)

2023-12-30 Thread Cédric Villemain
Le 11/12/2023 à 10:12, Thomas Munro a écrit : 3. smgrreadv/smgrwritev patch: * improved ENOSPC handling * improve description of EOF and ENOSPC handling * fixed the sizes reported in dtrace static probes * fixed some words in the docs about that * changed error messages to refer to

Re: Streaming I/O, vectored I/O (WIP)

2023-12-11 Thread Thomas Munro
On Mon, Dec 11, 2023 at 10:28 PM Heikki Linnakangas wrote: > On 11/12/2023 11:12, Thomas Munro wrote: > > 1. I eventually figured out how to generalise > > compute_remaining_iovec() (as I now call it) so that the existing > > pg_pwritev_with_retry() in file_utils.c could also use it, so that's >

Re: Streaming I/O, vectored I/O (WIP)

2023-12-11 Thread Heikki Linnakangas
On 11/12/2023 11:12, Thomas Munro wrote: 1. I eventually figured out how to generalise compute_remaining_iovec() (as I now call it) so that the existing pg_pwritev_with_retry() in file_utils.c could also use it, so that's now done in a patch of its own. In compute_remaining_iovec(): 'source'

Re: Streaming I/O, vectored I/O (WIP)

2023-12-11 Thread Thomas Munro
On Sat, Dec 9, 2023 at 10:23 PM Heikki Linnakangas wrote: > Ok, works for me. I finished up making a few more improvements: 1. I eventually figured out how to generalise compute_remaining_iovec() (as I now call it) so that the existing pg_pwritev_with_retry() in file_utils.c could also use it,

Re: Streaming I/O, vectored I/O (WIP)

2023-12-09 Thread Heikki Linnakangas
On 09/12/2023 02:41, Thomas Munro wrote: On Sat, Dec 9, 2023 at 7:25 AM Andres Freund wrote: On 2023-11-30 13:01:46 +1300, Thomas Munro wrote: On Thu, Nov 30, 2023 at 12:16 PM Heikki Linnakangas wrote: Maybe we should bite the bullet and always retry short writes in FileWriteV(). Is that

Re: Streaming I/O, vectored I/O (WIP)

2023-12-08 Thread Thomas Munro
On Sat, Dec 9, 2023 at 7:25 AM Andres Freund wrote: > On 2023-11-30 13:01:46 +1300, Thomas Munro wrote: > > On Thu, Nov 30, 2023 at 12:16 PM Heikki Linnakangas wrote: > > > Maybe we should bite the bullet and always retry short writes in > > > FileWriteV(). Is that what you meant by "handling

Re: Streaming I/O, vectored I/O (WIP)

2023-12-08 Thread Andres Freund
Hi, On 2023-11-30 13:01:46 +1300, Thomas Munro wrote: > On Thu, Nov 30, 2023 at 12:16 PM Heikki Linnakangas wrote: > > On 29/11/2023 21:39, Thomas Munro wrote: > > > One thing I wasn't 100% happy with was the treatment of ENOSPC. A few > > > callers believe that short writes set errno: they

Re: Streaming I/O, vectored I/O (WIP)

2023-11-29 Thread Thomas Munro
On Thu, Nov 30, 2023 at 12:16 PM Heikki Linnakangas wrote: > On 29/11/2023 21:39, Thomas Munro wrote: > > One thing I wasn't 100% happy with was the treatment of ENOSPC. A few > > callers believe that short writes set errno: they error out with a > > message including %m. We have historically

Re: Streaming I/O, vectored I/O (WIP)

2023-11-29 Thread Heikki Linnakangas
On 29/11/2023 21:39, Thomas Munro wrote: One thing I wasn't 100% happy with was the treatment of ENOSPC. A few callers believe that short writes set errno: they error out with a message including %m. We have historically set errno = ENOSPC inside FileWrite() if the write size was unexpectedly

Re: Streaming I/O, vectored I/O (WIP)

2023-11-29 Thread Thomas Munro
On Wed, Nov 29, 2023 at 1:44 AM Heikki Linnakangas wrote: > LGTM. I think this 0001 patch is ready for commit, independently of the > rest of the patches. Done. > In v2-0002-Provide-vectored-variants-of-FileRead-and-FileWri-1.patch, fd.h: > > > +/* Filename components */ > > +#define

Re: Streaming I/O, vectored I/O (WIP)

2023-11-28 Thread Melanie Plageman
On Wed, Nov 29, 2023 at 01:17:19AM +1300, Thomas Munro wrote: Thanks for posting a new version. I've included a review of 0004. > I've included just the pg_prewarm example user for now while we > discuss the basic infrastructure. The rest are rebased and in my > public Github branch

Re: Streaming I/O, vectored I/O (WIP)

2023-11-28 Thread Heikki Linnakangas
On 28/11/2023 14:17, Thomas Munro wrote: On Thu, Sep 28, 2023 at 7:33 AM Heikki Linnakangas wrote: in streaming_read.h: typedef bool (*PgStreamingReadBufferDetermineNextCB) (PgStreamingRead *pgsr, uintptr_t pgsr_private, void *per_io_private, BufferManagerRelation

Re: Streaming I/O, vectored I/O (WIP)

2023-11-28 Thread Heikki Linnakangas
On 28/11/2023 14:17, Thomas Munro wrote: On Thu, Sep 28, 2023 at 7:33 AM Heikki Linnakangas wrote: + /* Avoid a slightly more expensive kernel call if there is no benefit. */ + if (iovcnt == 1) + returnCode = pg_pread(vfdP->fd, +

Re: Streaming I/O, vectored I/O (WIP)

2023-09-27 Thread Heikki Linnakangas
On 31/08/2023 07:00, Thomas Munro wrote: Currently PostgreSQL reads (and writes) data files 8KB at a time. That's because we call ReadBuffer() one block at a time, with no opportunity for lower layers to do better than that. This thread is about a model where you say which block you'll want

Re: Streaming I/O, vectored I/O (WIP)

2023-09-27 Thread Thomas Munro
On Thu, Sep 28, 2023 at 9:13 AM Andres Freund wrote: > On 2023-09-27 21:33:15 +0300, Heikki Linnakangas wrote: > > Looking at the later patch that introduces the streaming read API, seems > > that it finishes all the reads, so I suppose we don't need an abort > > function. Does it all get cleaned

Re: Streaming I/O, vectored I/O (WIP)

2023-09-27 Thread Andres Freund
Hi, On 2023-09-27 21:33:15 +0300, Heikki Linnakangas wrote: > I'm a bit disappointed and surprised by > v1-0009-WIP-Use-streaming-reads-in-vacuum.patch though: > > 4 files changed, 244 insertions(+), 78 deletions(-) > > The current prefetching logic in vacuumlazy.c is pretty hairy, so I hoped