Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 1:29 PM Robert Haas wrote: > Hmm. I wonder if we could teach the system to figure out which of > those things is happening. In the case that I'm worried about, when > we're considering growing the line pointer array, either the line > pointers will be dead or the line

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 2:06 PM Andres Freund wrote: > It's not hard to hit scenarios where pages are effectively unusable, because > they have close to 291 dead items, without autovacuum triggering (or > autovacuum just taking a while). I think that this is mostly a problem with HOT-updates, and

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 2:18 PM Andres Freund wrote: > It's 4 bytes per line pointer, right? Yeah, it's 4 bytes in Postgres. Most other DB systems only need 2 bytes, which is implemented in exactly the way that you're imagining. -- Peter Geoghegan

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Andres Freund
Hi, On 2022-04-08 15:04:37 -0400, Robert Haas wrote: > I meant wasting space in the page. I think that's a real concern. > Imagine you allowed 1000 line pointers per page. Each one consumes 2 > bytes. It's 4 bytes per line pointer, right? struct ItemIdData { unsigned int

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Andres Freund
Hi, On 2022-04-08 09:17:40 -0400, Robert Haas wrote: > I agree that the value of 291 is pretty much accidental, but it also > seems fairly generous to me. The bigger you make it, the more space > you can waste. I must have missed (or failed to understand) previous > discussions about why raising

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Robert Haas
On Fri, Apr 8, 2022 at 3:31 PM Peter Geoghegan wrote: > What if we miss the opportunity to systematically keep successor > versions of a given logical row on the same heap page over time, due > only to the current low MaxHeapLinePointersPerPage limit of 291? If we > had only been able to "absorb"

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 12:04 PM Robert Haas wrote: > I meant wasting space in the page. I think that's a real concern. > Imagine you allowed 1000 line pointers per page. Each one consumes 2 > bytes. So now you could have ~25% of each page in the table storing > dead line pointers. That sounds

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Robert Haas
On Fri, Apr 8, 2022 at 12:57 PM Peter Geoghegan wrote: > What do you mean about wasting space? Wasting space on the stack? I > can't imagine you meant wasting space on the page, since being able to > accomodate more items on each heap page seems like it would be > strictly better, barring any

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 9:44 AM Peter Geoghegan wrote: > On Fri, Apr 8, 2022 at 4:38 AM Matthias van de Meent > wrote: > > Yeah, I think we should definately support more line pointers on a > > heap page, but abusing MaxHeapTuplesPerPage for that is misleading: > > the current value is the

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 6:17 AM Robert Haas wrote: > I agree that the value of 291 is pretty much accidental, but it also > seems fairly generous to me. The bigger you make it, the more space > you can waste. I must have missed (or failed to understand) previous > discussions about why raising it

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 4:38 AM Matthias van de Meent wrote: > Yeah, I think we should definately support more line pointers on a > heap page, but abusing MaxHeapTuplesPerPage for that is misleading: > the current value is the physical limit for heap tuples, as we have at > most 1 heap tuple per

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Robert Haas
On Thu, Apr 7, 2022 at 7:01 PM Andres Freund wrote: > On 2022-04-04 19:24:22 -0700, Peter Geoghegan wrote: > > We should definitely increase MaxHeapTuplesPerPage before too long, > > for a variety of reasons that I have talked about in the past. Its > > current value is 291 on all mainstream

Re: Lowering the ever-growing heap->pd_lower

2022-04-08 Thread Matthias van de Meent
On Fri, 8 Apr 2022 at 01:01, Andres Freund wrote: > > Hi, > > On 2022-04-04 19:24:22 -0700, Peter Geoghegan wrote: > > We should definitely increase MaxHeapTuplesPerPage before too long, > > for a variety of reasons that I have talked about in the past. Its > > current value is 291 on all

Re: Lowering the ever-growing heap->pd_lower

2022-04-07 Thread Peter Geoghegan
On Thu, Apr 7, 2022 at 4:01 PM Andres Freund wrote: > I'm on-board with that - but I think we should rewrite a bunch of places that > use MaxHeapTuplesPerPage sized-arrays on the stack first. It's not great using > several KB of stack at the current the current value already (*), but if it >

Re: Lowering the ever-growing heap->pd_lower

2022-04-07 Thread Andres Freund
Hi, On 2022-04-04 19:24:22 -0700, Peter Geoghegan wrote: > We should definitely increase MaxHeapTuplesPerPage before too long, > for a variety of reasons that I have talked about in the past. Its > current value is 291 on all mainstream platforms, a value that's > derived from accidental historic

Re: Lowering the ever-growing heap->pd_lower

2022-04-07 Thread Peter Geoghegan
On Mon, Apr 4, 2022 at 7:24 PM Peter Geoghegan wrote: > I am sympathetic to the idea that giving the system a more accurate > picture of how much free space is available on each heap page is an > intrinsic good. This might help us in a few different areas. For > example, the FSM cares about

Re: Lowering the ever-growing heap->pd_lower

2022-04-04 Thread Peter Geoghegan
On Tue, Feb 15, 2022 at 10:48 AM Matthias van de Meent wrote: > # Truncating lp_array during pruning > === > > The following adversarial load grows the heap relation, but with the > patch the relation keeps its size. The point being that HOT updates > can temporarily

Re: Lowering the ever-growing heap->pd_lower

2022-03-14 Thread Peter Geoghegan
On Thu, Mar 10, 2022 at 5:49 AM Matthias van de Meent wrote: > I double-checked the changes, and to me it seems like that was the > only place in the code where PageGetMaxOffsetNumber was not handled > correctly. This was fixed in the latest patch (v8). > > Peter, would you have time to further

Re: Lowering the ever-growing heap->pd_lower

2022-03-10 Thread Matthias van de Meent
On Wed, 16 Feb 2022 at 21:14, Matthias van de Meent wrote: > > On Wed, 16 Feb 2022 at 20:54, Peter Geoghegan wrote: > > > > On Tue, Feb 15, 2022 at 10:48 AM Matthias van de Meent > > wrote: > > > Peter Geoghegan asked for good arguments for the two changes > > > implemented. Below are my

Re: Lowering the ever-growing heap->pd_lower

2022-02-16 Thread Matthias van de Meent
On Wed, 16 Feb 2022 at 20:54, Peter Geoghegan wrote: > > On Tue, Feb 15, 2022 at 10:48 AM Matthias van de Meent > wrote: > > Peter Geoghegan asked for good arguments for the two changes > > implemented. Below are my arguments detailed, with adversarial loads > > that show the problematic

Re: Lowering the ever-growing heap->pd_lower

2022-02-16 Thread Peter Geoghegan
On Tue, Feb 15, 2022 at 10:48 AM Matthias van de Meent wrote: > Peter Geoghegan asked for good arguments for the two changes > implemented. Below are my arguments detailed, with adversarial loads > that show the problematic behaviour of the line pointer array that is > fixed with the patch. Why

Re: Lowering the ever-growing heap->pd_lower

2022-02-15 Thread Matthias van de Meent
On Thu, 2 Dec 2021 at 11:17, Daniel Gustafsson wrote: > > This thread has stalled, and the request for benchmark/test has gone > unanswered > so I'm marking this patch Returned with Feedback. Please feel free to > resubmit > this patch if it is picked up again. Well then, here we go. It took

Re: Lowering the ever-growing heap->pd_lower

2021-12-02 Thread Daniel Gustafsson
This thread has stalled, and the request for benchmark/test has gone unanswered so I'm marking this patch Returned with Feedback. Please feel free to resubmit this patch if it is picked up again. -- Daniel Gustafsson https://vmware.com/

Re: Lowering the ever-growing heap->pd_lower

2021-08-05 Thread Peter Geoghegan
On Thu, Aug 5, 2021 at 6:28 AM Simon Riggs wrote: > Hmm, there is no information in WAL to describe the line pointers > being truncated by PageTruncateLinePointerArray(). We just truncate > every time we see a XLOG_HEAP2_VACUUM record and presume it does the > same thing as the original change. >

Re: Lowering the ever-growing heap->pd_lower

2021-08-05 Thread Simon Riggs
On Wed, 4 Aug 2021 at 15:39, Robert Haas wrote: > > On Tue, Aug 3, 2021 at 8:44 PM Peter Geoghegan wrote: > > This time it's quite different: we're truncating the line pointer > > array during pruning. Pruning often occurs opportunistically, during > > regular query processing. In fact I'd say

Re: Lowering the ever-growing heap->pd_lower

2021-08-04 Thread Peter Geoghegan
On Wed, Aug 4, 2021 at 7:39 AM Robert Haas wrote: > How would it hurt? > > It's easy to see the harm caused by not shortening the line pointer > array when it is possible to do so: we're using up space in the page > that could have been made free. It's not so obvious to me what the > downside of

Re: Lowering the ever-growing heap->pd_lower

2021-08-04 Thread Peter Geoghegan
On Wed, Aug 4, 2021 at 12:09 PM Simon Riggs wrote: > Truncating line pointers can make extra space on the page, so it could > be the difference between a HOT and a non-HOT update. My understanding > is that these just-in-time actions have a beneficial effect in other > circumstances, so we can do

Re: Lowering the ever-growing heap->pd_lower

2021-08-04 Thread Simon Riggs
On Wed, 4 Aug 2021 at 01:43, Peter Geoghegan wrote: > > On Mon, Aug 2, 2021 at 11:57 PM Simon Riggs > wrote: > > 1. Allow same thing as PageTruncateLinePointerArray() during HOT cleanup > > That is going to have a clear benefit for HOT workloads, which by > > their nature will use a lot of line

Re: Lowering the ever-growing heap->pd_lower

2021-08-04 Thread Robert Haas
On Tue, Aug 3, 2021 at 8:44 PM Peter Geoghegan wrote: > This time it's quite different: we're truncating the line pointer > array during pruning. Pruning often occurs opportunistically, during > regular query processing. In fact I'd say that it's far more common > than pruning by VACUUM. So the

Re: Lowering the ever-growing heap->pd_lower

2021-08-04 Thread Matthias van de Meent
On Wed, 4 Aug 2021 at 02:43, Peter Geoghegan wrote: > > On Mon, Aug 2, 2021 at 11:57 PM Simon Riggs > wrote: > > 2. Reduce number of line pointers to 0 in some cases. > > Matthias - I don't think you've made a full case for doing this, nor > > looked at the implications. > > The comment clearly

Re: Lowering the ever-growing heap->pd_lower

2021-08-04 Thread Matthias van de Meent
On Wed, 4 Aug 2021 at 03:51, Peter Geoghegan wrote: > > We generate an FPI the first time a page is modified after a > checkpoint. The FPI consists of the page *after* it has been modified. In that case, I misremembered when FPIs were written with relation to checkpoints. Thanks for reminding

Re: Lowering the ever-growing heap->pd_lower

2021-08-03 Thread Peter Geoghegan
On Tue, Aug 3, 2021 at 12:27 PM Matthias van de Meent wrote: > This change makes it easier and more worthwile to implement a further > optimization for the checkpointer and/or buffer manager to determine > that 1.) this page is now empty, and that 2.) we can therefore write a > specialized WAL

Re: Lowering the ever-growing heap->pd_lower

2021-08-03 Thread Peter Geoghegan
On Mon, Aug 2, 2021 at 11:57 PM Simon Riggs wrote: > 1. Allow same thing as PageTruncateLinePointerArray() during HOT cleanup > That is going to have a clear benefit for HOT workloads, which by > their nature will use a lot of line pointers. Why do you say that? > Many applications are updated

Re: Lowering the ever-growing heap->pd_lower

2021-08-03 Thread Matthias van de Meent
On Tue, 3 Aug 2021 at 20:37, Simon Riggs wrote: > > On Tue, 3 Aug 2021 at 17:15, Matthias van de Meent > wrote: > > > and further future optimizations might include > > > > - Full-page WAL logging of empty pages produced in the checkpointer > > could potentially be optimized to only log 'it's an

Re: Lowering the ever-growing heap->pd_lower

2021-08-03 Thread Simon Riggs
On Tue, 3 Aug 2021 at 17:15, Matthias van de Meent wrote: > and further future optimizations might include > > - Full-page WAL logging of empty pages produced in the checkpointer > could potentially be optimized to only log 'it's an empty page' > instead of writing out the full 8kb page, which

Re: Lowering the ever-growing heap->pd_lower

2021-08-03 Thread Matthias van de Meent
On Tue, 3 Aug 2021 at 08:57, Simon Riggs wrote: > > On Tue, 18 May 2021 at 20:33, Peter Geoghegan wrote: > > > > On Tue, May 18, 2021 at 12:29 PM Matthias van de Meent > > wrote: > > > PFA the updated version of this patch. Apart from adding line pointer > > > truncation in

Re: Lowering the ever-growing heap->pd_lower

2021-08-03 Thread Simon Riggs
On Tue, 18 May 2021 at 20:33, Peter Geoghegan wrote: > > On Tue, May 18, 2021 at 12:29 PM Matthias van de Meent > wrote: > > PFA the updated version of this patch. Apart from adding line pointer > > truncation in PageRepairFragmentation (as in the earlier patches), I > > also altered

Re: Lowering the ever-growing heap->pd_lower

2021-05-18 Thread Peter Geoghegan
On Tue, May 18, 2021 at 12:29 PM Matthias van de Meent wrote: > PFA the updated version of this patch. Apart from adding line pointer > truncation in PageRepairFragmentation (as in the earlier patches), I > also altered PageTruncateLinePointerArray to clean up all trailing > line pointers, even

Re: Lowering the ever-growing heap->pd_lower

2021-05-18 Thread Matthias van de Meent
On Mon, 3 May 2021 at 16:39, Matthias van de Meent wrote: > I am planning on fixing this patch sometime > before the next commit fest so that we can truncate the LP array > during hot pruning as well, instead of only doing so in the 2nd VACUUM > pass. PFA the updated version of this patch. Apart

Re: Lowering the ever-growing heap->pd_lower

2021-05-03 Thread Matthias van de Meent
On Mon, 3 May 2021 at 16:26, John Naylor wrote: > > On Sat, Apr 3, 2021 at 10:07 PM Peter Geoghegan wrote: > > I would like to deal with this work within the scope of the project > > we're discussing over on the "New IndexAM API controlling index vacuum > > strategies" thread. The latest

Re: Lowering the ever-growing heap->pd_lower

2021-05-03 Thread John Naylor
On Sat, Apr 3, 2021 at 10:07 PM Peter Geoghegan wrote: > I would like to deal with this work within the scope of the project > we're discussing over on the "New IndexAM API controlling index vacuum > strategies" thread. The latest revision of that patch series includes > a modified version of

Re: Lowering the ever-growing heap->pd_lower

2021-04-03 Thread Peter Geoghegan
On Wed, Mar 31, 2021 at 2:49 AM Matthias van de Meent wrote: > I had implemented it locally, but was waiting for some more feedback > before posting that and got busy with other stuff since, it's now > attached. > > I've also played around with marking the free space on the page as > undefined

Re: Lowering the ever-growing heap->pd_lower

2021-03-31 Thread Matthias van de Meent
On Wed, 31 Mar 2021 at 05:35, Peter Geoghegan wrote: > > On Wed, Mar 10, 2021 at 6:01 AM Matthias van de Meent > wrote: > > > The case I was concerned about back when is that there are various bits of > > > code that may visit a page with a predetermined TID in mind to look at. > > > An index

Re: Lowering the ever-growing heap->pd_lower

2021-03-30 Thread Peter Geoghegan
On Wed, Mar 10, 2021 at 6:01 AM Matthias van de Meent wrote: > > The case I was concerned about back when is that there are various bits of > > code that may visit a page with a predetermined TID in mind to look at. > > An index lookup is an obvious example, and another one is chasing an > >

Re: Lowering the ever-growing heap->pd_lower

2021-03-10 Thread Matthias van de Meent
On Tue, 9 Mar 2021 at 22:35, Tom Lane wrote: > > Matthias van de Meent writes: > > The only two existing mechanisms that I could find (in the access/heap > > directory) that possibly could fail on shrunken line pointer arrays; > > being xlog recovery (I do not have enough knowledge on recovery

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Peter Geoghegan
On Tue, Mar 9, 2021 at 1:54 PM Peter Geoghegan wrote: > It occurs to me that we should also mark the hole in the middle of the > page (which includes the would-be LP_UNUSED line pointers at the end > of the original line pointer array space) as undefined to Valgrind > within

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Tom Lane
Peter Geoghegan writes: > It occurs to me that we should also mark the hole in the middle of the > page (which includes the would-be LP_UNUSED line pointers at the end > of the original line pointer array space) as undefined to Valgrind > within PageRepairFragmentation(). +1

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Tom Lane
Mark Dilger writes: >> On Mar 9, 2021, at 1:35 PM, Tom Lane wrote: >> So, to accept a patch that shortens the line pointer array, what we need >> to do is verify that every such code path checks for an out-of-range >> offset before trying to fetch the target line pointer. > Much as Pavan asked

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Peter Geoghegan
On Tue, Mar 9, 2021 at 1:36 PM Tom Lane wrote: > > Matthias van de Meent writes: > > The only two existing mechanisms that I could find (in the access/heap > > directory) that possibly could fail on shrunken line pointer arrays; > > being xlog recovery (I do not have enough knowledge on recovery

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Mark Dilger
> On Mar 9, 2021, at 1:35 PM, Tom Lane wrote: > > So, to accept a patch that shortens the line pointer array, what we need > to do is verify that every such code path checks for an out-of-range > offset before trying to fetch the target line pointer. I believed > back in 2007 that there

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Tom Lane
Matthias van de Meent writes: > The only two existing mechanisms that I could find (in the access/heap > directory) that possibly could fail on shrunken line pointer arrays; > being xlog recovery (I do not have enough knowledge on recovery to > determine if that may touch pages that have shrunken

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Peter Geoghegan
On Tue, Mar 9, 2021 at 7:13 AM Matthias van de Meent wrote: > The shrinking of the line pointer array is already common practice in > indexes (in which all LP_UNUSED items are removed), but this specific > implementation cannot be used for heap pages due to ItemId > invalidation. One available

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Matthias van de Meent
On Tue, 9 Mar 2021 at 17:21, Mark Dilger wrote: > > For a prior discussion on this topic: > > https://www.postgresql.org/message-id/2e78013d0709130606l56539755wb9dbe17225ffe90a%40mail.gmail.com Thanks for the reference! I note that that thread mentions the old-style VACUUM FULL as a reason as to

Re: Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Mark Dilger
> On Mar 9, 2021, at 7:13 AM, Matthias van de Meent > wrote: > > Hi, > > The heap AMs' pages only grow their pd_linp array, and never shrink > when trailing entries are marked unused. This means that up to 14% of > free space (=291 unused line pointers) on a page could be unusable for >

Lowering the ever-growing heap->pd_lower

2021-03-09 Thread Matthias van de Meent
Hi, The heap AMs' pages only grow their pd_linp array, and never shrink when trailing entries are marked unused. This means that up to 14% of free space (=291 unused line pointers) on a page could be unusable for data storage, which I think is a shame. With a patch in the works that allows the