Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-28 Thread John Naylor
> The fix is easy enough -- set the child pointer to null upon deletion, but I'm somewhat astonished that the regression tests didn't hit this. I do still intend to replace this code with something faster, but before I do so the tests should probably exercise the deletion paths more. Since VACUUM

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-28 Thread John Naylor
While creating a benchmark for inserting into node128-inner, I found a bug. If a caller deletes from a node128, the slot index is set to invalid, but the child pointer is still valid. Do that a few times, and every child pointer is valid, even if no slot index points to it. When the next inserter

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-25 Thread John Naylor
On Thu, Nov 24, 2022 at 9:54 PM Masahiko Sawada wrote: > > [v11] There is one more thing that just now occurred to me: In expanding the use of size classes, that makes rebasing and reworking the shared memory piece more work than it should be. That's important because there are still some open

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-25 Thread John Naylor
On Thu, Nov 24, 2022 at 9:54 PM Masahiko Sawada wrote: > > So it seems that there are two candidates of rt_node structure: (1) > all nodes except for node256 are variable-size nodes and use pointer > tagging, and (2) node32 and node128 are variable-sized nodes and do > not use pointer tagging

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-22 Thread Andres Freund
On 2022-11-21 17:06:56 +0900, Masahiko Sawada wrote: > Sure. I've attached the v10 patches. 0004 is the pure refactoring > patch and 0005 patch introduces the pointer tagging. This failed on cfbot, with som many crashes that the VM ran out of disk for core dumps. During testing with 32bit, so

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-21 Thread John Naylor
On Mon, Nov 21, 2022 at 3:43 PM Masahiko Sawada wrote: > > On Mon, Nov 21, 2022 at 4:20 PM John Naylor > wrote: > > Assuming the smallest node is fixed size (i.e. fanout/capacity member not part of the common set, so only part of variable-sized nodes), 3 has a nice property: no wasted padding

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-21 Thread Masahiko Sawada
On Mon, Nov 21, 2022 at 4:20 PM John Naylor wrote: > > > On Fri, Nov 18, 2022 at 2:48 PM I wrote: > > One issue with this patch: The "fanout" member is a uint8, so it can't hold > > 256 for the largest node kind. That's not an issue in practice, since we > > never need to grow it, and we only

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-20 Thread John Naylor
On Fri, Nov 18, 2022 at 2:48 PM I wrote: > One issue with this patch: The "fanout" member is a uint8, so it can't hold 256 for the largest node kind. That's not an issue in practice, since we never need to grow it, and we only compare that value with the count in an Assert(), so I just set it to

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-20 Thread John Naylor
On Fri, Nov 18, 2022 at 8:20 PM Masahiko Sawada wrote: > > On Thu, Nov 17, 2022 at 12:24 AM Masahiko Sawada wrote: > > > > On Wed, Nov 16, 2022 at 4:39 PM John Naylor > > wrote: > > > That means my idea for the pointer struct might have some problems, at least as currently implemented. Maybe

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-18 Thread John Naylor
On Fri, Nov 18, 2022 at 8:20 PM Masahiko Sawada wrote: > > FYI I've not tested the patch you shared today but here are the > benchmark results I did with the v9 patch in my environment (I used > the second filter). I splitted 0004 patch into two patches: a patch > for pure refactoring patch to

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-18 Thread Masahiko Sawada
On Thu, Nov 17, 2022 at 12:24 AM Masahiko Sawada wrote: > > On Wed, Nov 16, 2022 at 4:39 PM John Naylor > wrote: > > > > > > On Wed, Nov 16, 2022 at 12:33 PM Masahiko Sawada > > wrote: > > > > > > On Wed, Nov 16, 2022 at 1:46 PM John Naylor > > > wrote: > > > > > > > > > > > > On Tue, Nov 15,

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-17 Thread John Naylor
On Wed, Sep 28, 2022 at 1:18 PM I wrote: > Along those lines, one thing I've been thinking about is the number of size classes. There is a tradeoff between memory efficiency and number of branches when searching/inserting. My current thinking is there is too much coupling between size class and

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-16 Thread Masahiko Sawada
On Wed, Nov 16, 2022 at 4:39 PM John Naylor wrote: > > > On Wed, Nov 16, 2022 at 12:33 PM Masahiko Sawada > wrote: > > > > On Wed, Nov 16, 2022 at 1:46 PM John Naylor > > wrote: > > > > > > > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada > > > wrote: > > > > Thanks! Please let me know

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-15 Thread John Naylor
On Wed, Nov 16, 2022 at 12:33 PM Masahiko Sawada wrote: > > On Wed, Nov 16, 2022 at 1:46 PM John Naylor > wrote: > > > > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada wrote: > > > Thanks! Please let me know if there is something I can help with. > > > > I didn't get very far because the

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-15 Thread Masahiko Sawada
On Wed, Nov 16, 2022 at 2:17 PM John Naylor wrote: > > > > On Wed, Nov 16, 2022 at 11:46 AM John Naylor > wrote: > > > > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada > > wrote: > > > Thanks! Please let me know if there is something I can help with. > > > > I didn't get very far

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-15 Thread Masahiko Sawada
On Wed, Nov 16, 2022 at 1:46 PM John Naylor wrote: > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada > wrote: > > Thanks! Please let me know if there is something I can help with. > > I didn't get very far because the tests fail on 0004 in rt_verify_node: > > TRAP: failed

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-15 Thread John Naylor
On Wed, Nov 16, 2022 at 11:46 AM John Naylor wrote: > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada wrote: > > Thanks! Please let me know if there is something I can help with. > > I didn't get very far because the tests fail on 0004 in rt_verify_node: > > TRAP: failed

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-15 Thread John Naylor
On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada wrote: > Thanks! Please let me know if there is something I can help with. I didn't get very far because the tests fail on 0004 in rt_verify_node: TRAP: failed Assert("n4->chunks[i - 1] < n4->chunks[i]"), File: "../src/backend/lib/radixtree.c",

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-14 Thread Masahiko Sawada
On Mon, Nov 14, 2022 at 10:00 PM John Naylor wrote: > > On Mon, Nov 14, 2022 at 3:44 PM Masahiko Sawada wrote: > > > > 0004 patch is a new patch supporting a pointer tagging of the node > > kind. Also, it introduces rt_node_ptr we discussed so that internal > > functions use it rather than

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-14 Thread John Naylor
On Mon, Nov 14, 2022 at 3:44 PM Masahiko Sawada wrote: > > 0004 patch is a new patch supporting a pointer tagging of the node > kind. Also, it introduces rt_node_ptr we discussed so that internal > functions use it rather than having two arguments for encoded and > decoded pointers. With this

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-08 Thread Peter Geoghegan
On Fri, Nov 4, 2022 at 8:25 AM Masahiko Sawada wrote: > For parallel heap pruning, multiple workers will insert key-value > pairs to the radix tree concurrently. The simplest solution would be a > single lock to protect writes but the performance will not be good. > Another solution would be that

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-08 Thread Masahiko Sawada
On Sat, Nov 5, 2022 at 6:23 PM John Naylor wrote: > > On Fri, Nov 4, 2022 at 10:25 PM Masahiko Sawada wrote: > > > > For parallel heap pruning, multiple workers will insert key-value > > pairs to the radix tree concurrently. The simplest solution would be a > > single lock to protect writes but

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-05 Thread John Naylor
On Fri, Nov 4, 2022 at 10:25 PM Masahiko Sawada wrote: > > For parallel heap pruning, multiple workers will insert key-value > pairs to the radix tree concurrently. The simplest solution would be a > single lock to protect writes but the performance will not be good. > Another solution would be

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-04 Thread Masahiko Sawada
On Thu, Nov 3, 2022 at 1:59 PM John Naylor wrote: > > On Mon, Oct 31, 2022 at 12:47 PM Masahiko Sawada > wrote: > > > > I've attached v8 patches. 0001, 0002, and 0003 patches incorporated > > the comments I got so far. 0004 patch is a DSA support patch for PoC. > > Thanks for the new patchset.

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-02 Thread John Naylor
On Mon, Oct 31, 2022 at 12:47 PM Masahiko Sawada wrote: > > I've attached v8 patches. 0001, 0002, and 0003 patches incorporated > the comments I got so far. 0004 patch is a DSA support patch for PoC. Thanks for the new patchset. This is not a full review, but I have some comments: 0001 and 0002

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-26 Thread John Naylor
On Thu, Oct 27, 2022 at 9:11 AM Masahiko Sawada wrote: > > True. I'm going to start with 6 bytes and will consider reducing it to > 5 bytes. Okay, let's plan on 6 for now, so we have the worst-case sizes up front. As discussed, I will attempt the size class decoupling after v8 and see how it

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-26 Thread Masahiko Sawada
On Wed, Oct 26, 2022 at 8:06 PM John Naylor wrote: > > On Mon, Oct 24, 2022 at 12:54 PM Masahiko Sawada > wrote: > > > I've attached updated PoC patches for discussion and cfbot. From the > > previous version, I mainly changed the following things: > > Thank you for the comments! > > *

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-26 Thread John Naylor
On Mon, Oct 24, 2022 at 12:54 PM Masahiko Sawada wrote: > I've attached updated PoC patches for discussion and cfbot. From the > previous version, I mainly changed the following things: > > * Separate treatment of inner and leaf nodes Overall, this looks much better! > * Pack both the node

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-14 Thread Masahiko Sawada
Hi, On Mon, Oct 10, 2022 at 2:16 PM John Naylor wrote: > > The following is not quite a full review, but has plenty to think about. > There is too much to cover at once, and I have to start somewhere... > > My main concerns are that internal APIs: > > 1. are difficult to follow > 2. lead to

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-09 Thread John Naylor
On Mon, Oct 10, 2022 at 12:16 PM John Naylor wrote: > Thanks for that! Now I can show clear results on some aspects in a simple way. The attached patches (apply on top of v6) Forgot the patchset... -- John Naylor EDB: http://www.enterprisedb.com radix-v6-addendum-jcn1.tar.gz Description:

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-09 Thread John Naylor
The following is not quite a full review, but has plenty to think about. There is too much to cover at once, and I have to start somewhere... My main concerns are that internal APIs: 1. are difficult to follow 2. lead to poor branch prediction and too many function calls Some of the

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-07 Thread Masahiko Sawada
On Fri, Oct 7, 2022 at 2:29 PM John Naylor wrote: > > On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada wrote: > > In addition to two patches, I've attached the third patch. It's not > > part of radix tree implementation but introduces a contrib module > > bench_radix_tree, a tool for radix tree

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-06 Thread John Naylor
On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada wrote: > In addition to two patches, I've attached the third patch. It's not > part of radix tree implementation but introduces a contrib module > bench_radix_tree, a tool for radix tree performance benchmarking. It > measures loading and lookup

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-06 Thread John Naylor
On Thu, Oct 6, 2022 at 2:53 PM Masahiko Sawada wrote: > > On Wed, Oct 5, 2022 at 6:40 PM John Naylor wrote: > > > > This wasn't the focus of your current email, but while experimenting with v6 I had another thought about local allocation: If we use the default slab block size of 8192 bytes, then

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-06 Thread Masahiko Sawada
On Wed, Oct 5, 2022 at 6:40 PM John Naylor wrote: > > > On Wed, Oct 5, 2022 at 1:46 PM Masahiko Sawada wrote: > > > > On Wed, Sep 28, 2022 at 12:49 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Sep 23, 2022 at 12:11 AM John Naylor > > > wrote: > > > Yeah, node31 and node256 are bloated.

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-05 Thread John Naylor
On Wed, Oct 5, 2022 at 1:46 PM Masahiko Sawada wrote: > > On Wed, Sep 28, 2022 at 12:49 PM Masahiko Sawada wrote: > > > > On Fri, Sep 23, 2022 at 12:11 AM John Naylor > > wrote: > > Yeah, node31 and node256 are bloated. We probably could use slab for > > node256 independently. It's worth

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-05 Thread Masahiko Sawada
On Wed, Sep 28, 2022 at 12:49 PM Masahiko Sawada wrote: > > On Fri, Sep 23, 2022 at 12:11 AM John Naylor > wrote: > > > > > > On Thu, Sep 22, 2022 at 11:46 AM John Naylor > > wrote: > > > One thing I want to try soon is storing fewer than 16/32 etc entries, so > > > that the whole node fits

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-02 Thread Masahiko Sawada
On Mon, Oct 3, 2022 at 2:04 AM Andres Freund wrote: > > Hi, > > On 2022-09-16 15:00:31 +0900, Masahiko Sawada wrote: > > I've updated the radix tree patch. It's now separated into two patches. > > cfbot notices a compiler warning: >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-10-02 Thread Andres Freund
Hi, On 2022-09-16 15:00:31 +0900, Masahiko Sawada wrote: > I've updated the radix tree patch. It's now separated into two patches. cfbot notices a compiler warning: https://cirrus-ci.com/task/6247907681632256?logs=gcc_warning#L446 [11:03:05.343] radixtree.c: In function ‘rt_iterate_next’:

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-30 Thread Masahiko Sawada
On Wed, Sep 28, 2022 at 3:18 PM John Naylor wrote: > > On Wed, Sep 28, 2022 at 10:49 AM Masahiko Sawada > wrote: > > > BTW We need to consider not only aset/slab but also DSA since we > > allocate dead tuple TIDs on DSM in parallel vacuum cases. FYI DSA uses > > the following size classes: > >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-28 Thread John Naylor
On Wed, Sep 28, 2022 at 1:18 PM John Naylor wrote: > [stuff about size classes] I kind of buried the lede here on one thing: If we only have 4 kinds regardless of the number of size classes, we can use 2 bits of the pointer for dispatch, which would only require 4-byte alignment. That should

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-28 Thread John Naylor
On Wed, Sep 28, 2022 at 10:49 AM Masahiko Sawada wrote: > BTW We need to consider not only aset/slab but also DSA since we > allocate dead tuple TIDs on DSM in parallel vacuum cases. FYI DSA uses > the following size classes: > > static const uint16 dsa_size_classes[] = { > [...] Thanks for

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-27 Thread Masahiko Sawada
On Fri, Sep 23, 2022 at 12:11 AM John Naylor wrote: > > > On Thu, Sep 22, 2022 at 11:46 AM John Naylor > wrote: > > One thing I want to try soon is storing fewer than 16/32 etc entries, so > > that the whole node fits comfortably inside a power-of-two allocation. That > > would allow us to

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-22 Thread John Naylor
On Thu, Sep 22, 2022 at 11:46 AM John Naylor wrote: > One thing I want to try soon is storing fewer than 16/32 etc entries, so that the whole node fits comfortably inside a power-of-two allocation. That would allow us to use aset without wasting space for the smaller nodes, which would be faster

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-22 Thread John Naylor
On Thu, Sep 22, 2022 at 7:52 PM John Naylor wrote: > > > On Thu, Sep 22, 2022 at 1:26 PM Masahiko Sawada wrote: > > Good point. While keeping the chunks in the small nodes in sorted > > order is useful for visiting all keys in sorted order, additional > > branches and memmove calls could be

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-22 Thread John Naylor
On Thu, Sep 22, 2022 at 1:26 PM Masahiko Sawada wrote: > > On Thu, Sep 22, 2022 at 1:46 PM John Naylor > wrote: > > While on the subject, I wonder how important it is to keep the chunks in the small nodes in sorted order. That adds branches and memmove calls, and is the whole reason for the

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-22 Thread Masahiko Sawada
On Thu, Sep 22, 2022 at 1:46 PM John Naylor wrote: > > > On Thu, Sep 22, 2022 at 1:01 AM Nathan Bossart > wrote: > > > > On Wed, Sep 21, 2022 at 01:17:21PM +0700, John Naylor wrote: > > > > > In short, this code needs to be lower level so that we still have full > > > control while being

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-21 Thread John Naylor
On Thu, Sep 22, 2022 at 1:01 AM Nathan Bossart wrote: > > On Wed, Sep 21, 2022 at 01:17:21PM +0700, John Naylor wrote: > > > In short, this code needs to be lower level so that we still have full > > control while being portable. I will work on this, and also the related > > code for node

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-21 Thread Nathan Bossart
On Wed, Sep 21, 2022 at 01:17:21PM +0700, John Naylor wrote: > In trying to wrap the SIMD code behind layers of abstraction, the latest > patch (and Nathan's cleanup) threw it away in almost all cases. To explain, > we need to talk about how vectorized code deals with the "tail" that is too >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-21 Thread John Naylor
On Tue, Sep 20, 2022 at 3:19 PM Masahiko Sawada wrote: > > On Fri, Sep 16, 2022 at 4:54 PM John Naylor > wrote: > > Here again, I'd rather put this off and focus on getting the "large > > details" in good enough shape so we can got towards integrating with > > vacuum. > > Thank you for the

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-20 Thread Masahiko Sawada
On Fri, Sep 16, 2022 at 4:54 PM John Naylor wrote: > > On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada wrote: > > > > On Mon, Aug 15, 2022 at 10:39 PM John Naylor > > wrote: > > > > > > bool, buth = and <=. Should be pretty close. Also, i believe if you > > > left this for last as a possible

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-17 Thread Nathan Bossart
On Fri, Sep 16, 2022 at 02:54:14PM +0700, John Naylor wrote: > Here again, I'd rather put this off and focus on getting the "large > details" in good enough shape so we can got towards integrating with > vacuum. I started a new thread for the SIMD patch [0] so that this thread can remain focused

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-16 Thread John Naylor
On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada wrote: > > On Mon, Aug 15, 2022 at 10:39 PM John Naylor > wrote: > > > > bool, buth = and <=. Should be pretty close. Also, i believe if you > > left this for last as a possible refactoring, it might save some work. v6 demonstrates why this should

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-16 Thread Masahiko Sawada
On Mon, Aug 15, 2022 at 10:39 PM John Naylor wrote: > > On Mon, Aug 15, 2022 at 12:39 PM Masahiko Sawada > wrote: > > > > On Fri, Jul 22, 2022 at 10:43 AM Masahiko Sawada > > wrote: > > > > > > On Tue, Jul 19, 2022 at 1:30 PM John Naylor > > > wrote: > > > > > > > > > > > > > > > > On Tue,

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-08-15 Thread John Naylor
On Mon, Aug 15, 2022 at 12:39 PM Masahiko Sawada wrote: > > On Fri, Jul 22, 2022 at 10:43 AM Masahiko Sawada > wrote: > > > > On Tue, Jul 19, 2022 at 1:30 PM John Naylor > > wrote: > > > > > > > > > > > > On Tue, Jul 19, 2022 at 9:11 AM Masahiko Sawada > > > wrote: > > > > > > > I’d like to

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-08-14 Thread Masahiko Sawada
On Fri, Jul 22, 2022 at 10:43 AM Masahiko Sawada wrote: > > On Tue, Jul 19, 2022 at 1:30 PM John Naylor > wrote: > > > > > > > > On Tue, Jul 19, 2022 at 9:11 AM Masahiko Sawada > > wrote: > > > > > I’d like to keep the first version simple. We can improve it and add > > > more optimizations

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-21 Thread Masahiko Sawada
On Tue, Jul 19, 2022 at 1:30 PM John Naylor wrote: > > > > On Tue, Jul 19, 2022 at 9:11 AM Masahiko Sawada wrote: > > > I’d like to keep the first version simple. We can improve it and add > > more optimizations later. Using radix tree for vacuum TID storage > > would still be a big win

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-18 Thread John Naylor
On Tue, Jul 19, 2022 at 9:11 AM Masahiko Sawada wrote: > I’d like to keep the first version simple. We can improve it and add > more optimizations later. Using radix tree for vacuum TID storage > would still be a big win comparing to using a flat array, even without > all these optimizations. In

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-18 Thread Peter Geoghegan
On Mon, Jul 18, 2022 at 9:10 PM John Naylor wrote: > On Tue, Jul 19, 2022 at 9:24 AM Andres Freund wrote: > > FWIW, I think the best path forward would be to do something similar to the > > simplehash.h approach, so it can be customized to the specific user. > > I figured that would come up at

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-18 Thread John Naylor
On Tue, Jul 19, 2022 at 9:24 AM Andres Freund wrote: > FWIW, I think the best path forward would be to do something similar to the > simplehash.h approach, so it can be customized to the specific user. I figured that would come up at some point. It may be worth doing in the future, but I think

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-18 Thread Andres Freund
Hi, On 2022-07-08 11:09:44 +0900, Masahiko Sawada wrote: > I think that at this stage it's better to define the design first. For > example, key size and value size, and these sizes are fixed or can be > set the arbitary size? Given the use case of buffer mapping, we would > need a wider key to

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-18 Thread Masahiko Sawada
On Thu, Jul 14, 2022 at 1:17 PM John Naylor wrote: > > On Tue, Jul 12, 2022 at 8:16 AM Masahiko Sawada wrote: > > > > > I think that at this stage it's better to define the design first. For > > > > example, key size and value size, and these sizes are fixed or can be > > > > set the arbitary

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-13 Thread John Naylor
On Tue, Jul 12, 2022 at 8:16 AM Masahiko Sawada wrote: > > > I think that at this stage it's better to define the design first. For > > > example, key size and value size, and these sizes are fixed or can be > > > set the arbitary size? > > > > I don't think we need to start over. Andres'

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-11 Thread Masahiko Sawada
On Fri, Jul 8, 2022 at 3:43 PM John Naylor wrote: > > On Fri, Jul 8, 2022 at 9:10 AM Masahiko Sawada wrote: > > > I guess that the tree height is affected by where garbages are, right? > > For example, even if all garbage in the table is concentrated in > > 0.5GB, if they exist between 2^17 and

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-08 Thread John Naylor
On Fri, Jul 8, 2022 at 9:10 AM Masahiko Sawada wrote: > I guess that the tree height is affected by where garbages are, right? > For example, even if all garbage in the table is concentrated in > 0.5GB, if they exist between 2^17 and 2^18 block, we use the first > byte of blockhi. If the table

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-07 Thread Masahiko Sawada
On Tue, Jul 5, 2022 at 5:49 PM John Naylor wrote: > > On Mon, Jul 4, 2022 at 12:07 PM Masahiko Sawada wrote: > > > > Looking at the node stats, and then your benchmark code, I think key > > > construction is a major influence, maybe more than node type. The > > > key/value scheme tested now

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-06 Thread Masahiko Sawada
On Tue, Jul 5, 2022 at 5:09 PM Andres Freund wrote: > > Hi, > > On 2022-07-05 16:33:17 +0900, Masahiko Sawada wrote: > > On Tue, Jul 5, 2022 at 6:18 AM Andres Freund wrote: > > A datum value is convenient to represent both a pointer and a value so > > I used it to avoid defining node types for

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-05 Thread John Naylor
On Mon, Jul 4, 2022 at 12:07 PM Masahiko Sawada wrote: > > Looking at the node stats, and then your benchmark code, I think key > > construction is a major influence, maybe more than node type. The > > key/value scheme tested now makes sense: > > > > blockhi || blocklo || 9 bits of item offset >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-05 Thread Andres Freund
Hi, On 2022-07-05 16:33:29 +0900, Masahiko Sawada wrote: > > One thing I was wondering about is trying to choose node types in > > roughly-power-of-two struct sizes. It's pretty easy to end up with > > significant > > fragmentation in the slabs right now when inserting as you go, because some >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-05 Thread Andres Freund
Hi, On 2022-07-05 16:33:17 +0900, Masahiko Sawada wrote: > On Tue, Jul 5, 2022 at 6:18 AM Andres Freund wrote: > A datum value is convenient to represent both a pointer and a value so > I used it to avoid defining node types for inner and leaf nodes > separately. I'm not convinced that's a good

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-05 Thread Masahiko Sawada
On Tue, Jul 5, 2022 at 7:00 AM Andres Freund wrote: > > Hi, > > On 2022-06-28 15:24:11 +0900, Masahiko Sawada wrote: > > In both test cases, There is not much difference between using AVX2 > > and SSE2. The more mode types, the more time it takes for loading the > > data (see

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-05 Thread Masahiko Sawada
On Tue, Jul 5, 2022 at 6:18 AM Andres Freund wrote: > > Hi, > > On 2022-06-16 13:56:55 +0900, Masahiko Sawada wrote: > > diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c > > new file mode 100644 > > index 00..bf87f932fd > > --- /dev/null > > +++

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-04 Thread Masahiko Sawada
On Mon, Jul 4, 2022 at 2:07 PM Masahiko Sawada wrote: > > On Tue, Jun 28, 2022 at 10:10 PM John Naylor > wrote: > > > > On Tue, Jun 28, 2022 at 1:24 PM Masahiko Sawada > > wrote: > > > > > > > I > > > > suspect other optimizations would be worth a lot more than using AVX2: > > > > - collapsing

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-04 Thread Andres Freund
Hi, On 2022-06-28 15:24:11 +0900, Masahiko Sawada wrote: > In both test cases, There is not much difference between using AVX2 > and SSE2. The more mode types, the more time it takes for loading the > data (see sse2_4_16_32_128_256). Yea, at some point the compiler starts using a jump table

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-04 Thread Andres Freund
Hi, On 2022-06-16 13:56:55 +0900, Masahiko Sawada wrote: > diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c > new file mode 100644 > index 00..bf87f932fd > --- /dev/null > +++ b/src/backend/lib/radixtree.c > @@ -0,0 +1,1763 @@ >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-04 Thread Andres Freund
Hi, I just noticed that I had a reply forgotten in drafts... On 2022-05-10 10:51:46 +0900, Masahiko Sawada wrote: > To move this project forward, I've implemented radix tree > implementation from scratch while studying Andres's implementation. It > supports insertion, search, and iteration but

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-03 Thread Masahiko Sawada
On Tue, Jun 28, 2022 at 10:10 PM John Naylor wrote: > > On Tue, Jun 28, 2022 at 1:24 PM Masahiko Sawada wrote: > > > > > I > > > suspect other optimizations would be worth a lot more than using AVX2: > > > - collapsing inner nodes > > > - taking care when constructing the key (more on this when

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-28 Thread John Naylor
On Tue, Jun 28, 2022 at 1:24 PM Masahiko Sawada wrote: > > > I > > suspect other optimizations would be worth a lot more than using AVX2: > > - collapsing inner nodes > > - taking care when constructing the key (more on this when we > > integrate with VACUUM) > > ...and a couple Andres mentioned:

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-28 Thread Masahiko Sawada
Hi, On Mon, Jun 27, 2022 at 8:12 PM John Naylor wrote: > > On Mon, Jun 20, 2022 at 7:57 AM Masahiko Sawada wrote: > > [v3 patch] > > Hi Masahiko, > > Since there are new files, and they are pretty large, I've attached > most specific review comments and questions as a diff rather than in > the

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread Andres Freund
Hi, On 2022-06-28 11:17:42 +0700, John Naylor wrote: > On Mon, Jun 27, 2022 at 10:23 PM Hannu Krosing wrote: > > > > > Another thought: for non-x86 platforms, the SIMD nodes degenerate to > > > "simple loop", and looping over up to 32 elements is not great > > > (although possibly okay). We

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread John Naylor
On Mon, Jun 27, 2022 at 10:23 PM Hannu Krosing wrote: > > > Another thought: for non-x86 platforms, the SIMD nodes degenerate to > > "simple loop", and looping over up to 32 elements is not great > > (although possibly okay). We could do binary search, but that has bad > > branch prediction. > >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread Andres Freund
Hi, On 2022-06-27 18:12:13 +0700, John Naylor wrote: > Another thought: for non-x86 platforms, the SIMD nodes degenerate to > "simple loop", and looping over up to 32 elements is not great > (although possibly okay). We could do binary search, but that has bad > branch prediction. I'd be quite

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread Hannu Krosing
> Another thought: for non-x86 platforms, the SIMD nodes degenerate to > "simple loop", and looping over up to 32 elements is not great > (although possibly okay). We could do binary search, but that has bad > branch prediction. I am not sure that for relevant non-x86 platforms SIMD / vector

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread John Naylor
On Mon, Jun 20, 2022 at 7:57 AM Masahiko Sawada wrote: [v3 patch] Hi Masahiko, Since there are new files, and they are pretty large, I've attached most specific review comments and questions as a diff rather than in the email body. This is not a full review, which will take more time -- this

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-19 Thread Masahiko Sawada
Hi, On Thu, Jun 16, 2022 at 4:30 PM John Naylor wrote: > > On Thu, Jun 16, 2022 at 11:57 AM Masahiko Sawada > wrote: > > I've attached an updated version patch that changes the configure > > script. I'm still studying how to support AVX2 on msvc build. Also, > > added more regression tests. >

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-16 Thread Andrew Dunstan
On 2022-06-16 Th 00:56, Masahiko Sawada wrote: > > I've attached an updated version patch that changes the configure > script. I'm still studying how to support AVX2 on msvc build. Also, > added more regression tests. I think you would need to add '/arch:AVX2' to the compiler flags in

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-16 Thread John Naylor
On Thu, Jun 16, 2022 at 11:57 AM Masahiko Sawada wrote: > I've attached an updated version patch that changes the configure > script. I'm still studying how to support AVX2 on msvc build. Also, > added more regression tests. Thanks for the update, I will take a closer look at the patch in the

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-15 Thread Masahiko Sawada
On Wed, May 25, 2022 at 11:48 AM Masahiko Sawada wrote: > > On Tue, May 10, 2022 at 6:58 PM John Naylor > wrote: > > > > On Tue, May 10, 2022 at 8:52 AM Masahiko Sawada > > wrote: > > > > > > Overall, radix tree implementations have good numbers. Once we got an > > > agreement on moving in

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-05-24 Thread Masahiko Sawada
On Tue, May 10, 2022 at 6:58 PM John Naylor wrote: > > On Tue, May 10, 2022 at 8:52 AM Masahiko Sawada wrote: > > > > Overall, radix tree implementations have good numbers. Once we got an > > agreement on moving in this direction, I'll start a new thread for > > that and move the implementation

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-05-10 Thread John Naylor
On Tue, May 10, 2022 at 8:52 AM Masahiko Sawada wrote: > > Overall, radix tree implementations have good numbers. Once we got an > agreement on moving in this direction, I'll start a new thread for > that and move the implementation further; there are many things to do > and discuss: deletion,

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-05-09 Thread Masahiko Sawada
Hi, On Sun, Feb 13, 2022 at 12:39 PM Andres Freund wrote: > > On 2022-02-13 12:36:13 +0900, Masahiko Sawada wrote: > > Actually, I'm working on simplifying and improving radix tree > > implementation for PG16 dev cycle. From the discussion so far I think > > it's better to have a data structure

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-12 Thread Andres Freund
On 2022-02-13 12:36:13 +0900, Masahiko Sawada wrote: > Actually, I'm working on simplifying and improving radix tree > implementation for PG16 dev cycle. From the discussion so far I think > it's better to have a data structure that can be used for > general-purpose and is also good for storing

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-12 Thread Masahiko Sawada
On Sun, Feb 13, 2022 at 11:02 AM Andres Freund wrote: > > Hi, > > On 2022-02-11 13:47:01 +0100, Matthias van de Meent wrote: > > Today I noticed the inefficiencies of our dead tuple storage once > > again, and started theorizing about a better storage method; which is > > when I remembered that

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-12 Thread Andres Freund
Hi, On 2022-02-11 13:47:01 +0100, Matthias van de Meent wrote: > Today I noticed the inefficiencies of our dead tuple storage once > again, and started theorizing about a better storage method; which is > when I remembered that this thread exists, and that this thread > already has amazing

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-11 Thread Matthias van de Meent
Hi, Today I noticed the inefficiencies of our dead tuple storage once again, and started theorizing about a better storage method; which is when I remembered that this thread exists, and that this thread already has amazing results. Are there any plans to get the results of this thread from PoC

Re: [PoC] Improve dead tuple storage for lazy vacuum

2021-07-30 Thread Robert Haas
On Fri, Jul 30, 2021 at 3:34 PM Andres Freund wrote: > The lower memory usage also often will result in a better cache > utilization - which is a crucial factor for index vacuuming when the > index order isn't correlated with the heap order. Cache misses really > are a crucial performance factor

Re: [PoC] Improve dead tuple storage for lazy vacuum

2021-07-30 Thread Andres Freund
Hi, On 2021-07-30 15:13:49 -0400, Robert Haas wrote: > On Thu, Jul 29, 2021 at 3:14 PM Andres Freund wrote: > > I think those advantages are far outstripped by the big disadvantage of > > needing to either size the array accurately from the start, or to > > reallocate the whole array. Our

Re: [PoC] Improve dead tuple storage for lazy vacuum

2021-07-30 Thread Robert Haas
On Thu, Jul 29, 2021 at 3:14 PM Andres Freund wrote: > I think those advantages are far outstripped by the big disadvantage of > needing to either size the array accurately from the start, or to > reallocate the whole array. Our current pre-allocation behaviour is > very wasteful for most

Re: [PoC] Improve dead tuple storage for lazy vacuum

2021-07-29 Thread Andres Freund
Hi, On 2021-07-29 13:15:53 -0400, Robert Haas wrote: > I don't know if this is better, but I do kind of like the fact that > the basic representation is just an array. It makes it really easy to > predict how much memory will be needed for a given number of dead > TIDs, and it's very DSM-friendly

Re: [PoC] Improve dead tuple storage for lazy vacuum

2021-07-29 Thread Yura Sokolov
Robert Haas писал 2021-07-29 20:15: On Thu, Jul 29, 2021 at 5:11 AM Masahiko Sawada wrote: Indeed. Given that the radix tree itself has other use cases, I have no concern about using radix tree for vacuum's dead tuples storage. It will be better to have one that can be generally used and has

<    1   2   3   4   5   >