On Fri, Sep 23, 2022 at 12:11 AM John Naylor
<[email protected]> wrote:
>
>
> On Thu, Sep 22, 2022 at 11:46 AM John Naylor <[email protected]>
> wrote:
> > One thing I want to try soon is storing fewer than 16/32 etc entries, so
> > that the whole node fits comfortably inside a power-of-two allocation. That
> > would allow us to use aset without wasting space for the smaller nodes,
> > which would be faster and possibly would solve the fragmentation problem
> > Andres referred to in
>
> > https://www.postgresql.org/message-id/20220704220038.at2ane5xkymzzssb%40awork3.anarazel.de
>
> While calculating node sizes that fit within a power-of-two size, I noticed
> the current base node is a bit wasteful, taking up 8 bytes. The node kind
> only has a small number of values, so it doesn't really make sense to use an
> enum here in the struct (in fact, Andres' prototype used a uint8 for
> node_kind). We could use a bitfield for the count and kind:
>
> uint16 -- kind and count bitfield
> uint8 shift;
> uint8 chunk;
>
> That's only 4 bytes. Plus, if the kind is ever encoded in a pointer tag, the
> bitfield can just go back to being count only.
Good point, agreed.
>
> Here are the v6 node kinds:
>
> node4: 8 + 4 +(4) + 4*8 = 48 bytes
> node16: 8 + 16 + 16*8 = 152
> node32: 8 + 32 + 32*8 = 296
> node128: 8 + 256 + 128/8 + 128*8 = 1304
> node256: 8 + 256/8 + 256*8 = 2088
>
> And here are the possible ways we could optimize nodes for space using aset
> allocation. Parentheses are padding bytes. Even if my math has mistakes, the
> numbers shouldn't be too far off:
>
> node3: 4 + 3 +(1) + 3*8 = 32 bytes
> node6: 4 + 6 +(6) + 6*8 = 64
> node13: 4 + 13 +(7) + 13*8 = 128
> node28: 4 + 28 + 28*8 = 256
> node31: 4 + 256 + 32/8 + 31*8 = 512 (XXX not good)
> node94: 4 + 256 + 96/8 + 94*8 = 1024
> node220: 4 + 256 + 224/8 + 220*8 = 2048
> node256: = 4096
>
> The main disadvantage is that node256 would balloon in size.
Yeah, node31 and node256 are bloated. We probably could use slab for
node256 independently. It's worth trying a benchmark to see how it
affects the performance and the tree size.
BTW We need to consider not only aset/slab but also DSA since we
allocate dead tuple TIDs on DSM in parallel vacuum cases. FYI DSA uses
the following size classes:
static const uint16 dsa_size_classes[] = {
sizeof(dsa_area_span), 0, /* special size classes */
8, 16, 24, 32, 40, 48, 56, 64, /* 8 classes separated by 8 bytes */
80, 96, 112, 128, /* 4 classes separated by 16 bytes */
160, 192, 224, 256, /* 4 classes separated by 32 bytes */
320, 384, 448, 512, /* 4 classes separated by 64 bytes */
640, 768, 896, 1024, /* 4 classes separated by 128 bytes */
1280, 1560, 1816, 2048, /* 4 classes separated by ~256 bytes */
2616, 3120, 3640, 4096, /* 4 classes separated by ~512 bytes */
5456, 6552, 7280, 8192 /* 4 classes separated by ~1024 bytes */
};
node256 will be classed as 2616, which is still not good.
Anyway, I'll implement DSA support for radix tree.
Regards,
--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com