On Sat, Dec 21, 2024 at 2:17 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Fri, Dec 20, 2024 at 2:27 AM John Naylor <johncnaylo...@gmail.com> wrote:
> > v3-0001 allocates the iter data in the caller's context. It's a small > > patch, but still a core behavior change so proposed for master-only. I > > believe your v1 is still the least invasive fix for PG17. > > I agree to use v1 for v17. Okay, did you want to commit that separately, or together with my 0001 on master? Either way, I've put a bit more effort into the commit message in v4 and adjusted the comment slightly. > > It's trivial after 0001-02: 0003 removes makes one test use slab as > > the passed context (only 32-bit systems would actually use it > > currently). > > These changes make sense to me. Here are a few comments: > > RT_RADIX_TREE has 'leaf_context' but it seems that we use it only for > local memory. Do we want to declare only in the !RT_SHMEM case? That's already the case, if I understand your statement correctly. > --- > /* > * Similar to TidStoreCreateLocal() but create a shared TidStore on a > * DSA area. The TID storage will live in the DSA area, and the memory > * context rt_context will have only meta data of the radix tree. > * > * The returned object is allocated in backend-local memory. > */ > > We need to update the comment about rt_context too since we no longer > use rt_context for shared tidstore. Fixed. BTW, it seems TidStore.context is unused? -- John Naylor Amazon Web Services
From 679cf5c305b3f0affb5c4a679cf18eb0674e8691 Mon Sep 17 00:00:00 2001 From: John Naylor <john.nay...@postgresql.org> Date: Fri, 20 Dec 2024 14:48:24 +0700 Subject: [PATCH v4 3/3] Always use the caller-provided context for radix tree leaves Previously, it may not have worked for a caller to pass a slab context, since it would have been also used for other things which may have incompatible size. In an attempt to helpfully avoid wasting space due to aset's power-of-two rounding, RT_CREATE would create an additional slab context if the values were fixed-length and larger than pointer size. The problem was, we have since added the bump context type, and the generation context was a possibility as well, so silently overriding the caller's choice is not friendly. Commit XXXXXXXXX arranged so that the caller-provided context is only used for leaves and nothing else, so it's safe for the caller to use slab here if they wish. As demonstration, use slab in one of the radix tree regression tests. Reviewed by Masahiko Sawada --- src/include/lib/radixtree.h | 14 -------------- src/test/modules/test_radixtree/test_radixtree.c | 5 +++-- 2 files changed, 3 insertions(+), 16 deletions(-) diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h index 97aff227c5..fc7474e0c7 100644 --- a/src/include/lib/radixtree.h +++ b/src/include/lib/radixtree.h @@ -1849,21 +1849,7 @@ RT_CREATE(MemoryContext ctx) size_class.allocsize); } - /* By default we use the passed context for leaves. */ tree->leaf_context = ctx; - -#ifndef RT_VARLEN_VALUE_SIZE - - /* - * For leaves storing fixed-length values, we use a slab context to avoid - * the possibility of space wastage by power-of-2 rounding up. - */ - if (sizeof(RT_VALUE_TYPE) > sizeof(RT_PTR_ALLOC)) - tree->leaf_context = SlabContextCreate(ctx, - RT_STR(RT_PREFIX) "_radix_tree leaf context", - RT_SLAB_BLOCK_SIZE(sizeof(RT_VALUE_TYPE)), - sizeof(RT_VALUE_TYPE)); -#endif /* !RT_VARLEN_VALUE_SIZE */ #endif /* RT_SHMEM */ /* add root node now so that RT_SET can assume it exists */ diff --git a/src/test/modules/test_radixtree/test_radixtree.c b/src/test/modules/test_radixtree/test_radixtree.c index 8074b83695..5c54962edf 100644 --- a/src/test/modules/test_radixtree/test_radixtree.c +++ b/src/test/modules/test_radixtree/test_radixtree.c @@ -313,9 +313,10 @@ test_random(void) #else MemoryContext radixtree_ctx; - radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, + radixtree_ctx = SlabContextCreate(CurrentMemoryContext, "test_radix_tree", - ALLOCSET_SMALL_SIZES); + SLAB_DEFAULT_BLOCK_SIZE, + sizeof(TestValueType)); radixtree = rt_create(radixtree_ctx); #endif -- 2.47.1
From 46056af82a7516250b5cdb817579aeeb3952b8de Mon Sep 17 00:00:00 2001 From: John Naylor <john.nay...@postgresql.org> Date: Sat, 21 Dec 2024 10:55:31 +0700 Subject: [PATCH v4 1/3] Get rid of radix tree's separate iterator context Instead, just use the caller's memory context. This relieves backends from having to create this context when they attach to a tree in shared memory, and delete when they detach. The iterator state is freed when ending iteration, and even if a developer failed to follow existing examples of iteration, the state struct is small enough to pose low risk. --- src/include/lib/radixtree.h | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h index 1301f3fee4..ea20365a23 100644 --- a/src/include/lib/radixtree.h +++ b/src/include/lib/radixtree.h @@ -719,7 +719,6 @@ struct RT_RADIX_TREE /* leaf_context is used only for single-value leaves */ MemoryContextData *leaf_context; #endif - MemoryContextData *iter_context; }; /* @@ -1836,14 +1835,6 @@ RT_CREATE(MemoryContext ctx) tree = (RT_RADIX_TREE *) palloc0(sizeof(RT_RADIX_TREE)); tree->context = ctx; - /* - * Separate context for iteration in case the tree context doesn't support - * pfree - */ - tree->iter_context = AllocSetContextCreate(ctx, - RT_STR(RT_PREFIX) "_radix_tree iter context", - ALLOCSET_SMALL_SIZES); - #ifdef RT_SHMEM tree->dsa = dsa; dp = dsa_allocate0(dsa, sizeof(RT_RADIX_TREE_CONTROL)); @@ -2075,7 +2066,8 @@ RT_FREE(RT_RADIX_TREE * tree) /***************** ITERATION *****************/ /* - * Create and return the iterator for the given radix tree. + * Create and return an iterator for the given radix tree + * in the caller's memory context. * * Taking a lock in shared mode during the iteration is the caller's * responsibility. @@ -2086,8 +2078,7 @@ RT_BEGIN_ITERATE(RT_RADIX_TREE * tree) RT_ITER *iter; RT_CHILD_PTR root; - iter = (RT_ITER *) MemoryContextAllocZero(tree->iter_context, - sizeof(RT_ITER)); + iter = (RT_ITER *) palloc0(sizeof(RT_ITER)); iter->tree = tree; Assert(RT_PTR_ALLOC_IS_VALID(tree->ctl->root)); -- 2.47.1
From 4ca82e1dd6a59f5536ed92319c1d59edf84ad12d Mon Sep 17 00:00:00 2001 From: John Naylor <john.nay...@postgresql.org> Date: Fri, 20 Dec 2024 13:04:18 +0700 Subject: [PATCH v4 2/3] Get rid of radix tree's general purpose memory context Previously, this was notionally used only for the entry point of the tree and, for local memory, the (vestigial) control object. For local memory, leaves and nodes use the same passed context, but it's notionally different. However, for shared memory, the creator allocated the entry point in this context, but attaching backends didn't have access to this, so just used the caller's context. For the sake of consistency, now we allocate every instance of an entry point (and control object, if not in DSA) in the caller's context. For local memory, this makes the "leaf context" contain the child contexts for nodes, so it's a bit of a misnomer, but we may someday want to make node contexts independent rather than children, so leave it this way for now to avoid code churn. The memory context parameter for RT_CREATE is now unused in the case of shared memory, so remove it and adjust callers to match. Reviewed by Masahiko Sawada --- src/backend/access/common/tidstore.c | 21 ++++----- src/include/lib/radixtree.h | 35 ++++++--------- .../modules/test_radixtree/test_radixtree.c | 45 ++++++++----------- 3 files changed, 41 insertions(+), 60 deletions(-) diff --git a/src/backend/access/common/tidstore.c b/src/backend/access/common/tidstore.c index a7179759d6..6192ffd7ba 100644 --- a/src/backend/access/common/tidstore.c +++ b/src/backend/access/common/tidstore.c @@ -116,7 +116,10 @@ struct TidStore /* MemoryContext where the TidStore is allocated */ MemoryContext context; - /* MemoryContext that the radix tree uses */ + /* + * MemoryContext for the radix tree when using local memory, NULL for + * shared memory + */ MemoryContext rt_context; /* Storage for TIDs. Use either one depending on TidStoreIsShared() */ @@ -201,8 +204,7 @@ TidStoreCreateLocal(size_t max_bytes, bool insert_only) /* * Similar to TidStoreCreateLocal() but create a shared TidStore on a - * DSA area. The TID storage will live in the DSA area, and the memory - * context rt_context will have only meta data of the radix tree. + * DSA area. * * The returned object is allocated in backend-local memory. */ @@ -217,10 +219,6 @@ TidStoreCreateShared(size_t max_bytes, int tranche_id) ts = palloc0(sizeof(TidStore)); ts->context = CurrentMemoryContext; - ts->rt_context = AllocSetContextCreate(CurrentMemoryContext, - "TID storage meta data", - ALLOCSET_SMALL_SIZES); - /* * Choose the initial and maximum DSA segment sizes to be no longer than * 1/8 of max_bytes. @@ -235,8 +233,7 @@ TidStoreCreateShared(size_t max_bytes, int tranche_id) dsa_init_size = dsa_max_size; area = dsa_create_ext(tranche_id, dsa_init_size, dsa_max_size); - ts->tree.shared = shared_ts_create(ts->rt_context, area, - tranche_id); + ts->tree.shared = shared_ts_create(area, tranche_id); ts->area = area; return ts; @@ -328,13 +325,13 @@ TidStoreDestroy(TidStore *ts) if (TidStoreIsShared(ts)) { shared_ts_free(ts->tree.shared); - dsa_detach(ts->area); } else + { local_ts_free(ts->tree.local); - - MemoryContextDelete(ts->rt_context); + MemoryContextDelete(ts->rt_context); + } pfree(ts); } diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h index ea20365a23..97aff227c5 100644 --- a/src/include/lib/radixtree.h +++ b/src/include/lib/radixtree.h @@ -275,7 +275,7 @@ typedef dsa_pointer RT_HANDLE; #endif #ifdef RT_SHMEM -RT_SCOPE RT_RADIX_TREE *RT_CREATE(MemoryContext ctx, dsa_area *dsa, int tranche_id); +RT_SCOPE RT_RADIX_TREE *RT_CREATE(dsa_area *dsa, int tranche_id); RT_SCOPE RT_RADIX_TREE *RT_ATTACH(dsa_area *dsa, dsa_pointer dp); RT_SCOPE void RT_DETACH(RT_RADIX_TREE * tree); RT_SCOPE RT_HANDLE RT_GET_HANDLE(RT_RADIX_TREE * tree); @@ -706,8 +706,6 @@ typedef struct RT_RADIX_TREE_CONTROL /* Entry point for allocating and accessing the tree */ struct RT_RADIX_TREE { - MemoryContext context; - /* pointing to either local memory or DSA */ RT_RADIX_TREE_CONTROL *ctl; @@ -1809,31 +1807,25 @@ have_slot: /***************** SETUP / TEARDOWN *****************/ /* - * Create the radix tree in the given memory context and return it. + * Create the radix tree root in the caller's current memory context and return it. * - * All local memory required for a radix tree is allocated in the given - * memory context and its children. Note that RT_FREE() will delete all - * allocated space within the given memory context, so the dsa_area should - * be created in a different context. + * The tree's nodes and leaves are allocated in "ctx" and its children for + * local memory, or in "dsa" for shared memory. */ RT_SCOPE RT_RADIX_TREE * #ifdef RT_SHMEM -RT_CREATE(MemoryContext ctx, dsa_area *dsa, int tranche_id) +RT_CREATE(dsa_area *dsa, int tranche_id) #else RT_CREATE(MemoryContext ctx) #endif { RT_RADIX_TREE *tree; - MemoryContext old_ctx; RT_CHILD_PTR rootnode; #ifdef RT_SHMEM dsa_pointer dp; #endif - old_ctx = MemoryContextSwitchTo(ctx); - tree = (RT_RADIX_TREE *) palloc0(sizeof(RT_RADIX_TREE)); - tree->context = ctx; #ifdef RT_SHMEM tree->dsa = dsa; @@ -1858,7 +1850,7 @@ RT_CREATE(MemoryContext ctx) } /* By default we use the passed context for leaves. */ - tree->leaf_context = tree->context; + tree->leaf_context = ctx; #ifndef RT_VARLEN_VALUE_SIZE @@ -1880,8 +1872,6 @@ RT_CREATE(MemoryContext ctx) tree->ctl->start_shift = 0; tree->ctl->max_val = RT_SHIFT_GET_MAX_VAL(0); - MemoryContextSwitchTo(old_ctx); - return tree; } @@ -2054,13 +2044,16 @@ RT_FREE(RT_RADIX_TREE * tree) */ tree->ctl->magic = 0; dsa_free(tree->dsa, tree->ctl->handle); -#endif - +#else /* - * Free all space allocated within the tree's context and delete all child + * Free all space allocated within the leaf context and delete all child * contexts such as those used for nodes. */ - MemoryContextReset(tree->context); + MemoryContextReset(tree->leaf_context); + + pfree(tree->ctl); +#endif + pfree(tree); } /***************** ITERATION *****************/ @@ -2674,7 +2667,7 @@ RT_MEMORY_USAGE(RT_RADIX_TREE * tree) Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC); total = dsa_get_total_size(tree->dsa); #else - total = MemoryContextMemAllocated(tree->context, true); + total = MemoryContextMemAllocated(tree->leaf_context, true); #endif return total; diff --git a/src/test/modules/test_radixtree/test_radixtree.c b/src/test/modules/test_radixtree/test_radixtree.c index 3e5aa3720c..8074b83695 100644 --- a/src/test/modules/test_radixtree/test_radixtree.c +++ b/src/test/modules/test_radixtree/test_radixtree.c @@ -120,25 +120,22 @@ PG_FUNCTION_INFO_V1(test_radixtree); static void test_empty(void) { - MemoryContext radixtree_ctx; rt_radix_tree *radixtree; rt_iter *iter; uint64 key; #ifdef TEST_SHARED_RT int tranche_id = LWLockNewTrancheId(); dsa_area *dsa; -#endif - - radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, - "test_radix_tree", - ALLOCSET_SMALL_SIZES); -#ifdef TEST_SHARED_RT LWLockRegisterTranche(tranche_id, "test_radix_tree"); dsa = dsa_create(tranche_id); - - radixtree = rt_create(radixtree_ctx, dsa, tranche_id); + radixtree = rt_create(dsa, tranche_id); #else + MemoryContext radixtree_ctx; + + radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, + "test_radix_tree", + ALLOCSET_SMALL_SIZES); radixtree = rt_create(radixtree_ctx); #endif @@ -165,7 +162,6 @@ test_empty(void) static void test_basic(rt_node_class_test_elem *test_info, int shift, bool asc) { - MemoryContext radixtree_ctx; rt_radix_tree *radixtree; rt_iter *iter; uint64 *keys; @@ -173,18 +169,16 @@ test_basic(rt_node_class_test_elem *test_info, int shift, bool asc) #ifdef TEST_SHARED_RT int tranche_id = LWLockNewTrancheId(); dsa_area *dsa; -#endif - radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, - "test_radix_tree", - ALLOCSET_SMALL_SIZES); - -#ifdef TEST_SHARED_RT LWLockRegisterTranche(tranche_id, "test_radix_tree"); dsa = dsa_create(tranche_id); - - radixtree = rt_create(radixtree_ctx, dsa, tranche_id); + radixtree = rt_create(dsa, tranche_id); #else + MemoryContext radixtree_ctx; + + radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, + "test_radix_tree", + ALLOCSET_SMALL_SIZES); radixtree = rt_create(radixtree_ctx); #endif @@ -300,7 +294,6 @@ key_cmp(const void *a, const void *b) static void test_random(void) { - MemoryContext radixtree_ctx; rt_radix_tree *radixtree; rt_iter *iter; pg_prng_state state; @@ -313,18 +306,16 @@ test_random(void) #ifdef TEST_SHARED_RT int tranche_id = LWLockNewTrancheId(); dsa_area *dsa; -#endif - radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, - "test_radix_tree", - ALLOCSET_SMALL_SIZES); - -#ifdef TEST_SHARED_RT LWLockRegisterTranche(tranche_id, "test_radix_tree"); dsa = dsa_create(tranche_id); - - radixtree = rt_create(radixtree_ctx, dsa, tranche_id); + radixtree = rt_create(dsa, tranche_id); #else + MemoryContext radixtree_ctx; + + radixtree_ctx = AllocSetContextCreate(CurrentMemoryContext, + "test_radix_tree", + ALLOCSET_SMALL_SIZES); radixtree = rt_create(radixtree_ctx); #endif -- 2.47.1