On Wed, Jan 17, 2024 at 11:37 AM John Naylor <johncnaylo...@gmail.com> wrote: > > On Wed, Jan 17, 2024 at 8:39 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Wed, Jan 17, 2024 at 9:20 AM John Naylor <johncnaylo...@gmail.com> wrote: > > > > > > On Tue, Jan 16, 2024 at 1:18 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > Just changing "items" to be the local tidstore struct could make the > > > > code tricky a bit, since max_bytes and num_items are on the shared > > > > memory while "items" is a local pointer to the shared tidstore. > > > > > > Thanks for trying it this way! I like the overall simplification but > > > this aspect is not great. > > > Hmm, I wonder if that's a side-effect of the "create" functions doing > > > their own allocations and returning a pointer. Would it be less tricky > > > if the structs were declared where we need them and passed to "init" > > > functions? > > > > Seems worth trying. The current RT_CREATE() API is also convenient as > > other data structure such as simplehash.h and dshash.c supports a > > similar > > I don't happen to know if these paths had to solve similar trickiness > with some values being local, and some shared. > > > > That may be a good idea for other reasons. It's awkward that the > > > create function is declared like this: > > > > > > #ifdef RT_SHMEM > > > RT_SCOPE RT_RADIX_TREE *RT_CREATE(MemoryContext ctx, Size max_bytes, > > > dsa_area *dsa, > > > int tranche_id); > > > #else > > > RT_SCOPE RT_RADIX_TREE *RT_CREATE(MemoryContext ctx, Size max_bytes); > > > #endif > > > > > > An init function wouldn't need these parameters: it could look at the > > > passed struct to know what to do. > > > > But the init function would initialize leaf_ctx etc,no? Initializing > > leaf_ctx needs max_bytes that is not stored in RT_RADIX_TREE. > > I was more referring to the parameters that were different above > depending on shared memory. My first thought was that the tricky part > is because of the allocation in local memory, but it's certainly > possible I've misunderstood the problem. > > > The same > > is true for dsa. I imagined that an init function would allocate a DSA > > memory for the control object. > > Yes: > > ... > // embedded in VacDeadItems > TidStore items; > }; > > // NULL DSA in local case, etc > dead_items->items.area = dead_items_dsa; > dead_items->items.tranche_id = FOO_ID; > > TidStoreInit(&dead_items->items, vac_work_mem); > > That's how I imagined it would work (leaving out some details). I > haven't tried it, so not sure how much it helps. Maybe it has other > problems, but I'm hoping it's just a matter of programming.
It seems we cannot make this work nicely. IIUC VacDeadItems is allocated in DSM and TidStore is embedded there. However, dead_items->items.area is a local pointer to dsa_area. So we cannot include dsa_area in neither TidStore nor RT_RADIX_TREE. Instead we would need to pass dsa_area to each interface by callers. > > If we can't make this work nicely, I'd be okay with keeping the tid > store control object. My biggest concern is unnecessary > double-locking. If we don't do any locking stuff in radix tree APIs and it's the user's responsibility at all, probably we don't need a lock for tidstore? That is, we expose lock functions as you mentioned and the user (like tidstore) acquires/releases the lock before/after accessing the radix tree and num_items. Currently (as of v52 patch) RT_FIND is doing so, but we would need to change RT_SET() and iteration functions as well. During trying this idea, I realized that there is a visibility problem in the radix tree template especially if we want to embed the radix tree in a struct. Considering a use case where we want to use a radix tree in an exposed struct, we would declare only interfaces in a .h file and define actual implementation in a .c file (FYI TupleHashTableData does a similar thing with simplehash.h). The .c file and .h file would be like: in .h file: #define RT_PREFIX local_rt #define RT_SCOPE extern #define RT_DECLARE #define RT_VALUE_TYPE BlocktableEntry #define RT_VARLEN_VALUE #include "lib/radixtree.h" typedef struct TidStore { : local_rt_radix_tree tree; /* embedded */ : } TidStore; in .c file: #define RT_PREFIX local_rt #define RT_SCOPE extern #define RT_DEFINE #define RT_VALUE_TYPE BlocktableEntry #define RT_VARLEN_VALUE #include "lib/radixtree.h" But it doesn't work as the compiler doesn't know the actual definition of local_rt_radix_tree. If the 'tree' is *local_rt_radix_tree, it works. The reason is that with RT_DECLARE but without RT_DEFINE, the radix tree template generates only forward declarations: #ifdef RT_DECLARE typedef struct RT_RADIX_TREE RT_RADIX_TREE; typedef struct RT_ITER RT_ITER; In order to make it work, we need to move the definitions required to expose RT_RADIX_TREE struct to RT_DECLARE part, which actually requires to move RT_NODE, RT_HANDLE, RT_NODE_PTR, RT_SIZE_CLASS_COUNT, and RT_RADIX_TREE_CONTROL etc. However RT_SIZE_CLASS_COUNT, used in RT_RADIX_TREE, could be bothersome. Since it refers to RT_SIZE_CLASS_INFO that further refers to many #defines and structs, we might end up moving many structs such as RT_NODE_4 etc to RT_DECLARE part as well. Or we can use a fixed number is stead of "lengthof(RT_SIZE_CLASS_INFO)". Apart from that, macros requried by both RT_DECLARE and RT_DEFINE such as RT_PAN and RT_MAX_LEVEL also needs to be moved to a common place where they are defined in both cases. Given these facts, I think that the current abstraction works nicely and it would make sense not to support embedding the radix tree. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com