On 15 Mar 2026, at 16:27, Eli Britstein wrote:
> On 04/03/2026 15:55, Eelco Chaudron wrote: >> External email: Use caution opening links or attachments >> >> >> On 9 Feb 2026, at 14:29, Eli Britstein wrote: >> >>> From: Gaetan Rivet<[email protected]> >>> >>> Add a reference-counted key-value map. >> Thanks Gaetan for the patch. I did a detailed review and found some issues >> and comments below. The test cases could benefit from incremental >> verification >> after each operation (like test-cmap.c), testing duplicate key behavior, and >> iterator modification safety. > Hi Eli, Thanks for replying to the review. Find my comments below. Cheers, Eelco > In the check there is an incremental verification per step. Could you please > elaborate specifically what additional checks do you have in mind? Looking at test-cmap.c, check_cmap() does map state verification after every operation. It iterates through the entire map, collects all values, and verifies that exactly the expected entries exist. Your run_check() only does per-key checks during insert and batch verification afterward, but never verifies the full map state incrementally. Other gaps: 1. refmap_for_each() is never called, no iteration testing at all. This means you can't verify the map contains exactly the expected N entries at any point. 2. No check_refmap() helper (see above) that verifies full map state (like check_cmap does). Add one that uses refmap_for_each() to collect all entries and verify they match the expected, then call it after each ref/unref operation. 3. Iterator modification safety cannot be tested without #1. Need to test calling refmap_ref()/unref() from within refmap_for_each() callback. 4. refmap_value_refcount_read(), refmap_ref_value(), and refmap_key_from_value() are never tested. >> Cheers, >> >> Eelco >> >>> Duplicates take a reference on the original entry within the map, >>> leaving it in place. To be able to execute an entry creation after >>> determining whether it is already present or not in the map, >>> store relevant initialization and de-initialization functions. >> initialization should be US initialisation :) > There are "initialization" all over OVS code and specifically in other maps, > so it's aligned. Do you want to have new code like this and align the rest > (not in this commit)? You are right, I was wrong the US spelling is with the z, not the s ;) >>> Signed-off-by: Gaetan Rivet<[email protected]> >>> Co-authored-by: Eli Britstein<[email protected]> >>> Signed-off-by: Eli Britstein<[email protected]> [...] >>> + >>> +struct refmap * >>> +refmap_create(const char *name, >> refmap_create() allocates the struct, but other maps (hmap/cmap/smap) use >> xxx_init() where users allocate. Should this follow the standard pattern? > > While other maps xxx_init(&map) only "resets" <map> to a well known init > state, refmap_create takes more arguments. > > 1. We can still call it "refmap_init" (with the same argument list), but it's > still not fully aligned with other xxx_init() functions. IMO it should stay > refmap_create. > > 2. We can move the allocation/free of the struct from refmap module to the > user, as in other maps. > > WDYT? I guess we can either keep the design as is (which is fine by me) and use refmap_create(), or move to option 2. But in that case, this should be renamed to refmap_init. Ilya, any preference from your side? >>> + size_t key_size, >>> + size_t value_size, >>> + refmap_value_init value_init, >>> + refmap_value_uninit value_uninit, >>> + refmap_value_format value_format) [...] >>> +static void >>> +refmap_destroy__(struct refmap *rfm, bool global_destroy) >>> +{ >>> + bool leaks_detected = false; >>> + >>> + if (rfm == NULL) { >>> + return; >>> + } >>> + >>> + VLOG_DBG("%s: destroying the map", rfm->name); >> Should we RL all debug messages? Guess it should be ok as destroying remaps >> should not happen at a high rate. > It's only for debug, low rate, and we don't want to miss any of such messages. ACK, keep it as is. >>> + >>> + ovs_mutex_lock(&rfm->map_lock); >>> + if (!cmap_is_empty(&rfm->map)) { >>> + struct refmap_node *node; >>> + >>> + VLOG_WARN("%s: %s called with elements remaining in the map", >>> + rfm->name, __func__); >>> + leaks_detected = true; >>> + CMAP_FOR_EACH (node, map_node, &rfm->map) { >>> + /* No need to remove the node from the CMAP, it will >>> + * be destroyed immediately. */ >> Do we need to call value_uninit? > Those nodes were leaks in the refmap, at this point they are destroyed as OVS > is stopping, it is not possible to call value_uninit without crashing. But is this not only true for 'global_destroy == true'. What about the genuine refmap_destroy() call? If this remains the case, we should clearly add this to the API documentation. >>> + ovsrcu_postpone_inline(free, node, rcu_node); >>> + } >>> + } >>> + cmap_destroy(&rfm->map); >>> + ovs_mutex_unlock(&rfm->map_lock); >>> + >>> + ovs_mutex_destroy(&rfm->map_lock); >>> + free(rfm->name); >>> + free(rfm); >>> + >>> + /* During the very last stage of execution of RCU callbacks, >>> + * the VLOG subsystem has been disabled. All logs are thus muted. >>> + * If leaks are detected, abort the process, even though we were >>> + * exiting due to a fatal signal. The SIGABRT generated will still >>> + * be visible. */ >> Double space between new sentences. > Ack. >>> + if (global_destroy && leaks_detected) { >>> + ovs_abort(-1, "Refmap values leak detected."); >>> + } >>> +} >>> + >>> +void >>> +refmap_destroy(struct refmap *rfm) >>> +{ >>> + if (rfm == NULL) { [...] >>>>>> + >>> +/* Called once on a newly created 'value', i.e. when the first >>> + * reference is taken. */ >>> +typedef int (*refmap_value_init)(void *value, void *arg); >>> + >>> +/* Called once on the last dereference to value. */ >>> +typedef void (*refmap_value_uninit)(void *value); >>> + >> Note that value_uninit() for an old entry can execute before value_init() >> completes for a new entry with the same key, potentially causing resource >> conflicts if the value manages resources tied to the key. Should this be >> fixed or documented? > both value_init/uninit operate under the refmap map-lock. how could this > happen? You're right that both run under map_lock, but the race happens because the refcount drop in refmap_unref() occurs OUTSIDE the lock, before acquiring map_lock. This creates a window: Thread A: ovs_refcount_unref() drops to 0 → [waits for map_lock] Thread B: [acquires map_lock] → value_init() → insert → release Thread A: [acquires map_lock] → value_uninit() So Thread B's value_init() can complete before Thread A's value_uninit() executes. >>> +/* Format a (key, value, arg) tuple in 's'. */ >> The refmap_value_format comment should clarify it's for debug logging >> and that the callback is optional (can be NULL). > Ack. >>> +typedef struct ds *(*refmap_value_format)(struct ds *s, void *key, >>> + void *value, void *arg); [...] _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
