Hello again, Here find the attached v2, that contains everything from v1 + expanded comment to (hopefully) fully explain what is going on.
Let me know, if this is it. Regards, Milos On Sat, Mar 28, 2026 at 5:57 PM Samuel Thibault <[email protected]> wrote: > > Milos Nikic, le sam. 28 mars 2026 17:01:50 -0700, a ecrit: > > I understand the above comment, and I fully agree that we shouldn't > > use two for loops for obvious reasons. > > > > However, the structure presented in my patch is not a regular nested > > loop. It uses a C macro idiom where the outer loop does not actually > > iterate; > > it acts solely as a scoped variable cache with a built-in termination > > trigger. > > Then the comment should be updated. > > Samuel > > > > If a user calls break; inside the iterator block: > > - It escapes the inner for loop. > > - Control immediately passes to the outer loop's increment > > statement, which executes: _ihash_ht_##val = NULL; > > - The outer loop checks its condition (_ihash_ht_##val != NULL), > > which evaluates to false. > > - The entire macro terminates completely. > > > > Because of this NULL assignment in the increment block, the > > double-loop behaves identically to a single loop regarding scope, > > break, and continue, while successfully solving the > > multiple-evaluation hazard. > > > > To verify this behavior in practice, I've attached a small, non-merge > > demonstration patch. If you apply this on top of my original patch, > > you will see that whenever you call sync in a shell, you get: > > > > We have 1 of executions. > > > > printed to the Mach console. > > If the break didn't work as expected, the exit(1) would trigger and > > the translator wouldn't even survive boot! > > Let me know what you think, or if anything else needs refinement. > > > > Thanks, > > Milos > > > > On Sat, Mar 28, 2026 at 3:44 PM Samuel Thibault <[email protected]> > > wrote: > > > > > > Hello, > > > > > > Milos Nikic, le ven. 27 mars 2026 22:04:22 -0700, a ecrit: > > > > In HURD_IHASH_ITERATE and HURD_IHASH_ITERATE_ITEMS, the hash table > > > > argument (ht) was evaluated multiple times during loop initialization > > > > and condition checking. If a caller passed an expression with side > > > > effects (e.g., a function call returning a table pointer), it would > > > > be executed repeatedly per loop, leading to performance degradation > > > > or unintended behavior. > > > > > > > > Fix this by wrapping the iterators in an outer `for` loop that > > > > evaluates and caches `(ht)` exactly once using `__typeof__`. > > > > > > Please read the comment above: > > > > > > We can not use two for-loops, because we want a break statement inside > > > the iterator block to terminate the operation. > > > > > > Samuel > > > > > > > Token > > > > pasting (##) is used to generate a unique cache variable name based > > > > on the caller's item identifier, safely allowing these macros to be > > > > nested without variable shadowing or naming collisions. > > > > --- > > > > libihash/ihash.h | 46 +++++++++++++++++++++++++++++----------------- > > > > 1 file changed, 29 insertions(+), 17 deletions(-) > > > > > > > > diff --git a/libihash/ihash.h b/libihash/ihash.h > > > > index 80679f14..5f73f3e1 100644 > > > > --- a/libihash/ihash.h > > > > +++ b/libihash/ihash.h > > > > @@ -322,17 +322,22 @@ hurd_ihash_value_t hurd_ihash_locp_find > > > > (hurd_ihash_t ht, > > > > pointer pointed to must not have an influence on the condition > > > > result, so the comma operator is used to make sure this > > > > subexpression is always true). */ > > > > -#define HURD_IHASH_ITERATE(ht, val) \ > > > > - for (hurd_ihash_value_t val, > > > > \ > > > > - *_hurd_ihash_valuep = (ht)->size ? &(ht)->items[0].value : 0; > > > > \ > > > > - (ht)->size \ > > > > - && (size_t) ((_hurd_ihash_item_t) _hurd_ihash_valuep \ > > > > - - &(ht)->items[0]) \ > > > > - < (ht)->size \ > > > > - && (val = *_hurd_ihash_valuep, 1); \ > > > > - _hurd_ihash_valuep = (hurd_ihash_value_t *) \ > > > > - (((_hurd_ihash_item_t) _hurd_ihash_valuep) + 1)) \ > > > > - if (val != _HURD_IHASH_EMPTY && val != _HURD_IHASH_DELETED) > > > > +#define HURD_IHASH_ITERATE(ht, val) > > > > \ > > > > + /* Cache 'ht' to prevent multiple evaluation of functions. */ > > > > \ > > > > + for (__typeof__(ht) _ihash_ht_##val = (ht); > > > > \ > > > > + _ihash_ht_##val != NULL; > > > > \ > > > > + _ihash_ht_##val = NULL) > > > > \ > > > > + for (hurd_ihash_value_t val, > > > > \ > > > > + *_ihash_p_##val = _ihash_ht_##val->size > > > > \ > > > > + ? &_ihash_ht_##val->items[0].value : 0; > > > > \ > > > > + _ihash_ht_##val->size > > > > \ > > > > + && (size_t) ((_hurd_ihash_item_t) _ihash_p_##val > > > > \ > > > > + - &_ihash_ht_##val->items[0]) > > > > \ > > > > + < _ihash_ht_##val->size > > > > \ > > > > + && (((val) = *_ihash_p_##val), 1); > > > > \ > > > > + _ihash_p_##val = (hurd_ihash_value_t *) > > > > \ > > > > + (((_hurd_ihash_item_t) _ihash_p_##val) + 1)) > > > > \ > > > > + if ((val) != _HURD_IHASH_EMPTY && (val) != _HURD_IHASH_DELETED) > > > > > > > > /* Iterate over all elements in the hash table making both the key and > > > > the value available. You use this macro with a block, for example > > > > @@ -344,12 +349,19 @@ hurd_ihash_value_t hurd_ihash_locp_find > > > > (hurd_ihash_t ht, > > > > The block will be run for every element in the hash table HT. The > > > > key and value of the current element is available as ITEM->key and > > > > ITEM->value. */ > > > > -#define HURD_IHASH_ITERATE_ITEMS(ht, item) > > > > \ > > > > - for (_hurd_ihash_item_t item = (ht)->size? &(ht)->items[0]: 0; \ > > > > - (ht)->size && item - &(ht)->items[0] < (ht)->size; > > > > \ > > > > - item++) > > > > \ > > > > - if (item->value != _HURD_IHASH_EMPTY && > > > > \ > > > > - item->value != _HURD_IHASH_DELETED) > > > > +#define HURD_IHASH_ITERATE_ITEMS(ht, item) > > > > \ > > > > + /* Cache 'ht' to prevent multiple evaluation of functions. */ > > > > \ > > > > + for (__typeof__(ht) _ihash_items_ht_##item = (ht); > > > > \ > > > > + _ihash_items_ht_##item != NULL; > > > > \ > > > > + _ihash_items_ht_##item = NULL) > > > > \ > > > > + for (_hurd_ihash_item_t item = _ihash_items_ht_##item->size > > > > \ > > > > + ? &_ihash_items_ht_##item->items[0] > > > > : 0; \ > > > > + _ihash_items_ht_##item->size > > > > \ > > > > + && (size_t) ((item) - &_ihash_items_ht_##item->items[0]) > > > > \ > > > > + < _ihash_items_ht_##item->size; > > > > \ > > > > + (item)++) > > > > \ > > > > + if ((item)->value != _HURD_IHASH_EMPTY && > > > > \ > > > > + (item)->value != _HURD_IHASH_DELETED) > > > > > > > > /* Remove the entry with the key KEY from the hash table HT. If such > > > > an entry was found and removed, 1 is returned, otherwise 0. */ > > > > -- > > > > 2.53.0 > > > > > > > > > > > > > > -- > > > Samuel > > > <s> T'as pas de portable ? > > > <m> J'ai un nokia, dans le bassin d'arcachon > > > From b7734504414225f447e79ad4d206bc9a473b1cc7 Mon Sep 17 00:00:00 2001 > > From: Milos Nikic <[email protected]> > > Date: Sat, 28 Mar 2026 16:44:43 -0700 > > Subject: [PATCH] Do not merge, lets see how break behaves. > > > > --- > > libdiskfs/node-cache.c | 18 ++++++++++++++++++ > > 1 file changed, 18 insertions(+) > > > > diff --git a/libdiskfs/node-cache.c b/libdiskfs/node-cache.c > > index 0157d55d..51a9c0bd 100644 > > --- a/libdiskfs/node-cache.c > > +++ b/libdiskfs/node-cache.c > > @@ -228,6 +228,24 @@ diskfs_node_iterate (error_t (*fun)(struct node *)) > > get called. */ > > refcounts_ref (&node->refcounts, NULL); > > } > > + > > + /* Small test for behavior of break inside HURD_IHASH_ITERATE */ > > + int loop_executions = 0; > > + HURD_IHASH_ITERATE (&nodecache, test_val) > > + { > > + loop_executions++; > > + /* Escapes the inner loop. Will it escape the "outer loop"? */ > > + break; > > + } > > + > > + > > + char buffer[100]; > > + snprintf(buffer, sizeof(buffer), "We have %i of executions.\n", > > loop_executions); > > + mach_print(buffer); > > + /* If break doesn't work this will be more than 1...stop everything */ > > + if (loop_executions > 1) > > + exit (1); > > + > > pthread_rwlock_unlock (&nodecache_lock); > > > > p = node_list; > > -- > > 2.53.0 > > > > > -- > Samuel > > No manual is ever necessary. > May I politely interject here: BULLSHIT. That's the biggest Apple lie of all! > (Discussion in comp.os.linux.misc on the intuitiveness of interfaces.)
From 23ee39904b8de0e53700b8ff8c587da7e769c6aa Mon Sep 17 00:00:00 2001 From: Milos Nikic <[email protected]> Date: Fri, 27 Mar 2026 22:02:43 -0700 Subject: [PATCH v2] libihash: Prevent multiple evaluation in iteration macros In HURD_IHASH_ITERATE and HURD_IHASH_ITERATE_ITEMS, the hash table argument (ht) was evaluated multiple times during loop initialization and condition checking. If a caller passed an expression with side effects (e.g., a function call returning a table pointer), it would be executed repeatedly per loop, leading to performance degradation or unintended behavior. Fix this by wrapping the iterators in a dummy outer `for` loop that evaluates and caches `(ht)` exactly once using `__typeof__`. Token pasting (##) is used to generate a unique cache variable name based on the caller's item identifier, safely allowing these macros to be nested without variable shadowing or naming collisions. --- libihash/ihash.h | 68 ++++++++++++++++++++++++++++++------------------ 1 file changed, 43 insertions(+), 25 deletions(-) diff --git a/libihash/ihash.h b/libihash/ihash.h index 80679f14..b1e0e362 100644 --- a/libihash/ihash.h +++ b/libihash/ihash.h @@ -309,30 +309,41 @@ hurd_ihash_value_t hurd_ihash_locp_find (hurd_ihash_t ht, variable. We can define variables inside the for-loop initializer (C99), but - we can only use one basic type to do that. We can not use two - for-loops, because we want a break statement inside the iterator - block to terminate the operation. So we must have both variables - of the same basic type, but we can make one (or both) of them a - pointer type. - - The pointer to the value can be used as the loop variable. This is - also the first element of the hash item, so we can cast the pointer + we can only use one basic type to do that. Traditionally, we could + not use two for-loops to get around this, because a break statement + inside the iterator block would only terminate the inner loop. + + However, to prevent multiple evaluations of the HT argument, we now + use an outer dummy for-loop to cache the table pointer. This outer + loop is designed to run exactly once. If a break statement is used + inside the iterator block, it terminates the inner loop, and control + passes to the outer loop's increment step. This sets the cache pointer + to NULL, instantly failing the outer loop condition and terminating + the entire macro. + + The pointer to the value can be used as the inner loop variable. This + is also the first element of the hash item, so we can cast the pointer freely between these two types. The pointer is only dereferenced after the loop condition is checked (but of course the value the pointer pointed to must not have an influence on the condition result, so the comma operator is used to make sure this subexpression is always true). */ -#define HURD_IHASH_ITERATE(ht, val) \ - for (hurd_ihash_value_t val, \ - *_hurd_ihash_valuep = (ht)->size ? &(ht)->items[0].value : 0; \ - (ht)->size \ - && (size_t) ((_hurd_ihash_item_t) _hurd_ihash_valuep \ - - &(ht)->items[0]) \ - < (ht)->size \ - && (val = *_hurd_ihash_valuep, 1); \ - _hurd_ihash_valuep = (hurd_ihash_value_t *) \ - (((_hurd_ihash_item_t) _hurd_ihash_valuep) + 1)) \ - if (val != _HURD_IHASH_EMPTY && val != _HURD_IHASH_DELETED) +#define HURD_IHASH_ITERATE(ht, val) \ + /* Cache 'ht' to prevent multiple evaluation of functions. */ \ + for (__typeof__(ht) _ihash_ht_##val = (ht); \ + _ihash_ht_##val != NULL; \ + _ihash_ht_##val = NULL) \ + for (hurd_ihash_value_t val, \ + *_ihash_p_##val = _ihash_ht_##val->size \ + ? &_ihash_ht_##val->items[0].value : 0; \ + _ihash_ht_##val->size \ + && (size_t) ((_hurd_ihash_item_t) _ihash_p_##val \ + - &_ihash_ht_##val->items[0]) \ + < _ihash_ht_##val->size \ + && (((val) = *_ihash_p_##val), 1); \ + _ihash_p_##val = (hurd_ihash_value_t *) \ + (((_hurd_ihash_item_t) _ihash_p_##val) + 1)) \ + if ((val) != _HURD_IHASH_EMPTY && (val) != _HURD_IHASH_DELETED) /* Iterate over all elements in the hash table making both the key and the value available. You use this macro with a block, for example @@ -344,12 +355,19 @@ hurd_ihash_value_t hurd_ihash_locp_find (hurd_ihash_t ht, The block will be run for every element in the hash table HT. The key and value of the current element is available as ITEM->key and ITEM->value. */ -#define HURD_IHASH_ITERATE_ITEMS(ht, item) \ - for (_hurd_ihash_item_t item = (ht)->size? &(ht)->items[0]: 0; \ - (ht)->size && item - &(ht)->items[0] < (ht)->size; \ - item++) \ - if (item->value != _HURD_IHASH_EMPTY && \ - item->value != _HURD_IHASH_DELETED) +#define HURD_IHASH_ITERATE_ITEMS(ht, item) \ + /* Cache 'ht' to prevent multiple evaluation of functions. */ \ + for (__typeof__(ht) _ihash_items_ht_##item = (ht); \ + _ihash_items_ht_##item != NULL; \ + _ihash_items_ht_##item = NULL) \ + for (_hurd_ihash_item_t item = _ihash_items_ht_##item->size \ + ? &_ihash_items_ht_##item->items[0] : 0; \ + _ihash_items_ht_##item->size \ + && (size_t) ((item) - &_ihash_items_ht_##item->items[0]) \ + < _ihash_items_ht_##item->size; \ + (item)++) \ + if ((item)->value != _HURD_IHASH_EMPTY && \ + (item)->value != _HURD_IHASH_DELETED) /* Remove the entry with the key KEY from the hash table HT. If such an entry was found and removed, 1 is returned, otherwise 0. */ -- 2.53.0
