Thanks for having a look at this. On Mon, 21 Jun 2021 at 05:02, Zhihong Yu <z...@yugabyte.com> wrote: > + * GH_ELEMENT_TYPE defines the data type that the hashtable stores. Each > + * instance of GH_ELEMENT_TYPE which is stored in the hash table is done so > + * inside a GH_SEGMENT. > > I think the second sentence can be written as (since done means stored, it is > redundant):
I've rewords this entire paragraph slightly. > Each instance of GH_ELEMENT_TYPE is stored in the hash table inside a > GH_SEGMENT. > > + * Macros for translating a bucket's index into the segment and another to > + * determine the item number within the segment. > + */ > +#define GH_INDEX_SEGMENT(i) (i) / GH_ITEMS_PER_SEGMENT > > into the segment -> into the segment number (in the code I see segidx but I > wonder if segment index may cause slight confusion). I've adjusted this comment > + GH_BITMAP_WORD used_items[GH_BITMAP_WORDS]; /* A 1-bit for each used item > + * in the items array */ > > 'A 1-bit' -> One bit (A and 1 mean the same) I think you might have misread this. We're storing a 1-bit for each used item rather than a 0-bit. If I remove the 'A' then it's not clear what the meaning of each bit's value is. > + uint32 first_free_segment; > > Since the segment may not be totally free, maybe name the field > first_segment_with_free_slot I don't really like that. It feels too long to me. > + * This is similar to GH_NEXT_ONEBIT but flips the bits before operating on > + * each GH_BITMAP_WORD. > > It seems the only difference from GH_NEXT_ONEBIT is in this line: > > + GH_BITMAP_WORD word = ~(words[wordnum] & mask); /* flip bits */ > > If a 4th parameter is added to signify whether the flipping should be done, > these two functions can be unified. I don't want to do that. I'd rather have them separate to ensure the compiler does not create any additional needless branching. Those functions are pretty hot. > + * next insert will store in this segment. If it's already pointing to an > + * earlier segment, then leave it be. > > The last sentence is confusing: first_free_segment cannot point to some > segment and earlier segment at the same time. > Maybe drop the last sentence. I've adjusted this comment to become: * Check if we need to lower the next_free_segment. We want new inserts * to fill the lower segments first, so only change the first_free_segment * if the removed entry was stored in an earlier segment. Thanks for having a look at this. I'll attach an updated patch soon. David