I think there are both issues with your benchmark that make results less clear than you think {eg. you do not do PGO with a fast hash function - that shrinks ratios 3x for me, some operation time ratios seem suspicious {why is del 4x not 2x slower than write? better prefetching? PRNGs do not actually block prefetching}, and you do not report memory usage/overhead of memory-optimized ditab vs spacious SparseSet} and also optimizations of your SparseSet that you missed {eg. `clear` can be fixed cost even for 20 MiEntry, as your geeksforgeeks link & you both mentioned}.
Just as one data point, though, for your initTable 20971520 del 4.91s 203.81m initDITab 20971520 del 177.68% 2.76s 362.12m initSparseSet 20971520 del 466.64% 1.05s 951.05m Run with just that PGO/faster hash in a few minutes I got almost 3x better perf ratio: initTable 20971520 del 932.34ms 1.07 initDITab 20971520 del 93.50% 997.21ms 1.00 initSparseSet 20971520 del 171.33% 544.18ms 1.84 Run Notice how, as I alluded to, Table is not _that_ much slower out of cache { and space-optimized ditab even slower, as alluded to, though I would love if someone could speed up `sequint` :-) }. 1.7x may matter or re-pay for itself in memory savings to make something fit in RAM/L3/L2/L1 that would not otherwise, etc. This is also just one pair of CPUs, my Skylake and whatever @jackhftang used which may also matter. I am not sure even the above numbers are meaningful, though it seems best to just raise methodological issues over at your linked github repo where interested readers can follow if they care, as it is likely to get even more into the detail weeds {and already uncovered at least one bug of mine. Oops! :-) }. I will do that later today/this weekend. It should be uncontroversial, though, that any one set of benchmarks like this is unlikely to decide the question "in context", especially not measuring memory use. Maybe the use case wants 20,000 tables of 20,000 entries or the key address space is so much easier to loosely bound than to tightly bound that the space optimization has more power. {Direct indexing/the sparse array is particularly vulnerable to this.} This is perhaps covered by your "functionalities, etc.", but it relates to the general an-application-itself-is-the-best-benchmark property which **_always_** bears reiteration. So, I'm all for people doing their own data structs as needed - Nim makes that super easy. Hooray! :-) But I _also_ agree with @jackhftang's [earlier comment](https://forum.nim-lang.org/t/7236#45750) that its best to do APIs/code structure first and then profile to see what needs work (a profile is usually better than microbenchmarks) rather than just assume Table will have a problem, as the linked YouTube video does in its motivation (before also naively overselling Bloom filters, but that is some whole other spiel). My later comments are mostly all just in support of that since it sounded like @b3liever was [still figuring out what he even needed/wanted](https://forum.nim-lang.org/t/7236#45749). Cheers, all.