First, I should clarify my stance. I am not interested in development of SparseSet, but I am happy if in the end this can lead to a better library. Also, the `SparseSet` is copied from @b3liever I failed to import his lib from nimble via url and I don't have time to fix it and I just copy the code with one modification that renaming `delete` to `del`.
The test is mean to be quick and dirty and incomplete (originally I just want to spend 1h on it, but overtime anyway). And the margin of common operations are already large enough that I have confidence to reject your claims like this > The one other advantage you might see for direct indexing is the same average > vs worst case per-element time cost. That sounds a lot better than hash > table's expected worst case per element ~log(table size). But for big tables > of tiny objects you can fit many per cache line and that cache load dominates > lookup time. So, worst case random access for the hash table can be more like > 2x the time cost of the average (especially if you have Robin Hood re-org > activated), not log(N) as much. What's more, in the non-compact case, hashing > can achieve just 1 cache line fetch almost all the time, while direct > indexing will usually take 2 cache loads. adix/althashes.hashRoMu1() is also > so fast as to be almost free. So, depending upon scale/features it is easy to > imagine linearly probed hashing being up to 2x faster than direct indexing, > insignificantly more variable, and possibly much more memory efficient. I > think this is a situation where naive "big O" analysis can give misleading > expectations. >From my experience in debate, there is a technique called **focus shift**. If >A is talking topic X, I can talk about topic Y; and when A follow the topic Y, >I would talk about topic Z... In the end, I can drain all the energy of A and >then the focus lost. In real world, a lot of politician are doing similar >things. Back to our topic, you mentioned a lot about performance before and after I test about the speed; You talk about the memory usage; If I test about memory usage, would you talk about hash selections? If I test about hashes, would you talk about the key access patterns, the CPU architectures, the multi-level caches, the platforms, SMA, etc so many moving parts... In the end, the focus just lose, drain everyone energy, and no meaningful outcomes. You asked about the context, the context should be interpreted in the most typical situation: out-of-the-box container used by ordinary users. You pointed out I have methodological issues. I am an open-minded guy to admit mistakes and I know the tests are not serious, but I need to see more data and tests to be convinced. If there are so many factors, I think speed and space are important metrics. I guess by focusing on speed/space (or just speed) this would lead to a more meaningful discussion.