@inv2004 \- if all you want to do is use this as an exercise to learn / teach this specific benchmark, you could be sure to pack the whole problem into an L2-CPU by using the techniques of <https://github.com/c-blake/adix/blob/master/util/lfreq.nim> to **_pack a whole hash cell into just 8 bytes_** (4X better than the 32 as per my earlier, easier way note).. probably 6 for the length of a key, 1 for "overflow or not" and 20 for an offset into up to 1 MiB data, and then 64-27=37 for up to 37 topic tags. With ~15K posts you could probably run a hash table at 93% full in 128KiB, spilling out of that 37 for posts using tags past the first 37 (which should be / could be sorted roughly by expected popularity).
**_To really run at 93%_** full, you would want to pay the re-organize-on-insert Robin Hood hash cost (not yet a part of `adix/oats.nim`). Otherwise more slack space in the table likely pays off. Also, `adix/tests/wf.nim` also shows an example of doing a **_multi-core table build/merge_**. Packing matters more IF it can let workers use core-private memory. There are many "exercises for the ambitious reader" here, which I suppose likely ultimately drove this whole problem choice. What is best **_all depends_** on various distributions and disclaimers, though, many of which may not even generalize from sample to sample (a battle database impls are often fighting which is why "learned indices" are largely "cheating"). Anyway, it's at least slightly different than the typical Knuth-McIlroy example calculation @Zoom helped to optimize back in the day. And Nim is a fine language with which to implement a database (a test used by some to define a "systems PL"). But beginning programmers are likely, as Araq already mentioned, better off "just using" some DB impl unless they have good reason to do caffeine-fueled rage optimization. I am merely outlining some fruitful lines of thought for those near an espresso machine. ;-)
