GitHub user numinnex added a comment to the discussion: PR Proposal: 
Cache-Line-Friendly In-Memory Segment Index

Hi, we were thinking about this when we merged together offset and timestamp 
indexes into singular structure and we've found that this does not scale really 
well with huge partition count. 

Our indexes aren't sparse, which means we store them per message, rather than 
per N messages (batch), thus the memory overhead will be pretty big.

We are not concerned about cache misses, as our indexes are cached in memory 
per segment and since they are iterated through fairly frequently, the odds of 
a cache miss are fairly small (didn't measure this yet, as we are not at this 
stage of optimization).

Also you've mentioned binary search, which we use already and afaik binary 
search isn't the most cache friendly searching algorithm, unless you use 
something like eytzinger layout, which we don't want to do as it adds a lot of 
complexity and it doesn't work really well with dynamically sized containers, 
as everytime the collection has to scale up, it has to be reallocated and 
copied over and constructing eytzinger layout array is very expensive. 

GitHub link: 
https://github.com/apache/iggy/discussions/2381#discussioncomment-15015777

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to