Re: [PR] BTree engine term cache [couchdb]

via GitHub Tue, 19 Aug 2025 15:36:29 -0700


rnewson commented on PR #5625:
URL: https://github.com/apache/couchdb/pull/5625#issuecomment-3202493282


   I wonder also if supplying the fadvise hints as we read data might give a 
boost without adding an in-vm cache. (e.g, `WILLNEED` for non-leaf btree nodes 
and `DONTNEED` for all leaf data (doc bodies, attachment chunks).
   
   for _all_docs and _changes traversals (which under the covers are walking 
the by_id and by_seq trees in their respective orders) I wonder why caching 
does help. We "should" read the btree optimally there, we're not skipping 
around the btree. there's no seek penalty these days, etc. It might be a code 
wart we could solve. Or, perhaps, we could cache in-process more safely/easily. 
repeated calls to pread_term with the same offset by the same process that has 
an open #db record will yield the same result each time (until a reopen).
   
   I'd focus on benchmarking and more statistics, so we know what we're 
comparing. We'd want to know how many times we read the same term from disk in 
the current codebase versus how often we avoided it with the cache.
   
   Finally, ets_lru exists and has both size and time expiration. why invent a 
new way?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] BTree engine term cache [couchdb]

Reply via email to