IOContext, ReadAdvice, madvise

Michael Sokolov Thu, 07 Aug 2025 14:03:00 -0700

I want to raise an issue here that has come up before which is about the 
choices we have made to apply madvise flags in an opinionated way.


In our environment, the choices Lucene is making are really detrimental to our 
indexing throughput. In the past we had disabled this by subclassing 
MMapDirectory (a super expert workaround). Somehow we missed the fact that 
changes in Lucene 10 made this workaround ineffective and it took us a while to 
find the new recommended workaround, which is a system property setting. In an 
excess (perhaps) of caution, instead of the sysprop we've opted to modify a 
Lucene fork to disable this in a more fundamental way (cauterizing 
PosixNativeAccess.madvise), I think hoping that this might insulate us against 
future changes in this area? But we don't want to have to engage in this kind 
of paranoid programming!

Lucene has made a choice that may be good for some environments or operating 
conditions, but not for others, and the difference can be pretty dramatic. I'm 
not sure how we came to decide that the current default is better than the old 
one? I'll also say I don't really understand why the MADV_RANDOM is hurting us 
so much, but it does cause our merge operations to get much slower, fall 
behind, and pile up to the extent that low-resource environments (that used to 
work fine with MADV_NORMAL) are crumbling under the weight of pending merges.

Another thread is that the multiple layers of abstraction we have today 
(IOContext + ReadAdvice + DataAccessHint + FileDataHint + madvise) make it 
quite difficult to reason about what OS behavior is happening for any given IO 
operation. I read the IOContext javadocs but they only give general information 
and don't explain how hints are used to determine an actual MADV flag. In what 
circumstance should I use a hint vs an advice? The IndexInput.updateReadAdvice 
javadoc actually says "provide a hint" but accepts an advice.

So to summarize:

- Selflishly, I don't like the current default MADV setting Lucene has chosen, 
although I recognize it's possible it may work for some use case. But I do 
wonder at some level if the OS's default shouldn't be a good default setting?
- I find the Lucene API in this area confusing and not well-documented. 
Understanding that the IO contexts are many and varied and could profitably be 
tuned differently, I wonder if we could have a centralized and first-class API 
(not a system property) that can be used to set a memory access profile of some 
sort?

I think some evidence supporting the choices we have made today (why is the 
default MADV_RANDOM) would be helpful as a starting point. Maybe there is a 
past thread I overlooked?

IOContext, ReadAdvice, madvise

Reply via email to