The trade-off does not sound simple to me. This approach could lead to having more segments overall, making search requests and updates potentially slower and more I/O-intensive since they have to iterate over more segments? I'm not saying this is a bad idea, but it could have unexpected side-effects.
Do you actually have a high commit rate or a high reopen rate (DirectoryReader.open(IndexWriter))? Maybe reopening instead of committing (and still committing, but less frequently) would decrease the I/O load since NRT segments might never need to be actually written to disk if they are merged before the next commit happens and you give enough memory to the filesystem cache. Le mar. 1 août 2017 à 10:59, Tommaso Teofili <[email protected]> a écrit : > Hi all, > > lately I am looking a bit closer at merge policies, of course particularly > at the tiered one, and I was wondering if we can mitigate the amount of > possibly avoidable merges in high commit rates scenarios, especially when a > high percentage of the commits happens on same docs. > I've observed several evolutions of merges in such scenarios and it seemed > to me the merge policy was too aggressive in merging, causing a large IO > overhead. > I've then tried the same with a merge policy which was tentatively looking > at commit rates and skipping merges if such a rate is higher than a > threshold which seemed to give slightly better results in reducing the > unneeded IO caused by avoidable merges. > > I know this is a bit abstract but I would like to know if anyone has any > ideas or plans about mitigating the merge overhead in general and / or in > similar cases. > > Regards, > Tommaso > > > >
