I happened to stumble across this chart https://home.apache.org/~mikemccand/lucenebench/PKLookup.html showing a pretty drastic drop in this benchmark on 5/13. I looked at the commits between the previous run and this one and did some investigation, trying to do some git bisect to find the problem using benchmarks as a test, but it proved to be quite difficult due to a breaking change re: MemoryCodec that also required corresponding changes in benchmark code.
In the end, I think removing MemoryCodec is what caused the drop in perf here, based on this comment in benchmark code: '2011-06-26' Switched to MemoryCodec for the primary-key 'id' field so that lookups (either for PKLookup test or for deletions during reopen in the NRT test) are fast, with no IO. Also switched to NRTCachingDirectory for the NRT test, so that small new segments are written only in RAM. I don't really understand the implications here beyond benchmarks, but it does seem that perhaps some essential high-performing capability has been lost? Is there some equivalent thing remaining after MemoryCodec's removal that can be used for primary keys? -Mike
