mikemccand commented on issue #15662: URL: https://github.com/apache/lucene/issues/15662#issuecomment-3862344094
Actually @epotyom (I think?) was able to run even the nightly benchmarks in his own dev area maybe? Is there a doc to share on what you had to do? The nightly tasks/indices ([`src/python/nightlyBench.py` in luceneutil](https://github.com/mikemccand/luceneutil/blob/main/src/python/nightlyBench.py)) is configured specifically / different indices / different tasks than a "normal" default dev area run -- it builds a fixed search index with one threads and vectors, facets, doc values, etc., and runs the nightly tasks. Maybe something specific to that nightly benchmark is necessary to repro elsewhere? Seems unlikely but possible. Also, all of the nightly runs use the throughput GC (`ParallelGC`) ... my heart stopped briefly when I saw that because of this weird bug that impacted Amazon's product search Lucene service (we first upgraded to JDK 25 and our indexers hit GC tailspin disaster (100% heap in use, unproductive and slow GC), and never finished full indexing pass). Here's a [dev list thread](https://lists.apache.org/thread/qrohqcvn9p82cxcpznj2l2htjg9fon3n) about it, and [JDK bug](https://bugs.openjdk.org/browse/JDK-8375467). BUT for us (Amazon) it seemed to only happen when COH (Lilliput, Compact Object Headers) is enabled, and I just checked with `-XX:+PrintFlagsFinal` that the nightly benchy Java command-line has COH disabled still. So, net/net I don't think that JDK bug is what's ailing nightly benchy... The current ad-hoc benchy (testing @uschindler's idea) is off and running, slowly building the fixed-segments search index (the index that it runs all of the nightly tasks against). It uses a single thread, and `SerialMergeScheduler`, to build this index, sigh, so we have precisely same index geometry (same docs in each segment in same order), to reduce at least that one source of noise from night to night. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
