GitHub user jonmeredith opened a pull request: https://github.com/apache/cassandra/pull/279
CASSANDRA-14790 Fix flaky LongBufferPoolTest The LongBufferPoolTest previously required significantly more heap memory on machines with higher core count. This PR - adds commands to the build system to allow running individual burn tests (in and outside junit) - fixes some race conditions that occur when the test is running under heavy memory pressure - changes the calculation for how much memory the ring-of-threads should allocate to be roughly double the pool size under test. It now completes with much less memory and should run fine on a builder. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jonmeredith/cassandra CASSANDRA-14790-3.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra/pull/279.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #279 ---- commit 0f55a39849ff4fa3b0f2b4f0d61678e08fa03a46 Author: Jon Meredith <jmeredithco@...> Date: 2018-10-04T23:08:52Z Add build targets for running burn tests. commit 00abab9ccbfd0d6a154c31a69c0780f142df5121 Author: Jon Meredith <jmeredithco@...> Date: 2018-10-04T23:10:02Z Harden LongBufferPoolTest against java.lang.OutOfMemoryError Catch all possible errors on the worker threads, otherwise if threads exit due to java.lang.OutOfMemoryError, the test either reports a lack of progress or the chunk recycling check fails. Add an explicit exit to the main on errors as the REQUEST-SCHEDULER thread is not marked as a daemon and prevents the test from exiting. Added some extra logging to clear up how large the buffer pool size is under test, as loading the default config prints `Global buffer pool is enabled, when pool is exahusted (max is 512 mb) it will allocate on heap` and that is not what gets tested. commit db755d706faa363a4973b371d68994db65b8be83 Author: Jon Meredith <jmeredithco@...> Date: 2018-10-07T19:47:49Z Harden LongBufferPoolTest against flaky Chunk recycle assertion The test as written does not guarantee that all Chunks will be recycled, just that it is likely that once during the 10s check cycle, worker threads sets the target memory size to zero (a one-in-16k chance). Instead, make the thread free at least once per cycle and only do the recycle test when all threads are known to have released all memory as they might exit. Also, check the return code from the burn producer and consumer threads in case they throw an assertion error. The current burn producer/consumer threads currently double-free each buffer, which occasionally triggers a double-release assertion check. commit b2b509a4ed35a27cc7a90f47cc8f47d2683c9ea1 Author: Jon Meredith <jmeredithco@...> Date: 2018-10-04T23:29:54Z Change target allocation size for LongBufferPoolSize. The prior method for calculating the amount of memory to try and allocate for the tests increased when it ran with more threads. The pool under test is a fixed size pool so the effect was to add pressure on heap allocations instead. On a 6-core i7 the test failed with -Xmx500m. This change restricts the total allocation size to a little over double the pool size regardless of the number of threads present. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org