Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21076 )
Change subject: KUDU-3557 Simple way to change maximum thread cache size of tcmalloc. ...................................................................... Patch Set 7: (6 comments) http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc File src/kudu/integration-tests/memory_gc-itest.cc: http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc@135 PS7, Line 135: TEST_F(MemoryGcITest, TestLockContentionInVariousThreadCacheSize) { Thank you for adding this test. It seems the lock contention metric is quite sensitive to various scheduler anomalies and the overall load of a test node at least with the current settings for the test. The 'spinlock_contention_time' measures the lock contention in Kudu itself, so it doesn't track the lock contention in tcmalloc library. /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/memory_gc-itest.cc:240 Expected: (contention_1) >= (contention_2), actual: 8537501 vs 8662836 The failure is for the RELEASE build. Maybe, it's worth trying more disparate settings for --tcmalloc_max_total_thread_cache_bytes, having something like 1MB, 8MB, and 64MB for tablet servers at index 0, 1, and 2 correspondingly. Also, I don't think this test requires writing so much rows show the lock contention -- maybe consider reducing that. If that doesn't help to stabilize the test to be rock-solid, then I'd suggest removing the assertion on the contention numbers in the end of the test, and just output them for the information when the test is run manually. At least, this test verifies that the setting for the --tcmalloc_max_total_thread_cache_bytes flag is effective and shows decrease in lock contention in a majority of its runs. http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc@146 PS7, Line 146: Set the second max_total_thread_cache_bytes of the second tablet server to 32MB. nit: Set --max_total_thread_cache_bytes to 32MB for the second tablet server. http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc@152 PS7, Line 152: Set the third max_total_thread_cache_bytes of the second tablet server to 128MB. nit: Set --max_total_thread_cache_bytes to 128MB for the third tablet server. http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc@184 PS7, Line 184: Write some data for scan later. nit: Write some data to be scanned later on. http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc@200 PS7, Line 200: Start scan, then more memory will be allocated by tcmalloc. nit: Additional memory is allocated during scan operations below. http://gerrit.cloudera.org:8080/#/c/21076/7/src/kudu/integration-tests/memory_gc-itest.cc@207 PS7, Line 207: Sleep a long time to ensure memory consumed more. nit: Run the scan workload for a long time to let it allocate/deallocate a lot of memory. -- To view, visit http://gerrit.cloudera.org:8080/21076 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cb8c6ed6a8f24c63258ae65f8b841fe74b75308 Gerrit-Change-Number: 21076 Gerrit-PatchSet: 7 Gerrit-Owner: Song Jiacheng <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 28 Feb 2024 19:08:22 +0000 Gerrit-HasComments: Yes
