Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13290 )
Change subject: Allow running backend tests sharded and in parallel ...................................................................... Patch Set 2: (2 comments) I think this is a good approach. In the unified binary world, you end up with two sharding approaches. One is that you tell CMake the number of shards and it splits up the tests appropriately. The other is that you do multiple ADD_BE_TEST calls with different gtest_filter values and set them to RUN_SERIAL false. The CMake route is more automatic and probably produces more uniform shards. The other route produces a more coherent set of tests in a single shard (and produces an executable script that runs a single shard), but requires more manual effort. CMake has some functionality to auto-detect google tests by running -list_tests on the gtest executable. I'm not sure how this maps to parallelism. Something to explore in the future now that the toolchain has a modern CMake. http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt@674 PS2, Line 674: "GTEST_OUTPUT=xml:$ENV{IMPALA_BE_TEST_LOGS_DIR}/${TARGET} > Ack One thing that I found with the unified binary is that gtest_filter causes gtest to output JUnitXML that contains the tests that didn't run with status="notrun". I suspect sharding does something similar. If it does, then we might need to pass the JUnitXML through bin/junitxml_prune_notrun.py. Some consumers of JUnitXML like Jenkins don't understand status="notrun" and this can throw off the list and count of tests run at the Jenkins level. A good test would be to run this on Jenkins and compare the number of tests seen through JUnitXML before and after (though I think we changed that slightly with the expr-test change). One option is to modify bin/gen-backend-test-script.py to be able to generate scripts that represent the shards. It would take in the actual executable (rather than assuming the unified test executable) and then extra args to specify the shard number and total shards. Not sure if it is worth the complication, but it gets you the "notrun" fix for free. I'm not sure any developer would run one of the shard scripts directly rather than using gtest_filter or ctest. http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt@675 PS2, Line 675: RUN_SERIAL > right. That's a difference from the way that one of the old "shard and para I think this is the right thing to do. We want parallelism to extend across test boundaries. -- To view, visit http://gerrit.cloudera.org:8080/13290 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d44475e3f5a45c806f00d5003eeadf59683b009 Gerrit-Change-Number: 13290 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Fri, 17 May 2019 20:17:46 +0000 Gerrit-HasComments: Yes
