[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/7467 ) Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. Abandoned Will abandon for now. I linked the CR from the JIRA so it can be found. -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-Change-Number: 7467 Gerrit-PatchSet: 2 Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Henry Robinson has posted comments on this change. Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. Patch Set 2: Thanks for the review, btw - I'm working on some ergonomic improvements before resubmitting which is why this is taking a little while to resurface. -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Tim Armstrong has posted comments on this change. Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/7467/2//COMMIT_MSG Commit Message: Line 9: Googletest supports sharded execution, where multiple independent IIRC ctest also supports running multiple tests in parallel. How would the sharding fit with that? Would we shard big tests and run smaller tests in parallel? Line 30: Each shard process logs to its own directory 'foo_test_shard_N' under How would I know which output file to look at? http://gerrit.cloudera.org:8080/#/c/7467/2/bin/run-backend-tests.sh File bin/run-backend-tests.sh: Line 40: # mode. This includes all sharded tests, but also the non-sharded versions of any test The approach seems reasonable unless there's some built-in way to do this. Line 72: set +e What's this for? http://gerrit.cloudera.org:8080/#/c/7467/2/bin/run-sharded-test.sh File bin/run-sharded-test.sh: PS2, Line 35: NUM_SHARDS We should probably document this along with the input args. -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Henry Robinson has posted comments on this change. Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. Patch Set 2: Only in the sense that I was hoping for some feedback on the approach (e.g. is it a good idea, running N test processes at the same time, logging to different places - will it be more painful to find an error message after the fact)? -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Tim Armstrong has posted comments on this change. Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. Patch Set 2: Is this still a preview only? -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Henry Robinson has posted comments on this change. Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. Patch Set 2: Any takers for a review? This patch reduces the be-test end-to-end time from 13.5 minutes to 9.5 minutes on my machine (and there's much more parallelism to be had if we break up some of the larger test cases). -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Henry Robinson has uploaded a new patch set (#2). Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode Googletest supports sharded execution, where multiple independent processes can run a subset of a test suite in parallel on the same machine or different machines. This patch adds preliminary support for doing so for some of our backend tests, where the parallelism speedup is worth the extra overhead. How we do this: for every backend test 'foo', add a new test target in CMake, called 'foo_sharded'. That target executes a wrapper script, bin/run-sharded-test.sh, which runs N copies of the test in parallel, with the environment correctly set up to enable sharding. You can run a sharded test like this: ctest -R "foo_sharded" Then, in bin/run-backend-tests.sh, keep a whitelist of tests that can be successfully run in sharded mode (i.e. there are no known conflicts from running multiple copies at once, and the speedup is worth the overhead). Use that list to run ctest twice, once in serial mode with no sharded tests (using -E to exclude those tests) and once in sharded mode running only the whitelisted sharded tests. Each shard process logs to its own directory 'foo_test_shard_N' under the usual logs/be_tests/ root. On my desktop machine, this improves expr-test runtime from ~10 minutes to about 5 minutes. More importantly, this opens up more opportunity for easy latency improvements if we break the large test cases into smaller ones that are parallelisable. Change-Id: I865db25b07728f3886133316ded5122c60490967 --- M be/CMakeLists.txt M bin/run-backend-tests.sh A bin/run-sharded-test.sh 3 files changed, 98 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7467/2 -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/7467 Change subject: (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode .. (PREVIEW) IMPALA-5684: Optionally run be tests in sharded mode Googletest supports sharded execution, where multiple independent processes can run a subset of a test suite in parallel on the same machine or different machines. This patch adds preliminary support for doing so for some of our backend tests, where the parallelism speedup is worth the extra overhead. How we do this: for every backend test 'foo', add a new test target in CMake, called 'foo_sharded'. That target executes a wrapper script, bin/run-sharded-test.sh, which runs N copies of the test in parallel, with the environment correctly set up to enable sharding. You can run a sharded test like this: ctest -R "foo_sharded" Then, in bin/run-backend-tests.sh, keep a whitelist of tests that can be successfully run in sharded mode (i.e. there are no known conflicts from running multiple copies at once, and the speedup is worth the overhead). Use that list to run ctest twice, once in serial mode with no sharded tests (using -E to exclude those tests) and once in sharded mode running only the whitelisted sharded tests. Each shard process logs to its own directory 'foo_test_shard_N' under the usual logs/be_tests/ root. On my desktop machine, this improves expr-test runtime from ~10 minutes to about 5 minutes. More importantly, this opens up more opportunity for easy latency improvements if we break the large test cases into smaller ones that are parallelisable. Change-Id: I865db25b07728f3886133316ded5122c60490967 --- M be/CMakeLists.txt M bin/run-backend-tests.sh A bin/run-sharded-test.sh 3 files changed, 97 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7467/1 -- To view, visit http://gerrit.cloudera.org:8080/7467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I865db25b07728f3886133316ded5122c60490967 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson