[kudu-CR] catalog manager tsk-itest: ensure that test eventually makes progress
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8567 ) Change subject: catalog_manager_tsk-itest: ensure that test eventually makes progress .. catalog_manager_tsk-itest: ensure that test eventually makes progress This test previously tried to introduce a lot of master leader elections by setting a very low heartbeat and failure interval. This worked, but sometimes worked so well that the test never made progress and couldn't obtain a stable leader long enough to create a table. This patch changes the test to instead use a separate thread which triggers elections manually on all the leaders. The elections start off very frequent and then back off as the test progresses to ensure that by the end, the leaders do actually make progress. I verified that this still covers the case of a failed write when writing TSKs by changing the RETURN_NOT_OK to a CHECK_OK when storing the TSK. With the CHECK_OK, the test failed nearly immediately. Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Reviewed-on: http://gerrit.cloudera.org:8080/8567 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin--- M src/kudu/integration-tests/catalog_manager_tsk-itest.cc 1 file changed, 46 insertions(+), 20 deletions(-) Approvals: Kudu Jenkins: Verified Alexey Serbin: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Gerrit-Change-Number: 8567 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] catalog manager tsk-itest: ensure that test eventually makes progress
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8567 ) Change subject: catalog_manager_tsk-itest: ensure that test eventually makes progress .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Gerrit-Change-Number: 8567 Gerrit-PatchSet: 2 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 Nov 2017 23:40:17 + Gerrit-HasComments: No
[kudu-CR] catalog manager tsk-itest: ensure that test eventually makes progress
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8567 ) Change subject: catalog_manager_tsk-itest: ensure that test eventually makes progress .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8567/1/src/kudu/integration-tests/catalog_manager_tsk-itest.cc File src/kudu/integration-tests/catalog_manager_tsk-itest.cc: http://gerrit.cloudera.org:8080/#/c/8567/1/src/kudu/integration-tests/catalog_manager_tsk-itest.cc@18 PS1, Line 18: #include > nit: maybe, replace with and put along with the rest of C++ heade Done http://gerrit.cloudera.org:8080/#/c/8567/1/src/kudu/integration-tests/catalog_manager_tsk-itest.cc@20 PS1, Line 20: #include > warning: #includes are not sorted properly [llvm-include-order] Done -- To view, visit http://gerrit.cloudera.org:8080/8567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Gerrit-Change-Number: 8567 Gerrit-PatchSet: 2 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 Nov 2017 21:48:39 + Gerrit-HasComments: Yes
[kudu-CR] catalog manager tsk-itest: ensure that test eventually makes progress
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8567 to look at the new patch set (#2). Change subject: catalog_manager_tsk-itest: ensure that test eventually makes progress .. catalog_manager_tsk-itest: ensure that test eventually makes progress This test previously tried to introduce a lot of master leader elections by setting a very low heartbeat and failure interval. This worked, but sometimes worked so well that the test never made progress and couldn't obtain a stable leader long enough to create a table. This patch changes the test to instead use a separate thread which triggers elections manually on all the leaders. The elections start off very frequent and then back off as the test progresses to ensure that by the end, the leaders do actually make progress. I verified that this still covers the case of a failed write when writing TSKs by changing the RETURN_NOT_OK to a CHECK_OK when storing the TSK. With the CHECK_OK, the test failed nearly immediately. Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b --- M src/kudu/integration-tests/catalog_manager_tsk-itest.cc 1 file changed, 46 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/8567/2 -- To view, visit http://gerrit.cloudera.org:8080/8567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Gerrit-Change-Number: 8567 Gerrit-PatchSet: 2 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] catalog manager tsk-itest: ensure that test eventually makes progress
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8567 ) Change subject: catalog_manager_tsk-itest: ensure that test eventually makes progress .. Patch Set 1: Code-Review+2 (2 comments) Thank you for fixing the flake. It was more stable when we use separate threads for sending Raft heartbeats. I did a couple updates on the test, increasing the HB interval, so the test became stable. However, I rolled that back with 9c1997a after fix for KUDU-2149 was committed, just to have this test as a canary for such things like KUDU-2149. I think this fix definitely is more comprehensive approach compared with just increasing Raft HB interval. http://gerrit.cloudera.org:8080/#/c/8567/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8567/1//COMMIT_MSG@10 PS1, Line 10: but : sometimes worked so well that the test never made progress and couldn't : obtain a stable leader long enough to create a table lol :) http://gerrit.cloudera.org:8080/#/c/8567/1/src/kudu/integration-tests/catalog_manager_tsk-itest.cc File src/kudu/integration-tests/catalog_manager_tsk-itest.cc: http://gerrit.cloudera.org:8080/#/c/8567/1/src/kudu/integration-tests/catalog_manager_tsk-itest.cc@18 PS1, Line 18: #include nit: maybe, replace with and put along with the rest of C++ headers? -- To view, visit http://gerrit.cloudera.org:8080/8567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Gerrit-Change-Number: 8567 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Thu, 16 Nov 2017 06:36:59 + Gerrit-HasComments: Yes
[kudu-CR] catalog manager tsk-itest: ensure that test eventually makes progress
Hello Alexey Serbin, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8567 to review the following change. Change subject: catalog_manager_tsk-itest: ensure that test eventually makes progress .. catalog_manager_tsk-itest: ensure that test eventually makes progress This test previously tried to introduce a lot of master leader elections by setting a very low heartbeat and failure interval. This worked, but sometimes worked so well that the test never made progress and couldn't obtain a stable leader long enough to create a table. This patch changes the test to instead use a separate thread which triggers elections manually on all the leaders. The elections start off very frequent and then back off as the test progresses to ensure that by the end, the leaders do actually make progress. I verified that this still covers the case of a failed write when writing TSKs by changing the RETURN_NOT_OK to a CHECK_OK when storing the TSK. With the CHECK_OK, the test failed nearly immediately. Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b --- M src/kudu/integration-tests/catalog_manager_tsk-itest.cc 1 file changed, 47 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/8567/1 -- To view, visit http://gerrit.cloudera.org:8080/8567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3ecda0c269225e7674bc384fee652576b110ae7b Gerrit-Change-Number: 8567 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin