Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17909 )
Change subject: [master] KUDU-1885: re-resolve TSDescriptor proxies on network error ...................................................................... Patch Set 5: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/17909/5/src/kudu/integration-tests/dns_alias-itest.cc File src/kudu/integration-tests/dns_alias-itest.cc: http://gerrit.cloudera.org:8080/#/c/17909/5/src/kudu/integration-tests/dns_alias-itest.cc@384 PS5, Line 384: to the tserver with a new address IIRC, catalog manager doesn't assign tablet replicas to a tablet server it knows is down (controlled by the --tserver_unresponsive_timeout_ms). Maybe, add a blurb explaining that catalog manager isn't able to see the change in the state of the tserver because of the default setting for --tserver_unresponsive_timeout_ms ? BTW, did you see any flakiness in this scenario when running in TSAN configuration? If so, does it make sense to increase --tserver_unresponsive_timeout_ms setting for master? http://gerrit.cloudera.org:8080/#/c/17909/5/src/kudu/integration-tests/dns_alias-itest.cc@421 PS5, Line 421: ASSERT_OK(tserver_proxy->ListTablets(req, &resp, &controller)); nit: does it make sense to add check for resp.has_error() as well just for easier debugging if the call fails? -- To view, visit http://gerrit.cloudera.org:8080/17909 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6245f75a232fd4827de684cfc04d6b6e53b7ddef Gerrit-Change-Number: 17909 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Bankim Bhavsar <ban...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 08 Oct 2021 19:30:18 +0000 Gerrit-HasComments: Yes