Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11608 )
Change subject: IMPALA-7597: wraps retries around Frontend metadata operations. ...................................................................... Patch Set 5: (5 comments) thx for the suggestions. I brought up lambdas before but need java 8-- added a todo to switch it over, which should make this part nicer. http://gerrit.cloudera.org:8080/#/c/11608/4/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11608/4/fe/src/main/java/org/apache/impala/service/Frontend.java@681 PS4, Line 681: RI > I meant, log the exception stack trace too. LOG.warn("foo", e) instead of L Done http://gerrit.cloudera.org:8080/#/c/11608/5/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11608/5/fe/src/main/java/org/apache/impala/service/Frontend.java@754 PS5, Line 754: List<? extends FeDb> dbs = getCatalog().getDbs(matcher); > mention no need to retry for this call path? added a comment in the retry wrapper that explains when retries are not needed. http://gerrit.cloudera.org:8080/#/c/11608/5/fe/src/main/java/org/apache/impala/service/Frontend.java@1463 PS5, Line 1463: FeTable table = getCatalog().getTable(request.getTable_name().getDb_name(), > this one? yup, wrapped this one and another. http://gerrit.cloudera.org:8080/#/c/11608/4/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/11608/4/tests/custom_cluster/test_local_catalog.py@280 PS4, Line 280: catalogd_args="--catalog_topic_mode=minimal") > Ya, i meant to replace that "alter" with "refresh" which bumps the version good suggestion. simpler now. http://gerrit.cloudera.org:8080/#/c/11608/5/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/11608/5/tests/custom_cluster/test_local_catalog.py@248 PS5, Line 248: attempt < 100 > Is this needed if we join for 30s? I saw that without the patch, the failure comes up pretty quickly. With the patch, there are lots of successes. The more queries run, the more memory was taken up to the point of oom'ing the test runner. So I've capped the number of queries but still guard using a timeout for the case where we're stuck in a longer retry loop. -- To view, visit http://gerrit.cloudera.org:8080/11608 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I43a21571d54a7966c5c68bea1fe69dbc62be2a0b Gerrit-Change-Number: 11608 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Thu, 18 Oct 2018 05:43:26 +0000 Gerrit-HasComments: Yes
