Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) ......................................................................
Patch Set 3: > So the GVO failed because there's a custom cluster test that runs > impala with a very short (1ms) timeout for Kudu operations and then > runs two queries, each of which it expects to timeout while > retrieving table metadata. > > Since we reuse clients now, the metadata ends up being fetched by > the first query (even though the operation times out, it apparently > still completes afterwards) and then the second query doesn't time > out since the metadata is already preset (both tests currently > touch the same table, but changing the second one to a different > table doesn't seem to fix the problem, presumably the Kudu client > loads more tables than explicitly requested). > > Some possible solutions: > - Only run one of the queries. This would cause some coverage to be > lost (we could replace it by adding a test that checks for timeout > behavior in the BE, which I don't think currently exists, and would > work since the BE and FE still use different clients) > - Run both of the tests in separate custom_cluster tests. > custom_cluster tests are expensive (this one runs for ~20s) so it > seems like a lot of test overhead for little benefit. > - Provide some way to invalidate the Kudu clients, eg. have > 'invalidate metadata' clear the list of clients in KuduUtil. We may > want to do something like this anyways since in the current version > of the patch once a KuduClient is created for a particular Kudu > master, that client will live on as long as the impalad does, and > if something goes wrong (eg. the automatic token refresh that the > Kudu team implemented doesn't work for some reason) you're > basically screwed and won't be able to do anything with that Kudu > master without restarting the impalad. Good catch, yes that seems like the test won't work after your change. Let's do this: 1) Change the tests (I think you'll have to update both kudu-timeouts-impalad and kudu-timeouts-catalogd) to only run the first for now. I agree it's not a huge amount of lost coverage. Just comment out the rest and leave a TODO to add them back, w/ a reference to the JIRA in #2. 2) File a JIRA to provide some way to invalidate the Kudu clients. It's a bit different from HMS metadata since it's not really per table, it's per Kudu server. I'm not sure yet if this is something that will be worthwhile to expose for general use (e.g. a well documented stmt like 'invalidate metadata'), or if we should try to do something that would be more covert, e.g. a web endpoint (hacky). -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: No
