Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE)
......................................................................


Patch Set 3:

> So the GVO failed because there's a custom cluster test that runs
 > impala with a very short (1ms) timeout for Kudu operations and then
 > runs two queries, each of which it expects to timeout while
 > retrieving table metadata.
 > 
 > Since we reuse clients now, the metadata ends up being fetched by
 > the first query (even though the operation times out, it apparently
 > still completes afterwards) and then the second query doesn't time
 > out since the metadata is already preset (both tests currently
 > touch the same table, but changing the second one to a different
 > table doesn't seem to fix the problem, presumably the Kudu client
 > loads more tables than explicitly requested).
 > 
 > Some possible solutions:
 > - Only run one of the queries. This would cause some coverage to be
 > lost (we could replace it by adding a test that checks for timeout
 > behavior in the BE, which I don't think currently exists, and would
 > work since the BE and FE still use different clients)
 > - Run both of the tests in separate custom_cluster tests.
 > custom_cluster tests are expensive (this one runs for ~20s) so it
 > seems like a lot of test overhead for little benefit.
 > - Provide some way to invalidate the Kudu clients, eg. have
 > 'invalidate metadata' clear the list of clients in KuduUtil. We may
 > want to do something like this anyways since in the current version
 > of the patch once a KuduClient is created for a particular Kudu
 > master, that client will live on as long as the impalad does, and
 > if something goes wrong (eg. the automatic token refresh that the
 > Kudu team implemented doesn't work for some reason) you're
 > basically screwed and won't be able to do anything with that Kudu
 > master without restarting the impalad.

Good catch, yes that seems like the test won't work after your change.

Let's do this:
1) Change the tests (I think you'll have to update both kudu-timeouts-impalad 
and kudu-timeouts-catalogd) to only run the first for now. I agree it's not a 
huge amount of lost coverage. Just comment out the rest and leave a TODO to add 
them back, w/ a reference to the JIRA in #2.
2) File a JIRA to provide some way to invalidate the Kudu clients. It's a bit 
different from HMS metadata since it's not really per table, it's per Kudu 
server. I'm not sure yet if this is something that will be worthwhile to expose 
for general use (e.g. a well documented stmt like 'invalidate metadata'), or if 
we should try to do something that would be more covert, e.g. a web endpoint 
(hacky).

-- 
To view, visit http://gerrit.cloudera.org:8080/6898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-HasComments: No

Reply via email to