Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
......................................................................


Patch Set 8:

(10 comments)

Responding to the latest set of comments. Code changes will follow soonish.

http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@947
PS8, Line 947: Retries, if enabled, are attempted
             :     for all exceptions other than QueryCancelledByShellException
             :     and MissingThriftMethodException
> My understanding of TApplicationExceptions is that they are thrown by the T
I totally agree with not retrying all exceptions. Only the ones which arise 
from the transport layer. I have made TApplicationExceptions as not retriable.


http://gerrit.cloudera.org:8080/#/c/15378/7/tests/custom_cluster/test_hs2_fault_injection.py
File tests/custom_cluster/test_hs2_fault_injection.py:

http://gerrit.cloudera.org:8080/#/c/15378/7/tests/custom_cluster/test_hs2_fault_injection.py@123
PS7, Line 123:   @pytest.mark.execute_serially
> For what it's worth, it seems that all of the custom cluster are serial, be
Yeah the cluster gets restarted for every test. I am also not sure if we have 
the capability to run tests in parallel for the custom cluster. I see many-many 
custom cluster tests use the '@pytest.mark.execute_serially' tag and many which 
don't.

I originally started writing these tests in an ImpalaTestSuite class.  But, 
realized that the 'test_connect' leaves an idle session since we don't get back 
a session handle (to close). I think the 'test_execute_query' also leaves 
behind and orphan query handle/fragments since we don't close the query (again 
because we don't have the query handle). In ImpalaTestSuite tests there is a 
metrics_verifier which is run at the end (after executing all tests in a class) 
and it fails because some of the metrics such as session_handle and fragments 
etc are non zero.

I could still split the tests, and use CustomClusterTestSuite for the ones 
which don't close properly and use ImpalaTestSuite for other, but I prefer 
having all tests in one common place.


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py
File tests/custom_cluster/test_hs2_fault_injection.py:

http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@30
PS8, Line 30:   """Class for injecting faults in the THttpClient."""
> should include an explanation of how faults are injected (e.g. they are inj
Done.


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@40
PS8, Line 40: fault
> nit: enable_fault would sound clearer
Done


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@81
PS8, Line 81:   """Fault injecting ImpalaHS2Client class using http as 
transport"""
> should mention the relation between this class and FaultInjectingHttpClient
Done


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@85
PS8, Line 85:     """Creates a transport with HTTP as the base."""
> Or possibly moved above the call to super(...) as a proper docstring.
Good point. I'll move it as a proper docstring.


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@101
PS8, Line 101: CustomClusterTestSuite
> why does this need to be a custom cluster test? you are just injecting faul
The test_connect test leaves an idle session since we don't get back a session 
handle. I think the test_execute_query also leaves behind and orphan query 
handle/fragments since we don't close the query (again because we don't have 
the query handle). In ImpalaTestSuite tests there is a metrics_verifier which 
is run at the end (after executing all tests in a class) and it fails because 
some of the metrics such as session_handle and fragments etc are non zero.


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@106
PS8, Line 106:     kerberos_host_fqdn = None
> why is this needed, just pass None as a parameter?
Done.


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@128
PS8, Line 128:     self.transport.fault(502, "Injected Fault", 0.20)
> whats the point in adding a frequency to these tests? why not always inject
I think the point of having a frequency is so that we have some faults and we 
are able to recover from them after retries. The retries only help if faults 
don't happen 100% of the time, else the retries will also fail.


http://gerrit.cloudera.org:8080/#/c/15378/8/tests/custom_cluster/test_hs2_fault_injection.py@129
PS8, Line 129:     self.connect()
> is there a way to actually verify that operations were retried?
I think we might be able to get it from stderr.



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 8
Gerrit-Owner: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Comment-Date: Thu, 19 Mar 2020 20:55:03 +0000
Gerrit-HasComments: Yes

Reply via email to