Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21153 )

Change subject: IMPALA-12913: Refactor Workload Management Custom Cluster Tests
......................................................................


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21153/6/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/21153/6/tests/custom_cluster/test_query_log.py@51
PS6, Line 51:
> I wouldn't expect any difference in behavior between Beeswax and HS2, from
There are a few fields in the completed queries table that are different based 
on the client protocol.

The TestQueryLogTableAll class is the only test class that runs using both the 
beeswax and hs2 dimensions.  This class contains unique tests that I would like 
to test on both client protocols to cover any corner cases.  For example, there 
is only 1 dml query and 1 ddl that are asserted.  I test on both protocols for 
those.  I also test one invalid query and all combinations of ignore queries to 
make sure that changing the client protocol doesn't actually run on some 
unexpected code path that should not be executed.  I was very intentional to 
only do duplication where I thought it added unique test coverage.


http://gerrit.cloudera.org:8080/#/c/21153/6/tests/custom_cluster/test_query_log.py@97
PS6, Line 97:     # causing the execution to take a very long time.
> Could you comment that we use async to avoid fetching the large result stri
Done


http://gerrit.cloudera.org:8080/#/c/21153/6/tests/custom_cluster/test_query_log.py@383
PS6, Line 383:                                     cluster_size=3,
> This isn't specific to query_log, could we move the sys db tests to a new f
Done


http://gerrit.cloudera.org:8080/#/c/21153/7/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/21153/7/tests/custom_cluster/test_query_log.py@208
PS7, Line 208:       res = client.execute(select_sql)
> Why do you sleep between runs?
The original issue is that running queries too quickly did not cause the cache 
to be populated.  I modified this code to check the complete-queries.written 
metric after each query.  That will ensure there is some down time between 
queries.


http://gerrit.cloudera.org:8080/#/c/21153/7/tests/custom_cluster/test_query_log.py@214
PS7, Line 214:     
self.cluster.get_first_impalad().service.wait_for_metric_value(
> What are you actually waiting on. Is there a data-cache metric we can watch
Time is needed for the cache to be populated or else it is not used.  I 
modified this code to check a metric instead of sleeping.



--
To view, visit http://gerrit.cloudera.org:8080/21153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e3249a8f306cf43de0d6f6586711c779399e83b
Gerrit-Change-Number: 21153
Gerrit-PatchSet: 8
Gerrit-Owner: Jason Fehr <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Tue, 26 Mar 2024 22:28:52 +0000
Gerrit-HasComments: Yes

Reply via email to