[ 
https://issues.apache.org/jira/browse/IMPALA-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939677#comment-17939677
 ] 

Csaba Ringhofer commented on IMPALA-13912:
------------------------------------------

A few pointers on how this could be implemented without rewriting a lot of 
tests:
CustomClusterTestSuite could save the arguments of the last started cluster, 
and in the case a test has the same args, no restart is needed.

The cluster restart happens here (if per class args):
https://github.com/apache/impala/blob/b121a40d20107bad6c04732ba580f26639acd43a/tests/common/custom_cluster_test_suite.py#L148
or here (if per test args):
https://github.com/apache/impala/blob/b121a40d20107bad6c04732ba580f26639acd43a/tests/common/custom_cluster_test_suite.py#L377

AFAIK pytest will run the tests in a given file in a single process, so a 
global/static var is enough to store the arguments of the running cluster. If 
it has to be passed between processes, then a file like 
/tmp/impala-custom-cluster-args.json could be used. Besides storing the startup 
args the pid of the started processes could be also saved to be able to test 
that the cluster is still alive.

Some tests need a fresh cluster, for example if they parse the logs, in these 
cases this optimization has to be disabled.
This could be done by adding an argument to 
https://github.com/apache/impala/blob/b121a40d20107bad6c04732ba580f26639acd43a/tests/common/custom_cluster_test_suite.py#L156
 , e.g. force_restart=False

> Use SHARED_CLUSTER_ARGS in more custom cluster tests
> ----------------------------------------------------
>
>                 Key: IMPALA-13912
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13912
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Test
>            Reporter: Csaba Ringhofer
>            Assignee: Mihaly Szjatinya
>            Priority: Minor
>              Labels: ramp-up
>
> IMPALA-13503 allowed restarting the cluster only once per test suite in 
> CustomClusterTestSuites using per suite @CustomClusterTestSuite.with_args.
> There are many custom cluster tests that could be restructured to be much 
> faster this way, for example 
> https://github.com/apache/impala/blob/7f38c7ed61a0536c430311b3d4600aa0a16b988a/tests/custom_cluster/test_client_ssl.py#L100C7-L100C15
> {code}
>   @CustomClusterTestSuite.with_args(impalad_args=SSL_ARGS, 
> statestored_args=SSL_ARGS,
>                                     catalogd_args=SSL_ARGS)
>   def test_ssl(self, vector):
> {code}
> The test above is run with 4 test vectors (see add_test_dimensions) and the 
> cluster is restarted each time. This is not needed as the test vector doesn't 
> affect the cluster parameters. A possible fix for this is to split the test 
> suite to multiple suites where CustomClusterTestSuite.with_args is set per 
> suite instead of per test.
> Tests like this seem very common - the following rough estimate returns 189:
> {code}
> git grep -B5 "def test.*vector" | grep "CustomClusterTestSuite.with_args" | 
> wc -l
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to