[jira] [Updated] (KUDU-3464) Failed mini-cluster creation leaves chronyd open in Python test infra

Marton Greber (Jira) Wed, 22 Mar 2023 08:10:11 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Marton Greber updated KUDU-3464:
--------------------------------
    Description: 
*Description:*
While working on adding extra startup flag support in the Python test infra I 
was tinkering with adding negative tests. One example, when a wrong flag name 
is specified in a test class.
{code:python}
class TestKuduTestStartupFlagsMasterWrongFlagName(KuduTestBase, CompatUnitTest):
    @classmethod
    def setUpClass(self):
        extra_master_flags=[("non_existent_flag","1")]
        extra_tserver_flags=[("tablet_apply_pool_overload_threshold_ms", "1")]
        error_msg = 'RUNTIME_ERROR'
        with self.assertRaisesRegex(self, Exception, error_msg):
            super(TestKuduTestStartupFlagsMasterWrongFlagName, self)\
                .setUpClass(extra_master_flags, extra_tserver_flags)

    def test_startup_flags_master_wrong_flag_name(self):
        pass

class TestKuduTestStartupFlagsTserverWrongFlagName(KuduTestBase, 
CompatUnitTest):
    @classmethod
    def setUpClass(self):
        extra_master_flags=[("check_expired_table_interval_seconds","1")]
        extra_tserver_flags=[("non_existent_flag","1")]
        error_msg = 'RUNTIME_ERROR'
        with self.assertRaisesRegex(self, Exception, error_msg):
            super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
                .setUpClass(extra_master_flags, extra_tserver_flags)

    def test_startup_flags_tserver_wrong_flag_name(self):
        pass
{code}
By themselves, these test run fine. Running them after each other results in 
the following error:
{code:bash}
2023-03-22T14:48:01Z Fatal error : Another chronyd may already be running 
(pid=377244), check /tmp/kudutest-0/minicluster-data/chrony.0/chronyd.pid
Could not open connection to daemon
{code}
It times out with the above error, when the control flow reaches the second 
test:
{code:python}
_ ERROR at setup of 
TestKuduTestStartupFlagsTserverWrongFlagName.test_startup_flags_tserver_wrong_flag_name
 _
Exception: Error in response: {'code': 'TIMED_OUT', 'message': 'failed to start 
NTP server 0: failed to contact chronyd in 1.000s'}

During handling of the above exception, another exception occurred:

self = <class 
'kudu.tests.test_common.TestKuduTestStartupFlagsTserverWrongFlagName'>

    @classmethod
    def setUpClass(self):
        extra_master_flags=[("check_expired_table_interval_seconds","1")]
        extra_tserver_flags=[("non_existent_flag","1")]
        error_msg = 'RUNTIME_ERROR'
        with self.assertRaisesRegex(self, Exception, error_msg):
            super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
>               .setUpClass(extra_master_flags, extra_tserver_flags)
E           TypeError: _formatMessage() missing 1 required positional argument: 
'standardMsg'

kudu/tests/test_common.py:82: TypeError
{code}
I suspect that when the first cluster creation fails, chronyd is not properly 
disposed of.

(After quitting the test execution, the referred /tmp/kudutest-0 location is 
properly cleaned up.)

*Consequences:*

If developers writing Python tests mess up a flag name, or value for more than 
once in the code they get "Could not open connection to daemon" errors. (which 
is not really helpful at first)

However for properly written test code this bug has no negative effect.
        Summary: Failed mini-cluster creation leaves chronyd open in Python 
test infra  (was: Failed mini-cluster creation leaves chronyd running in Python 
test infra)

> Failed mini-cluster creation leaves chronyd open in Python test infra
> ---------------------------------------------------------------------
>
>                 Key: KUDU-3464
>                 URL: https://issues.apache.org/jira/browse/KUDU-3464
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Marton Greber
>            Priority: Minor
>              Labels: client
>
> *Description:*
> While working on adding extra startup flag support in the Python test infra I 
> was tinkering with adding negative tests. One example, when a wrong flag name 
> is specified in a test class.
> {code:python}
> class TestKuduTestStartupFlagsMasterWrongFlagName(KuduTestBase, 
> CompatUnitTest):
>     @classmethod
>     def setUpClass(self):
>         extra_master_flags=[("non_existent_flag","1")]
>         extra_tserver_flags=[("tablet_apply_pool_overload_threshold_ms", "1")]
>         error_msg = 'RUNTIME_ERROR'
>         with self.assertRaisesRegex(self, Exception, error_msg):
>             super(TestKuduTestStartupFlagsMasterWrongFlagName, self)\
>                 .setUpClass(extra_master_flags, extra_tserver_flags)
>     def test_startup_flags_master_wrong_flag_name(self):
>         pass
> class TestKuduTestStartupFlagsTserverWrongFlagName(KuduTestBase, 
> CompatUnitTest):
>     @classmethod
>     def setUpClass(self):
>         extra_master_flags=[("check_expired_table_interval_seconds","1")]
>         extra_tserver_flags=[("non_existent_flag","1")]
>         error_msg = 'RUNTIME_ERROR'
>         with self.assertRaisesRegex(self, Exception, error_msg):
>             super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
>                 .setUpClass(extra_master_flags, extra_tserver_flags)
>     def test_startup_flags_tserver_wrong_flag_name(self):
>         pass
> {code}
> By themselves, these test run fine. Running them after each other results in 
> the following error:
> {code:bash}
> 2023-03-22T14:48:01Z Fatal error : Another chronyd may already be running 
> (pid=377244), check /tmp/kudutest-0/minicluster-data/chrony.0/chronyd.pid
> Could not open connection to daemon
> {code}
> It times out with the above error, when the control flow reaches the second 
> test:
> {code:python}
> _ ERROR at setup of 
> TestKuduTestStartupFlagsTserverWrongFlagName.test_startup_flags_tserver_wrong_flag_name
>  _
> Exception: Error in response: {'code': 'TIMED_OUT', 'message': 'failed to 
> start NTP server 0: failed to contact chronyd in 1.000s'}
> During handling of the above exception, another exception occurred:
> self = <class 
> 'kudu.tests.test_common.TestKuduTestStartupFlagsTserverWrongFlagName'>
>     @classmethod
>     def setUpClass(self):
>         extra_master_flags=[("check_expired_table_interval_seconds","1")]
>         extra_tserver_flags=[("non_existent_flag","1")]
>         error_msg = 'RUNTIME_ERROR'
>         with self.assertRaisesRegex(self, Exception, error_msg):
>             super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
> >               .setUpClass(extra_master_flags, extra_tserver_flags)
> E           TypeError: _formatMessage() missing 1 required positional 
> argument: 'standardMsg'
> kudu/tests/test_common.py:82: TypeError
> {code}
> I suspect that when the first cluster creation fails, chronyd is not properly 
> disposed of.
> (After quitting the test execution, the referred /tmp/kudutest-0 location is 
> properly cleaned up.)
> *Consequences:*
> If developers writing Python tests mess up a flag name, or value for more 
> than once in the code they get "Could not open connection to daemon" errors. 
> (which is not really helpful at first)
> However for properly written test code this bug has no negative effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KUDU-3464) Failed mini-cluster creation leaves chronyd open in Python test infra

Reply via email to