[
https://issues.apache.org/jira/browse/KUDU-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marton Greber updated KUDU-3464:
--------------------------------
Description:
*Description:*
While working on adding extra startup flag support in the Python test infra I
was tinkering with adding negative tests. One example, when a wrong flag name
is specified in a test class.
{code:python}
class TestKuduTestStartupFlagsMasterWrongFlagName(KuduTestBase, CompatUnitTest):
@classmethod
def setUpClass(self):
extra_master_flags=[("non_existent_flag","1")]
extra_tserver_flags=[("tablet_apply_pool_overload_threshold_ms", "1")]
error_msg = 'RUNTIME_ERROR'
with self.assertRaisesRegex(self, Exception, error_msg):
super(TestKuduTestStartupFlagsMasterWrongFlagName, self)\
.setUpClass(extra_master_flags, extra_tserver_flags)
def test_startup_flags_master_wrong_flag_name(self):
pass
class TestKuduTestStartupFlagsTserverWrongFlagName(KuduTestBase,
CompatUnitTest):
@classmethod
def setUpClass(self):
extra_master_flags=[("check_expired_table_interval_seconds","1")]
extra_tserver_flags=[("non_existent_flag","1")]
error_msg = 'RUNTIME_ERROR'
with self.assertRaisesRegex(self, Exception, error_msg):
super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
.setUpClass(extra_master_flags, extra_tserver_flags)
def test_startup_flags_tserver_wrong_flag_name(self):
pass
{code}
By themselves, these test run fine. Running them after each other results in
the following error:
{code:bash}
2023-03-22T14:48:01Z Fatal error : Another chronyd may already be running
(pid=377244), check /tmp/kudutest-0/minicluster-data/chrony.0/chronyd.pid
Could not open connection to daemon
{code}
It times out with the above error, when the control flow reaches the second
test:
{code:python}
_ ERROR at setup of
TestKuduTestStartupFlagsTserverWrongFlagName.test_startup_flags_tserver_wrong_flag_name
_
Exception: Error in response: {'code': 'TIMED_OUT', 'message': 'failed to start
NTP server 0: failed to contact chronyd in 1.000s'}
During handling of the above exception, another exception occurred:
self = <class
'kudu.tests.test_common.TestKuduTestStartupFlagsTserverWrongFlagName'>
@classmethod
def setUpClass(self):
extra_master_flags=[("check_expired_table_interval_seconds","1")]
extra_tserver_flags=[("non_existent_flag","1")]
error_msg = 'RUNTIME_ERROR'
with self.assertRaisesRegex(self, Exception, error_msg):
super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
> .setUpClass(extra_master_flags, extra_tserver_flags)
E TypeError: _formatMessage() missing 1 required positional argument:
'standardMsg'
kudu/tests/test_common.py:82: TypeError
{code}
I suspect that when the first cluster creation fails, chronyd is not properly
disposed of.
(After quitting the test execution, the referred /tmp/kudutest-0 location is
properly cleaned up.)
*Consequences:*
If developers writing Python tests mess up a flag name, or value for more than
once in the code they get "Could not open connection to daemon" errors. (which
is not really helpful at first)
However for properly written test code this bug has no negative effect.
Summary: Failed mini-cluster creation leaves chronyd open in Python
test infra (was: Failed mini-cluster creation leaves chronyd running in Python
test infra)
> Failed mini-cluster creation leaves chronyd open in Python test infra
> ---------------------------------------------------------------------
>
> Key: KUDU-3464
> URL: https://issues.apache.org/jira/browse/KUDU-3464
> Project: Kudu
> Issue Type: Bug
> Reporter: Marton Greber
> Priority: Minor
> Labels: client
>
> *Description:*
> While working on adding extra startup flag support in the Python test infra I
> was tinkering with adding negative tests. One example, when a wrong flag name
> is specified in a test class.
> {code:python}
> class TestKuduTestStartupFlagsMasterWrongFlagName(KuduTestBase,
> CompatUnitTest):
> @classmethod
> def setUpClass(self):
> extra_master_flags=[("non_existent_flag","1")]
> extra_tserver_flags=[("tablet_apply_pool_overload_threshold_ms", "1")]
> error_msg = 'RUNTIME_ERROR'
> with self.assertRaisesRegex(self, Exception, error_msg):
> super(TestKuduTestStartupFlagsMasterWrongFlagName, self)\
> .setUpClass(extra_master_flags, extra_tserver_flags)
> def test_startup_flags_master_wrong_flag_name(self):
> pass
> class TestKuduTestStartupFlagsTserverWrongFlagName(KuduTestBase,
> CompatUnitTest):
> @classmethod
> def setUpClass(self):
> extra_master_flags=[("check_expired_table_interval_seconds","1")]
> extra_tserver_flags=[("non_existent_flag","1")]
> error_msg = 'RUNTIME_ERROR'
> with self.assertRaisesRegex(self, Exception, error_msg):
> super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
> .setUpClass(extra_master_flags, extra_tserver_flags)
> def test_startup_flags_tserver_wrong_flag_name(self):
> pass
> {code}
> By themselves, these test run fine. Running them after each other results in
> the following error:
> {code:bash}
> 2023-03-22T14:48:01Z Fatal error : Another chronyd may already be running
> (pid=377244), check /tmp/kudutest-0/minicluster-data/chrony.0/chronyd.pid
> Could not open connection to daemon
> {code}
> It times out with the above error, when the control flow reaches the second
> test:
> {code:python}
> _ ERROR at setup of
> TestKuduTestStartupFlagsTserverWrongFlagName.test_startup_flags_tserver_wrong_flag_name
> _
> Exception: Error in response: {'code': 'TIMED_OUT', 'message': 'failed to
> start NTP server 0: failed to contact chronyd in 1.000s'}
> During handling of the above exception, another exception occurred:
> self = <class
> 'kudu.tests.test_common.TestKuduTestStartupFlagsTserverWrongFlagName'>
> @classmethod
> def setUpClass(self):
> extra_master_flags=[("check_expired_table_interval_seconds","1")]
> extra_tserver_flags=[("non_existent_flag","1")]
> error_msg = 'RUNTIME_ERROR'
> with self.assertRaisesRegex(self, Exception, error_msg):
> super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
> > .setUpClass(extra_master_flags, extra_tserver_flags)
> E TypeError: _formatMessage() missing 1 required positional
> argument: 'standardMsg'
> kudu/tests/test_common.py:82: TypeError
> {code}
> I suspect that when the first cluster creation fails, chronyd is not properly
> disposed of.
> (After quitting the test execution, the referred /tmp/kudutest-0 location is
> properly cleaned up.)
> *Consequences:*
> If developers writing Python tests mess up a flag name, or value for more
> than once in the code they get "Could not open connection to daemon" errors.
> (which is not really helpful at first)
> However for properly written test code this bug has no negative effect.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)