[
https://issues.apache.org/jira/browse/CASSANDRA-14047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vincent White updated CASSANDRA-14047:
--------------------------------------
Attachment: trunk-patch_passes_test-debug.log
3_11-debug.log
trunk-debug.log
trunk-debug.log-2
On 3.11 I saw still see the {{UnknownColumnFamilyException}}. Which does cause
the test to fail because it triggers the "Unexpected error in log" assertion
error when tearing down the test. Strangely the test passes and doesn't hit
this on trunk with my patch even though the logs still contains the
UnknownColumnFamilyException (not sure if thats related to C* version specific
config in dtests or something).
So the netty issue is unrelated to the flakiness of this test, not sure if it
should have its own ticket? I've attached a few sets of debug logs that
demonstrate the various behaviours with/without netty and with/without my patch
from the previous comment.
In regard to the test itself. It appears that the reads that are triggering the
{{UnknownColumnFamilyException}} are actually from the initialisation of
CassandraRoleManager since they are for {{system_auth.roles}} (I believe
{{hasExistingRoles()}} in {{setupDefaultRole()}}), I'm not exactly sure what
the best way to resolve this is. This error isn't an issue for the role manager
itself as it will simply retry later and it doesn't affect the tests apart from
triggering the unexpected error in log. For the tests I guess we could leave a
gap between starting nodes. But it's probably more correct to just ignore these
errors. I've tested that
[https://github.com/vincewhite/cassandra-dtest/commit/7e48704713123a253a914802975f7163474ede9b]
this resolves the failures and I assume it's probably safe to ignore this
error for all of the tests in consistency_test but I haven't looked into that
at this stage.
Also these tests don't do anything fancy in regard to how they start the
cluster, they just use the normal {{cluster.start(wait_for_binary_proto=True,
wait_other_notice=True)}} call so I guess this could causes random failures in
a lot of tests.
> test_simple_strategy_each_quorum_users - consistency_test.TestAccuracy fails:
> Missing: ['127.0.0.3.* now UP']:
> --------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-14047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14047
> Project: Cassandra
> Issue Type: Bug
> Components: Testing
> Reporter: Michael Kjellman
> Assignee: Vincent White
> Attachments: 3_11-debug.log, trunk-debug.log, trunk-debug.log-2,
> trunk-patch_passes_test-debug.log
>
>
> test_simple_strategy_each_quorum_users - consistency_test.TestAccuracy fails:
> Missing: ['127.0.0.3.* now UP']:
> 15 Nov 2017 11:23:37 [node1] Missing: ['127.0.0.3.* now UP']:
> INFO [main] 2017-11-15 11:21:32,452 YamlConfigura.....
> See system.log for remainder
> -------------------- >> begin captured logging << --------------------
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-v3VgyS
> dtest: DEBUG: Done setting configuration options:
> { 'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 10000,
> 'read_request_timeout_in_ms': 10000,
> 'request_timeout_in_ms': 10000,
> 'truncate_request_timeout_in_ms': 10000,
> 'write_request_timeout_in_ms': 10000}
> dtest: DEBUG: Testing single dc, users, each quorum reads
> --------------------- >> end captured logging << ---------------------
> File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
> File "/home/cassandra/cassandra-dtest/tools/decorators.py", line 48, in
> wrapped
> f(obj)
> File "/home/cassandra/cassandra-dtest/consistency_test.py", line 621, in
> test_simple_strategy_each_quorum_users
>
> self._run_test_function_in_parallel(TestAccuracy.Validation.validate_users,
> [self.nodes], [self.rf], combinations)
> File "/home/cassandra/cassandra-dtest/consistency_test.py", line 535, in
> _run_test_function_in_parallel
> self._start_cluster(save_sessions=True,
> requires_local_reads=requires_local_reads)
> File "/home/cassandra/cassandra-dtest/consistency_test.py", line 141, in
> _start_cluster
> cluster.start(wait_for_binary_proto=True, wait_other_notice=True)
> File
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/cluster.py",
> line 428, in start
> node.watch_log_for_alive(other_node, from_mark=mark)
> File
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout,
> filename=filename)
> File
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + "
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" +
> reads[:50] + ".....\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:23:37 [node1] Missing: ['127.0.0.3.* now UP']:\nINFO [main]
> 2017-11-15 11:21:32,452 YamlConfigura.....\nSee system.log for
> remainder\n-------------------- >> begin captured logging <<
> --------------------\ndtest: DEBUG: cluster ccm directory:
> /tmp/dtest-v3VgyS\ndtest: DEBUG: Done setting configuration options:\n{
> 'initial_token': None,\n 'num_tokens': '32',\n 'phi_convict_threshold':
> 5,\n 'range_request_timeout_in_ms': 10000,\n
> 'read_request_timeout_in_ms': 10000,\n 'request_timeout_in_ms': 10000,\n
> 'truncate_request_timeout_in_ms': 10000,\n 'write_request_timeout_in_ms':
> 10000}\ndtest: DEBUG: Testing single dc, users, each quorum
> reads\n--------------------- >> end captured logging << ---------------------"
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]