[ https://issues.apache.org/jira/browse/CASSANDRA-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250495#comment-15250495 ]
Philip Thompson edited comment on CASSANDRA-11608 at 4/20/16 6:50 PM: ---------------------------------------------------------------------- Looking at the logs, it seems node4 was added before node3 was fully recognized as down. I tried setting {{wait-other-notice=True}} on the {{node3.stop()}} call, but that didn't fix the issue at all. I haven't checked the logs for those runs yet. http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/74/testReport/ http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/75/testReport/ was (Author: philipthompson): Looking at the logs, it seems node4 was added before node3 was fully recognized as down. I tried setting {{wait-other-notice=True}} on the {{node3.stop()}} call, but that didn't fix the issue at all. I haven't checked the logs for those runs yet. > dtest failure in > replace_address_test.TestReplaceAddress.replace_first_boot_test > -------------------------------------------------------------------------------- > > Key: CASSANDRA-11608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11608 > Project: Cassandra > Issue Type: Test > Reporter: Jim Witschey > Assignee: Philip Thompson > Labels: dtest > > This looks like a timeout kind of flap. It's flapped once. Example failure: > http://cassci.datastax.com/job/cassandra-2.2_offheap_dtest/344/testReport/replace_address_test/TestReplaceAddress/replace_first_boot_test > Failed on CassCI build cassandra-2.2_offheap_dtest #344 - 2.2.6-tentative > {code} > Error Message > 15 Apr 2016 16:23:41 [node3] Missing: ['127.0.0.4.* now UP']: > INFO [main] 2016-04-15 16:21:32,345 Config.java:4..... > See system.log for remainder > -------------------- >> begin captured logging << -------------------- > dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-4i5qkE > dtest: DEBUG: Custom init_config not found. Setting defaults. > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'memtable_allocation_type': 'offheap_objects', > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'start_rpc': 'true'} > dtest: DEBUG: Starting cluster with 3 nodes. > dtest: DEBUG: 32 > dtest: DEBUG: Inserting Data... > dtest: DEBUG: Stopping node 3. > dtest: DEBUG: Testing node stoppage (query should fail). > dtest: DEBUG: Retrying read after timeout. Attempt #0 > dtest: DEBUG: Retrying read after timeout. Attempt #1 > dtest: DEBUG: Retrying request after UE. Attempt #2 > dtest: DEBUG: Retrying request after UE. Attempt #3 > dtest: DEBUG: Retrying request after UE. Attempt #4 > dtest: DEBUG: Starting node 4 to replace node 3 > dtest: DEBUG: Verifying querying works again. > dtest: DEBUG: Verifying tokens migrated sucessfully > dtest: DEBUG: ('WARN [main] 2016-04-15 16:21:21,068 TokenMetadata.java:196 - > Token -3855903180169109916 changing ownership from /127.0.0.3 to > /127.0.0.4\n', <_sre.SRE_Match object at 0x7fd21c0e2370>) > dtest: DEBUG: Try to restart node 3 (should fail) > dtest: DEBUG: [('WARN [GossipStage:1] 2016-04-15 16:21:22,942 > StorageService.java:1962 - Host ID collision for > 75916cc0-86ec-4136-b336-862a49953616 between /127.0.0.3 and /127.0.0.4; > /127.0.0.4 is the new owner\n', <_sre.SRE_Match object at 0x7fd1f83555e0>)] > --------------------- >> end captured logging << --------------------- > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/replace_address_test.py", line 212, > in replace_first_boot_test > node4.start(wait_for_binary_proto=True) > File "/home/automaton/ccm/ccmlib/node.py", line 610, in start > node.watch_log_for_alive(self, from_mark=mark) > File "/home/automaton/ccm/ccmlib/node.py", line 457, in watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > File "/home/automaton/ccm/ccmlib/node.py", line 425, in watch_log_for > raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " > [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + > reads[:50] + ".....\nSee {} for remainder".format(filename)) > "15 Apr 2016 16:23:41 [node3] Missing: ['127.0.0.4.* now UP']:\nINFO [main] > 2016-04-15 16:21:32,345 Config.java:4.....\nSee system.log for > remainder\n-------------------- >> begin captured logging << > --------------------\ndtest: DEBUG: cluster ccm directory: > /mnt/tmp/dtest-4i5qkE\ndtest: DEBUG: Custom init_config not found. Setting > defaults.\ndtest: DEBUG: Done setting configuration options:\n{ > 'initial_token': None,\n 'memtable_allocation_type': 'offheap_objects',\n > 'num_tokens': '32',\n 'phi_convict_threshold': 5,\n 'start_rpc': > 'true'}\ndtest: DEBUG: Starting cluster with 3 nodes.\ndtest: DEBUG: > 32\ndtest: DEBUG: Inserting Data...\ndtest: DEBUG: Stopping node 3.\ndtest: > DEBUG: Testing node stoppage (query should fail).\ndtest: DEBUG: Retrying > read after timeout. Attempt #0\ndtest: DEBUG: Retrying read after timeout. > Attempt #1\ndtest: DEBUG: Retrying request after UE. Attempt #2\ndtest: > DEBUG: Retrying request after UE. Attempt #3\ndtest: DEBUG: Retrying request > after UE. Attempt #4\ndtest: DEBUG: Starting node 4 to replace node 3\ndtest: > DEBUG: Verifying querying works again.\ndtest: DEBUG: Verifying tokens > migrated sucessfully\ndtest: DEBUG: ('WARN [main] 2016-04-15 16:21:21,068 > TokenMetadata.java:196 - Token -3855903180169109916 changing ownership from > /127.0.0.3 to /127.0.0.4\\n', <_sre.SRE_Match object at > 0x7fd21c0e2370>)\ndtest: DEBUG: Try to restart node 3 (should fail)\ndtest: > DEBUG: [('WARN [GossipStage:1] 2016-04-15 16:21:22,942 > StorageService.java:1962 - Host ID collision for > 75916cc0-86ec-4136-b336-862a49953616 between /127.0.0.3 and /127.0.0.4; > /127.0.0.4 is the new owner\\n', <_sre.SRE_Match object at > 0x7fd1f83555e0>)]\n--------------------- >> end captured logging << > ---------------------" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)