[
https://issues.apache.org/jira/browse/CASSANDRA-9418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561704#comment-14561704
]
Joshua McKenzie commented on CASSANDRA-9418:
--------------------------------------------
A simple ccm PR ([link|https://github.com/pcmanus/ccm/pull/289]) and a dtest PR
([link|https://github.com/riptano/cassandra-dtest/pull/299]) have knocked off
226 more failures and gotten us down to 338 failures from 564 ([test run
here|http://cassci.datastax.com/view/trunk/job/trunk_dtest_win32/271/]).
In the logs there's a very high count of "Found running cassandra process with
pid: 15776. Killing." from the dtest change to kill running cassandra processes
during Tester.setUp(). The matching ccm PR was intended to give us something to
correlate w/those failures to find out if there were specific tests that were
hanging and address them, however this first test run gave us over 100
instances of hung tests that had to be killed. Hopefully there's a systemic
infrastructural issue that we can address that will help with those errors.
There's still a high number of errors indicating missing system.log, conf\*, or
bin\* files; there's some environmental silliness occurring I haven't gotten
any clarity on yet as it's CI specific. The PR for ccm didn't actually print to
stderr so once I figure out why CI / nosetests is absorbing that output, I'll
probably also put in some more debug information from ccm regarding starting
and stopping clusters and point CI to that debug branch to get more information
out of it.
> Fix dtests on WIndows
> ---------------------
>
> Key: CASSANDRA-9418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9418
> Project: Cassandra
> Issue Type: Bug
> Reporter: Joshua McKenzie
> Assignee: Joshua McKenzie
> Labels: Windows
> Fix For: 2.2.x
>
>
> There's a variety of infrastructural failures within dtest w/regards to
> windows that are causing tests to fail and those failures to cascade.
> Error: failure to delete commit log after a test / ccm cluster is stopped:
> {noformat}
> Traceback (most recent call last):
> File "C:\src\cassandra-dtest\dtest.py", line 452, in tearDown
> self._cleanup_cluster()
> File "C:\src\cassandra-dtest\dtest.py", line 172, in _cleanup_cluster
> self.cluster.remove()
> File "build\bdist.win-amd64\egg\ccmlib\cluster.py", line 212, in remove
> shutil.rmtree(self.get_path())
> File "C:\Python27\lib\shutil.py", line 247, in rmtree
> rmtree(fullname, ignore_errors, onerror)
> File "C:\Python27\lib\shutil.py", line 247, in rmtree
> rmtree(fullname, ignore_errors, onerror)
> File "C:\Python27\lib\shutil.py", line 252, in rmtree
> onerror(os.remove, fullname, sys.exc_info())
> File "C:\Python27\lib\shutil.py", line 250, in rmtree
> os.remove(fullname)
> WindowsError: [Error 5] Access is denied:
> 'c:\\temp\\dtest-4rxq2i\\test\\node1\\commitlogs\\CommitLog-5-1431969131917.log'
> {noformat}
> Cascading error: implication is that tests aren't shutting down correctly and
> subsequent tests cannot start:
> {noformat}
> 06:00:20 ERROR: test_incr_decr_super_remove (thrift_tests.TestMutations)
> 06:00:20
> ----------------------------------------------------------------------
> 06:00:20 Traceback (most recent call last):
> 06:00:20 File
> "D:\jenkins\workspace\trunk_dtest_win32\cassandra-dtest\thrift_tests.py",
> line 55, in setUp
> 06:00:20 cluster.start()
> 06:00:20 File "build\bdist.win-amd64\egg\ccmlib\cluster.py", line 249, in
> start
> 06:00:20 p = node.start(update_pid=False, jvm_args=jvm_args,
> profile_options=profile_options)
> 06:00:20 File "build\bdist.win-amd64\egg\ccmlib\node.py", line 457, in start
> 06:00:20 common.check_socket_available(itf)
> 06:00:20 File "build\bdist.win-amd64\egg\ccmlib\common.py", line 341, in
> check_socket_available
> 06:00:20 raise UnavailableSocketError("Inet address %s:%s is not
> available: %s" % (addr, port, msg))
> 06:00:20 UnavailableSocketError: Inet address 127.0.0.1:9042 is not
> available: [Errno 10013] An attempt was made to access a socket in a way
> forbidden by its access permissions
> 06:00:20 -------------------- >> begin captured logging <<
> --------------------
> 06:00:20 dtest: DEBUG: removing ccm cluster test at: d:\temp\dtest-a5iny5
> 06:00:20 dtest: DEBUG: cluster ccm directory: d:\temp\dtest-dalzcy
> 06:00:20 --------------------- >> end captured logging <<
> ---------------------
> {noformat}
> I've also seen (and am debugging) an error where a node just fails to start
> via ccm.
> I'll update this ticket with PR's to dtest or other observations of interest.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)