[ https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078840#comment-17078840 ]
David Capwell commented on CASSANDRA-15686: ------------------------------------------- [~mck] [~newkek] doesn't look like concurrent runners are safe =( {code} org.apache.cassandra.exceptions.ConfigurationException: 127.0.0.1:7014 is in use by another process. Change listen_address:storage_port in cassandra.yaml to values that do not conflict with other services at org.apache.cassandra.net.InboundConnectionInitiator.bind(InboundConnectionInitiator.java:159) at org.apache.cassandra.net.InboundConnectionInitiator.bind(InboundConnectionInitiator.java:181) at org.apache.cassandra.net.InboundSockets$InboundSocket.open(InboundSockets.java:95) at org.apache.cassandra.net.InboundSockets$InboundSocket.open(InboundSockets.java:82) at org.apache.cassandra.net.InboundSockets.open(InboundSockets.java:209) at org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:241) at org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:262) at org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:240) at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:229) at org.apache.cassandra.net.ConnectionTest.testCRCCorruption(ConnectionTest.java:725) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) {code} I ran with 8 cores and 16gb memory (matches HIGHER in circle ci) and saw more failures than normal (normally ~1-3 failures, had 9; most known flaky tests). At least for me, running with 8 runners didn't seem to improve job latency; felt the same (though only ran once so could have been a outlier). > Improvements in circle CI default config > ---------------------------------------- > > Key: CASSANDRA-15686 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15686 > Project: Cassandra > Issue Type: Bug > Components: Build > Reporter: Kevin Gallardo > Priority: Normal > > I have been looking at and played around with the [default CircleCI > config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], > a few comments/questions regarding the following topics: > * Python dtests do not run successfully (200-300 failures) on {{medium}} > instances, they seem to only run with small flaky failures on {{large}} > instances or higher > * Python Upgrade tests: > ** Do not seem to run without many failures on any instance types / any > parallelism setting > ** Do not seem to parallelize well, it seems each container is going to > download multiple C* versions > ** Additionally it seems the configuration is not up to date, as currently > we get errors because {{JAVA8_HOME}} is not set > * Unit tests do not seem to parallelize optimally, number of test runners do > not reflect the available CPUs on the container. Ideally if # of runners == # > of CPUs, build time is improved, on any type of instances. > ** For instance when using the current configuration, running on medium > instances, build will use 1 junit test runner, but 2 CPUs are available. If > using 2 runners, the build time is reduced from 19min (at the current main > config of parallelism=4) to 12min. > * There are some typos in the file, some dtests say "Run Unit Tests" but > they are JVM dtests (see > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077], > > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386]) > So some ways to process these would be: > * Do the Python dtests run successfully for anyone on {{medium}} instances? > If not, would it make sense to bump them to {{large}} so that they can be run > successfully? > * Does anybody ever run the python upgrade tests on CircleCI and what is the > configuration that makes it work? > * Would it make sense to either hardcode the number of test runners in the > unit tests with `-Dtest.runners` in the config file to reflect the number of > CPUs on the instances, or change the build so that it is able to detect the > appropriate number of core available automatically? > Additionally, it seems this default config file (config.yml) is not as well > maintained as the > [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml] > (+its lowres/highres) version in the same folder (from CASSANDRA-14806). > What is the reasoning for maintaining these 2 versions of the build? Could > the better maintained version be used as the default? We could generate a > lowres version of the new config-2_1.yml, and rename it {{config.yml}} so > that it gets picked up by CircleCI automatically instead of the current > default. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org