Andres de la Peña created CASSANDRA-17685:
---------------------------------------------

             Summary: Test failure: 
transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair
                 Key: CASSANDRA-17685
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17685
             Project: Cassandra
          Issue Type: Bug
          Components: Test/dtest/python
            Reporter: Andres de la Peña


The Python dtest 
{{transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair}}
 is flaky at least in {{{}cassandra-4.1{}}}, with a flakiness < 1%.

I haven't seen the failure on Jenkins but on [this CircleCI 
run|https://app.circleci.com/pipelines/github/adelapena/cassandra/1663/workflows/c63703e3-8c7a-42c6-981a-53cb59babe1f/jobs/17476]
 for CASSANDRA-17458.

The failure can also be [reproduced in the 
multiplexer|https://app.circleci.com/pipelines/github/adelapena/cassandra/1666/workflows/6f925be1-c0df-4b2a-83e0-4612a46f32bd/jobs/17516],
 with 5 failures in 5000 iterations:
{code:java}
self = 
<transient_replication_test.TestTransientReplicationRepairStreamEntireSSTable 
object at 0x7f87951c77b8>

    @pytest.mark.no_vnodes
    def test_optimized_primary_range_repair(self):
        """ optimized primary range incremental repair from full replica should 
remove data on node3 """
        self._test_speculative_write_repair_cycle(primary_range=True,
                                                  optimized_repair=True,
                                                  repair_coordinator=self.node1,
>                                                 expect_node3_data=False)

transient_replication_test.py:523: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
transient_replication_test.py:473: in _test_speculative_write_repair_cycle
    with tm(self.node1) as tm1, tm(self.node2) as tm2, tm(self.node3) as tm3:
transient_replication_test.py:62: in __enter__
    self.start()
transient_replication_test.py:55: in start
    self.jmx.start()
tools/jmxutils.py:187: in start
    subprocess.check_output(args, stderr=subprocess.STDOUT)
/usr/lib/python3.6/subprocess.py:356: in check_output
    **kwargs).stdout
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input = None, timeout = None, check = True
popenargs = (('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', 
'/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandr...t/tools/../lib/jolokia-jvm-1.6.2-agent.jar',
 'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', ...),)
kwargs = {'stderr': -2, 'stdout': -1}
process = <subprocess.Popen object at 0x7f877e977208>
stdout = b"Couldn't start agent for PID 11637\nPossible reason could be that 
port '8778' is already occupied.\nPlease check the standard output of the 
target process for a detailed error message.\n"
stderr = None, retcode = 1

    def run(*popenargs, input=None, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.
    
        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those 
attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture 
them.
    
        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return 
code
        in the returncode attribute, and output & stderr attributes if those 
streams
        were captured.
    
        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.
    
        There is an optional argument "input", allowing you to
        pass a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.
    
        The other arguments are the same as for the Popen constructor.
    
        If universal_newlines=True is passed, the "input" argument must be a
        string and stdout/stderr in the returned object will be strings rather 
than
        bytes.
        """
        if input is not None:
            if 'stdin' in kwargs:
                raise ValueError('stdin and input arguments may not both be 
used.')
            kwargs['stdin'] = PIPE
    
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired:
                process.kill()
                stdout, stderr = process.communicate()
                raise TimeoutExpired(process.args, timeout, output=stdout,
                                     stderr=stderr)
            except:
                process.kill()
                process.wait()
                raise
            retcode = process.poll()
            if check and retcode:
                raise CalledProcessError(retcode, process.args,
>                                        output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command 
'('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', 
'/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandra/cassandra-dtest/tools/../lib/jolokia-jvm-1.6.2-agent.jar',
 'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', 'start', 
'11637')' returned non-zero exit status 1.

/usr/lib/python3.6/subprocess.py:438: CalledProcessError
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to