[
https://issues.apache.org/jira/browse/CASSANDRA-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Tunnicliffe updated CASSANDRA-19093:
----------------------------------------
Test and Documentation Plan: Run updated tests
Status: Patch Available (was: In Progress)
[https://github.com/apache/cassandra-dtest/pull/247]
I don't believe this is directly related to TCM as I can trivially repro it on
an M2 MacBook running against the 5.0 branch and using commit
[{{b355b84c}}|https://github.com/apache/cassandra-dtest/commit/b355b84c5f7b53d390822332215e3751df562559]
of cassandra-dtests (before any TCM changes landed). What happens is that at
startup the initial view building task is submitted to the optional tasks
executor from {{CassandraDaemon::setup}}, but before this is executed the
{{CREATE VIEW}} statement is received and the build of {{ks.t_by_v}} is started
from the migration stage. Then when the initial view task does run on optional
tasks it forces the existing task to stop and then resume adding the {{Resuming
view build for range...}} log messages which causes the test assertion to fail.
It's possible that changes to timing of startup and initialization from TCM may
make this more likely to fail now. The patch removes the negative grep for the
non-presence of "{{Resuming view build...}}" before restarting the cluster.
Instead, it marks each logfile and only does the subsequent grep from those
points. Confirmed that with this patch the error no longer repros on 5.0
> Test Failure:
> materialized_views_test.TestMaterializedViews.test_interrupt_build_process
> ----------------------------------------------------------------------------------------
>
> Key: CASSANDRA-19093
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19093
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest/python
> Reporter: Michael Semb Wever
> Assignee: Sam Tunnicliffe
> Priority: Normal
> Fix For: 5.1-alpha1
>
>
> Seen in j11_dtests in CASSANDRA-19034
> https://app.circleci.com/pipelines/github/michaelsembwever/cassandra/259/workflows/f343d3e3-00cf-4e13-bb4d-bbfff1d3658c/jobs/21100/tests
> {noformat}
> AssertionError: assert not [('DEBUG [ViewBuildExecutor:2] 2023-11-25
> 10:20:56,917 ViewBuilderTask.java:128 - Resuming view build for range
> (-3458...token -5761824694134994220 with 1 covered keys\n', <re.Match object;
> span=(79, 98), match='Resuming view build'>), ...]
> + where [('DEBUG [ViewBuildExecutor:2] 2023-11-25 10:20:56,917
> ViewBuilderTask.java:128 - Resuming view build for range (-3458...token
> -5761824694134994220 with 1 covered keys\n', <re.Match object; span=(79, 98),
> match='Resuming view build'>), ...] = <bound method Node.grep_log of
> <ccmlib.node.Node object at 0x7f09f960c390>>('Resuming view build',
> filename='debug.log')
> + where <bound method Node.grep_log of <ccmlib.node.Node object at
> 0x7f09f960c390>> = <ccmlib.node.Node object at 0x7f09f960c390>.grep_log
> self = <materialized_views_test.TestMaterializedViews object at
> 0x7f09fa5f0250>
> def test_interrupt_build_process(self):
> """Test that an interrupted MV build process is resumed as it
> should"""
>
> options = {'hinted_handoff_enabled': False}
> if self.cluster.version() >= '4':
> options['concurrent_materialized_view_builders'] = 4
>
> session = self.prepare(options=options, install_byteman=True)
> node1, node2, node3 = self.cluster.nodelist()
>
> logger.debug("Avoid premature MV build finalization with byteman")
> for node in self.cluster.nodelist():
> if self.cluster.version() >= '4':
>
> node.byteman_submit([mk_bman_path('4.0/skip_view_build_finalization.btm')])
>
> node.byteman_submit([mk_bman_path('4.0/skip_view_build_task_finalization.btm')])
> else:
>
> node.byteman_submit([mk_bman_path('pre4.0/skip_finish_view_build_status.btm')])
>
> node.byteman_submit([mk_bman_path('pre4.0/skip_view_build_update_distributed.btm')])
>
> session.execute("CREATE TABLE t (id int PRIMARY KEY, v int, v2 text,
> v3 decimal)")
>
> logger.debug("Inserting initial data")
> for i in range(10000):
> session.execute(
> "INSERT INTO t (id, v, v2, v3) VALUES ({v}, {v}, 'a', 3.0) IF
> NOT EXISTS".format(v=i)
> )
>
> logger.debug("Create a MV")
> session.execute(("CREATE MATERIALIZED VIEW t_by_v AS SELECT * FROM t "
> "WHERE v IS NOT NULL AND id IS NOT NULL PRIMARY KEY
> (v, id)"))
>
> logger.debug("Wait and ensure the MV build has started. Waiting up to
> 2 minutes.")
> self._wait_for_view_build_start(session, 'ks', 't_by_v',
> wait_minutes=2)
>
> logger.debug("Stop the cluster. Interrupt the MV build process.")
> self.cluster.stop()
>
> logger.debug("Checking logs to verify that the view build tasks have
> been created")
> for node in self.cluster.nodelist():
> assert node.grep_log('Starting new view build',
> filename='debug.log')
> > assert not node.grep_log('Resuming view build',
> > filename='debug.log')
> E AssertionError: assert not [('DEBUG [ViewBuildExecutor:2]
> 2023-11-25 10:20:56,917 ViewBuilderTask.java:128 - Resuming view build for
> range (-3458...token -5761824694134994220 with 1 covered keys\n', <re.Match
> object; span=(79, 98), match='Resuming view build'>), ...]
> E + where [('DEBUG [ViewBuildExecutor:2] 2023-11-25 10:20:56,917
> ViewBuilderTask.java:128 - Resuming view build for range (-3458...token
> -5761824694134994220 with 1 covered keys\n', <re.Match object; span=(79, 98),
> match='Resuming view build'>), ...] = <bound method Node.grep_log of
> <ccmlib.node.Node object at 0x7f09f960c390>>('Resuming view build',
> filename='debug.log')
> E + where <bound method Node.grep_log of <ccmlib.node.Node
> object at 0x7f09f960c390>> = <ccmlib.node.Node object at
> 0x7f09f960c390>.grep_log
> materialized_views_test.py:1129: AssertionError
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]