[
https://issues.apache.org/jira/browse/CASSANDRA-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545624#comment-17545624
]
David Capwell commented on CASSANDRA-17244:
-------------------------------------------
Ok here is what I am seeing
{code}
java.lang.NullPointerException
at
com.google.common.collect.Iterables.getOnlyElement(Iterables.java:254)
at
org.apache.cassandra.distributed.test.thresholds.TombstoneCountWarningTest.failThreshold(TombstoneCountWarningTest.java:287)
at
org.apache.cassandra.distributed.test.thresholds.TombstoneCountWarningTest.failThresholdSinglePartition(TombstoneCountWarningTest.java:199)
{code}
which maps to this code
{code}
enable(false);
warnings = CLUSTER.get(1).callsOnInstance(() -> {
ClientWarn.instance.captureWarnings();
try
{
QueryProcessor.execute(cql,
org.apache.cassandra.db.ConsistencyLevel.ALL, QueryState.forInternalCalls());
Assert.fail("Expected query failure");
}
catch (ReadFailureException e)
{
Assertions.assertThat(e).isNotInstanceOf(TombstoneAbortException.class);
}
return ClientWarn.instance.getWarnings();
}).call();
// client warnings are currently coordinator only, so if present only 1
is expected
if (isScan)
{
// Scans perform multiple ReadCommands, which will not propgate the
warnings to the top-level coordinator; so no warnings are expected
Assertions.assertThat(warnings).isNull();
}
else
{
Assertions.assertThat(Iterables.getOnlyElement(warnings))
.startsWith("Read " + TOMBSTONE_FAIL + " live rows and "
+ (TOMBSTONE_FAIL + 1) + " tombstone cells for query " + cql);
}
{code}
So track warnings is no longer active, and we are validating that old tombstone
warning was present... which is done here
org/apache/cassandra/db/ReadCommand.java:493.onClose
{code}
boolean warnTombstones = tombstones > warningThreshold &&
respectTombstoneThresholds;
if (warnTombstones)
{
String msg = String.format(
"Read %d live rows and %d tombstone cells for query
%1.512s; token %s (see tombstone_warn_threshold)",
liveRows, tombstones,
ReadCommand.this.toCQLString(), currentKey.getToken());
if (trackWarnings)
MessageParams.add(ParamType.TOMBSTONE_WARNING,
tombstones);
else
ClientWarn.instance.warn(msg);
if (tombstones < failureThreshold)
{
metric.tombstoneWarnings.inc();
}
logger.warn(msg);
}
{code}
based off logs I see the following
{code}
WARN [ReadStage-1] node2 2022-06-02 19:41:29,697 ReadCommand.java:592 - Read
100 live rows and 101 tombstone cells for query SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
WARN [ReadStage-1] node3 2022-06-02 19:41:29,701 ReadCommand.java:592 - Read
100 live rows and 101 tombstone cells for query SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
WARN [ReadStage-1] node1 2022-06-02 19:41:29,701 ReadCommand.java:592 - Read
100 live rows and 101 tombstone cells for query SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
{code}
so every node saw tombstone warnings, so coordinator should which should bubble
up to the user... but didn't?
Full logs for this section of code
{code}
INFO [node1_isolatedExecutor:2] node1 2022-06-02 19:41:29,690
DatabaseDescriptor.java:3992 - updated read_thresholds_enabled to false
INFO [node2_isolatedExecutor:2] node2 2022-06-02 19:41:29,691
DatabaseDescriptor.java:3992 - updated read_thresholds_enabled to false
INFO [node3_isolatedExecutor:1] node3 2022-06-02 19:41:29,691
DatabaseDescriptor.java:3992 - updated read_thresholds_enabled to false
WARN 19:41:29 Read 100 live rows and 101 tombstone cells for query SELECT *
FROM distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
WARN [ReadStage-1] node2 2022-06-02 19:41:29,697 ReadCommand.java:592 - Read
100 live rows and 101 tombstone cells for query SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
ERROR 19:41:29 Scanned over 101 tombstones during query 'SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING' (last scanned row
token was -4069959284402364209 and partion key was ((1), 100)); query aborted
ERROR [ReadStage-1] node2 2022-06-02 19:41:29,697 NoSpamLogger.java:111 -
Scanned over 101 tombstones during query 'SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING' (last scanned row
token was -4069959284402364209 and partion key was ((1), 100)); query aborted
DEBUG [node1_isolatedExecutor:2] node1 2022-06-02 19:41:29,700
ReadCallback.java:146 - Failed; received 0 of 3 responses
WARN 19:41:29 Read 100 live rows and 101 tombstone cells for query SELECT *
FROM distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
WARN [ReadStage-1] node3 2022-06-02 19:41:29,701 ReadCommand.java:592 - Read
100 live rows and 101 tombstone cells for query SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
ERROR 19:41:29 Scanned over 101 tombstones during query 'SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING' (last scanned row
token was -4069959284402364209 and partion key was ((1), 100)); query aborted
ERROR [ReadStage-1] node3 2022-06-02 19:41:29,701 NoSpamLogger.java:111 -
Scanned over 101 tombstones during query 'SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING' (last scanned row
token was -4069959284402364209 and partion key was ((1), 100)); query aborted
WARN 19:41:29 Read 100 live rows and 101 tombstone cells for query SELECT *
FROM distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
WARN [ReadStage-1] node1 2022-06-02 19:41:29,701 ReadCommand.java:592 - Read
100 live rows and 101 tombstone cells for query SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING; token
-4069959284402364209 (see tombstone_warn_threshold)
ERROR 19:41:29 Scanned over 101 tombstones during query 'SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING' (last scanned row
token was -4069959284402364209 and partion key was ((1), 100)); query aborted
ERROR [ReadStage-1] node1 2022-06-02 19:41:29,701 StorageProxy.java:2148 -
Scanned over 101 tombstones during query 'SELECT * FROM
distributed_test_keyspace.tbl WHERE pk = 1 ALLOW FILTERING' (last scanned row
token was -4069959284402364209 and partion key was ((1), 100)); query aborted
{code}
> Fix
> org.apache.cassandra.distributed.test.trackwarnings.TombstoneCountWarningTest.failThresholdSinglePartition
> --------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-17244
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17244
> Project: Cassandra
> Issue Type: Bug
> Components: CI
> Reporter: Ekaterina Dimitrova
> Assignee: David Capwell
> Priority: Normal
> Fix For: 4.1-beta, 4.1.x, 4.x
>
>
> org.apache.cassandra.distributed.test.trackwarnings.TombstoneCountWarningTest.failThresholdSinglePartition
> failed
> [here|https://jenkins-cm4.apache.org/job/Cassandra-devbranch/1354/testReport/junit/org.apache.cassandra.distributed.test.trackwarnings/TombstoneCountWarningTest/failThresholdSinglePartition/]
> I didn't find any other occurrences but seems to me legit failure.
> CC [~dcapwell] as I think you were working on those and probably you will
> make better assessment than me. :)
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]