[jira] [Comment Edited] (CASSANDRA-21164) Bulk import support at CL

Ariel Weisberg (Jira) Mon, 16 Feb 2026 14:46:57 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058977#comment-18058977
 ]


Ariel Weisberg edited comment on CASSANDRA-21164 at 2/16/26 10:45 PM:
----------------------------------------------------------------------

An additional wrinkle here is that if you support import at QUORUM and a node 
comes back and in order to execute a read it has to import a huge number of 
sstables it means the read is going to time out. That's not an acceptable 
outcome so we need to detect missing sstable imports and if they occur send a 
message back to the coordinator so it can pick a different full replica. Or we 
can just rely on read speculation to do this, but that still needs to be 
implemented for MT (I think) and that still adds extra latency.

We took on a lot of extra complexity by moving distribution of sstable import 
into the database itself and to complete the feature it will take a bit more 
for both witnesses and mutation tracking.


was (Author: aweisberg):
An additional wrinkle here is that if you support import at QUORUM and a node 
comes back and in order to execute a read it has to import a huge number of 
sstables it means the read is going to time out. That's not an acceptable 
outcome so we need to detect missing sstable imports and if they occur send a 
message back to the coordinator so it can pick a different full replica. Or we 
can just rely on read speculation to do this, but that still needs to be 
implemented for MT (I think) and that still adds extra latency.

We took on a lot of extra complexity by moving distribution of sstable import 
into the database itself and to complete the feature it will take quite a bit 
more for both witnesses and mutation tracking.

> Bulk import support at CL
> -------------------------
>
>                 Key: CASSANDRA-21164
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21164
>             Project: Apache Cassandra
>          Issue Type: Sub-task
>            Reporter: Ariel Weisberg
>            Priority: Normal
>
> Right now bulk import will fail with mutation tracked tables if a node is 
> down because it assumes that it will distribute at ALL, but things like 
> BulkWriter would rather distribute at QUORUM or some other consistency level.
> Another wrinkle is that import will distribute the data for you if it is a 
> mutation tracked table, but will silently only import if it at one node if it 
> is not mutation tracked. For callers this is risky and it would be better if 
> distribution only occurred if the caller requested it and if the caller 
> requests distribution and Cassandra will not distribute it then an error 
> should be signaled. This prevent the case where the caller expects it to be 
> distributed and it silently isn’t and the data is only visible at ONE.
> Since mutation tracking requires distribution if the caller does not request 
> distribution it should fail and indicate that it needs to be requested with 
> distribution. Finicky yes, but fails fast with less risk of data loss.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-21164) Bulk import support at CL

Reply via email to