[
https://issues.apache.org/jira/browse/CASSANDRA-20383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Caleb Rackliffe updated CASSANDRA-20383:
----------------------------------------
Reviewers: Blake Eggleston, Caleb Rackliffe (was: Caleb Rackliffe)
> CEP-45: Bulk transfer
> ---------------------
>
> Key: CASSANDRA-20383
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20383
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Consistency/Coordination
> Reporter: Blake Eggleston
> Assignee: Abe Ratnofsky
> Priority: Normal
> Time Spent: 8h 40m
> Remaining Estimate: 0h
>
> To support failure recovery and consistent sstable import, and other
> operations requiring transmission of large amounts of data for log replicated
> keyspaces, we need a method of staging data transfers and adding them as a
> logged operation.
> If large volumes of data are streamed and made visible to reads as soon as
> streams complete, it will cause increases in read latency and/or read outages
> as replicas data sets diverge in a way that can’t be handled (practically)
> via read reconciliation.
> For instance, imagine 5GB of data being added to a node via sstable import or
> failure recovery. The read mutation id summary needs to account for that
> data, and if the read coordinator can’t find a quorum of nodes that either
> have or have not received this data, the read can’t be completed and will
> fail.
> To prevent this, we’d need large transfers to be performed as a two stage
> process. First, data would need to be streamed to nodes under a unique id.
> Then, once the node coordinating transfer has finished streaming data to its
> recipients, it would instruct the recipients to add this data to their
> dataset as a logged operation. This way, if the “add” step doesn’t complete,
> or if it races with reads, read reconciliation (or normal reconciliation) can
> perform the add step.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]