[ 
https://issues.apache.org/jira/browse/CASSANDRA-20383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abe Ratnofsky updated CASSANDRA-20383:
--------------------------------------
    Attachment: ci_summary.html

> CEP-45: Bulk transfer
> ---------------------
>
>                 Key: CASSANDRA-20383
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20383
>             Project: Apache Cassandra
>          Issue Type: New Feature
>          Components: Consistency/Coordination
>            Reporter: Blake Eggleston
>            Assignee: Abe Ratnofsky
>            Priority: Normal
>         Attachments: ci_summary.html
>
>          Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> To support failure recovery and consistent sstable import, and other 
> operations requiring transmission of large amounts of data for log replicated 
> keyspaces, we need a method of staging data transfers and adding them as a 
> logged operation.
> If large volumes of data are streamed and made visible to reads as soon as 
> streams complete, it will cause increases in read latency and/or read outages 
> as replicas data sets diverge in a way that can’t be handled (practically) 
> via read reconciliation. 
> For instance, imagine 5GB of data being added to a node via sstable import or 
> failure recovery. The read mutation id summary needs to account for that 
> data, and if the read coordinator can’t find a quorum of nodes that either 
> have or have not received this data, the read can’t be completed and will 
> fail.
> To prevent this, we’d need large transfers to be performed as a two stage 
> process. First, data would need to be streamed to nodes under a unique id. 
> Then, once the node coordinating transfer has finished streaming data to its 
> recipients, it would instruct the recipients to add this data to their 
> dataset as a logged operation. This way, if the “add” step doesn’t complete, 
> or if it races with reads, read reconciliation (or normal reconciliation) can 
> perform the add step.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to