Jacek Lewandowski created CASSANDRA-16619:
---------------------------------------------

             Summary: Loss of commit log data possible after sstable ingest
                 Key: CASSANDRA-16619
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16619
             Project: Cassandra
          Issue Type: Bug
            Reporter: Jacek Lewandowski


SSTable metadata contains commit log positions of the sstable. These positions 
are used to filter out mutations from the commit log on restart and only make 
sense for the node on which the data was flushed.

If an SSTable is moved between nodes they may cover regions that the receiving 
node has not yet flushed, and result in valid data being lost should these 
sections of the commit log need to be replayed.

Solution:
The chosen solution introduces a new sstable metadata (StatsMetadata) - 
originatingHostId (UUID), which is the local host id of the node on which the 
sstable was created, or null if not known. Commit log intervals from an sstable 
are taken into account during Commit Log replay only when the originatingHostId 
of the sstable matches the local node's hostId.

For new sstables the originatingHostId is set according to StorageService's 
local hostId.
For compacted sstables the originatingHostId set according to StorageService's 
local hostId, and only commit log intervals from local sstables is preserved in 
the resulting sstable.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to