[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-09-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-14145:
---
Labels: pull-request-available  (was: )

>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-09-05 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14145:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Thanks all, committed to trunk in {{5fbb938adaafd91e7bea1672f09a03c7ac5b9b9d}}

>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.0
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-09-01 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14145:

Status: Ready to Commit  (was: Patch Available)

>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.x
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-08-29 Thread Jordan West (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-14145:

Reviewers: Jordan West, Marcus Eriksson
 Reviewer:   (was: Marcus Eriksson)

>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.x
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-08-11 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14145:

 Reviewer: Marcus Eriksson
Fix Version/s: 4.x
   Status: Patch Available  (was: In Progress)

I agree with Stefan that logging inconsistencies so that operators can 
investigate further is the sensible way to approach this initially so I've 
taken a pass at this in the branch linked below.

On digest mismatch, the coordinator adds a new parameter to the requests for 
the full data reads. When executing the query, replicas generate a digest of 
the portion of the data read from their repaired sstables. When the coordinator 
resolves the data requests, it checks for multiple digests and logs + 
increments a metric if it finds > 1. To mitigate against false positives caused 
by sstables moving from pending to repaired at slightly different times on each 
replica, we also track if the reads touched any tables with pending, but 
locally uncommitted, repair sessions. If any replica had pending sessions 
during the read, we increment a different metric (I called these confirmed and 
unconfirmed inconsistencies).
 
Partition range reads don't make digest requests, so in order to detect 
inconsistency on that side of the read path the coordinator always adds the 
parameter to request the info on the repaired status. Although the overhead of 
tracking the repaired status should be minimal, this means the every range read 
will perform the additional work. With that in mind, to be conservative I've 
added separate config option/JMX operations to enable/disable it for single 
partition and range reads.

I haven't added any dtests yet, but there's quite a bit of unit test coverage 
in the patch

||branch||utest||dtest||
|[14145-trunk|https://github.com/beobal/cassandra/tree/14145-trunk]|[utest|https://circleci.com/gh/beobal/cassandra/300]|[vnodes|https://circleci.com/gh/beobal/cassandra/301]
 / [no_vnodes|https://circleci.com/gh/beobal/cassandra/302]|


>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.x
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which 

[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-01-03 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14145:
---
Description: 
We have seen several bugs in which deleted data gets resurrected. We should try 
to see if we can detect this on the read path and possibly fix it. Here are a 
few examples which brought back data

A replica lost an sstable on startup which caused one replica to lose the 
tombstone and not the data. This tombstone was past gc grace which means this 
could resurrect data. We can detect such invalid states by looking at other 
replicas. 

If we are running incremental repair, Cassandra will keep repaired and 
non-repaired data separate. Every-time incremental repair will run, it will 
move the data from non-repaired to repaired. Repaired data across all replicas 
should be 100% consistent. 

Here is an example of how we can detect and mitigate the issue in most cases. 
Say we have 3 machines, A,B and C. All these machines will have data split b/w 
repaired and non-repaired. 
1. Machine A due to some bug bring backs data D. This data D is in repaired 
dataset. All other replicas will have data D and tombstone T 
2. Read for data D comes from application which involve replicas A and B. The 
data being read involves data which is in repaired state.  A will respond back 
to co-ordinator with data D and B will send nothing as tombstone is past gc 
grace. This will cause digest mismatch. 
3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
will ask both replicas to send back all data like we do today but with this 
patch, replicas will respond back what data it is returning is coming from 
repaired vs non-repaired. If data coming from repaired does not match, we know 
there is a something wrong!! At this time, co-ordinator cannot determine if 
replica A has resurrected some data or replica B has lost some data. We can 
still log error in the logs saying we hit an invalid state.
4. Besides the log, we can take this further and even correct the response to 
the query. After logging an invalid state, we can ask replica A and B (and also 
C if alive) to send back all data for this including gcable tombstones. If any 
machine returns a tombstone which is after this data, we know we cannot return 
this data. This way we can avoid returning data which has been deleted. 

Some Challenges with this 
1. When data will be moved from non-repaired to repaired, there could be a race 
here. We can look at which incremental repairs have promoted things on which 
replica to avoid false positives.  
2. If the third replica is down and live replica does not have any tombstone, 
we wont be able to break the tie in deciding whether data was actually deleted 
or resurrected. 
3. If the read is for latest data only, we wont be able to detect it as the 
read will be served from non-repaired data. 
4. If the replica where we lose a tombstone is the last replica to compact the 
tombstone, we wont be able to decide if data is coming back or rest of the 
replicas has lost that data. But we will still detect something is wrong. 
5. We wont affect 99.9% of the read queries as we only do extra work during 
digest mismatch.
6. CL.ONE reads will not be able to detect this. 

  was:
We have seen several bugs in which deleted data gets resurrected. We should try 
to see if we can detect this on the read path and possibly fix it. Here are a 
few examples which brought back data

A replica lost an sstable on startup which caused one replica to lose the 
tombstone and not the data. This tombstone was past gc grace which means this 
could resurrect data. We can deduct such invalid states by looking at other 
replicas. 

If we are running incremental repair, Cassandra will keep repaired and 
non-repaired data separate. Every-time incremental repair will run, it will 
move the data from non-repaired to repaired. Repaired data across all replicas 
should be 100% consistent. 

Here is an example of how we can detect and mitigate the issue in most cases. 
Say we have 3 machines, A,B and C. All these machines will have data split b/w 
repaired and non-repaired. 
1. Machine A due to some bug bring backs data D. This data D is in repaired 
dataset. All other replicas will have data D and tombstone T 
2. Read for data D comes from application which involve replicas A and B. The 
data being read involves data which is in repaired state.  A will respond back 
to co-ordinator with data D and B will send nothing as tombstone is past gc 
grace. This will cause digest mismatch. 
3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
will ask both replicas to send back all data like we do today but with this 
patch, replicas will respond back what data it is returning is coming from 
repaired vs non-repaired. If data coming from repaired does not match, we know 
there is a something wrong!! At this