[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-06-08 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2590:
--

Attachment: 2590-v4-0.7.txt

I think we're almost there.

The trick is you actually need _both_ collectCollatedColumns and removeDeleted, 
since rD assumes cCC has already been called (which it is, when we're merging 
versions from different sstables...  but not when we're merging versions from 
different replicas, as in RRR).

Added a test (testResolveDeletedSuper) to illustrate this.  Fails against v3 
(rD but no cCC) but passes w/ v4 (cCC and rD).

> row delete breaks read repair 
> --
>
> Key: CASSANDRA-2590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Minor
> Fix For: 0.7.7, 0.8.1
>
> Attachments: 0001-2590-v3.patch, 
> 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt, 
> 2590-v4-0.7.txt
>
>
> related to CASSANDRA-2589 
> Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
> on the 0.7 and 0.8 source. 
> Steps to reproduce:
> # two node cluster with rf 2 and HH turned off
> # insert rows via cli 
> # flush both nodes 
> # shutdown node 1
> # connect to node 2 via cli and delete one row
> # bring up node 1
> # connect to node 1 via cli and issue get with CL ALL 
> # first get returns the deleted row, second get returns zero rows.
> RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
> columns, and the remote CF which is marked for deletion. CF.resolve() does 
> not pay attention to the deletion flags and the resolved CF has both 
> markedForDeletion set and a column with a lower timestamp. The return from 
> resolveSuperSet() is used as the return for the read without checking if the 
> cols are relevant. 
> Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
> mutations. Node 1 is given the row level deletation, and Node 2 is given a 
> mutation to write the old (and now deleted) column from node 2. I have some 
> log traces for this if needed. 
> A quick fix is to check for relevant columns in the RowRepairResolver, will 
> attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-06-08 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2590:
--

 Reviewer: jbellis
Affects Version/s: (was: 0.7.5)
   (was: 0.8 beta 1)
Fix Version/s: 0.8.1
   0.7.7

> row delete breaks read repair 
> --
>
> Key: CASSANDRA-2590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Minor
> Fix For: 0.7.7, 0.8.1
>
> Attachments: 0001-2590-v3.patch, 
> 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt
>
>
> related to CASSANDRA-2589 
> Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
> on the 0.7 and 0.8 source. 
> Steps to reproduce:
> # two node cluster with rf 2 and HH turned off
> # insert rows via cli 
> # flush both nodes 
> # shutdown node 1
> # connect to node 2 via cli and delete one row
> # bring up node 1
> # connect to node 1 via cli and issue get with CL ALL 
> # first get returns the deleted row, second get returns zero rows.
> RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
> columns, and the remote CF which is marked for deletion. CF.resolve() does 
> not pay attention to the deletion flags and the resolved CF has both 
> markedForDeletion set and a column with a lower timestamp. The return from 
> resolveSuperSet() is used as the return for the read without checking if the 
> cols are relevant. 
> Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
> mutations. Node 1 is given the row level deletation, and Node 2 is given a 
> mutation to write the old (and now deleted) column from node 2. I have some 
> log traces for this if needed. 
> A quick fix is to check for relevant columns in the RowRepairResolver, will 
> attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-06-01 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2590:
--

Comment: was deleted

(was: Why do we need to add extra steps to RRR instead of the IQF approach 
(which means it gets fixed for any local-only queries too)?)

> row delete breaks read repair 
> --
>
> Key: CASSANDRA-2590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.5, 0.8 beta 1
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Minor
> Attachments: 0001-2590-v3.patch, 
> 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt
>
>
> related to CASSANDRA-2589 
> Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
> on the 0.7 and 0.8 source. 
> Steps to reproduce:
> # two node cluster with rf 2 and HH turned off
> # insert rows via cli 
> # flush both nodes 
> # shutdown node 1
> # connect to node 2 via cli and delete one row
> # bring up node 1
> # connect to node 1 via cli and issue get with CL ALL 
> # first get returns the deleted row, second get returns zero rows.
> RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
> columns, and the remote CF which is marked for deletion. CF.resolve() does 
> not pay attention to the deletion flags and the resolved CF has both 
> markedForDeletion set and a column with a lower timestamp. The return from 
> resolveSuperSet() is used as the return for the read without checking if the 
> cols are relevant. 
> Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
> mutations. Node 1 is given the row level deletation, and Node 2 is given a 
> mutation to write the old (and now deleted) column from node 2. I have some 
> log traces for this if needed. 
> A quick fix is to check for relevant columns in the RowRepairResolver, will 
> attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-05-09 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-2590:


Attachment: 0001-2590-v3.patch

2590-v3 uses removeDeleted() in RowRepairResolver.resolveSuperset() and 
includes tests in RowResolverTest.

CASSANDRA-2621 shows that QueryFilter.collectCollatedColumns() returns a CF 
with deleted columns and the caller should use removeDeleted. 

Continuing to use CF.resolve() seemed like the minimum change. Let me know if 
you think we should still use QueryFilter to resolve the differences. 

> row delete breaks read repair 
> --
>
> Key: CASSANDRA-2590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.5, 0.8 beta 1
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Minor
> Attachments: 0001-2590-v3.patch, 
> 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt
>
>
> related to CASSANDRA-2589 
> Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
> on the 0.7 and 0.8 source. 
> Steps to reproduce:
> # two node cluster with rf 2 and HH turned off
> # insert rows via cli 
> # flush both nodes 
> # shutdown node 1
> # connect to node 2 via cli and delete one row
> # bring up node 1
> # connect to node 1 via cli and issue get with CL ALL 
> # first get returns the deleted row, second get returns zero rows.
> RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
> columns, and the remote CF which is marked for deletion. CF.resolve() does 
> not pay attention to the deletion flags and the resolved CF has both 
> markedForDeletion set and a column with a lower timestamp. The return from 
> resolveSuperSet() is used as the return for the read without checking if the 
> cols are relevant. 
> Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
> mutations. Node 1 is given the row level deletation, and Node 2 is given a 
> mutation to write the old (and now deleted) column from node 2. I have some 
> log traces for this if needed. 
> A quick fix is to check for relevant columns in the RowRepairResolver, will 
> attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-05-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2590:
--

Attachment: 2590-v2.txt

... but that's not what we want for RowRepairResolver. (I freely admit that 
dealing with tombstones is subtle and tricky. :)

removeDeleted will give you back a version of the row with any GC-able 
tombstones removed. That's not what we want for read repair; we want to 
preserve tombstones, but we want a "canonical" representation of only the 
minimum tombstones necessary.

Instead we want to do what you were doing with ensureRelevant, and drop columns 
that are irrelevant. But it's a little more complex than that because we have 
the same problem at the supercolumn level, as at the row level.

Here's a patch that uses an IdentityQueryFilter to run through the isRelevant 
logic using the same supercolumn-aware code that we use when merging versions 
from different memtables/sstables.

> row delete breaks read repair 
> --
>
> Key: CASSANDRA-2590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.5, 0.8 beta 1
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Minor
> Attachments: 
> 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt
>
>
> related to CASSANDRA-2589 
> Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
> on the 0.7 and 0.8 source. 
> Steps to reproduce:
> # two node cluster with rf 2 and HH turned off
> # insert rows via cli 
> # flush both nodes 
> # shutdown node 1
> # connect to node 2 via cli and delete one row
> # bring up node 1
> # connect to node 1 via cli and issue get with CL ALL 
> # first get returns the deleted row, second get returns zero rows.
> RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
> columns, and the remote CF which is marked for deletion. CF.resolve() does 
> not pay attention to the deletion flags and the resolved CF has both 
> markedForDeletion set and a column with a lower timestamp. The return from 
> resolveSuperSet() is used as the return for the read without checking if the 
> cols are relevant. 
> Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
> mutations. Node 1 is given the row level deletation, and Node 2 is given a 
> mutation to write the old (and now deleted) column from node 2. I have some 
> log traces for this if needed. 
> A quick fix is to check for relevant columns in the RowRepairResolver, will 
> attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-05-01 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-2590:


Attachment: 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch

unit test to show columns in a deleted CF after calling resolve() and a hack 
fix for the use case described above.

> row delete breaks read repair 
> --
>
> Key: CASSANDRA-2590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.5, 0.8 beta 1
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Minor
> Attachments: 
> 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch
>
>
> related to CASSANDRA-2589 
> Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
> on the 0.7 and 0.8 source. 
> Steps to reproduce:
> # two node cluster with rf 2 and HH turned off
> # insert rows via cli 
> # flush both nodes 
> # shutdown node 1
> # connect to node 2 via cli and delete one row
> # bring up node 1
> # connect to node 1 via cli and issue get with CL ALL 
> # first get returns the deleted row, second get returns zero rows.
> RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
> columns, and the remote CF which is marked for deletion. CF.resolve() does 
> not pay attention to the deletion flags and the resolved CF has both 
> markedForDeletion set and a column with a lower timestamp. The return from 
> resolveSuperSet() is used as the return for the read without checking if the 
> cols are relevant. 
> Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
> mutations. Node 1 is given the row level deletation, and Node 2 is given a 
> mutation to write the old (and now deleted) column from node 2. I have some 
> log traces for this if needed. 
> A quick fix is to check for relevant columns in the RowRepairResolver, will 
> attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira