subject:"\[jira\] \[Updated\] \(CASSANDRA\-14812\) Multiget Thrift query processor skips records in case of digest mismatch"

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-11-29 Thread Benedict (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14812:
-
Fix Version/s: 3.0.18

> Multiget Thrift query processor skips records in case of digest mismatch
> 
>
> Key: CASSANDRA-14812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14812
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination, Core
>Reporter: Sivukhin Nikita
>Assignee: Benedict
>Priority: Critical
>  Labels: bug
> Fix For: 3.0.18
>
> Attachments: repro_script.py, requirements.txt, small_repro_script.py
>
>
> It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
> Thrift query processing logic. When one tries to read data from several 
> partitions with a single {{multiget}} query and {{DigestMismatch}} exception 
> is raised during this query processing, request coordinator prematurely 
> terminates response stream right at the point where the first 
> \{{DigestMismatch}} error is occurring. This leads to situation where clients 
> "do not see" some data contained in the database.
> We managed to reproduce this bug in all versions of Cassandra starting with 
> v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
> [refactoring of iterator transformation 
> hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
>  related to CASSANDRA-9975 triggers incorrect behaviour.
> When concatenated iterator is returned from the 
> [StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
>  Cassandra starts to consume this combined iterator. Because of 
> {{DigestMismatch}} exception some elements of this combined iterator contain 
> additional {{ThriftCounter}}, that was added during 
> [DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
>  execution. While consuming iterator for many partitions Cassandra calls 
> [BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
>  method that must switch from one partition iterator to another in case of 
> exhaustion of the former. In this case all Transformations contained in the 
> next iterator are applied to the combined BaseIterator that enumerates 
> partitions sequence which is wrong. This behaviour causes BaseIterator to 
> stop enumeration after it fully consumes partition with {{DigestMismatch}} 
> error, because this partition iterator has additional {{ThriftCounter}} data 
> limit.
> The attachment contains the python2 script [^small_repro_script.py] that 
> reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is 
> an extended version of this script - [^repro_script.py] - that contains more 
> logging information and provides the ability to test behavior for many 
> Cassandra versions (to run all test cases from repro_script.py you can call 
> {{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
> necessary dependencies contained in the [^requirements.txt]
>  
> This bug is critical in our production environment because we can't permit 
> any data skip.
> Any ideas about a patch for this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-11-15 Thread C. Scott Andreas (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14812:
-
Component/s: Core
 Coordination

> Multiget Thrift query processor skips records in case of digest mismatch
> 
>
> Key: CASSANDRA-14812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14812
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination, Core
>Reporter: Sivukhin Nikita
>Priority: Critical
>  Labels: bug
> Attachments: repro_script.py, requirements.txt, small_repro_script.py
>
>
> It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
> Thrift query processing logic. When one tries to read data from several 
> partitions with a single {{multiget}} query and {{DigestMismatch}} exception 
> is raised during this query processing, request coordinator prematurely 
> terminates response stream right at the point where the first 
> \{{DigestMismatch}} error is occurring. This leads to situation where clients 
> "do not see" some data contained in the database.
> We managed to reproduce this bug in all versions of Cassandra starting with 
> v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
> [refactoring of iterator transformation 
> hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
>  related to CASSANDRA-9975 triggers incorrect behaviour.
> When concatenated iterator is returned from the 
> [StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
>  Cassandra starts to consume this combined iterator. Because of 
> {{DigestMismatch}} exception some elements of this combined iterator contain 
> additional {{ThriftCounter}}, that was added during 
> [DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
>  execution. While consuming iterator for many partitions Cassandra calls 
> [BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
>  method that must switch from one partition iterator to another in case of 
> exhaustion of the former. In this case all Transformations contained in the 
> next iterator are applied to the combined BaseIterator that enumerates 
> partitions sequence which is wrong. This behaviour causes BaseIterator to 
> stop enumeration after it fully consumes partition with {{DigestMismatch}} 
> error, because this partition iterator has additional {{ThriftCounter}} data 
> limit.
> The attachment contains the python2 script [^small_repro_script.py] that 
> reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is 
> an extended version of this script - [^repro_script.py] - that contains more 
> logging information and provides the ability to test behavior for many 
> Cassandra versions (to run all test cases from repro_script.py you can call 
> {{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
> necessary dependencies contained in the [^requirements.txt]
>  
> This bug is critical in our production environment because we can't permit 
> any data skip.
> Any ideas about a patch for this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-10-12 Thread Sivukhin Nikita (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sivukhin Nikita updated CASSANDRA-14812:

Description: 
It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
Thrift query processing logic. When one tries to read data from several 
partitions with a single {{multiget}} query and {{DigestMismatch}} exception is 
raised during this query processing, request coordinator prematurely terminates 
response stream right at the point where the first \{{DigestMismatch}} error is 
occurring. This leads to situation where clients "do not see" some data 
contained in the database.

We managed to reproduce this bug in all versions of Cassandra starting with 
v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
[refactoring of iterator transformation 
hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
 related to CASSANDRA-9975 triggers incorrect behaviour.

When concatenated iterator is returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} exception some elements of this combined iterator contain 
additional {{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of 
exhaustion of the former. In this case all Transformations contained in the 
next iterator are applied to the combined BaseIterator that enumerates 
partitions sequence which is wrong. This behaviour causes BaseIterator to stop 
enumeration after it fully consumes partition with {{DigestMismatch}} error, 
because this partition iterator has additional {{ThriftCounter}} data limit.

The attachment contains the python2 script [^small_repro_script.py] that 
reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is an 
extended version of this script - [^repro_script.py] - that contains more 
logging information and provides the ability to test behavior for many 
Cassandra versions (to run all test cases from repro_script.py you can call 
{{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
necessary dependencies contained in the [^requirements.txt]

 
This bug is critical in our production environment because we can't permit any 
data skip.

Any ideas about a patch for this issue?

  was:
It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
Thrift query processing logic. When one tries to read data from several 
partitions with a single {{multiget}} query and {{DigestMismatch}} exception is 
raised during this query processing, request coordinator prematurely terminates 
response stream right at the point where the first \{{DigestMismatch}}error is 
occurring. This leads to situation where clients "do not see" some data 
contained in the database.

We managed to reproduce this bug in all versions of Cassandra starting with 
v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
[refactoring of iterator transformation 
hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
 related to CASSANDRA-9975 triggers incorrect behaviour.

When concatenated iterator is returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} exception some elements of this combined iterator contain 
additional {{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of 
exhaustion of the former. In this case all Transformations contained in the 
next iterator are applied to the combined BaseIterator that enumerates 
partitions sequence which is

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-10-12 Thread Sivukhin Nikita (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sivukhin Nikita updated CASSANDRA-14812:

Priority: Critical  (was: Major)

> Multiget Thrift query processor skips records in case of digest mismatch
> 
>
> Key: CASSANDRA-14812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14812
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sivukhin Nikita
>Priority: Critical
>  Labels: bug
> Attachments: repro_script.py, requirements.txt, small_repro_script.py
>
>
> It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
> Thrift query processing logic. When one tries to read data from several 
> partitions with a single {{multiget}} query and {{DigestMismatch}} exception 
> is raised during this query processing, request coordinator prematurely 
> terminates response stream right at the point where the first 
> \{{DigestMismatch}} error is occurring. This leads to situation where clients 
> "do not see" some data contained in the database.
> We managed to reproduce this bug in all versions of Cassandra starting with 
> v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
> [refactoring of iterator transformation 
> hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
>  related to CASSANDRA-9975 triggers incorrect behaviour.
> When concatenated iterator is returned from the 
> [StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
>  Cassandra starts to consume this combined iterator. Because of 
> {{DigestMismatch}} exception some elements of this combined iterator contain 
> additional {{ThriftCounter}}, that was added during 
> [DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
>  execution. While consuming iterator for many partitions Cassandra calls 
> [BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
>  method that must switch from one partition iterator to another in case of 
> exhaustion of the former. In this case all Transformations contained in the 
> next iterator are applied to the combined BaseIterator that enumerates 
> partitions sequence which is wrong. This behaviour causes BaseIterator to 
> stop enumeration after it fully consumes partition with {{DigestMismatch}} 
> error, because this partition iterator has additional {{ThriftCounter}} data 
> limit.
> The attachment contains the python2 script [^small_repro_script.py] that 
> reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is 
> an extended version of this script - [^repro_script.py] - that contains more 
> logging information and provides the ability to test behavior for many 
> Cassandra versions (to run all test cases from repro_script.py you can call 
> {{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
> necessary dependencies contained in the [^requirements.txt]
>  
> This bug is critical in our production environment because we can't permit 
> any data skip.
> Any ideas about a patch for this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-10-12 Thread Sivukhin Nikita (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sivukhin Nikita updated CASSANDRA-14812:

Description: 
It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
Thrift query processing logic. When one tries to read data from several 
partitions with a single {{multiget}} query and {{DigestMismatch}} exception is 
raised during this query processing, request coordinator prematurely terminates 
response stream right at the point where the first {{DigestMismatch}}error is 
occurring. This leads to situation where clients "do not see" some data 
contained in the database.

We managed to reproduce this bug in all versions of Cassandra starting with 
v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
[refactoring of iterator transformation 
hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
 related to CASSANDRA-9975 triggers incorrect behaviour.

When concatenated iterator is returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} exception some elements of this combined iterator contain 
additional {{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of 
exhaustion of the former. In this case all Transformations contained in the 
next iterator are applied to the combined BaseIterator that enumerates 
partitions sequence which is wrong. This behaviour causes BaseIterator to stop 
enumeration after it fully consumes partition with {{DigestMismatch}} error, 
because this partition iterator has additional {{ThriftCounter}} data limit.

The attachment contains the python2 script [^small_repro_script.py] that 
reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is an 
extended version of this script - [^repro_script.py] - that contains more 
logging information and provides the ability to test behavior for many 
Cassandra versions (to run all test cases from repro_script.py you can call 
{{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
necessary dependencies contained in the [^requirements.txt]

 
This bug is critical in our production environment because we can't permit any 
data skip.

Any ideas about a patch for this issue?

  was:
Since Cassandra 3.0.0 there is a subtle bug that relates to the {{multiget}} 
Thrift query. It appears in the case when you try to read many partitions and 
this read cause {{DigestMismatch}} for some partition. When this situation 
happened, Cassandra cut your response stream right at the point when the first 
{{DigestMismatch}} error occurred.

This bug reproduced in all versions of Cassandra since 3.0.0. The pre-release 
version 3.0.0-rc2 works fine (also, there is no such problem in Cassandra 2.x 
versions). Looks like the big refactoring related to the task CASSANDRA-9975 
([link to 
commit|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7])
 in partition iterator architecture causes the wrong behavior.

When concatenated iterator returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} some elements of this combined iterator contains additional 
{{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions, Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of the 
devastation of former. In this case, all transformations for next iterator 
applied to the whole BaseIterator that enumerate many partitions sequence. This 
behavior causes iterator to stop enumeration after it fully consumes partition 
with the {{DigestMismatch}} error

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-10-12 Thread Sivukhin Nikita (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sivukhin Nikita updated CASSANDRA-14812:

Description: 
It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
Thrift query processing logic. When one tries to read data from several 
partitions with a single {{multiget}} query and {{DigestMismatch}} exception is 
raised during this query processing, request coordinator prematurely terminates 
response stream right at the point where the first \{{DigestMismatch}}error is 
occurring. This leads to situation where clients "do not see" some data 
contained in the database.

We managed to reproduce this bug in all versions of Cassandra starting with 
v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
[refactoring of iterator transformation 
hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
 related to CASSANDRA-9975 triggers incorrect behaviour.

When concatenated iterator is returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} exception some elements of this combined iterator contain 
additional {{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of 
exhaustion of the former. In this case all Transformations contained in the 
next iterator are applied to the combined BaseIterator that enumerates 
partitions sequence which is wrong. This behaviour causes BaseIterator to stop 
enumeration after it fully consumes partition with {{DigestMismatch}} error, 
because this partition iterator has additional {{ThriftCounter}} data limit.

The attachment contains the python2 script [^small_repro_script.py] that 
reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is an 
extended version of this script - [^repro_script.py] - that contains more 
logging information and provides the ability to test behavior for many 
Cassandra versions (to run all test cases from repro_script.py you can call 
{{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
necessary dependencies contained in the [^requirements.txt]

 
This bug is critical in our production environment because we can't permit any 
data skip.

Any ideas about a patch for this issue?

  was:
It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} 
Thrift query processing logic. When one tries to read data from several 
partitions with a single {{multiget}} query and {{DigestMismatch}} exception is 
raised during this query processing, request coordinator prematurely terminates 
response stream right at the point where the first {{DigestMismatch}}error is 
occurring. This leads to situation where clients "do not see" some data 
contained in the database.

We managed to reproduce this bug in all versions of Cassandra starting with 
v3.0.0. The pre-release version 3.0.0-rc2 works correctly. It looks like 
[refactoring of iterator transformation 
hierarchy|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7]
 related to CASSANDRA-9975 triggers incorrect behaviour.

When concatenated iterator is returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} exception some elements of this combined iterator contain 
additional {{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of 
exhaustion of the former. In this case all Transformations contained in the 
next iterator are applied to the combined BaseIterator that enumerates 
partitions sequence which is

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

2018-10-12 Thread Sivukhin Nikita (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sivukhin Nikita updated CASSANDRA-14812:

Summary: Multiget Thrift query processor skips records in case of digest 
mismatch  (was: Multiget Thrift query processor skips records in case of 
DigestMismatch)

> Multiget Thrift query processor skips records in case of digest mismatch
> 
>
> Key: CASSANDRA-14812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14812
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sivukhin Nikita
>Priority: Major
>  Labels: bug
> Attachments: repro_script.py, requirements.txt, small_repro_script.py
>
>
> Since Cassandra 3.0.0 there is a subtle bug that relates to the {{multiget}} 
> Thrift query. It appears in the case when you try to read many partitions and 
> this read cause {{DigestMismatch}} for some partition. When this situation 
> happened, Cassandra cut your response stream right at the point when the 
> first {{DigestMismatch}} error occurred.
> This bug reproduced in all versions of Cassandra since 3.0.0. The pre-release 
> version 3.0.0-rc2 works fine (also, there is no such problem in Cassandra 2.x 
> versions). Looks like the big refactoring related to the task CASSANDRA-9975 
> ([link to 
> commit|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7])
>  in partition iterator architecture causes the wrong behavior.
> When concatenated iterator returned from the 
> [StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
>  Cassandra starts to consume this combined iterator. Because of 
> {{DigestMismatch}} some elements of this combined iterator contains 
> additional {{ThriftCounter}}, that was added during 
> [DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
>  execution. While consuming iterator for many partitions, Cassandra calls 
> [BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
>  method that must switch from one partition iterator to another in case of 
> the devastation of former. In this case, all transformations for next 
> iterator applied to the whole BaseIterator that enumerate many partitions 
> sequence. This behavior causes iterator to stop enumeration after it fully 
> consumes partition with the {{DigestMismatch}} error because this partition 
> has addition {{ThriftCounter}} data limit that was applied to the whole 
> composite iterator.
> The attachment contains the python2 script [^small_repro_script.py] that 
> reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is 
> an extended version of this script - [^repro_script.py] - that contains more 
> logging information and provides the ability to test behavior for many 
> Cassandra versions (to run all test cases from repro_script.py you can call 
> {{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
> necessary dependencies contained in the [^requirements.txt]
>  
> This bug is critical in our production environment because we can't permit 
> any data skip.
> Any ideas about a patch for this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

[jira] [Updated] (CASSANDRA-14812) Multiget Thrift query processor skips records in case of digest mismatch

7 matches

Site Navigation

Mail list logo

Footer information