[ https://issues.apache.org/jira/browse/BEAM-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550154#comment-17550154 ]
Danny McCormick commented on BEAM-14558: ---------------------------------------- This issue has been migrated to https://github.com/apache/beam/issues/21715 > Data missing when using CassandraIO.Read > ---------------------------------------- > > Key: BEAM-14558 > URL: https://issues.apache.org/jira/browse/BEAM-14558 > Project: Beam > Issue Type: Bug > Components: io-java-cassandra > Affects Versions: 2.34.0, 2.35.0, 2.36.0, 2.37.0, 2.38.0, 2.39.0 > Reporter: Christophe ROQUETTE > Priority: P1 > > h2. Bug > Data at the beginning or end of the token ring is never retrieved, due to a > bad TokenRange request. > This bug was introduced by BEAM-9008, in [this > commit|https://github.com/apache/beam/commit/e12fc33e55e23db9f2aee330039d16dace34f9aa] > A basic reproduction case & workarounds are available here: > [Github/beam-cassandraio-bug|https://github.com/KriKroff/beam-cassandraio-bug] > h2. Description > When using {{{}CassandraIO{}}}, a list of token ranges is requested to C* > nodes in order to create splits in those ranges. > A split will be represented as a RingRange resulting in a request to C* in > the form of > `TOKEN(partition_key) >= range_start AND TOKEN(partition_key) < range_end` > The token ring goes from Long.MIN_VALUE to Long.MAX_VALUE (so -2xxx to 2xxx), > a range may contains the "join point" and be represented by [2xx, -2xxx]. > In this case (Aka TokenRange isWrapping), old implementation used to send 2 > different requests: > * {{TOKEN(partition_key) >= range_start}} (To get result up to the end of > the ring, i.e. Long.MAX_VALUE) > * {{TOKEN(partition_key) < range_end}} (To get result from the beginning end > of the ring, i.e. Long.MIN_VALUE) > Now, this behavior is not implemented anymore and token ranges are all called > the same way, even in the wrapping case. > It results in a request like : > {{TOKEN(partition_key) >= 2XXX AND TOKEN(partition_key) < -2xxx}} > This gives 0 results, and some data is never retrieved. > > h2. WorkArounds > * Downgrade to 2.33.0 > * Use customer TokenRanges & readAll implementation -- This message was sent by Atlassian Jira (v8.20.7#820007)