Christophe ROQUETTE created BEAM-14558:
------------------------------------------

             Summary: Data missing when using CassandraIO.Read
                 Key: BEAM-14558
                 URL: https://issues.apache.org/jira/browse/BEAM-14558
             Project: Beam
          Issue Type: Bug
          Components: io-java-cassandra
    Affects Versions: 2.39.0, 2.38.0, 2.37.0, 2.36.0, 2.35.0, 2.34.0
            Reporter: Christophe ROQUETTE


h2. Bug

Data at the beginning or end of the token ring is never retrieved, due to a bad 
TokenRange request.

This bug was introduced by BEAM-9008, in [this 
commit|https://github.com/apache/beam/commit/e12fc33e55e23db9f2aee330039d16dace34f9aa]

A basic reproduction case & workarounds are available here:

[Github/beam-cassandraio-bug|https://github.com/KriKroff/beam-cassandraio-bug]
h2. Description

When using {{{}CassandraIO{}}}, a list of token ranges is requested to C* nodes 
in order to create splits in those ranges.
A split will be represented as a RingRange resulting in a request to C* in the 
form of
`TOKEN(partition_key) >= range_start AND TOKEN(partition_key) < range_end`

The token ring goes from Long.MIN_VALUE to Long.MAX_VALUE (so -2xxx to 2xxx), a 
range may contains the "join point" and be represented by [2xx, -2xxx].

In this case (Aka TokenRange isWrapping), old implementation used to send 2 
different requests:
 * {{TOKEN(partition_key) >= range_start}} (To get result up to the end of the 
ring, i.e. Long.MAX_VALUE)
 * {{TOKEN(partition_key) < range_end}} (To get result from the beginning end 
of the ring, i.e. Long.MIN_VALUE)

Now, this behavior is not implemented anymore and token ranges are all called 
the same way, even in the wrapping case.
It results in a request like :
{{TOKEN(partition_key) >= 2XXX AND TOKEN(partition_key) < -2xxx}}
This gives 0 results, and some data is never retrieved.

 
h2. WorkArounds
 * Downgrade to 2.33.0
 * Use customer TokenRanges & readAll implementation



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to