damccorm opened a new issue, #20563:
URL: https://github.com/apache/beam/issues/20563

   There is a bug in the Kinesis connector which doesn't allow Beam to read 
from new shards if restored from an old enough checkpoint.
   
   When loading from an old checkpoint `ShardReadersPool` will identify that 
the old shard is closed and only attempts to read from successive shards. 
However when trying to find successive shards it will only look for shards that 
were children of the old shard, these shards may not exist if the checkpoint is 
old enough as expired shards are not returned when listing shards 
([docs](https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html))
   
   Example 1:
   
   Checkpoint:
   
   (Alive)
    Shard 1
   
   At restoration:
   
                      (Alive)
                    \> Shard 2 \
    (Exp)     /                    \  (Alive)
    Shard 1                        -\> Shard 4
                \   (Alive)      / 
                  \> Shard 3 /
   
   In this example the connector will currently work correctly as Shard 2 and 3 
will be identified as the successive shards and will continue reading from them.
   
   Example 2:
   
   Checkpoint:
   
   (Alive)
    Shard 1
   
   At restoration:
   
                       (Exp)
                    \> Shard 2 \
    (Exp)     /                    \  (Alive)
    Shard 1                        -\> Shard 4
                \    (Exp)       / 
                  \> Shard 3 /
   
   In this example the connector currently won't work correctly as it wont 
identify Shard 4 as a successive shard as its not a child of Shard 1 and thus 
stop reading from the stream when it should start reading from Shard 4.
   
    
   
   Imported from Jira 
[BEAM-11014](https://issues.apache.org/jira/browse/BEAM-11014). Original Jira 
may contain additional context.
   Reported by: usamj.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to