Priyath Gregory created PULSAR-7:
------------------------------------
Summary: Possibility of data loss due to asynchronous
checkpointing in kinesis consumer
Key: PULSAR-7
URL: https://issues.apache.org/jira/browse/PULSAR-7
Project: Pulsar
Issue Type: Bug
Reporter: Priyath Gregory
The KinesisRecordProcessor pushes each record to a blocking queue and proceeds
to checkpoint shard progress on every checkpoint interval. But at the point of
checkpointing, there is no guarantee of successful delivery of pending records
in the queue, which can lead to data loss in case of instance failure.
Source:
[https://github.com/apache/pulsar/blob/master/pulsar-io/kinesis/src/main/java/org/apache/pulsar/io/kinesis/KinesisRecordProcessor.java]
The fix should ideally cover the following scenarios as far as I can see:
1. Periodic checkpointing at each checkpoint interval should be performed after
pending records are processed up until that point
2. Checkpoint when onShardEnded is invoked should be performed after pending
records are processed.
3. Checkpoint when shutdownRequested is invoked should be performed after
pending records are processed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)