[ 
https://issues.apache.org/jira/browse/FLINK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16693524#comment-16693524
 ] 

ASF GitHub Bot commented on FLINK-5697:
---------------------------------------

mxm commented on a change in pull request #6980: [FLINK-5697] [kinesis] Add 
periodic per-shard watermark support
URL: https://github.com/apache/flink/pull/6980#discussion_r234580470
 
 

 ##########
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ##########
 @@ -78,6 +79,22 @@
  * A custom assigner implementation can be set via {@link 
#setShardAssigner(KinesisShardAssigner)} to optimize the
  * hash function or use static overrides to limit skew.
  *
+ * <p>In order for the consumer to emit watermarks, a timestamp assigner needs 
to be set via {@link
+ * #setPeriodicWatermarkAssigner(AssignerWithPeriodicWatermarks)} and the auto 
watermark emit
+ * interval configured via {@link
+ * org.apache.flink.api.common.ExecutionConfig#setAutoWatermarkInterval(long)}.
+ *
+ * <p>Watermarks can only advance when all shards of a subtask continuously 
deliver records. To
+ * avoid an inactive or closed shard to block the watermark progress, the idle 
timeout should be
+ * configured via configuration property {@link
+ * ConsumerConfigConstants#SHARD_IDLE_INTERVAL_MILLIS}. By default, shards 
won't be considered
+ * idle and watermark calculation will wait for newer records to arrive from 
all shards.
+ *
+ * <p>Note that re-sharding of the Kinesis stream while an application (that 
relies on
+ * the Kinesis records for watermarking) is running can lead to incorrect late 
events.
+ * This depends on how shards are assigned to subtasks and applies regardless 
of whether watermarks
+ * are generated in the source or a downstream operator.
 
 Review comment:
   Good to mention this here. The re-sharding logic can corrupt the 
watermarking logic, but that is not unique to this change. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add per-shard watermarks for FlinkKinesisConsumer
> -------------------------------------------------
>
>                 Key: FLINK-5697
>                 URL: https://issues.apache.org/jira/browse/FLINK-5697
>             Project: Flink
>          Issue Type: New Feature
>          Components: Kinesis Connector, Streaming Connectors
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Thomas Weise
>            Priority: Major
>              Labels: pull-request-available
>
> It would be nice to let the Kinesis consumer be on-par in functionality with 
> the Kafka consumer, since they share very similar abstractions. Per-partition 
> / shard watermarks is something we can add also to the Kinesis consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to