Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/flink/pull/1341#discussion_r44531824
--- Diff:
flink-streaming-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer.java
---
@@ -567,6 +604,75 @@ public void notifyCheckpointComplete(long
checkpointId) throws Exception {
}
return partitionsToSub;
}
+
+ /**
+ * Thread to periodically commit the current read offset into Zookeeper.
+ */
+ private static class PeriodicOffsetCommitter extends Thread {
+ private long commitInterval;
+ private volatile boolean running = true;
+ private FlinkKafkaConsumer consumer;
+ private final Object stateUpdateLock = new Object();
+
+ public PeriodicOffsetCommitter(long commitInterval,
FlinkKafkaConsumer consumer) {
+ this.commitInterval = commitInterval;
+ this.consumer = consumer;
+ }
+
+ @Override
+ public void run() {
+ try {
+ while (running) {
+ try {
+ Thread.sleep(commitInterval);
+
+ // ------------ commit
current offsets ----------------
+
+ // create copy of current
offsets
+ long[] currentOffsets;
+ synchronized (stateUpdateLock) {
+ currentOffsets =
Arrays.copyOf(consumer.lastOffsets, consumer.lastOffsets.length);
+ }
+
+ Map<TopicPartition, Long>
offsetsToCommit = new HashMap<>();
+ //noinspection unchecked
+ for (TopicPartition tp :
(List<TopicPartition>)consumer.subscribedPartitions) {
+ int partition =
tp.partition();
+ long offset =
currentOffsets[partition];
+ long lastCommitted =
consumer.commitedOffsets[partition];
+
+ if (offset !=
OFFSET_NOT_SET) {
+ if (offset >
lastCommitted) {
+
offsetsToCommit.put(tp, offset);
+
LOG.debug("Committing offset {} for partition {}", offset, partition);
+ } else {
+
LOG.debug("Ignoring offset {} for partition {} because it is already
committed", offset, partition);
+ }
+ }
+ }
+
+
consumer.offsetHandler.commit(offsetsToCommit);
+ } catch (InterruptedException e) {
+ // looks like the thread is
being closed. Leave loop
+ break;
--- End diff --
Good style is to throw the exception if `running` is still true. That way
the fetcher learns about unexpected interruptions of the thread.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---