Github user hmcl commented on a diff in the pull request:
https://github.com/apache/storm/pull/2465#discussion_r157537115
--- Diff:
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java
---
@@ -225,6 +243,25 @@ private long doSeek(TopicPartition tp,
OffsetAndMetadata committedOffset) {
}
}
+ /**
+ * Checks If {@link OffsetAndMetadata} was committed by an instance of
{@link KafkaSpout} in this topology.
+ * This info is used to decide if {@link FirstPollOffsetStrategy}
should be applied
+ *
+ * @param committedOffset {@link OffsetAndMetadata} info committed to
Kafka
+ * @return true if this topology committed this {@link
OffsetAndMetadata}, false otherwise
+ */
+ private boolean isOffsetCommittedByThisTopology(OffsetAndMetadata
committedOffset) {
+ try {
+ final KafkaSpout.Info info =
JSON_MAPPER.readValue(committedOffset.metadata(), KafkaSpout.Info.class);
+ return info.getTopologyId().equals(context.getStormId());
+ } catch (IOException e) {
+ LOG.trace("Failed to deserialize {}. Error likely occurred
because the last commit " +
--- End diff --
The log messages only get printed in the case of a commit made by a
topology running an older version of Storm and until the first commit is done
by this topology for each topic-partition. Ater that the first commit logs will
no longer occur. I will look into if it makes sense caching this info.
---