shanthoosh commented on a change in pull request #918: SAMZA-2094: Implement 
the StartpointVisitor for the KafkaSystemConsumer.
URL: https://github.com/apache/samza/pull/918#discussion_r261470344
 
 

 ##########
 File path: 
samza-kafka/src/main/java/org/apache/samza/system/kafka/KafkaSystemConsumer.java
 ##########
 @@ -330,6 +320,73 @@ public String getSystemName() {
     return systemName;
   }
 
+  @VisibleForTesting
+  static class KafkaStartpointRegistrationHandler implements StartpointVisitor 
{
+
+    private final Consumer kafkaConsumer;
+
+    KafkaStartpointRegistrationHandler(Consumer kafkaConsumer) {
+      this.kafkaConsumer = kafkaConsumer;
+    }
+
+    @Override
+    public void visit(SystemStreamPartition systemStreamPartition, 
StartpointSpecific startpointSpecific) {
+      TopicPartition topicPartition = toTopicPartition(systemStreamPartition);
+      long offsetInStartpoint = 
Long.parseLong(startpointSpecific.getSpecificOffset());
+      LOG.info("Updating the consumer fetch offsets of topic partition: {} to 
{}.", topicPartition, offsetInStartpoint);
+
+      // KafkaConsumer is not thread-safe.
+      synchronized (kafkaConsumer) {
+        kafkaConsumer.seek(topicPartition, offsetInStartpoint);
+      }
+    }
+
+    @Override
+    public void visit(SystemStreamPartition systemStreamPartition, 
StartpointTimestamp startpointTimestamp) {
+      Long timestampInStartpoint = startpointTimestamp.getTimestampOffset();
+      TopicPartition topicPartition = toTopicPartition(systemStreamPartition);
+      Map<TopicPartition, Long> topicPartitionsToTimeStamps = 
ImmutableMap.of(topicPartition, timestampInStartpoint);
+
+      // Look up the offset by timestamp.
+      LOG.info("Looking up the offsets of the topic partition: {} by 
timestamp: {}.", topicPartition, timestampInStartpoint);
+      Map<TopicPartition, OffsetAndTimestamp> topicPartitionToOffsetTimestamps 
= kafkaConsumer.offsetsForTimes(topicPartitionsToTimeStamps);
 
 Review comment:
   Good question. 
   Excerpt the kafka javadoc linked in the above comment:
   ```
   If the message format version in a partition is before 0.10.0, i.e. the 
messages do not have timestamps, null will be returned for that partition
   ```
   
   Null would be returned by  KafkaConsumer `offsetsForTimes` API when used 
with the kafka broker version < `0.10.0`. 
   
   Starting from 1.0, samza supports kafka version: `0.11.0.2`. From the 
[kafka-compatibility-matrix](https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix),
 kafka-client version used by samza is not compatible with broker versions < 
`0.10.0`. 
   
   Since samza doesn't support kafka brokers < 0.10.0, I'm not sure if we 
should handle null value returned by this API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to