AmatyaAvadhanula commented on code in PR #12792:
URL: https://github.com/apache/druid/pull/12792#discussion_r930653764


##########
extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisSequenceNumber.java:
##########
@@ -93,21 +101,57 @@ public static boolean isValidAWSKinesisSequence(String 
sequenceNumber)
     return !(END_OF_SHARD_MARKER.equals(sequenceNumber)
              || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)
              || EXPIRED_MARKER.equals(sequenceNumber)
+             || UNREAD_TRIM_HORIZON.equals(sequenceNumber)
+             || UNREAD_LATEST.equals(sequenceNumber)
       );
   }
 
   @Override
   public int compareTo(OrderedSequenceNumber<String> o)
   {
     KinesisSequenceNumber num = (KinesisSequenceNumber) o;
+    if (isUnread() && num.isUnread()) {
+      return 0;
+    } else if (isUnread()) {
+      return -1;
+    } else if (num.isUnread()) {
+      return 1;
+    }
     if (isMaxSequenceNumber && num.isMaxSequenceNumber) {
       return 0;
     } else if (isMaxSequenceNumber) {
       return 1;
     } else if (num.isMaxSequenceNumber) {
       return -1;
-    } else {
-      return this.intSequence.compareTo(new BigInteger(o.get()));
     }
+    return this.intSequence.compareTo(new BigInteger(o.get()));
+  }
+
+  @Override
+  public boolean isAvailableWithEarliest(OrderedSequenceNumber<String> 
earliest)
+  {
+    if (isUnread()) {
+      return true;
+    }
+    return super.isAvailableWithEarliest(earliest);
+  }
+
+  @Override
+  public boolean isMoreToReadBeforeReadingRecord(OrderedSequenceNumber<String> 
end)
+  {
+    if (isUnreadSequence(end.get())) {

Review Comment:
   > Kinesis sequence numbers are inclusive i.e if current sequence == end 
sequence, there are more records left to read. However, the equality check is 
exclusive when dealing with UNREAD tokens.
   
   Yes, this is important after the end offsets have been finalized with the 
current offsets by the supervisor. 
   Partition assignment is (re)computed for shards that have (not been read AND 
not caught up). 
   Other shards have been read from, so they're not assigned.
   Unread shards need to be caught up before the task can stop. The current 
check for kinesis sequence numbers is exclusive, so even if the end offset is 
UNREAD, the shard will be assigned and the task wouldn't stop early, but wait 
for  the supervisor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to