jon-wei commented on a change in pull request #8870: Additional Kinesis
resharding fixes
URL: https://github.com/apache/incubator-druid/pull/8870#discussion_r350567464
##########
File path:
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java
##########
@@ -3066,6 +3143,39 @@ protected abstract void updateLatestSequenceFromStream(
return makeSequenceNumber(seq, false);
}
+ /**
+ * If a task finishes reading a shard but no data was actually ingested, the
task will not publish any segments.
+ * In that case, a separate message indicating that shards were closed needs
to be sent to the supervisor,
+ * containing the IDs of the closed shards. The supervisor should mark those
shards with the end-of-shard marker
+ * in metadata storage.
+ *
+ * @param closedShards Set of closed shards
+ * @return true if the update was successful
+ */
+ public boolean updateClosedShards(Set<PartitionIdType> closedShards)
+ {
+ // Mark partitions as closed in metadata
+ @SuppressWarnings("unchecked")
+ SeekableStreamDataSourceMetadata<PartitionIdType, SequenceOffsetType>
currentMetadata =
+ (SeekableStreamDataSourceMetadata<PartitionIdType,
SequenceOffsetType>) indexerMetadataStorageCoordinator.getDataSourceMetadata(
+ dataSource);
+
+ SeekableStreamDataSourceMetadata<PartitionIdType, SequenceOffsetType>
cleanedMetadata =
+ createDataSourceMetadataWithClosedPartitions(currentMetadata,
closedShards);
+
+ try {
+ boolean success =
indexerMetadataStorageCoordinator.resetDataSourceMetadata(dataSource,
cleanedMetadata);
+ if (!success) {
+ // If this fails, a subsequent task will be reassigned the closed
shard and will eventually retry this.
+ log.error("Failed to update datasource metadata[%s] with expired
partitions removed", cleanedMetadata);
Review comment:
This area has been removed, the metadata commit is now handled in the main
publishing logic
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]