findingrish commented on code in PR #17025:
URL: https://github.com/apache/druid/pull/17025#discussion_r1756105964
##########
server/src/main/java/org/apache/druid/segment/metadata/AbstractSegmentMetadataCache.java:
##########
@@ -732,18 +718,35 @@ public Set<SegmentId> refreshSegmentsForDataSource(final
String dataSource, fina
log.debug("Refreshing metadata for datasource[%s].", dataSource);
+ final Set<SegmentId> retVal = new HashSet<>();
+
+ ConcurrentSkipListMap<SegmentId, AvailableSegmentMetadata>
datasourceSegments = segmentMetadataInfo.get(dataSource);
+ // this datasource no longer exists, skip refresh
+ if (datasourceSegments == null) {
+ return retVal;
+ }
+
+ // Skip refreshing tombstone segments. These segments lack data or column
information.
+ // Additionally, segment metadata queries, which are not yet implemented
for tombstone segments
+ // (see: https://github.com/apache/druid/pull/12137) do not provide
metadata for tombstones,
+ // leading to indefinite refresh attempts for these segments.
+ Set<SegmentId> segmentsWithoutTombstone =
Review Comment:
I just changed the approach yesterday. Instead of filtering out the
tombstone segments in the end before refresh, I am ensuring they are never
marked for refresh.
A segment is marked for refresh in following scenarios:
* Segment is added.
* Datasource signature is built and schema for segment is missing.
* Metadata for the segment is fetched and schema for the segment is missing.
I have ensured that a tombstone segment never gets marked for refresh
itself.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]