kfaraz commented on code in PR #18844:
URL: https://github.com/apache/druid/pull/18844#discussion_r2660439181
##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -1701,6 +1705,132 @@ private SegmentSchemaRecord mapToSchemaRecord(ResultSet
resultSet)
}
}
+ /**
+ * Retrieves all unique compaction state fingerprints currently referenced
by used segments.
+ * This is used for delta syncs to determine which fingerprints are still
active.
+ *
+ * @return Set of compaction state fingerprints
+ */
+ public Set<String> retrieveAllUsedCompactionStateFingerprints()
+ {
+ final String sql = StringUtils.format(
+ "SELECT DISTINCT compaction_state_fingerprint FROM %s "
+ + "WHERE used = true AND compaction_state_fingerprint IS NOT NULL",
+ dbTables.getSegmentsTable()
+ );
+
+ return Set.copyOf(
+ handle.createQuery(sql)
+ .setFetchSize(connector.getStreamingFetchSize())
+ .mapTo(String.class)
+ .list()
+ );
+ }
+
+ /**
+ * Retrieves all compaction states for used segments (full sync).
+ * Fetches from compaction_states table where the fingerprint is referenced
by used segments.
+ *
+ * @return List of CompactionStateRecord objects
+ */
+ public List<CompactionStateRecord> retrieveAllUsedCompactionStates()
+ {
+ final String sql = StringUtils.format(
+ "SELECT cs.fingerprint, cs.payload FROM %s cs "
+ + "WHERE cs.used = true "
+ + "AND cs.fingerprint IN ("
+ + " SELECT DISTINCT compaction_state_fingerprint FROM %s "
+ + " WHERE used = true AND compaction_state_fingerprint IS NOT NULL"
+ + ")",
+ dbTables.getCompactionStatesTable(),
+ dbTables.getSegmentsTable()
Review Comment:
Let's just ignore the segments table and query the compaction states table
directly.
It is okay to retrieve some compaction states even if they are not currently
referenced. The background cleanup threads will take care of the references,
and subsequent sync of the cache will add/remove the compaction state to the
in-memory cache as applicable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]