gianm commented on code in PR #17785:
URL: https://github.com/apache/druid/pull/17785#discussion_r2012764162


##########
indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordDataSourcesResource.java:
##########
@@ -66,19 +67,22 @@ public class OverlordDataSourcesResource
   private static final Logger log = new 
Logger(OverlordDataSourcesResource.class);
 
   private final SegmentsMetadataManager segmentsMetadataManager;
+  private final IndexerMetadataStorageCoordinator metadataStorageCoordinator;

Review Comment:
   I just realized that the `SegmentsMetadataManager` is the Coordinator 
version of a metadata cache. Do you think we're moving towards getting rid of 
it completely on the Overlord side of things, in favor of using only 
`IndexerMetadataStorageCoordinator`?
   
   It looks like after this patch, the `SegmentsMetadataManager` is still used 
in two places on the OL:
   
   - this class, for `markAsUsedNonOvershadowedSegmentsInInterval`, 
`markAsUsedNonOvershadowedSegments`, and `markSegmentAsUsed`.
   - in `OverlordCompactionScheduler`, for 
`getSnapshotOfDataSourcesWithAllUsedSegments`. to me it seems like this could 
already be replaced by 
`IndexerSQLMetadataStorageCoordinator#retrieveAllUsedSegments`. (although the 
`SegmentsMetadataManager` version would perform better, since it caches the 
snapshot, and the `IndexerSQLMetadataStorageCoordinator` version needs to build 
a timeline. I am not sure how much this matters.)



##########
server/src/main/java/org/apache/druid/metadata/segment/cache/SegmentRecord.java:
##########
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata.segment.cache;
+
+import org.apache.druid.java.util.common.DateTimes;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.timeline.SegmentId;
+import org.joda.time.DateTime;
+
+import javax.annotation.Nullable;
+import java.sql.ResultSet;
+
+/**
+ * Represents a single record in the druid_segments table.
+ */
+class SegmentRecord
+{
+  private static final Logger log = new Logger(SegmentRecord.class);
+
+  private final SegmentId segmentId;
+  private final boolean isUsed;
+  private final DateTime lastUpdatedTime;
+
+  SegmentRecord(SegmentId segmentId, boolean isUsed, DateTime lastUpdatedTime)
+  {
+    this.segmentId = segmentId;
+    this.isUsed = isUsed;
+    this.lastUpdatedTime = lastUpdatedTime;
+  }
+
+  public SegmentId getSegmentId()
+  {
+    return segmentId;
+  }
+
+  public boolean isUsed()
+  {
+    return isUsed;
+  }
+
+  public DateTime getLastUpdatedTime()
+  {
+    return lastUpdatedTime;
+  }
+
+  /**
+   * Creates a SegmentRecord from the given result set.
+   *
+   * @return null if an error occurred while reading the record.
+   */
+  @Nullable
+  static SegmentRecord fromResultSet(ResultSet r)
+  {
+    String serializedId = null;
+    String dataSource = null;
+    try {
+      serializedId = r.getString("id");
+      dataSource = r.getString("dataSource");
+
+      final DateTime lastUpdatedTime = nullSafeDate(r.getString(
+          "used_status_last_updated"));
+
+      final SegmentId segmentId = SegmentId.tryParse(dataSource, serializedId);
+      if (segmentId == null) {
+        log.noStackTrace().error(
+            "Could not parse Segment ID[%s] of datasource[%s]",
+            serializedId, dataSource
+        );
+        return null;
+      } else {
+        return new SegmentRecord(segmentId, true, lastUpdatedTime);
+      }
+    }
+    catch (Exception e) {
+      log.noStackTrace().error(

Review Comment:
   When might this happen? Does it mean we will ignore segments that we can't 
read? That would be worrying, because `Exception` is very broad. Is there some 
kind of scenario you have in mind where a failure is possible?
   
   Also, `.noStackTrace().error` is not a combination I expect to see. 
Typically `error` is reserved for serious problems, and for serious problems we 
should always have a stack trace.



##########
processing/src/main/java/org/apache/druid/common/utils/IdUtils.java:
##########
@@ -167,4 +169,24 @@ public static SegmentId getValidSegmentId(String 
dataSource, String serializedSe
       return parsedSegmentId;
     }
   }
+
+  /**
+   * Tries to parse the given serialized ID as {@link SegmentId}s of the given

Review Comment:
   This method is suspicious to me, since it encourages ignoring 
seemingly-invalid segment IDs. Where are the places that we are encountering 
segment IDs that may not be valid? Is it ok to ignore them?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to