kfaraz commented on code in PR #18844:
URL: https://github.com/apache/druid/pull/18844#discussion_r2624159447


##########
server/src/main/java/org/apache/druid/segment/metadata/CompactionStateManager.java:
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.metadata;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.cache.Cache;
+import com.google.common.cache.CacheBuilder;
+import com.google.common.cache.CacheLoader;
+import com.google.common.collect.Lists;
+import com.google.common.util.concurrent.Striped;
+import com.google.inject.Inject;
+import org.apache.druid.error.InternalServerError;
+import org.apache.druid.guice.ManageLifecycle;
+import org.apache.druid.java.util.common.DateTimes;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.java.util.common.lifecycle.LifecycleStart;
+import org.apache.druid.java.util.common.lifecycle.LifecycleStop;
+import org.apache.druid.java.util.emitter.EmittingLogger;
+import org.apache.druid.metadata.MetadataStorageTablesConfig;
+import org.apache.druid.metadata.SQLMetadataConnector;
+import org.apache.druid.timeline.CompactionState;
+import org.joda.time.DateTime;
+import org.skife.jdbi.v2.Handle;
+import org.skife.jdbi.v2.PreparedBatch;
+import org.skife.jdbi.v2.Query;
+import org.skife.jdbi.v2.SQLStatement;
+import org.skife.jdbi.v2.Update;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.locks.Lock;
+
+/**
+ * Manages the persistence and retrieval of {@link CompactionState} objects in 
the metadata storage.
+ * <p>
+ * Compaction states are uniquely identified by their fingerprints, which are 
SHA-256 hashes of their content. A cache
+ * of compaction states using the fingerprints as keys is maintained in memory 
to optimize retrieval performance.
+ * </p>
+ * <p>
+ * A striped locking mechanism is used to ensure thread-safe persistence of 
compaction states on a per-datasource basis.
+ * </p>
+ */
+@ManageLifecycle
+public class CompactionStateManager

Review Comment:
   Yes, the plan was to deprecate `CompactSegments` once compaction supervisors 
took off. I don't fully recall if compaction supervisors is already marked GA 
or not. They would also have to be made the default, if we want to start 
deprecation of `CompactSegments`.
   
   But I feel all of this should be out of scope for the current PR.
   
   If supporting the fingerprint logic in `CompactSegments` is not additional 
work and does not complicate the flow, we can leave it as is.
   
   My only concern is that there should be just one service that is responsible 
for persisting new fingerprints. I would prefer that to be the Overlord, so 
that it always has a consistent cache state. So we either just don't support 
fingerprints on the Coordinator or we handle persistence by calling an Overlord 
API.
   
   (I am yet to go through the whole PR to identify all the call sites that may 
persist a compaction state. I have only found the one in 
`CompactionConfigBasedJobTemplate` so far.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to