Jackie-Jiang commented on code in PR #17503:
URL: https://github.com/apache/pinot/pull/17503#discussion_r2894399667
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/UpsertContext.java:
##########
@@ -192,6 +192,16 @@ public File getTableIndexDir() {
return _tableIndexDir;
}
+ /**
+ * Returns true if the table configuration has settings that can lead to
inconsistent upsert metadata
+ * during segment replacement after force commit. This happens when:
+ * - Partial upsert is enabled (records need to be merged with previous
values)
+ * - dropOutOfOrderRecord is enabled with NONE consistency mode (records may
have been dropped)
+ */
+ public boolean isTableTypeInconsistentDuringConsumption() {
Review Comment:
I don't follow this name. What does table type inconsistent stand for?
Basically certain setups requires upsert/dedup metadata to be consistent
across replicas during consumption, thus I was suggesting naming it
`requireConsistentMetadataDuringConsumption`
##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/ConsumingSegmentConsistencyModeListener.java:
##########
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.utils;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.atomic.AtomicReference;
+import org.apache.pinot.spi.config.provider.PinotClusterConfigChangeListener;
+import
org.apache.pinot.spi.utils.CommonConstants.ConfigChangeListenerConstants;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * Singleton class to manage the configuration for force commit on consuming
segments
+ * for upsert tables with inconsistent state configurations (partial upsert or
dropOutOfOrderRecord=true or
+ * outOfOrderColumn). By default, the force commit and reload is disabled
+ * This configuration is dynamically updatable via ZK cluster config without
requiring a server restart.
+ */
+public class ConsumingSegmentConsistencyModeListener implements
PinotClusterConfigChangeListener {
+ private static final Logger LOGGER =
LoggerFactory.getLogger(ConsumingSegmentConsistencyModeListener.class);
+ private static final ConsumingSegmentConsistencyModeListener INSTANCE = new
ConsumingSegmentConsistencyModeListener();
+
+ public enum Mode {
+ /**
+ * Force commit is disabled for tables with inconsistent state
configurations.
+ * Safe option that prevents potential data inconsistency issues.
+ */
+ RESTRICTED(false),
+
+ /**
+ * Force commit is enabled but tables with partial upsert or
dropOutOfOrderRecord=true (with replication > 1)
+ * will have their upsert metadata reverted when inconsistencies are
detected.
+ */
+ PROTECTED(true),
+
+ /**
+ * Force commit is enabled for all tables regardless of their
configuration.
+ * Use with caution as this may cause data inconsistency for
partial-upsert tables
+ * or upsert tables with dropOutOfOrderRecord/outOfOrderRecordColumn
enabled when replication > 1.
+ * Inconsistency checks and metadata revert are skipped.
+ */
+ UNSAFE(true);
+
+ public static final Mode DEFAULT_CONSUMING_SEGMENT_CONSISTENCY_MODE =
RESTRICTED;
Review Comment:
(nit) Given the class structure, we can name it `DEFAULT_MODE` for concision
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]