Copilot commented on code in PR #16594:
URL: https://github.com/apache/pinot/pull/16594#discussion_r2277320383


##########
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/task/TableTaskTypeConfig.java:
##########
@@ -0,0 +1,150 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.config.table.task;
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.exception.InvalidTableConfigException;
+import org.apache.pinot.spi.exception.InvalidTableTaskConfigException;
+import org.apache.pinot.spi.exception.TaskNotAllowedForTableException;
+import org.apache.pinot.spi.utils.CommonConstants;
+import org.quartz.CronScheduleBuilder;
+
+
+/**
+ * Base class for all table task type configurations.
+ * <p>
+ * Attributes in this class are generally managed by the Pinot controller 
rather than the task itself.
+ * Eg: schedule, priority, minionInstanceTag, etc
+ *
+ * Though some may be managed by the task but are still common across all task 
types. Eg: maxNumTasks.
+ *
+ * Attributes of subclass can be of 2 types
+ * 1. Required -> Such attributes should either have defaults or should fail 
validation if not set.
+ * 2. Optional -> Such attributes should be Optional<> so that the 
implementation knows that it can be empty.
+ *
+ * This setup helps -
+ * 1. Provide visibility and documentation of all configs and their defaults 
available for a task type.
+ * 2. Enable easier validation and correction of configs through above 
awareness.
+ * 3. Abstract backward compatibility / config names of configs making the 
generator / executor simpler.
+ */
+public abstract class TableTaskTypeConfig {
+
+  /**
+   * Schedule in quartz cron format. eg: "0 0/5 * * * ?" to run every 5 minutes
+   * The default value is null, which means the task is not scheduled by 
default (except if it's a default task)
+   */
+  protected final Optional<String> _schedule;
+
+  /**
+   * Minion Tag identifier to schedule a task on a dedicated pool of minion 
hosts which match the tag.
+   * Default value matches the default minion instance tag.
+   */
+  protected final String _minionInstanceTag;
+
+  /**
+   * Maximum number of sub-tasks that can be executed in a single trigger of 
the segment refresh task.
+   * This is used to control helix resource usage on controller.
+   * If more sub-tasks are needed to be merged / reloaded, multiple triggers 
of the task will be needed.
+   * This is going to be bounded by the cluster limit of max number of 
subtasks possible
+   */
+  private final int _maxNumSubTasks;
+
+  /**
+   * The actual user defined configuration for the task type.
+   * This is not supposed to be used by any task generator / executor
+   * but is used by controller for persisting user defined configs
+   * This is required because the attribute values in the subclass are 
modified based on defaults, corrections, etc
+   */
+  protected Map<String, String> _configs;
+
+  protected TableTaskTypeConfig(Map<String, String> configs) {
+    // TODO - Move the constants from pinot-controller to pinot-spi and use 
them here.
+    _schedule = Optional.ofNullable(configs.get("schedule"));
+    _minionInstanceTag = configs.getOrDefault("minionInstanceTag",
+        CommonConstants.Helix.UNTAGGED_MINION_INSTANCE);
+    _maxNumSubTasks = Integer.parseInt(configs.getOrDefault("tableMaxNumTasks",
+        String.valueOf(getDefaultMaxNumSubTasks())));
+    _configs = configs;
+  }
+
+  /**
+   * Validates the task type configuration.
+   * TODO - This should be called by PinotTaskGenerator before generating 
tasks. It shouldn't be called by each task.
+   * @return true if the configuration is valid, false otherwise.
+   */
+  public final void checkValidity(TableConfig tableConfig) throws 
InvalidTableConfigException {
+    validateCronSchedule();
+    // TODO - check minion instance tag is valid.
+    checkIfAllowedForTable(tableConfig);
+    checkTaskConfigValiditiy();
+  }
+
+  /**
+   * Corrects the task type configuration if needed to make it pass validations
+   * This will only work if the validation failed due to {@link 
InvalidTableTaskConfigException}.
+   * This is typically used when task generator runs but the current task type 
configuration is invalid
+   * due to modifications in validation logic across Pinot versions.
+   * TODO - This should be called by PinotTaskGenerator before generating 
tasks. It shouldn't be called by each task.
+   */
+  public abstract TableTaskTypeConfig getCorrectedConfig();
+
+  /**
+   * Returns an unmodifiable copy of the task type configuration.
+   */
+  public Map<String, String> getConfigs() {
+    return Collections.unmodifiableMap(_configs);
+  }
+
+  public int getMaxNumSubTasks() {
+    return _maxNumSubTasks;
+  }
+
+  // Can be overridden by subclasses
+  protected int getDefaultMaxNumSubTasks() {
+    return 1000;
+  }
+
+  /**
+   * Checks if the task type is allowed for the given table configuration.
+   * @throws TaskNotAllowedForTableException with message on why the 
configuration is not valid.
+   */
+  protected abstract void checkIfAllowedForTable(TableConfig tableConfig) 
throws TaskNotAllowedForTableException;
+
+  /**
+   * Validates the task type configuration.
+   * @throws InvalidTableTaskConfigException with message on why the 
configuration is not valid.
+   */
+  protected abstract void checkTaskConfigValiditiy() throws 
InvalidTableTaskConfigException;

Review Comment:
   Method name has a typo: 'checkTaskConfigValiditiy' should be 
'checkTaskConfigValidity'
   ```suggestion
     protected abstract void checkTaskConfigValidity() throws 
InvalidTableTaskConfigException;
   ```



##########
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/task/TableTaskConfigRegistry.java:
##########
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.config.table.task;
+
+import java.lang.reflect.Constructor;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Set;
+import org.apache.pinot.spi.annotations.table.TableTaskConfig;
+import org.apache.pinot.spi.utils.PinotReflectionUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * Registry for all {@link TableTaskTypeConfig}.
+ */
+public class TableTaskConfigRegistry {
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(TableTaskConfigRegistry.class);
+
+  /**
+   * The package regex pattern for auto-registered {@link TableTaskTypeConfig}.
+   */
+  private static final String TABLE_TASK_CONFIG_PACKAGE_REGEX_PATTERN = 
".*\\.plugin\\.config\\.table\\.task\\..*";
+
+  private final Map<String, Class<? extends TableTaskTypeConfig>> 
_tableTaskConfigRegistry = new HashMap<>();
+
+  /**
+   * Registers the table task type configurations via reflection.
+   * NOTE: To plugin a class using reflection, the class should include 
".plugin.config.table.task" in its class
+   * path. This convention can significantly reduce the time of class scanning.
+   */
+  public TableTaskConfigRegistry() {
+    long startTimeMs = System.currentTimeMillis();
+    Set<Class<?>> classes = getTableTaskConfigClasses();
+    for (Class<?> clazz : classes) {
+      TableTaskConfig annotation = clazz.getAnnotation(TableTaskConfig.class);
+      if (annotation.enabled()) {
+        try {
+          // Validate that the class extends TableTaskTypeConfig
+          if (!TableTaskTypeConfig.class.isAssignableFrom(clazz)) {
+            LOGGER.error("Class {} does not extend TableTaskTypeConfig, 
skipping registration", clazz);
+            continue;
+          }
+
+          @SuppressWarnings("unchecked")
+          Class<? extends TableTaskTypeConfig> taskConfigClass = (Class<? 
extends TableTaskTypeConfig>) clazz;
+
+          // Extract task type from class name (e.g., SegmentPurgeTaskConfig 
-> SegmentPurgeTask)
+          String taskType = getTaskTypeFromClassName(clazz.getSimpleName());
+          registerTableTaskConfig(taskType, taskConfigClass);
+        } catch (Exception e) {
+          LOGGER.error("Caught exception while registering table task config: 
{}, skipping it", clazz, e);
+        }
+      }
+    }
+    LOGGER.info("Initialized TableTaskConfigRegistry with {} table task 
configs: {} in {}ms",
+        _tableTaskConfigRegistry.size(), _tableTaskConfigRegistry.keySet(),
+        System.currentTimeMillis() - startTimeMs);
+  }
+
+  public static Set<Class<?>> getTableTaskConfigClasses() {
+    return PinotReflectionUtils
+        .getClassesThroughReflection(TABLE_TASK_CONFIG_PACKAGE_REGEX_PATTERN, 
TableTaskConfig.class);
+  }
+
+  /**
+   * Registers a table task type configuration class.
+   */
+  public void registerTableTaskConfig(String taskType, Class<? extends 
TableTaskTypeConfig> taskConfigClass) {
+    _tableTaskConfigRegistry.put(taskType, taskConfigClass);
+  }
+
+  /**
+   * Returns all registered task types.
+   */
+  public Set<String> getAllTaskTypes() {
+    return _tableTaskConfigRegistry.keySet();
+  }
+
+  /**
+   * Returns the table task config class for the given task type.
+   */
+  public Class<? extends TableTaskTypeConfig> getTableTaskConfigClass(String 
taskType) {
+    return _tableTaskConfigRegistry.get(taskType);
+  }
+
+  /**
+   * Creates a table task config instance for the given task type with the 
provided configuration.
+   */
+  public TableTaskTypeConfig createTableTaskConfig(String taskType, 
Map<String, String> configs) {
+    Class<? extends TableTaskTypeConfig> taskConfigClass = 
_tableTaskConfigRegistry.get(taskType);
+    if (taskConfigClass == null) {
+      return null;
+    }
+
+    try {
+      Constructor<? extends TableTaskTypeConfig> constructor =
+          taskConfigClass.getDeclaredConstructor(Map.class);
+      constructor.setAccessible(true);

Review Comment:
   Using `setAccessible(true)` to bypass access controls can be a security 
risk. Consider making the constructor public or using a factory method instead 
of reflection with access override.
   ```suggestion
             taskConfigClass.getConstructor(Map.class);
   ```



##########
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/task/TableTaskTypeConfig.java:
##########
@@ -0,0 +1,150 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.config.table.task;
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.exception.InvalidTableConfigException;
+import org.apache.pinot.spi.exception.InvalidTableTaskConfigException;
+import org.apache.pinot.spi.exception.TaskNotAllowedForTableException;
+import org.apache.pinot.spi.utils.CommonConstants;
+import org.quartz.CronScheduleBuilder;
+
+
+/**
+ * Base class for all table task type configurations.
+ * <p>
+ * Attributes in this class are generally managed by the Pinot controller 
rather than the task itself.
+ * Eg: schedule, priority, minionInstanceTag, etc
+ *
+ * Though some may be managed by the task but are still common across all task 
types. Eg: maxNumTasks.
+ *
+ * Attributes of subclass can be of 2 types
+ * 1. Required -> Such attributes should either have defaults or should fail 
validation if not set.
+ * 2. Optional -> Such attributes should be Optional<> so that the 
implementation knows that it can be empty.
+ *
+ * This setup helps -
+ * 1. Provide visibility and documentation of all configs and their defaults 
available for a task type.
+ * 2. Enable easier validation and correction of configs through above 
awareness.
+ * 3. Abstract backward compatibility / config names of configs making the 
generator / executor simpler.
+ */
+public abstract class TableTaskTypeConfig {
+
+  /**
+   * Schedule in quartz cron format. eg: "0 0/5 * * * ?" to run every 5 minutes
+   * The default value is null, which means the task is not scheduled by 
default (except if it's a default task)
+   */
+  protected final Optional<String> _schedule;
+
+  /**
+   * Minion Tag identifier to schedule a task on a dedicated pool of minion 
hosts which match the tag.
+   * Default value matches the default minion instance tag.
+   */
+  protected final String _minionInstanceTag;
+
+  /**
+   * Maximum number of sub-tasks that can be executed in a single trigger of 
the segment refresh task.
+   * This is used to control helix resource usage on controller.
+   * If more sub-tasks are needed to be merged / reloaded, multiple triggers 
of the task will be needed.
+   * This is going to be bounded by the cluster limit of max number of 
subtasks possible
+   */
+  private final int _maxNumSubTasks;
+
+  /**
+   * The actual user defined configuration for the task type.
+   * This is not supposed to be used by any task generator / executor
+   * but is used by controller for persisting user defined configs
+   * This is required because the attribute values in the subclass are 
modified based on defaults, corrections, etc
+   */
+  protected Map<String, String> _configs;
+
+  protected TableTaskTypeConfig(Map<String, String> configs) {
+    // TODO - Move the constants from pinot-controller to pinot-spi and use 
them here.
+    _schedule = Optional.ofNullable(configs.get("schedule"));
+    _minionInstanceTag = configs.getOrDefault("minionInstanceTag",
+        CommonConstants.Helix.UNTAGGED_MINION_INSTANCE);
+    _maxNumSubTasks = Integer.parseInt(configs.getOrDefault("tableMaxNumTasks",
+        String.valueOf(getDefaultMaxNumSubTasks())));

Review Comment:
   The code uses `Integer.parseInt()` without handling potential 
`NumberFormatException`. If the config value is not a valid integer, this will 
cause a runtime exception during object construction.
   ```suggestion
       String maxNumSubTasksStr = configs.getOrDefault("tableMaxNumTasks",
           String.valueOf(getDefaultMaxNumSubTasks()));
       int maxNumSubTasks;
       try {
         maxNumSubTasks = Integer.parseInt(maxNumSubTasksStr);
       } catch (NumberFormatException e) {
         throw new InvalidTableTaskConfigException(
             String.format("Invalid value for tableMaxNumTasks: '%s'. Must be a 
valid integer.", maxNumSubTasksStr), e);
       }
       _maxNumSubTasks = maxNumSubTasks;
   ```



##########
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/task/TableTaskTypeConfig.java:
##########
@@ -0,0 +1,150 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.config.table.task;
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.exception.InvalidTableConfigException;
+import org.apache.pinot.spi.exception.InvalidTableTaskConfigException;
+import org.apache.pinot.spi.exception.TaskNotAllowedForTableException;
+import org.apache.pinot.spi.utils.CommonConstants;
+import org.quartz.CronScheduleBuilder;
+
+
+/**
+ * Base class for all table task type configurations.
+ * <p>
+ * Attributes in this class are generally managed by the Pinot controller 
rather than the task itself.
+ * Eg: schedule, priority, minionInstanceTag, etc
+ *
+ * Though some may be managed by the task but are still common across all task 
types. Eg: maxNumTasks.
+ *
+ * Attributes of subclass can be of 2 types
+ * 1. Required -> Such attributes should either have defaults or should fail 
validation if not set.
+ * 2. Optional -> Such attributes should be Optional<> so that the 
implementation knows that it can be empty.
+ *
+ * This setup helps -
+ * 1. Provide visibility and documentation of all configs and their defaults 
available for a task type.
+ * 2. Enable easier validation and correction of configs through above 
awareness.
+ * 3. Abstract backward compatibility / config names of configs making the 
generator / executor simpler.
+ */
+public abstract class TableTaskTypeConfig {
+
+  /**
+   * Schedule in quartz cron format. eg: "0 0/5 * * * ?" to run every 5 minutes
+   * The default value is null, which means the task is not scheduled by 
default (except if it's a default task)
+   */
+  protected final Optional<String> _schedule;
+
+  /**
+   * Minion Tag identifier to schedule a task on a dedicated pool of minion 
hosts which match the tag.
+   * Default value matches the default minion instance tag.
+   */
+  protected final String _minionInstanceTag;
+
+  /**
+   * Maximum number of sub-tasks that can be executed in a single trigger of 
the segment refresh task.
+   * This is used to control helix resource usage on controller.
+   * If more sub-tasks are needed to be merged / reloaded, multiple triggers 
of the task will be needed.
+   * This is going to be bounded by the cluster limit of max number of 
subtasks possible
+   */
+  private final int _maxNumSubTasks;
+
+  /**
+   * The actual user defined configuration for the task type.
+   * This is not supposed to be used by any task generator / executor
+   * but is used by controller for persisting user defined configs
+   * This is required because the attribute values in the subclass are 
modified based on defaults, corrections, etc
+   */
+  protected Map<String, String> _configs;
+
+  protected TableTaskTypeConfig(Map<String, String> configs) {
+    // TODO - Move the constants from pinot-controller to pinot-spi and use 
them here.
+    _schedule = Optional.ofNullable(configs.get("schedule"));
+    _minionInstanceTag = configs.getOrDefault("minionInstanceTag",
+        CommonConstants.Helix.UNTAGGED_MINION_INSTANCE);
+    _maxNumSubTasks = Integer.parseInt(configs.getOrDefault("tableMaxNumTasks",
+        String.valueOf(getDefaultMaxNumSubTasks())));
+    _configs = configs;
+  }
+
+  /**
+   * Validates the task type configuration.
+   * TODO - This should be called by PinotTaskGenerator before generating 
tasks. It shouldn't be called by each task.
+   * @return true if the configuration is valid, false otherwise.
+   */
+  public final void checkValidity(TableConfig tableConfig) throws 
InvalidTableConfigException {
+    validateCronSchedule();
+    // TODO - check minion instance tag is valid.
+    checkIfAllowedForTable(tableConfig);
+    checkTaskConfigValiditiy();

Review Comment:
   Method name has a typo: 'checkTaskConfigValiditiy' should be 
'checkTaskConfigValidity'
   ```suggestion
       checkTaskConfigValidity();
   ```



##########
pinot-controller/src/main/java/org/apache/pinot/controller/util/TaskConfigUtils.java:
##########
@@ -60,7 +60,11 @@ public static void validateTaskConfigs(TableConfig 
tableConfig, Schema schema, P
         if (taskGenerator != null) {
           Map<String, String> taskConfigs = taskConfigEntry.getValue();
           doCommonTaskValidations(tableConfig, taskType, taskConfigs);
-          taskGenerator.validateTaskConfigs(tableConfig, schema, taskConfigs);
+          if (taskConfig.getTaskConfig(taskType) != null) {
+            taskConfig.getTaskConfig(taskType).checkValidity(tableConfig);
+          } else {
+            taskGenerator.validateTaskConfigs(tableConfig, schema, 
taskConfigs);
+          }

Review Comment:
   [nitpick] The nested if conditions create complex control flow. Consider 
extracting this logic into a separate method for better readability and 
maintainability.
   ```suggestion
             validateSingleTaskType(taskConfig, taskType, tableConfig, 
taskGenerator, schema, taskConfigs);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to