kfaraz commented on code in PR #17562: URL: https://github.com/apache/druid/pull/17562#discussion_r1883880455
########## docs/ingestion/tasks.md: ########## @@ -470,6 +470,8 @@ The following parameters apply to all task types. |`storeEmptyColumns`|Enables the task to store empty columns during ingestion. When `true`, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). When `false`, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|`true`| |`taskLockTimeout`|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|300000| |`useLineageBasedSegmentAllocation`|Enables the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to `true` to ensure data correctness.|`false` in 0.21 or earlier, `true` in 0.22 or later| +|`lookupLoadingMode`|Controls the lookup loading behavior in tasks. This property supports three values: `ALL` mode loads all the lookups, `NONE` mode does not load any lookups and `ONLY_REQUIRED` mode loads the lookups specified with context key `lookupsToLoad`.|`ALL`| +|`lookupsToLoad`|Comma separated list of lookup names that will be loaded in tasks. This property is required only if the `lookupLoadingMode` is set to `ONLY_REQUIRED`. |`null`| Review Comment: Nit: Changed `comma-separated list` to `list`. "Comma-separated list" sounds a little ambiguous as it could mean a single string that has comma-separated values like ``` "lookupsToLoad": "lookup1, lookup2, lookup3" ``` whereas what we need is more like ``` "lookupsToLoad": ["lookup1", "lookup2", "lookup3"] ``` ```suggestion |`lookupsToLoad`|List of lookup names to load in tasks. This property is required only if the `lookupLoadingMode` is set to `ONLY_REQUIRED`. |`null`| ``` ########## docs/ingestion/tasks.md: ########## @@ -470,6 +470,8 @@ The following parameters apply to all task types. |`storeEmptyColumns`|Enables the task to store empty columns during ingestion. When `true`, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). When `false`, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|`true`| |`taskLockTimeout`|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|300000| |`useLineageBasedSegmentAllocation`|Enables the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to `true` to ensure data correctness.|`false` in 0.21 or earlier, `true` in 0.22 or later| +|`lookupLoadingMode`|Controls the lookup loading behavior in tasks. This property supports three values: `ALL` mode loads all the lookups, `NONE` mode does not load any lookups and `ONLY_REQUIRED` mode loads the lookups specified with context key `lookupsToLoad`.|`ALL`| Review Comment: We should call out that the user must not specify this context parameter for MSQ tasks and kill tasks as the system computed value is always the right choice: - MSQControllerTask - hardcoded to NONE, cannot be overridden - MSQWorkerTask - ONLY_REQUIRED, lookupsToLoad are identified by the controller task by parsing the SQL - `kill` task - hardcoded to NONE, cannot be overridden In the future, we might implement auto-detection of required lookups for batch ingest, streaming ingest and compact tasks too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
