vtlim commented on code in PR #12772:
URL: https://github.com/apache/druid/pull/12772#discussion_r923862200


##########
docs/ingestion/native-batch-input-source.md:
##########
@@ -707,18 +707,21 @@ Compared to the other native batch InputSources, SQL 
InputSource behaves differe
 * Similar to file-based input formats, any updates to existing data will 
replace the data in segments specific to the intervals specified in the 
`granularitySpec`.
 
 
-## Combining input sources
+## Combining input source
 
-The Combining input source is used to read data from multiple InputSources. 
This input source should be only used if all the delegate input sources are
- _splittable_ and can be used by the [Parallel task](./native-batch.md). This 
input source will identify the splits from its delegates and each split will be 
processed by a worker task. Similar to other input sources, this input source 
supports a single `inputFormat`. Therefore, please note that delegate input 
sources requiring an `inputFormat` must have the same format for input data.
+The Combining input source lets you read data from multiple input sources.
+It identifies the splits from delegate input sources and uses a worker task to 
process each split.
+Use the Combining input source only if all the delegates are _splittable_ and 
can be used by the [Parallel task](./native-batch.md). 
 
-|property|description|required?|
+Similar to other input sources, the Combining input source supports a single 
`inputFormat`.
+Delegate input sources that require an `inputFormat` must have the same format 
for input data.
+
+|Property|Description|Required|
 |--------|-----------|---------|
 |type|This should be "combining".|Yes|
-|delegates|List of _splittable_ InputSources to read data from.|Yes|
-
-Sample spec:
+|delegates|List of _splittable_ input sources to read data from.|Yes|

Review Comment:
   ```suggestion
   |delegates|List of splittable input sources to read data from.|Yes|
   ```



##########
docs/ingestion/native-batch-input-source.md:
##########
@@ -707,18 +707,21 @@ Compared to the other native batch InputSources, SQL 
InputSource behaves differe
 * Similar to file-based input formats, any updates to existing data will 
replace the data in segments specific to the intervals specified in the 
`granularitySpec`.
 
 
-## Combining input sources
+## Combining input source
 
-The Combining input source is used to read data from multiple InputSources. 
This input source should be only used if all the delegate input sources are
- _splittable_ and can be used by the [Parallel task](./native-batch.md). This 
input source will identify the splits from its delegates and each split will be 
processed by a worker task. Similar to other input sources, this input source 
supports a single `inputFormat`. Therefore, please note that delegate input 
sources requiring an `inputFormat` must have the same format for input data.
+The Combining input source lets you read data from multiple input sources.
+It identifies the splits from delegate input sources and uses a worker task to 
process each split.
+Use the Combining input source only if all the delegates are _splittable_ and 
can be used by the [Parallel task](./native-batch.md). 

Review Comment:
   ```suggestion
   Use the Combining input source only if all the delegates are splittable and 
can be used by the [Parallel task](./native-batch.md). 
   ```



##########
docs/ingestion/native-batch-input-source.md:
##########
@@ -748,3 +751,8 @@ Sample spec:
 ...
 ```
 
+For the Combining input source to read data correctly, set the value of 
`maxNumConcurrentSubTasks` in `tuningConfig` as follows:
+- more or equal to 1 for `range` or `single_dim` types
+- more or equal to 2 for `hashed` or `dynamic` types

Review Comment:
   ```suggestion
   - `range` or `single_dim` partitioning: greater than or equal to 1
   - `hashed` or `dynamic` partitioning: greater than or equal to 2
   ```



##########
docs/ingestion/native-batch-input-source.md:
##########
@@ -748,3 +751,8 @@ Sample spec:
 ...
 ```
 
+For the Combining input source to read data correctly, set the value of 
`maxNumConcurrentSubTasks` in `tuningConfig` as follows:

Review Comment:
   Adds a bit more context
   ```suggestion
   The [secondary partitioning method](native-batch.md#partitionsspec) 
determines the requisite number of concurrent worker tasks that run in parallel 
to complete ingestion with the Combining input source.
   Set this value in `maxNumConcurrentSubTasks` in `tuningConfig` based on the 
secondary partitioning method:
   ```



##########
docs/ingestion/native-batch-input-source.md:
##########
@@ -707,18 +707,21 @@ Compared to the other native batch InputSources, SQL 
InputSource behaves differe
 * Similar to file-based input formats, any updates to existing data will 
replace the data in segments specific to the intervals specified in the 
`granularitySpec`.
 
 
-## Combining input sources
+## Combining input source
 
-The Combining input source is used to read data from multiple InputSources. 
This input source should be only used if all the delegate input sources are
- _splittable_ and can be used by the [Parallel task](./native-batch.md). This 
input source will identify the splits from its delegates and each split will be 
processed by a worker task. Similar to other input sources, this input source 
supports a single `inputFormat`. Therefore, please note that delegate input 
sources requiring an `inputFormat` must have the same format for input data.
+The Combining input source lets you read data from multiple input sources.
+It identifies the splits from delegate input sources and uses a worker task to 
process each split.
+Use the Combining input source only if all the delegates are _splittable_ and 
can be used by the [Parallel task](./native-batch.md). 
 
-|property|description|required?|
+Similar to other input sources, the Combining input source supports a single 
`inputFormat`.
+Delegate input sources that require an `inputFormat` must have the same format 
for input data.
+
+|Property|Description|Required|
 |--------|-----------|---------|
 |type|This should be "combining".|Yes|

Review Comment:
   ```suggestion
   |type|Set the value to `combining`.|Yes|
   ```
   consider updating the `type` fields?
   following 
https://docs.imply.io/latest/druid/ingestion/native-batch/#dynamic-partitioning



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to