sthetland commented on a change in pull request #9360: Create splits of 
multiple files for parallel indexing
URL: https://github.com/apache/druid/pull/9360#discussion_r381510488
 
 

 ##########
 File path: docs/ingestion/native-batch.md
 ##########
 @@ -226,7 +224,14 @@ The tuningConfig is optional and default parameters will 
be used if no tuningCon
 `SplitHintSpec` is used to give a hint when the supervisor task creates input 
splits.
 Note that each worker task processes a single input split. You can control the 
amount of data each worker task will read during the first phase.
 
-Currently only one splitHintSpec, i.e., `segments`, is available.
+#### `MaxSizeSplitHintSpec`
+
+`MaxSizeSplitHintSpec` is respected by all splittable input sources except for 
the HTTP input source.
+
+|property|description|default|required?|
+|--------|-----------|-------|---------|
+|type|This should always be `maxSize`.|none|yes|
+|maxSplitSize|Maximum number of bytes of input files to process in a single 
task. If a single file is larger than this number, it will be processed by 
itself in a single task (splitting a large file is not supported yet).|500MB|no|
 
 Review comment:
   Could this match the wording used below, so: 
   "....in a single task. (Files are never split across tasks.)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to