jihoonson commented on a change in pull request #8925: Parallel indexing single 
dim partitions
URL: https://github.com/apache/incubator-druid/pull/8925#discussion_r354516221
 
 

 ##########
 File path: docs/ingestion/native-batch.md
 ##########
 @@ -241,18 +241,37 @@ Currently only one splitHintSpec, i.e., `segments`, is 
available.
 
 ### `partitionsSpec`
 
-PartitionsSpec is to describe the secondary partitioning method.
+PartitionsSpec is used to describe the secondary partitioning method.
 You should use different partitionsSpec depending on the [rollup 
mode](../ingestion/index.md#rollup) you want.
-For perfect rollup, you should use `hashed`.
+For perfect rollup, you should use either `hashed` (partitioning based on the 
hash of dimensions in each row) or
+`single_dim` (based on ranges of a single dimension. For best-effort rollup, 
you should use `dynamic`.
+
+For perfect rollup, `hashed` partitioning is recommended in most cases, as it 
will improve indexing
 
 Review comment:
   I think it's worth to clearly mention what are the pros/cons of using each 
partitions spec instead of promoting using `hashed` partitioning.
   
   - With `dynamic` partitioning, you can expect the fastest ingestion speed 
compared to when using other partitions specs. It also always guarantees a 
well-balanced distribution in segment size.
   - With `hashed`, your wording is correct. 
   - With `single_dim`, its partitioning can be skewed depending on the 
partition key, but the broker can use of the partition information to prune 
segments to query earlier. If the query has a filter on the partition key 
column, the broker can filter out segments which have only the values not 
satisfying the filter.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to