This is an automated email from the ASF dual-hosted git repository.

techdocsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new e1d80d0  Docs - note when partitioning using concatenated dimensions 
(#11506)
e1d80d0 is described below

commit e1d80d05a2ba045e333ed182a906f2959baa9f4c
Author: Peter Marshall <[email protected]>
AuthorDate: Mon Aug 30 19:59:24 2021 +0100

    Docs - note when partitioning using concatenated dimensions (#11506)
    
    LGTM
    
    * Update native-batch.md
    
    Knowledge from 
https://the-asf.slack.com/archives/CJ8D1JTB8/p1595434977062400
    
    * Update native-batch.md
    
    * Fixed broken link + some grammar
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith 
<[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith 
<[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith 
<[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith 
<[email protected]>
    
    * Update native-batch.md
    
    Some grammatical wizardry.
    
    * Update native-batch.md
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Update docs/ingestion/native-batch.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Apply suggestions from code review
    
    remove orphaned links
    
    Co-authored-by: Charles Smith 
<[email protected]>
    Co-authored-by: Charles Smith <[email protected]>
---
 docs/ingestion/native-batch.md | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/docs/ingestion/native-batch.md b/docs/ingestion/native-batch.md
index 9c1ef73..27b2c0b 100644
--- a/docs/ingestion/native-batch.md
+++ b/docs/ingestion/native-batch.md
@@ -370,11 +370,14 @@ Druid currently supports only one partition function.
 
 The Parallel task will use one subtask when you set `maxNumConcurrentSubTasks` 
to 1.
 
-> Be aware that, with this technique, segment sizes could be skewed if your 
chosen `partitionDimension` is also skewed in source data.
-
-> While it is technically possible to concatenate multiple dimensions into a 
single new dimension
-> that you go on to specify in `partitionDimension`, remember that you _must_ 
then use this newly concatenated dimension at query time
-> in order for segment pruning to be effective.
+When you use this technique to partition your data, segment sizes may be 
unequally distributed if the data
+in your `partitionDimension` is also unequally distributed.  Therefore, to 
avoid imbalance in data layout, 
+ review the distribution of values in your source data before deciding on a 
partitioning strategy.
+
+For segment pruning to be effective and translate into better query 
performance, you must use
+the `partitionDimension` at query time.  You can concatenate values from 
multiple
+dimensions into a new dimension to use as the `partitionDimension`. In this 
case, you
+must use that new dimension in your native filter `WHERE` clause.
 
 |property|description|default|required?|
 |--------|-----------|-------|---------|

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to