dclim commented on a change in pull request #6326: Add support hash 
partitioning by a subset of dimensions to indexTask
URL: https://github.com/apache/incubator-druid/pull/6326#discussion_r221719109
 
 

 ##########
 File path: docs/content/ingestion/native_tasks.md
 ##########
 @@ -475,6 +475,7 @@ The tuningConfig is optional and default parameters will 
be used if no tuningCon
 |maxBytesInMemory|Used in determining when intermediate persists to disk 
should occur. Normally this is computed internally and user does not need to 
set it. This value represents number of bytes to aggregate in heap memory 
before persisting. This is based on a rough estimate of memory usage and not 
actual usage. The maximum heap memory usage for indexing is maxBytesInMemory * 
(2 + maxPendingPersists)|1/6 of max JVM memory|no|
 |maxTotalRows|Total number of rows in segments waiting for being pushed. Used 
in determining when intermediate pushing should occur.|20000000|no|
 |numShards|Directly specify the number of shards to create. If this is 
specified and 'intervals' is specified in the granularitySpec, the index task 
can skip the determine intervals/partitions pass through the data. numShards 
cannot be specified if targetPartitionSize is set.|null|no|
+|partitionDimensions|The dimensions to partition on. Leave blank to select all 
dimensions. Only used with numShards > 1, will be ignored when 
targetPartitionSize or maxTotalRows is set.|null|no|
 
 Review comment:
   Why does this get ignored is targetPartitionSize/maxTotalRows is set? That's 
also a bit weird since those parameters have non-zero default values if not 
provided by the user. Wouldn't it get ignored if forceGuaranteedRollup is false?
   
   Also agree more documentation on why you would want to use this and how it 
would allow you to get better data locality would be helpful.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to