jihoonson commented on a change in pull request #8257: Add support for parallel native indexing with shuffle for perfect rollup URL: https://github.com/apache/incubator-druid/pull/8257#discussion_r313665107
########## File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/AbstractBatchIndexTask.java ########## @@ -377,6 +403,44 @@ static Granularity findGranularityFromSegments(List<DataSegment> segments) } } + /** + * Creates shard specs based on the given configurations. The return value is a map between intervals created + * based on the segment granularity and the shard specs to be created. + * Note that the shard specs to be created is a pair of {@link ShardSpecFactory} and number of segments per interval + * and filled only when {@link #isGuaranteedRollup} = true. Otherwise, the return value contains only the set of + * intervals generated based on the segment granularity. + */ + protected static Map<Interval, Pair<ShardSpecFactory, Integer>> createShardSpecWithoutInputScan( + GranularitySpec granularitySpec, + IndexIOConfig ioConfig, + IndexTuningConfig tuningConfig, + PartitionsSpec nonNullPartitionsSpec + ) + { + final Map<Interval, Pair<ShardSpecFactory, Integer>> allocateSpec = new HashMap<>(); + final SortedSet<Interval> intervals = granularitySpec.bucketIntervals().get(); + + if (isGuaranteedRollup(ioConfig, tuningConfig)) { + // Overwrite mode, guaranteed rollup: shardSpecs must be known in advance. + assert nonNullPartitionsSpec instanceof HashedPartitionsSpec; Review comment: Hmm, the thing about `DeterminePartitionsJob` is correct, but this assertion is because the index task and the parallel index task currently only supports the hashed partitions spec. The range partitions spec will be supported as well and this assertion will be removed in the future. Or, if you're asking about the comment, the index task already has a similar mode to determine partitions automatically and this method will not be called in that mode. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org