snleee commented on a change in pull request #7481:
URL: https://github.com/apache/pinot/pull/7481#discussion_r726555357
##########
File path:
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java
##########
@@ -388,6 +425,13 @@ private boolean validate(TableConfig tableConfig, String
taskType) {
return true;
}
+ /**
+ * Check if the segment span multiple buckets
+ */
+ private boolean hasSpilledOverData(SegmentZKMetadata segmentZKMetadata, long
bucketMs) {
+ return segmentZKMetadata.getStartTimeMs() / bucketMs !=
segmentZKMetadata.getEndTimeMs() / bucketMs;
Review comment:
Do we have a guarantee of `startTime <= endTime` from segmentZKMetadata?
If not, we may need to check `startTime/bucketMs < endTime / bucketMs`
##########
File path:
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskUtils.java
##########
@@ -34,7 +35,8 @@ private MergeRollupTaskUtils() {
MergeTask.ROUND_BUCKET_TIME_PERIOD_KEY,
MergeTask.MERGE_TYPE_KEY,
MergeTask.MAX_NUM_RECORDS_PER_SEGMENT_KEY,
- MergeTask.MAX_NUM_RECORDS_PER_TASK_KEY
+ MergeTask.MAX_NUM_RECORDS_PER_TASK_KEY,
+ MergeRollupTask.NUM_PARALLEL_BUCKETS
Review comment:
Why don't we put this to `MergeTask` instead of `MergeRollupTask`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]