jihoonson commented on issue #7048: Make IngestSegmentFirehoseFactory splittable for parallel ingestion URL: https://github.com/apache/incubator-druid/pull/7048#issuecomment-462896179 @glasser good questions! For (a), thank you for reminding me about `SegmentListUsedAction`. I think it's the best option and you can set taskToolBox and use it. For `TaskToolboxConsumingFirehoseFactory`, I guess you want to extract `IndexTask.setFirehoseFactoryToolbox()` to `IngestSegmentFirehoseFactory`. Please go for it if you think it's simpler. For (b), the first segment should be partially overshadowed. The part of v1 @ 1:00-2:00 should be available for input. `VersionedIntervalTimeline.lookup()` is responsible for doing this. If `usedSegments` are the segments of v1 @ 1:00-3:00 and v2 @ 2:00-4:00, `lookup()` would return 2 `TimelineObjectHolder` of v1 @ 1:00-2:00 and v2 @ 2:00-4:00. However, if `usedSegments` is v1 @ 1:00-3:00, `lookup()` just returns a `TimelineObjectHolder` for the same segment. So, you may want to adjust `intervals` of the granularitySpec for sub tasks as well to prune out the overshadowed parts. I'm not sure how this problem is currently being handled though.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
