jihoonson commented on issue #7048: Make IngestSegmentFirehoseFactory 
splittable for parallel ingestion
URL: https://github.com/apache/incubator-druid/pull/7048#issuecomment-462896179
 
 
   @glasser good questions!
   
   For (a), thank you for reminding me about `SegmentListUsedAction`. I think 
it's the best option and you can set taskToolBox and use it. For 
`TaskToolboxConsumingFirehoseFactory`, I guess you want to extract 
`IndexTask.setFirehoseFactoryToolbox()` to `IngestSegmentFirehoseFactory`. 
Please go for it if you think it's simpler.
   
   For (b), the first segment should be partially overshadowed. The part of v1 
@ 1:00-2:00 should be available for input. `VersionedIntervalTimeline.lookup()` 
is responsible for doing this. If `usedSegments` are the segments of v1 @ 
1:00-3:00 and v2 @ 2:00-4:00, `lookup()` would return 2 `TimelineObjectHolder` 
of v1 @ 1:00-2:00 and v2 @ 2:00-4:00. However, if `usedSegments` is v1 @ 
1:00-3:00, `lookup()` just returns a `TimelineObjectHolder` for the same 
segment.
   
   So, you may want to adjust `intervals` of the granularitySpec for sub tasks 
as well to prune out the overshadowed parts. I'm not sure how this problem is 
currently being handled though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to