glasser commented on issue #7048: Make IngestSegmentFirehoseFactory splittable 
for parallel ingestion
URL: https://github.com/apache/incubator-druid/pull/7048#issuecomment-462904598
 
 
   But doesn't that holder come from the timeline, so it only works if the 
timeline was constructed from the full set of segments?
   
   I think what this means is that `IngestSegmentFirehoseFactory.connect` needs 
to *always* call SegmentListAction/VersionedIntervalTimeline on the *full* 
original interval even in a subtask.  However it will only call fetchSegments 
on the selected segments, and it should skip elements of timelineSegments that 
aren't in the selected segments.
   
   This implies that instead of `segments` being an alternate option for 
`interval`, `interval` always needs to be specified.
   
   Alternatively, the split operation needs to provide both the full list of 
segments and the split-specific segment list to each split firehose factory.  
(Or at least include an extra list of overlapping segments.)
   
   
   (In other news, my TaskToolboxConsumingFirehoseFactory idea runs into 
trouble because CombiningFirehoseFactory is in druid-server which doesn't have 
access to druid-indexing-service's TaskToolbox type. I suppose the interface 
could declare `setTaskToolbox(Object)` and let IngestSegmentFirehoseFactory do 
a typecast?)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to