glasser commented on issue #7048: Make IngestSegmentFirehoseFactory splittable 
for parallel ingestion
URL: https://github.com/apache/incubator-druid/pull/7048#issuecomment-462529816
 
 
   Good question. I was kind of imagining you would set taskGranularity equal 
to your output segmentGranularity so that each subtask would write one segment. 
You're right that things can get unbalanced though.
   
   Are you imagining that the split implementation would query the segments 
metadata to learn all the segment sizes and the user would specify bytes per 
split?  Would we try to not divide any input segments but just chunk them 
together?
   
   This seems like a reasonable option to desire but I kind of feel like people 
might still want to get started with the simpler "I know my peons can handle an 
hour of data, just split by hours" anyway... so implementing one of these 
options doesn't necessarily stop from implementing the other later.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to