josephglanville edited a comment on issue #5584: Decoupling FirehoseFactory and 
InputRowParser
URL: 
https://github.com/apache/incubator-druid/issues/5584#issuecomment-408612801
 
 
   @jihoonson we actually make use of parse batch support also.
   
   That said `InputRowParser` should still operate on a single iteration unit 
IMO.
   
   A `Reader` interface that returns an iterator seems like the right choice.
   It should also remove some weirdness I remember seeing around resetting 
parsers on starting a new file as you could move that reset to the `Reader`.
   
   I don't see a reason to not also support `Reader` at realtime ingestion too, 
a `ReaderInputRowParser` that wraps an inner `InputRowParser` would be an 
awesome and flexible way to support entire files being pushed down streaming 
pipelines. Though perhaps integrating it at the level below that makes more 
sense so that the batch size isn't the entire file.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to