josephglanville commented on issue #5584: Decoupling FirehoseFactory and 
InputRowParser
URL: 
https://github.com/apache/incubator-druid/issues/5584#issuecomment-410030450
 
 
   One thing that has become apparent is the `StringInputRowParser` is a bit of 
a bad citizen.
   It extends `ByteBufferInputRowParser` aka `InputRowParser<ByteBuffer>`.
   
   This is not a problem in and of itself and it's completely reasonable to 
have a input row parser designed to read UTF-8 encoded strings from 
`ByteBuffer`.
   
   The issue is that isn't how it's used, it's primarily used via function with 
signature the `InputRow parse(String)`, the fact this is `String` rather than 
`ByteBuffer` not only violates the assumption of the interface but it also 
means that users of `StringInputRowParser` aren't using the `InputRowParser` 
interface at all, effectively making any code built on StringInputRowParser not 
very generic.
   
   It seems to me that it would make sense for there to be 
`StringInputRowParser` that is actually `InputRowParser<String>` and 
`StringDecoderInputRowParser` which inherits from `ByteBufferInputRowParser` 
reads data from ByteBuffer decoding UTF-8.
   
   I would also move to put add a `reset()` to the `InputRowParser` interface 
as it's already on `Parser`. Currently there is `startFileFromBeginning()` on 
`StringInputRowParser` that calls the same function on the inner `Parser`. 
Seems as this may be useful in general for resetting the state of stateful 
parsers when opening the next object/input stream.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to