gianm opened a new pull request, #15692:
URL: https://github.com/apache/druid/pull/15692

   Fixes a bug where the KafkaInputFormat would parse incoming JSON 
newline-delimited (as if it were a batch ingest) rather than as a whole entity 
(as is typical for streaming ingest).
   
   Background:
   
   JsonInputFormat has a `withLineSplittable` method that can be used to 
control whether JSON is read line-by-line, or as a whole. The intent is that in 
streaming ingestion, `lineSplittable` is false (although it can be overridden 
by `assumeNewlineDelimited`), and in batch ingestion, `lineSplittable` is true.
   
   When a `json` format is wrapped by a `kafka` format, this isn't set 
properly. This patch updates KafkaInputFormat to set this on an underlying 
`json` format.
   
   The tests for KafkaInputFormat were overriding the `lineSplittable` 
parameter explicitly, which wasn't really fair, because that made them 
unrealistic to what happens in production. Now they omit the parameter and get 
the production behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to