clintropolis commented on issue #11003:
URL: https://github.com/apache/druid/issues/11003#issuecomment-800791722


   I am not familiar enough with protobuf encoded files to know if this will 
work, but the error you are seeing is related to trying to use `inputSource` 
with a parser. You need to use the older 'parser' based ingestion spec to not 
see this error, see 
https://druid.apache.org/docs/latest/ingestion/index.html#parser-deprecated. 
(no `inputSource` or `inputFormat` on parser based specs, instead "firehoses" 
are used in place of input source iirc)
   
   The protobuf parser depends on getting byte chunks of encoded proto 
messages, so any file reader would need to read out individual message binary 
blobs from the underlying file to feed to the parser, which is the part that 
makes me unsure that protobuf files with batch would work correctly. For 
example the CSV parser is fed single lines from an underlying text file, where 
the line is expected to be a CSV row. A protobuf file parser would need 
something to do something similar with the binary message blobs from the file, 
and i'm not sure if just having the message schema is enough for that to work.
   
   I think if the `inputFormat` did exist, it might still have this issue, a 
file based protobuf decoder might be a specialized `InputFormat` implementation 
that is separate from a streaming individual message processor format (again 
I'm not familiar with the file side of things much).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to