gianm commented on a change in pull request #10232:
URL: https://github.com/apache/druid/pull/10232#discussion_r469298104
##########
File path: web-console/src/utils/ingestion-spec.tsx
##########
@@ -2663,7 +2663,7 @@ function guessInputFormat(sampleData: string[]):
InputFormat {
return inputFormatFromType('orc');
}
// Avro OCF 4 byte magic header:
https://avro.apache.org/docs/current/spec.html#Object+Container+Files
- if (sampleDatum.startsWith('Obj1')) {
+ if (sampleDatum.startsWith('Obj')) {
Review comment:
It'd be really great if we could test the proper thing here, but I'm not
sure how either.
@vogievetsky Any ideas about how we could actually check all 4 bytes here?
(See https://github.com/apache/druid/issues/10229; the "1" should actually be
the unprintable character 0x01.)
Also, are there tests for this logic block in general? Where could we add
one for this particular case?
##########
File path:
extensions-core/avro-extensions/src/test/java/org/apache/druid/data/input/avro/AvroFlattenerMakerTest.java
##########
@@ -78,7 +78,8 @@ public void getRootField()
flattener.getRootField(record, "someNull")
);
Assert.assertEquals(
- record.getSomeFixed(),
+ // Casted to an array by transformValue
+ record.getSomeFixed().bytes(),
Review comment:
What data type is "someFixed"? This code is expecting it to be
translated as a `byte[]`. I think most downstream users of these parsed datums
will be interpreting them using Rows.objectToStrings, which is going to
translate a `byte[]` into a base64 encoded string. Is that sensible behavior?
Would you be able to add some documentation to this page:
https://druid.apache.org/docs/latest/ingestion/data-formats.html#avro-ocf about
how the various Avro types that we support will be handled?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]