ahmedahamid commented on a change in pull request #2792: [GOBBLIN-933] add
support for array of unions in json schema
URL: https://github.com/apache/incubator-gobblin/pull/2792#discussion_r341779189
##########
File path: gobblin-core/src/test/resources/converter/jsonToAvroRecord.json
##########
@@ -1,4 +1,6 @@
-{"nullableField": null,
+{
+ "arrayFieldWithUnion": ["arrU1", "arrU2", "arrU3"],
Review comment:
Just following up: I tried adding a `null` value to this array but the
`JsonRecordAvroSchemaToAvroConverterTest.testConverter` failed:
```java
Could not convert field arrayFieldWithUnion
org.apache.gobblin.converter.DataConversionException: Could not convert
field arrayFieldWithUnion
at
org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverter.convertNestedRecord(JsonRecordAvroSchemaToAvroConverter.java:127)
at
org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverter.convertRecord(JsonRecordAvroSchemaToAvroConverter.java:71)
at
org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverterTest.testConverter(JsonRecordAvroSchemaToAvroConverterTest.java:69)
...
Caused by: java.lang.RuntimeException: Field: arrayFieldWithUnion is not
nullable and contains a null value
at
org.apache.gobblin.converter.avro.JsonElementConversionFactory$JsonElementConverter.convert(JsonElementConversionFactory.java:278)
at
org.apache.gobblin.converter.avro.JsonElementConversionWithAvroSchemaFactory$ArrayConverter.convertField(JsonElementConversionWithAvroSchemaFactory.java:98)
at
org.apache.gobblin.converter.avro.JsonElementConversionFactory$JsonElementConverter.convert(JsonElementConversionFactory.java:280)
at
org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverter.convertNestedRecord(JsonRecordAvroSchemaToAvroConverter.java:125)
... 51 more
```
I think the key issue here is that `UnionConverter` does not determine its
nullability correctly; it uses the nullability it receives from
`JsonElementConversionWithAvroSchemaFactory.getConvertor()` to decide its own
nullability. This is incorrect because even though `arrayFieldWithUnion` is a
non-nullable array, the union within its `items` _is_ nullable, i.e.
`UnionConverter` would return `false` if you invoke `isNullable()` on it even
though `["string", "null"]` is nullable because it is construing
`arrayFieldWithUnion`'s non-nullability as its own (see the values
`UnionConverter` provides to `super`).
One possible fix could be to have `UnionConverter` override `isNullable()`
so it can decide its own nullability based on its schemas, e.g.
```java
@Override
public boolean isNullable() {
return this.firstSchema.getType() == Schema.Type.NULL ||
this.secondSchema.getType() == Schema.Type.NULL;
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services