Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16066 )
Change subject: WIP IMPALA-9482 Support for BINARY columns ...................................................................... Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc File be/src/runtime/types.cc: http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc@122 PS2, Line 122: ToThrift(PrimitiveType ptype) Should this function work as the inverse of ThriftToType() ? If so, shouldn't it take a AuxColumnType parameter as well? http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc@222 PS2, Line 222: ColumnType::ToThrift(TColumnType* thrift_type) Same as above, maybe it needs now an additional AuxColumnType parameter? http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java File fe/src/main/java/org/apache/impala/analysis/CastExpr.java: http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@126 PS2, Line 126: // No built-in function needed for BINARY <-> STRING conversion. : if (fromType.getPrimitiveType() == PrimitiveType.BINARY || : toType.getPrimitiveType() == PrimitiveType.BINARY){ : continue; : } BINARY<->STRING conversions are no-op conversins, so maybe this block should be moved after L191. http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/catalog/Type.java File fe/src/main/java/org/apache/impala/catalog/Type.java: http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/catalog/Type.java@820 PS2, Line 820: // STRING <-> BINARY conversion is not lossy, but implicit cast is not allowed. I'm probably misunderstanding something but the commit msg suggests that BINARY to STRING implicit conversion is supported. "UDF/UDAFs that expect STRING argument accept BINARY too, while in Hive explicit cast is needed in this case." Please clarify. http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java File fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java: http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java@154 PS2, Line 154: case BINARY: return Schema.create(Schema.Type.STRING); I'm not sure about this, maybe binary should be converted to avro bytes. The avro documentation on primitive types states that: bytes: sequence of 8-bit unsigned bytes string: unicode character sequence http://gerrit.cloudera.org:8080/#/c/16066/2/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/16066/2/testdata/bin/generate-schema-statements.py@213 PS2, Line 213: string Again, avro has bytes type which might be a better fit for binary. -- To view, visit http://gerrit.cloudera.org:8080/16066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582 Gerrit-Change-Number: 16066 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 18 Jun 2020 15:24:12 +0000 Gerrit-HasComments: Yes
