Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16066 )

Change subject: WIP IMPALA-9482 Support for BINARY columns
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc
File be/src/runtime/types.cc:

http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc@122
PS2, Line 122: ToThrift(PrimitiveType ptype)
Should this function work as the inverse of ThriftToType() ?
If so, shouldn't it take a AuxColumnType parameter as well?


http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc@222
PS2, Line 222: ColumnType::ToThrift(TColumnType* thrift_type)
Same as above, maybe it needs now an additional AuxColumnType parameter?


http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@126
PS2, Line 126:        // No built-in function needed for BINARY <-> STRING 
conversion.
             :         if (fromType.getPrimitiveType() == PrimitiveType.BINARY 
||
             :             toType.getPrimitiveType() == PrimitiveType.BINARY){
             :           continue;
             :         }
BINARY<->STRING conversions are no-op conversins, so maybe this block should be 
moved after L191.


http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/catalog/Type.java
File fe/src/main/java/org/apache/impala/catalog/Type.java:

http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/catalog/Type.java@820
PS2, Line 820: // STRING <->  BINARY conversion is not lossy, but implicit cast 
is not allowed.
I'm probably misunderstanding something but the commit msg suggests that BINARY 
to STRING implicit conversion is supported.

"UDF/UDAFs that expect STRING argument accept BINARY too, while in
    Hive explicit cast is needed in this case."

Please clarify.


http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java
File fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java:

http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java@154
PS2, Line 154: case BINARY: return Schema.create(Schema.Type.STRING);
I'm not sure about this, maybe binary should be converted to avro bytes. The 
avro documentation on primitive types states that:
bytes: sequence of 8-bit unsigned bytes
string: unicode character sequence


http://gerrit.cloudera.org:8080/#/c/16066/2/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/16066/2/testdata/bin/generate-schema-statements.py@213
PS2, Line 213: string
Again, avro has bytes type which might be a better fit for binary.



--
To view, visit http://gerrit.cloudera.org:8080/16066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Gerrit-Change-Number: 16066
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 18 Jun 2020 15:24:12 +0000
Gerrit-HasComments: Yes

Reply via email to