Alex Behm has posted comments on this change.

Change subject: IMPALA-3687: Use type from Avro schema during schema 
reconciliation.
......................................................................


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3331/4//COMMIT_MSG
Commit Message:

Line 7: IMPALA-3687: Use type from avro schema during schema reconcilation.
Always prefer the Avro field name during schema reconciliation.


Line 9: There is no effect that we change the name of a string/char/varchar
This blurb is very difficult to follow. It would be good to summarize the new 
behavior, i.e., in case of inconsistencies we generally prefer the name, type, 
and comment from the Avro schema, except for CHAR/VARCHAR where we retain the 
CHAR/VARCHAR type but keep the name and comment from the Avro column.


http://gerrit.cloudera.org:8080/#/c/3331/2/fe/src/main/java/com/cloudera/impala/util/AvroSchemaUtils.java
File fe/src/main/java/com/cloudera/impala/util/AvroSchemaUtils.java:

PS2, Line 116: 
             : 
             : 
             : 
> I know but the code does not "reconcile" that at all right?
Bharath is correct that the comment had an original intention. The comment was 
put here to describe what will happen in cases like "create table t (ts 
timestamp) stored as avro". The resulting table will have a STRING type. But 
the 'root cause' of this behavior is actually in how we synthesize an Avro 
schema from the column definitions, so the comment is somewhat out of place 
here. Let's move this comment into AvroSchemaConverter.java and rephrase as 
appropriate.


http://gerrit.cloudera.org:8080/#/c/3331/4/fe/src/main/java/com/cloudera/impala/util/AvroSchemaUtils.java
File fe/src/main/java/com/cloudera/impala/util/AvroSchemaUtils.java:

PS4, Line 113: A
grammar; also could be rephrased for clarity, e.g.:

A CHAR/VARCHAR column definition maps to an Avro STRING. The reconciled column 
will preserve the CHAR/VARCHAR type but prefer the column name and comment from 
the Avro column.

also move this blurb to L134


http://gerrit.cloudera.org:8080/#/c/3331/2/testdata/workloads/functional-query/queries/QueryTest/avro-schema-changes.test
File 
testdata/workloads/functional-query/queries/QueryTest/avro-schema-changes.test:

PS2, Line 71: # IMPALA-3687
> I am fine with anything.. what do you think @Alex?
Agree that avro-schema-resolution.test is more appropriate.


http://gerrit.cloudera.org:8080/#/c/3331/4/testdata/workloads/functional-query/queries/QueryTest/avro-schema-changes.test
File 
testdata/workloads/functional-query/queries/QueryTest/avro-schema-changes.test:

Line 71: # IMPALA-3687. Create a table that has different avro schema from its 
column definition
Create a table with mismatched column definitions and Avro schema.


Line 86: LOCATION 
'$FILESYSTEM_PREFIX/test-warehouse/alltypesaggmultifilesnopart_avro_snap'
remove this clause - it only adds confusion


Line 95: {"type": ["float", "null"], "name": "float_col_changed", "doc": "float 
comment changed"},
typo: changed


-- 
To view, visit http://gerrit.cloudera.org:8080/3331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3e43b2885853c2b4f207a45a873c9d7f31379cd
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Huaisi Xu <[email protected]>
Gerrit-HasComments: Yes

Reply via email to