Bharath Vissapragada has posted comments on this change. Change subject: IMPALA:3314: Fix avro scanner crash in partitioned multi-file-format tables ......................................................................
Patch Set 1: (18 comments) http://gerrit.cloudera.org:8080/#/c/3045/1//COMMIT_MSG Commit Message: Line 7: IMPALA:3314: Fix avro scanner crash in partitioned multi-file-format tables > I think a more appropriate title would be something lile: Done Line 9: Bug: Impalads crash if we query a partitioned table with multiple file > Description seems too complicated. I think Impalads will crash when queryin Done Line 12: Cause: This happens because we don't set avroSchema_ in HdfsTable during > during metadata load -> during the alteration operation Done http://gerrit.cloudera.org:8080/#/c/3045/1/be/src/exec/hdfs-avro-scanner.cc File be/src/exec/hdfs-avro-scanner.cc: Line 63: return Status("Missing avro schema in scan node"); > Let's include the possible workaround. something like: Done Line 739: if (record.schema == NULL) return Status("Missing avro schema in scan node"); > same comment regarding mentioning workarounds Done http://gerrit.cloudera.org:8080/#/c/3045/1/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java: Line 1007: if (loadTableSchema) loadSchema(client, msTbl); > call setAvroSchema() in the loadTableSchema case Done Line 1012: setAvroSchema(client, msTbl); > why do we need this? Removed. Line 1261: * Sets avroSchema_ if the table or any of the partitions in the table are backed > are stored as Avro. Done Line 1262: * by AVRO file format. Additionally this method also reconciles the schema if > missing comma after Additionally Done Line 1263: * the column definitions from the metastore differ from the actual schema. > 'actual schema' is vague, say 'differ from the Avro schema' Done Line 1267: String inputFormat = msTbl.getSd().getInputFormat(); > Add Preconditions check that asserts loadSchema() has already been called. Done Line 1268: if (HdfsFileFormat.fromJavaClassName(inputFormat) == HdfsFileFormat.AVRO || > Is this first check still needed? An unpartitioned table should have a sing I didn't remove it so that I can short-circuit hasParitionOfFileFormat() call, but looks like an unnecessary optimization. Removed it. Line 1270: // Look for avro schema in TBLPROPERTIES and in SERDEPROPERTIES, with the latter > Use "Avro" everywhere consistently Done Line 1277: // Set the avroSchema_ if it is found in any of the search locations. This is > Comment could be misleading because it's not just "useful" in the described Done Line 1291: // At this point, nonPartFieldSchemas_ already has the field schemas from HMS. > Confusing comment because this is a precondition of this function, and not Given we are adding the Precond check above, removing this. Line 1330: * Loads table schema from Hive Metastore. It also loads column stats. > Loads the table schema and column stats from the Hive Metastore. Done http://gerrit.cloudera.org:8080/#/c/3045/1/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: Line 1902: functional > Let's try not to add one-off-tables for bugfixes in the schema template. Im Done http://gerrit.cloudera.org:8080/#/c/3045/1/testdata/workloads/functional-query/queries/QueryTest/mixed-format.test File testdata/workloads/functional-query/queries/QueryTest/mixed-format.test: Line 28: # IMPALA-3314: Run a query on multi-file-format table with an avro partition > This test feels more like it should go into TestAvroSchemaResolution. Done -- To view, visit http://gerrit.cloudera.org:8080/3045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I09262d3a7b85a2263c721f3beafd0cab2a1bdf4b Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-HasComments: Yes
