Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/8928 )
Change subject: IMPALA-4323: "SET ROW FORMAT" option added to "ALTER TABLE" command ...................................................................... Patch Set 4: (14 comments) Applied changes and answered questions. http://gerrit.cloudera.org:8080/#/c/8928/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/8928/3/fe/src/main/cup/sql-parser.cup@1007 PS3, Line 1007: // row_format_val cannot be used as it conflicts with cache_op_val. > I think they conflict because cache_opt_val and row_format_val both accept Hack won't work since the code block is after the parser branch and the parser branch is the unresolvable issue. I'm uploading an alternate approach where a row_format_val_req is defined that is not optional, and row_format_val uses row_format_val_req to not duplicate the grammar. http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java: http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java@72 PS4, Line 72: affectedPartitions = ((HdfsTable) tbl).getPartitions(); > I think we should allow changing the table-level row format even if one of The current behavior of file format is to pick up the file format when the partition is instantiated. Is this incorrect for file format and it should pick up the table level formatting unless explicitly set at the partition level? I only put this check in since any delimiter formats appear to be ignored for all but these two file formats. Ran a quick test: create table tabdel (c1 string, c2 string, c3 string) partitioned by (c0 int) stored as textfile location "/temp"; insert into tabdel partition(c0=0) values ("the","quick","brown"); insert into tabdel partition(c0=1) values ("the0","quick0","brown0"); select * from tabdel; # 2 rows returned alter table tabdel set fileformat PARQUET; select * from tabdel; # 2 rows returned insert into tabdel partition(c0=2) values ("the0","quick0","brown0"); # new partition created select * from tabdel; # 3 rows returned alter table tabdel partition(c0=1) set fileformat PARQUET; select * from tabdel; # breaks the c0=1 partition alter table tabdel partition(c0=1) set fileformat TEXTFILE; select * from tabdel; # 3 rows returned http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@460 PS4, Line 460: reloadFileMetadata = alterTableSetRowFormat(tbl, > I wonder why we need to reload the file metadata for this operation and SET This is mapped to "loadTableSchema" in the load method. If I remove setting this to true, the selects do not reflect the changes unless I call "invalidate metadata" manually. http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2266 PS4, Line 2266: * changing the row format the table metadata is marked as invalid and will be > I don't think this part is correct. If we return true the file metadata is Done http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2267 PS4, Line 2267: * reloaded on the next access. The return value is true if the fileMetadata needs > Why would we even need to reload the file metadata? It appears to be cached. If not reloaded, alter statements do not take effect unless "invalidate metadata" is called manually. http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2285 PS4, Line 2285: if (tbl instanceof HdfsTable) > style nit: we always use { } for multi-line if Removed the conditional because of the precondiition moved to to top of the function. http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2290 PS4, Line 2290: Preconditions.checkArgument(tbl instanceof HdfsTable); > This can go at the top if this function. Done http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@471 PS4, Line 471: AnalyzesOk("alter table functional_seq.alltypes set row format delimited " + > also tests for the PARTITION clause Done http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@473 PS4, Line 473: AnalysisError("alter table functional_kudu.alltypes set row format delimited " + > test the PARTITION clause on an unpartitioned HDFS table Done http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@476 PS4, Line 476: AnalysisError("alter table functional_parquet.alltypes set row format delimited " + > write these negative tests in a loop over a list of databases Done http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1338 PS4, Line 1338: # IMPALA-4323 - Test alter row format statement. Ensure alter statement updates metadata. > We generally use this style: Done http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1339 PS4, Line 1339: create table $DATABASE.delimited (c1 string, c2 string, c3 string) > no need to specify $DATABASE because this session is already using $DATABAS Done http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1347 PS4, Line 1347: ---- QUERY > Wow multiple QUERIES sections in a single test case work? I thought we need Done http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1351 PS4, Line 1351: 'the','quick\002brown','fox\002jumped' > Can we craft the test in a way that doesn't change how many fields are in a Done -- To view, visit http://gerrit.cloudera.org:8080/8928 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I96e347463504915a6f33932552e4d1f61e9b1154 Gerrit-Change-Number: 8928 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Holley <[email protected]> Gerrit-Reviewer: Adam Holley <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: John Russell <[email protected]> Gerrit-Reviewer: Kim Jin Chul <[email protected]> Gerrit-Comment-Date: Wed, 10 Jan 2018 17:03:29 +0000 Gerrit-HasComments: Yes
