Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8928 )

Change subject: IMPALA-4323: "SET ROW FORMAT" option added to "ALTER TABLE" 
command
......................................................................


Patch Set 4:

(14 comments)

Applied changes and answered questions.

http://gerrit.cloudera.org:8080/#/c/8928/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8928/3/fe/src/main/cup/sql-parser.cup@1007
PS3, Line 1007: // row_format_val cannot be used as it conflicts with 
cache_op_val.
> I think they conflict because cache_opt_val and row_format_val both accept
Hack won't work since the code block is after the parser branch and the parser 
branch is the unresolvable issue.  I'm uploading an alternate approach where a 
row_format_val_req is defined that is not optional, and row_format_val uses 
row_format_val_req to not duplicate the grammar.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java:

http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java@72
PS4, Line 72:       affectedPartitions = ((HdfsTable) tbl).getPartitions();
> I think we should allow changing the table-level row format even if one of
The current behavior of file format is to pick up the file format when the 
partition is instantiated.  Is this incorrect for file format and it should 
pick up the table level formatting unless explicitly set at the partition level?

I only put this check in since any delimiter formats appear to be ignored for 
all but these two file formats.

Ran a quick test:
create table tabdel (c1 string, c2 string, c3 string) partitioned by (c0 int) 
stored as textfile location "/temp";
insert into tabdel partition(c0=0) values ("the","quick","brown");
insert into tabdel partition(c0=1) values ("the0","quick0","brown0");
select * from tabdel; # 2 rows returned
alter table tabdel set fileformat PARQUET;
select * from tabdel; # 2 rows returned
insert into tabdel partition(c0=2) values ("the0","quick0","brown0"); # new 
partition created
select * from tabdel; # 3 rows returned
alter table tabdel partition(c0=1) set fileformat PARQUET;
select * from tabdel; # breaks the c0=1 partition
alter table tabdel partition(c0=1) set fileformat TEXTFILE;
select * from tabdel; # 3 rows returned


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@460
PS4, Line 460:           reloadFileMetadata = alterTableSetRowFormat(tbl,
> I wonder why we need to reload the file metadata for this operation and SET
This is mapped to "loadTableSchema" in the load method.  If I remove setting 
this to true, the selects do not reflect the changes unless I call "invalidate 
metadata" manually.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2266
PS4, Line 2266:    * changing the row format the table metadata is marked as 
invalid and will be
> I don't think this part is correct. If we return true the file metadata is
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2267
PS4, Line 2267:    * reloaded on the next access.  The return value is true if 
the fileMetadata needs
> Why would we even need to reload the file metadata?
It appears to be cached.  If not reloaded, alter statements do not take effect 
unless "invalidate metadata" is called manually.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2285
PS4, Line 2285:         if (tbl instanceof HdfsTable)
> style nit: we always use { } for multi-line if
Removed the conditional because of the precondiition moved to to top of the 
function.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2290
PS4, Line 2290:         Preconditions.checkArgument(tbl instanceof HdfsTable);
> This can go at the top if this function.
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@471
PS4, Line 471:     AnalyzesOk("alter table functional_seq.alltypes set row 
format delimited " +
> also tests for the PARTITION clause
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@473
PS4, Line 473:     AnalysisError("alter table functional_kudu.alltypes set row 
format delimited " +
> test the PARTITION clause on an unpartitioned HDFS table
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@476
PS4, Line 476:     AnalysisError("alter table functional_parquet.alltypes set 
row format delimited " +
> write these negative tests in a loop over a list of databases
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1338
PS4, Line 1338: # IMPALA-4323 - Test alter row format statement.  Ensure alter 
statement updates metadata.
> We generally use this style:
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1339
PS4, Line 1339: create table $DATABASE.delimited (c1 string, c2 string, c3 
string)
> no need to specify $DATABASE because this session is already using $DATABAS
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1347
PS4, Line 1347: ---- QUERY
> Wow multiple QUERIES sections in a single test case work? I thought we need
Done


http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test@1351
PS4, Line 1351: 'the','quick\002brown','fox\002jumped'
> Can we craft the test in a way that doesn't change how many fields are in a
Done



--
To view, visit http://gerrit.cloudera.org:8080/8928
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I96e347463504915a6f33932552e4d1f61e9b1154
Gerrit-Change-Number: 8928
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Holley <[email protected]>
Gerrit-Reviewer: Adam Holley <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: John Russell <[email protected]>
Gerrit-Reviewer: Kim Jin Chul <[email protected]>
Gerrit-Comment-Date: Wed, 10 Jan 2018 17:03:29 +0000
Gerrit-HasComments: Yes

Reply via email to