Amos Bird has posted comments on this change. Change subject: IMPALA-1654: Support general predicates in most partition DDL operations. ......................................................................
Patch Set 9: (46 comments) Thanks for the detailed review. I may need some time to refactor this patch. The plan is: 1. delegate the partition pruning work to catalogd 2. switch to bulk ddl update 3. access control 4. testing stuff 5. new partition set class What do you think? http://gerrit.cloudera.org:8080/#/c/1563/9/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: Line 530: const TDdlExecResponse* ddl_resp = catalog_op_executor_->ddl_exec_response(); > Add a new SetResultSet(TDdlExecResponse*) helper function, and use it here Done http://gerrit.cloudera.org:8080/#/c/1563/9/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: Line 206: struct TShowFilesParams { > I know we talked about this before but I'm bringing it up again because we' I think the best way to avoid inconsistency is doing partition pruning in Catalogd. How about I cutting out all the analysis work and delegate them to catalogd? http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: Line 1549: // clause in an INSERT statement. Partition clause contains dynamic and static > I don't think the partition clause for INSERT is different now. Dynamic par Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java File fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java: Line 52: Preconditions.checkState(getPartitionSpec().getPartitionIds().size() == 1); > use isEmpty() It should match one and only one partition. I don't think isEmpty() can achieve this. Line 71: if (partitions.size() != 1) > use !isEmpty() see last reply. http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java File fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java: Line 91: // Register this table so we can evaluate partition predicates. > Resolve and analyze this table ref so we can evaluate partition predicates. Done Line 92: tableRef_ = analyzer.resolveTableRef(tableRef_); > Add a Preconditions.checkState(tableRef_ instanceof BaseTableRef); Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java File fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java: Line 435: Preconditions.checkState( > I spent a lot of time thinking about the implications of these changes. I t this ACL thing is a little obscure for me. I may need some time to figure it out. http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/com/cloudera/impala/analysis/ComputeStatsStmt.java: Line 278: tableRef_.analyze(analyzer); > Add Preconditions check that tableRef is a base table ref. Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/LoadDataStmt.java File fe/src/main/java/com/cloudera/impala/analysis/LoadDataStmt.java: Line 200: if (partitionSpec_.getPartitions().size() != 1) { > use isEmpty() see previous reply. http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java File fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java: Line 51: private TableRef tableRef_; > You don't need to set this. Instead you can use: Done Line 57: // Flag used to tell if it is using the old KV style PartitionSpec because only they > Referring to "old" behavior is generally not useful, because somebody readi Done Line 58: // require partition specs in a particular KV form. See PARTITION_EXPR_ERROR." > extra " at end Done Line 85: List partitionIds = Lists.newArrayList(); > List<Long> Done Line 130: if (isKV_) { > It seems more appropriate to distinguish between a PartitionSpec and a Part This was discussed in Patchset 1. You mentioned "Ideally, we'd only have a single PartitionSpec to capture all scenarios." I agree we should split these two senarios. I will start refactoring soon. Line 228: // one partition SlotRef such that conjuncts like "SlotRef <op> Literal" works > work Done Line 229: // while "SlotRef <op> SlotRef" and "Literal <op> Literal" don't. > Why these restrictions? We should be able to handle "SlotRef <op> SlotRef" This was also about indicating user errors too. I think using more than one partition slotrefs in a predicate is counterintuitive either as "Literal <op> Literal". Line 242: List<Expr> conjuncts = Lists.newArrayList(); > Move this into a separate function and explain the reason for the transform Done Line 245: if (e instanceof BinaryPredicate) { > To avoid deep indentation we typically prefer a style like this: can't use continue here because of the conditionless statement at the end of the loop. I'm gonna refactor it to a new function. Line 247: if (bp.getOp().equals(EQ)) { > In order to make this completely bulletproof I think we need to fold the co I don't think it is necessary here. Because what we are trying to do is backward compatibility. And the previous KV style can only appear like this. Line 256: // so the old way of NULL partition specification doesn't break. > "old way" is not a useful description. Instead you should describe the beha Done Line 276: throw new AnalysisException("Partition expr evaluation failed in the backend."); > pass in e as the cause Done Line 279: if (!conjuncts.isEmpty()) { > remove Done Line 294: * Used only in ALTER TABLE ADD PARTITION. > another indication that we should probably have two separate classes agreed. I will refactor this together with PartitionId related stuff. http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/catalog/Catalog.java File fe/src/main/java/com/cloudera/impala/catalog/Catalog.java: Line 417: if (partitionIds.isEmpty()) > style: always use {} for multi-line ifs Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java: Line 639: * Gets hdfs partitions by partition id list. Returns null if no match was found. > why not return an empty list? Done Line 643: for(Long id: partitionIds) { > style: space after for Done Line 651: * Gets the hdfs partition by partition id list. Returns null if no match was found. > remove "list" Done Line 950: if (partitions == null) return; > Seems weird to silently accept a null input. This is a private function, le Done Line 951: for(HdfsPartition partition: partitions) { > style: space after for Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java File fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java: Line 91: * If 'allowEmpty' is True, partitions that contain no file may survive during > Suggest rephrasing to: Done Line 476: public List<SlotId> getSlotIds() { > getPartitionSlotIds() Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java: Line 110: "Partition spec in ADD PARTITION/LOAD DATA statement should be of " + > leave out the ADD PARTITION/LOAD DATA part Done Line 111: "this format: <COL> = <LITERAL>, <LITERAL> = <COL> or <COL> is null."); > IS NULL (capital) Done Line 118: AnalysisError("alter table functional.alltypes drop" + > nit: let's use consistent spacing/chopping (some inconsistency, look at oth Done Line 129: "alter table functional.alltypes add " + > extra space here, see comment above regarding splitting up string literals Done Line 179: AnalysisError("alter table functional.alltypes drop " + > already covered in the add/drop loop at the top Done Line 187: "exactly one partition SlotRef."); > SlotRef is an internal term, prefer "column reference" Done Line 202: "Partition expr does not contain any partition."); > This msg isn't quite accurate, it should say something like "No matching pa Done Line 502: "All partition expr conjuncts should contain and only contain " + > 'conjunct' is probably confusing to users, can you to something like: Done http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java File fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java: Line 1709 > Why not change these negative parser tests tests into positive ones? Done http://gerrit.cloudera.org:8080/#/c/1563/9/shell/impala_client.py File shell/impala_client.py: Line 444: excluded_query_types = ['use', 'drop'] > some ALTER statements now expect a result set and some don't, does this cha which tests should I run? http://gerrit.cloudera.org:8080/#/c/1563/9/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: Line 515: alter table t_part drop partition (j=2, s is null) > the old test was legitimate and interesting, let's not change it. If you wa Done http://gerrit.cloudera.org:8080/#/c/1563/9/testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates.test File testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates.test: Line 1: ==== > It feels like we are missing tests around the COL=NULL and COL='' rewrites Done Line 42: show files in p1 partition (j<2, k="a") > what should happen when I write something like It shows you files in the returned partition set. http://gerrit.cloudera.org:8080/#/c/1563/9/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: Line 507: def test_partition_ddl_predicates(self, vector): > use the unique_database test fixture here. you'll need to rebase to get it, Done -- To view, visit http://gerrit.cloudera.org:8080/1563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f Gerrit-PatchSet: 9 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Amos Bird <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Amos Bird <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-HasComments: Yes
