Amos Bird has posted comments on this change.

Change subject: IMPALA-1654: Support general predicates in most partition DDL 
operations.
......................................................................


Patch Set 9:

(46 comments)

Thanks for the detailed review. I may need some time to refactor this patch. 
The plan is:
1. delegate the partition pruning work to catalogd
2. switch to bulk ddl update
3. access control
4. testing stuff
5. new partition set class

What do you think?

http://gerrit.cloudera.org:8080/#/c/1563/9/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 530:   const TDdlExecResponse* ddl_resp = 
catalog_op_executor_->ddl_exec_response();
> Add a new SetResultSet(TDdlExecResponse*) helper function, and use it here 
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

Line 206: struct TShowFilesParams {
> I know we talked about this before but I'm bringing it up again because we'
I think the best way to avoid inconsistency is doing partition pruning in 
Catalogd. How about I cutting out all the analysis work and delegate them to 
catalogd?


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 1549: // clause in an INSERT statement. Partition clause contains dynamic 
and static
> I don't think the partition clause for INSERT is different now. Dynamic par
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java
File 
fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java:

Line 52:       
Preconditions.checkState(getPartitionSpec().getPartitionIds().size() == 1);
> use isEmpty()
It should match one and only one partition. I don't think isEmpty() can achieve 
this.


Line 71:         if (partitions.size() != 1)
> use !isEmpty()
see last reply.


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java:

Line 91:     // Register this table so we can evaluate partition predicates.
> Resolve and analyze this table ref so we can evaluate partition predicates.
Done


Line 92:     tableRef_ = analyzer.resolveTableRef(tableRef_);
> Add a Preconditions.checkState(tableRef_ instanceof BaseTableRef);
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
File fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java:

Line 435:         Preconditions.checkState(
> I spent a lot of time thinking about the implications of these changes. I t
this ACL thing is a little obscure for me. I may need some time to figure it 
out.


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/ComputeStatsStmt.java:

Line 278:         tableRef_.analyze(analyzer);
> Add Preconditions check that tableRef is a base table ref.
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/LoadDataStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/LoadDataStmt.java:

Line 200:         if (partitionSpec_.getPartitions().size() != 1) {
> use isEmpty()
see previous reply.


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
File fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java:

Line 51:   private TableRef tableRef_;
> You don't need to set this. Instead you can use:
Done


Line 57:   // Flag used to tell if it is using the old KV style PartitionSpec 
because only they
> Referring to "old" behavior is generally not useful, because somebody readi
Done


Line 58:   // require partition specs in a particular KV form. See 
PARTITION_EXPR_ERROR."
> extra " at end
Done


Line 85:     List partitionIds = Lists.newArrayList();
> List<Long>
Done


Line 130:     if (isKV_) {
> It seems more appropriate to distinguish between a PartitionSpec and a Part
This was discussed in Patchset 1. You mentioned "Ideally, we'd only have a 
single PartitionSpec to capture all scenarios." I agree we should split these 
two senarios. I will start refactoring soon.


Line 228:     // one partition SlotRef such that conjuncts like "SlotRef <op> 
Literal" works
> work
Done


Line 229:     // while "SlotRef <op> SlotRef" and "Literal <op> Literal" don't.
> Why these restrictions? We should be able to handle "SlotRef <op> SlotRef" 
This was also about indicating user errors too. I think using more than one 
partition slotrefs in a predicate is counterintuitive either as "Literal <op> 
Literal".


Line 242:     List<Expr> conjuncts = Lists.newArrayList();
> Move this into a separate function and explain the reason for the transform
Done


Line 245:       if (e instanceof BinaryPredicate) {
> To avoid deep indentation we typically prefer a style like this:
can't use continue here because of the conditionless statement at the end of 
the loop. I'm gonna refactor it to a new function.


Line 247:         if (bp.getOp().equals(EQ)) {
> In order to make this completely bulletproof I think we need to fold the co
I don't think it is necessary here. Because what we are trying to do is 
backward compatibility. And the previous KV style can only appear like this.


Line 256:             // so the old way of NULL partition specification doesn't 
break.
> "old way" is not a useful description. Instead you should describe the beha
Done


Line 276:       throw new AnalysisException("Partition expr evaluation failed 
in the backend.");
> pass in e as the cause
Done


Line 279:     if (!conjuncts.isEmpty()) {
> remove
Done


Line 294:    * Used only in ALTER TABLE ADD PARTITION.
> another indication that we should probably have two separate classes
agreed. I will refactor this together with PartitionId related stuff.


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
File fe/src/main/java/com/cloudera/impala/catalog/Catalog.java:

Line 417:     if (partitionIds.isEmpty())
> style: always use {} for multi-line ifs
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java:

Line 639:    * Gets hdfs partitions by partition id list. Returns null if no 
match was found.
> why not return an empty list?
Done


Line 643:     for(Long id: partitionIds) {
> style: space after for
Done


Line 651:    * Gets the hdfs partition by partition id list. Returns null if no 
match was found.
> remove "list"
Done


Line 950:     if (partitions == null) return;
> Seems weird to silently accept a null input. This is a private function, le
Done


Line 951:     for(HdfsPartition partition: partitions) {
> style: space after for
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java
File fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java:

Line 91:    * If 'allowEmpty' is True, partitions that contain no file may 
survive during
> Suggest rephrasing to:
Done


Line 476:   public List<SlotId> getSlotIds() {
> getPartitionSlotIds()
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java:

Line 110:                   "Partition spec in ADD PARTITION/LOAD DATA 
statement should be of " +
> leave out the ADD PARTITION/LOAD DATA part
Done


Line 111:                   "this format: <COL> = <LITERAL>, <LITERAL> = <COL> 
or <COL> is null.");
> IS NULL (capital)
Done


Line 118:     AnalysisError("alter table functional.alltypes drop" +
> nit: let's use consistent spacing/chopping (some inconsistency, look at oth
Done


Line 129:         "alter table functional.alltypes add " +
> extra space here, see comment above regarding splitting up string literals
Done


Line 179:     AnalysisError("alter table functional.alltypes drop " +
> already covered in the add/drop loop at the top
Done


Line 187:         "exactly one partition SlotRef.");
> SlotRef is an internal term, prefer "column reference"
Done


Line 202:         "Partition expr does not contain any partition.");
> This msg isn't quite accurate, it should say something like "No matching pa
Done


Line 502:                   "All partition expr conjuncts should contain and 
only contain " +
> 'conjunct' is probably confusing to users, can you to something like:
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java
File fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java:

Line 1709
> Why not change these negative parser tests tests into positive ones?
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/shell/impala_client.py
File shell/impala_client.py:

Line 444:     excluded_query_types = ['use', 'drop']
> some ALTER statements now expect a result set and some don't, does this cha
which tests should I run?


http://gerrit.cloudera.org:8080/#/c/1563/9/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

Line 515: alter table t_part drop partition (j=2, s is null)
> the old test was legitimate and interesting, let's not change it. If you wa
Done


http://gerrit.cloudera.org:8080/#/c/1563/9/testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates.test
File 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates.test:

Line 1: ====
> It feels like we are missing tests around the COL=NULL and COL='' rewrites 
Done


Line 42: show files in p1 partition (j<2, k="a")
> what should happen when I write something like
It shows you files in the returned partition set.


http://gerrit.cloudera.org:8080/#/c/1563/9/tests/metadata/test_ddl.py
File tests/metadata/test_ddl.py:

Line 507:   def test_partition_ddl_predicates(self, vector):
> use the unique_database test fixture here. you'll need to rebase to get it,
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/1563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Amos Bird <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Amos Bird <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-HasComments: Yes

Reply via email to