[Impala-ASF-CR] IMPALA-1654: Partition expr in DDL operations.

Alex Behm (Code Review) Thu, 18 Aug 2016 00:01:24 -0700

Alex Behm has posted comments on this change.

Change subject: IMPALA-1654: Partition expr in DDL operations.
......................................................................



Patch Set 1:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/3942/1/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 937: void ImpalaServer::QueryExecState::SetResultSet() {
make const TDdlExecResponse* a param to this function to make it clear when it 
makes sense to call this


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 897:     // Need to check partition is an opt_partition_spec in analyzer 
to avoid
suggest minor rephrase:

// Need to check in analysis that the partition set only matches a single 
partition. Avoids a reduce/reduce conflict and allows the user to select a 
partition without fully specifying all partition-key values.


Line 1562: // A partition spec is a set of expressions used to select a list of 
partitions
Update comment (still says 'partition spec')


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java
File 
fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java:

Line 79:               "can only match one partition.");
in the error message include the names of the matched partitions


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetStmt.java:

Line 37:   PartitionSet getPartitionSet() {
single line


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
File fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java:

Line 442:             analysisResult_.isAlterTableStmt() ||
Please take a look at my comment on the previous patch set. Summary: We should 
remove these checks and temporarily disable auth when analyzing a PartitionSet.


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/analysis/PartitionSet.java
File fe/src/main/java/com/cloudera/impala/analysis/PartitionSet.java:

Line 20:  * Represents a partition set - a collection of partition expressions.
Represents a set of partitions resulting from evaluating a list of partition 
conjuncts against a table's partition list.


Line 31:   // does or does not contain any partition matched by the given 
partition expr.
exprs


Line 34:   private Privilege privilegeRequirement_;
There are a bunch of common members and analysis code between PartitionSpec and 
PartitionSet. Some of the analysis  code is subtle, so it would be good to 
factor out the commonalities into a PartitionSpecBase.


Line 44:   public String getTbl() {
single lines where possible


Line 93:             "Dynamic partition spec is not supported.");
Error is somewhat cryptic. It should contain:
1. The offending expression
2. An understandable explanation for the failure and what the expected input is

Maybe something like the following, you get the idea:

Invalid column reference 'expr.toSql() goes here' in partition spec. A 
partition spec may only contain predicates.


Line 99:     // Verify the partition exprs. Make sure every conjunct only 
contains exactly
Update comment.


Line 103:     for (Expr e : conjuncts) {
We still need to check that e only references partition columns - and no 
non-partition columns.

Also add a test for that.


Line 105:       if (e.isLiteral()) {
I think it's reasonable to disallow constant expressions here, so you can just 
check e.isConstant(). In that case I think we should throw rather than just 
issue a warning.


Line 132:   // Transform <COL> = NULL into IsNull expr; <String COL> = '' into 
IsNull expr and
The NULL transformation looks good, but I don't think the string transformation 
is necessary or correct. The partition pruner should already be able to handle 
the string_col = '' case, and string partition-key values are case sensitive, 
so it's not correct to lower case them.


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
File fe/src/main/java/com/cloudera/impala/catalog/Catalog.java:

Line 390:   public List<HdfsPartition> getHdfsPartitions(Table tbl,
do we really need this function? seems like the caller could do 
tbl.getPartitionsFromPartitionSet() directly


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 892:    * Drops the partition specified in 'partitionSpec' from 'tbl'. 
Throws a
Update comment


Line 897:   public Table dropPartitions(Table tbl, 
List<List<TPartitionKeyValue>> partitionIds)
partitionIds -> partitionSet


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java:

Line 629:    * Gets hdfs partitions by partition id list.
Update comment, remove @param


Line 983:    * Drops the partitions from HdfsTable. Cleans up its metadata from 
all the mappings
Drops the given partitions from this table...


Line 985:    * Returns a list of partitions been dropped.
Returns the list of partitions that were dropped.


Line 1923:    * @param partitionSet
remove, we generally don't use the @param stuff


Line 1939:       // Get a list of HdfsPartition objects for the given partition 
ids.
partition ids -> partition set

please also fix elsewhere


http://gerrit.cloudera.org:8080/#/c/3942/1/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java:

Line 450:           alterTableSetTblProperties(tbl,
please condense these lengthy function calls to wrap around like this:

function(a, b, c, d, e,
    f, g, h, i, j);


Line 465:           alterTableUpdateStats(
condense wrapping


Line 486:             alterPartitionSetCached(tbl,
wrapping


Line 503:         loadTableMetadata(tbl,
wrapping


Line 1784:    * Drops existing partitions from the given table in Hive. If the 
partition is cached,
If a partition is cached ...


Line 1810:     List<HdfsPartition> parts = catalog_.getHdfsPartitions(tbl, 
partitionSet);
why not use tbl.getPartitionsFromPartitionSet() directly and get rid of Catalog?


Line 1834:         } catch (NoSuchObjectException e) {
The error behavior is a somewhat strange now because a user can write ALTER 
TABLE t DROP PARTITION (year < 2010) and then have that statement fail because 
partition 1998 did not exist. By definition the user only wants to drop 
partitions that exist, so the new more general case seems different from the 
original fully-specified single partition.

How does Hive behave with respect to IF EXISTS when non-equality predicate are 
used in a partition set?

What do you think is the right behavior with respect to IF EXISTS?


Line 1969:       long numTargetPartitions = 0L;
unused?


Line 1978:       numUpdatedPartitions.setRef(numTargetPartitions);
can't you set to modifiedParts.size() directly?


Line 2057:       numUpdatedPartitions.setRef(numTargetPartitions);
use modifiedParts.size() directly?


Line 2246:         ++ numTargetPartitions;
extra space


Line 2616:   private void bulkDropPartitions(String dbName, String tableName,
please remove this work-in-progress from the patch set, I'm still thinking 
about whether we need to switch to the bulk api immediately


http://gerrit.cloudera.org:8080/#/c/3942/1/shell/impala_client.py
File shell/impala_client.py:

Line 450:     excluded_query_types = ['use', 'drop']
Did you do the manual testing I suggested in the last round?


-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1654: Partition expr in DDL operations.

Reply via email to