Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15985 )

Change subject: IMPALA-9778: Refactor partition modifications in DDL/DMLs
......................................................................


Patch Set 11:

(17 comments)

Thanks for the feedbacks!

http://gerrit.cloudera.org:8080/#/c/15985/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15985/10//COMMIT_MSG@13
PS10, Line 13:  - HdfsPartition can be shared across table versions. In full 
catalog
> Are there specific bugs that are fixed by this patch? Or it this a class of
I can't find any bug JIRAs about this. But I think there are potential bugs. 
Currently we modify partitions in the following pattern:
1) modify partition in place, e.g. updating locaion, file format, stats etc.
2) apply ALTER operation in HMS
3) always mark the partition "dirty"
4) reload "dirty" partition from HMS and reload their file metadata

If step 2) and 4) fails (e.g. when HMS is temporarily out of service due to 
network issue), the partition meta will be inconsistent with HMS. Since we 
modify partition in place, we have no way to rollback.


http://gerrit.cloudera.org:8080/#/c/15985/9/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/15985/9/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@86
PS9, Line 86: public class HdfsPartition implements FeFsPartition, 
PrunablePartition {
> is it possible to annotate tihs as @Immutable? I think that'll trigger erro
No, it's a pity that this class only has shallow immutability. All its fields 
are final. But some fields are still referencing mutable objects like 
HdfsTable, LiteralExpr, InFlightEvents etc. We need more refactoring to make 
these immutable. However, things like InFlightEvents are only used in the 
catalogd so they can be mutable. I think we just need to make sure all the 
values sent to coordinators won't change.


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@519
PS10, Line 519:     public final ImmutableList<String> sdBucketCols;
> Wonder if we should change these to ImmutableList/ImmutableMap so that it's
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@564
PS10, Line 564:       sdBucketCols = other.sdBucketCols;
> Comment that we can safely reuse these because they're immutable. Maybe not
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@589
PS10, Line 589:   private final long numRows_;
> Maybe move this static value to a different block, e.g. above 585. It's a b
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@627
PS10, Line 627:  Populated
              :   // when the partition is being built in 
Builder#setPartitionStatsBytes().
> I think this needs to get updated - we should not be mutating this with set
Nice catch! Done.


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@967
PS10, Line 967:     public HdfsPartition build() {
> Is it valid to continue using the Builder after build() is called?
Yes, the build() method is idempotent if there are no more modifications after 
the first call. I use it in toHmsPartition() to build a temp instance for 
calling HdfsPartition#toHmsPartition(). TBO, this is not ideal... But there are 
some logics (like toHmsPartition) that need to be implemented in both 
HdfsPartition and HdfsPartition.Builder. We need more refactoring to merge 
these methods...


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@444
PS10, Line 444:
> nit: we usually leave blank lines in-between multi-line methods.
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@447
PS10, Line 447:    */
> Might help to briefly comment these methods, e.g. this one should document
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@455
PS10, Line 455:    */
> Probably worth adding a brief comment, e.g
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@803
PS10, Line 803:    */
> Maybe update to createPartitionBuilder?
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@897
PS10, Line 897:       // Store partitions with null partition values separately
> nit: add a blank line between methods.
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@945
PS10, Line 945:    * HdfsPartition that was dropped or null if the partition 
does not exist.
> We should document this in the comments for these variables.
Done for dirtyPartitions_. nullPartitionIds_ and partitionValuesMap_ have such 
comments.


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1428
PS10, Line 1428:    * table partitions.
> Can you document inprogressPartBuilders?
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2062
PS10, Line 2062:
> Maybe we can remove this reference, since we are always doing this now.
Removed the whole comment block since we can't updating the existing partition 
in place now.


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/catalog/Table.java@802
PS10, Line 802:   /**
> These methods deserve a short comment - how they fit into the execution of
Done


http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/15985/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@849
PS10, Line 849: modification
> typo: modification
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15985
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib52e5810d01d5e0c910daacb9c98977426d3914c
Gerrit-Change-Number: 15985
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Mon, 08 Jun 2020 08:23:32 +0000
Gerrit-HasComments: Yes

Reply via email to