Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
......................................................................


Patch Set 2:

(11 comments)

Did a first pass..some comments below.

http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@673
PS2, Line 673:   protected HdfsPartition(HdfsPartition other) {
Looking at the changes in https://issues.apache.org/jira/browse/IMPALA-9778,  
the HdfsPartition class  has been restructured quite a bit.. although it seems 
your changes should be ok since you are only creating a clone.


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@178
PS2, Line 178:       if (!isAcidJoin_ || detailLevel.ordinal() == 
TExplainLevel.VERBOSE.ordinal()) {
For the anti join should we not show the join conditions in the Impala query 
profile ? (query profile output is in EXTENDED mode, not VERBOSE).


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@83
PS2, Line 83:   // True if this join is used to do the join between insert and 
deleta delta files.
nit: spelling 'deleta'


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@88
PS2, Line 88:     displayName_ = "ACID " + displayName_;
The  'ACID' prefix for a HashJoin could be misleading to end users since a plan 
operator itself is not ACID in DBMS terms.  Could we use something like. 'DELTA 
TABLE HASH JOIN'  (open to other suggestions).


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1453
PS2, Line 1453:       FeFsPartition fsPartition = (FeFsPartition)partition;
The casting  is redundant since 'partition' is already of type FeFsPartition.


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1454
PS2, Line 1454:       for (HdfsPartition.FileDescriptor fd : 
fsPartition.getFileDescriptors()) {
Is there any indicator in the HMS's partition object that indicates whether any 
delete was done ?  It does seem expensive to  check all the file descriptors 
within a partition since in vast majority of cases nothing may have been 
deleted or the 'delete_delta' is found only towards the end. However,  if such 
metadata does not exist,  then I suppose this is the only alternative.


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1480
PS2, Line 1480:     String[] acidFields = {"originaltransaction", "bucket", 
"rowid"};
The design doc suggests the anti-join is on 4 fields..is the partition id not 
needed in this join ?


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1551
PS2, Line 1551:         hdfsTblRef.getDesc(), conjuncts, insertPartitions, 
hdfsTblRef, aggInfo,
The  'aggInfo' is meant for the simple 'select count(*) from table'  type of 
query where the aggregation value can be found from the row count in the 
metadata.  But in this case, since  we want to compute this value only after 
the anti-join has been done,  I think it should be null.


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1591
PS2, Line 1591:           ret.add(new BinaryPredicate(
Call analyze for the BinaryPredicate.  Also, could you add a simple test with 
explain_level=3 to check for the join condition ? I don't think I saw one.


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1605
PS2, Line 1605:    * @return the cloned partitoin
nit: spelling 'partitoin'


http://gerrit.cloudera.org:8080/#/c/16082/2/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test:

http://gerrit.cloudera.org:8080/#/c/16082/2/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@58
PS2, Line 58: |  row-size=100B cardinality=143
The cardinality estimate for this hash join would need some adjustment in the 
future..right now it is showing the default behavior of a Left Outer Join, 
which is not accurate for the join between the delta and delete_deltas.



--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Jun 2020 00:27:49 +0000
Gerrit-HasComments: Yes

Reply via email to