Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)

Patch Set 11:


File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

PS11, Line 1368:
               :   @Override
               :   public long getWriteId() {
               :     return writeId_;
               :   }
               :   @Override
               :   public HdfsPartition genInsertDeltaPartition() {
               :     ImmutableList<byte[]> fileDescriptors = 
!encodedInsertFileDescriptors_.isEmpty() ?
               :         encodedInsertFileDescriptors_ : 
               :     return new HdfsPartition.Builder(this)
               :         .setId(id_)
               :         .setFileDescriptors(fileDescriptors)
               :         .build();
               :   }
               :   @Override
               :   public HdfsPartition genDeleteDeltaPartition() {
               :     if (encodedDeleteFileDescriptors_.isEmpty()) return null;
               :     return new HdfsPartition.Builder(this)
               :         .setId(id_)
               :         .setFileDescriptors(encodedDeleteFileDescriptors_)
               :         .build();
               :   }
nit: Could you move these non-static methods to somewhere above the 
HdfsPartition.Builder? Other codes here are static subclass, fields or methods.

File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

PS11, Line 1453:     final String NOT_SUPPORTED_YET = "This query is not 
supported on full ACID tables " +
               :         "that have deleted rows. As a workaround you can run a 
major compaction.";
Maybe mention that the query references complex types?

Not sure how complex it is to add a test for this. It may worths some test 
coverage on it. Or maybe we'll support reading full-ACID complex types soon?

PS11, Line 1468: List<? extends FeFsPartition> partitions
unused arguement

PS11, Line 1471: hdfsTblRef.getUniqueAlias()
Looks like if the tableRef doesn't have an explict alias, this is the full 
qualified table name. Could you add some tests to make sure the resolution 
works as expected? I.e. tests that the full-acid table is used more than once 
and all occurances don't have explicit alias, e.g.

 select count(*) from (
   select * from functional_orc_def.alltypes_deleted_rows where id % 2 = 0
   union all
   select * from functional_orc_def.alltypes_deleted_rows where id % 2 != 0
 ) t;

 select id from functional_orc_def.alltypes_deleted_rows where id % 2 = 0 limit 
 union all
 select max(id) from functional_orc_def.alltypes_deleted_rows;

PS11, Line 1537: isPartitionKeyScan
Could you add some tests for isPartitionKeyScan=true? IIUC, we only scan one 
row per partition when isPartitionKeyScan=true. If the first row from the 
insert delta doesn't match the first row from the delete delta, we will 
consider this partition as non-empty, no matter whether there are other delete 
rows that could match the first row from the insert delta.

I think it's worth to add a test like this:

 create table my_acid (id int) partitioned by (p int) stored as orc
 insert into my_acid partition(p=0) values (0), (1), (2);
 // (p=1, id=0) may be the first row of insert delta
 insert into my_acid partition(p=1) values (0), (1), (2);
 // (p=1, id=2) may be the first row of delete delta
 delete from my_acid where p = 1 and id = 2;
 delete from my_acid where p = 1 and id = 1;
 delete from my_acid where p = 1 and id = 0;
 // Run this in Impala. The result should be 0 since partition(p=1) is empty.
 select max(p) from my_acid;

PS11, Line 1541: isPartitionKeyScan
Should we always set this to true to scan all delete rows?

File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test:

PS11, Line 714: ====
Could you add some tests to make sure that the query hints are respected? Also 
make sure the "using" syntax is correctly handled. E.g.

 select straight_join a.id from functional.alltypes a
   join /* +BROADCAST */ functional_orc_def.alltypes_deleted_rows b
   using (id);

  select straight_join a.id from functional.alltypes a
   join /* +SHUFFLE */ functional_orc_def.alltypes_deleted_rows b
   using (id);

File testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test:

PS11, Line 21: insert into acid values (5), (5), (5);
Could you add a test for insert overwrite?

PS11, Line 52: select count(*) from functional_orc_def.alltypes_deleted_rows;
Could you add a test for show partitions on this table? Will the deleted 
partitions still show up?

To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jul 2020 02:59:55 +0000
Gerrit-HasComments: Yes

Reply via email to