Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 )
Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) ...................................................................... Patch Set 11: (9 comments) http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1368 PS11, Line 1368: : @Override : public long getWriteId() { : return writeId_; : } : : @Override : public HdfsPartition genInsertDeltaPartition() { : ImmutableList<byte[]> fileDescriptors = !encodedInsertFileDescriptors_.isEmpty() ? : encodedInsertFileDescriptors_ : encodedFileDescriptors_; : return new HdfsPartition.Builder(this) : .setId(id_) : .setFileDescriptors(fileDescriptors) : .build(); : } : : @Override : public HdfsPartition genDeleteDeltaPartition() { : if (encodedDeleteFileDescriptors_.isEmpty()) return null; : return new HdfsPartition.Builder(this) : .setId(id_) : .setFileDescriptors(encodedDeleteFileDescriptors_) : .build(); : } nit: Could you move these non-static methods to somewhere above the HdfsPartition.Builder? Other codes here are static subclass, fields or methods. http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1453 PS11, Line 1453: final String NOT_SUPPORTED_YET = "This query is not supported on full ACID tables " + : "that have deleted rows. As a workaround you can run a major compaction."; Maybe mention that the query references complex types? Not sure how complex it is to add a test for this. It may worths some test coverage on it. Or maybe we'll support reading full-ACID complex types soon? http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1468 PS11, Line 1468: List<? extends FeFsPartition> partitions unused arguement http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1471 PS11, Line 1471: hdfsTblRef.getUniqueAlias() Looks like if the tableRef doesn't have an explict alias, this is the full qualified table name. Could you add some tests to make sure the resolution works as expected? I.e. tests that the full-acid table is used more than once and all occurances don't have explicit alias, e.g. select count(*) from ( select * from functional_orc_def.alltypes_deleted_rows where id % 2 = 0 union all select * from functional_orc_def.alltypes_deleted_rows where id % 2 != 0 ) t; select id from functional_orc_def.alltypes_deleted_rows where id % 2 = 0 limit 10 union all select max(id) from functional_orc_def.alltypes_deleted_rows; http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1537 PS11, Line 1537: isPartitionKeyScan Could you add some tests for isPartitionKeyScan=true? IIUC, we only scan one row per partition when isPartitionKeyScan=true. If the first row from the insert delta doesn't match the first row from the delete delta, we will consider this partition as non-empty, no matter whether there are other delete rows that could match the first row from the insert delta. I think it's worth to add a test like this: create table my_acid (id int) partitioned by (p int) stored as orc tblproperties('transactional'='true'); insert into my_acid partition(p=0) values (0), (1), (2); // (p=1, id=0) may be the first row of insert delta insert into my_acid partition(p=1) values (0), (1), (2); // (p=1, id=2) may be the first row of delete delta delete from my_acid where p = 1 and id = 2; delete from my_acid where p = 1 and id = 1; delete from my_acid where p = 1 and id = 0; // Run this in Impala. The result should be 0 since partition(p=1) is empty. select max(p) from my_acid; http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1541 PS11, Line 1541: isPartitionKeyScan Should we always set this to true to scan all delete rows? http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test: http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@714 PS11, Line 714: ==== Could you add some tests to make sure that the query hints are respected? Also make sure the "using" syntax is correctly handled. E.g. select straight_join a.id from functional.alltypes a join /* +BROADCAST */ functional_orc_def.alltypes_deleted_rows b using (id); select straight_join a.id from functional.alltypes a join /* +SHUFFLE */ functional_orc_def.alltypes_deleted_rows b using (id); http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test File testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test: http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test@21 PS11, Line 21: insert into acid values (5), (5), (5); Could you add a test for insert overwrite? http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test@52 PS11, Line 52: select count(*) from functional_orc_def.alltypes_deleted_rows; Could you add a test for show partitions on this table? Will the deleted partitions still show up? -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 10 Jul 2020 02:59:55 +0000 Gerrit-HasComments: Yes
