[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. Although from a filesystem point of view the directories won't be completely empty. They will contain the hidden '_empty' file, so FileSystemUtil.listFiles() will list them. It's similar to Hive's behavior which creates the magic file '_orc_acid_version' in each transactional directory. Except Impala only creates these '_empty' files in empty base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Reviewed-on: http://gerrit.cloudera.org:8080/14071 Reviewed-by: Zoltan Borok-Nagy Tested-by: Impala Public Jenkins --- M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 11 files changed, 368 insertions(+), 47 deletions(-) Approvals: Zoltan Borok-Nagy: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 27 Aug 2019 18:51:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4855/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 27 Aug 2019 14:44:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 11: Code-Review+2 (1 comment) Carrying +2 http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG@13 PS10, Line 13: s > typo? Done -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 27 Aug 2019 14:43:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#11). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. Although from a filesystem point of view the directories won't be completely empty. They will contain the hidden '_empty' file, so FileSystemUtil.listFiles() will list them. It's similar to Hive's behavior which creates the magic file '_orc_acid_version' in each transactional directory. Except Impala only creates these '_empty' files in empty base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 11 files changed, 368 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/11 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 10: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG@13 PS10, Line 13: s typo? -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 27 Aug 2019 14:15:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4384/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 27 Aug 2019 13:00:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4383/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 27 Aug 2019 12:35:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#10). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. Although from a filessystem point of view the directories won't be completely empty. They will contain the hidden '_empty' file, so FileSystemUtil.listFiles() will list them. It's similar to Hive's behavior which creates the magic file '_orc_acid_version' in each transactional directory. Except Impala only creates these '_empty' files in empty base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 11 files changed, 368 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/10 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#8). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. Although from a filessystem point of view the directories won't be completely empty. They will contain the hidden '_empty' file, so FileSystemUtil.listFiles() will list them. It's similar to Hive's behavior which creates the magic file '_orc_acid_version' in each transactional directory. Except Impala only creates these '_empty' files in empty base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 11 files changed, 368 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/8 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 7: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4367/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 26 Aug 2019 16:35:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 6: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4366/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 26 Aug 2019 16:27:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#7). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. Although from a filessystem point of view the directories won't be completely empty. They will contain the hidden '_empty' file, so FileSystemUtil.listFiles() will list them. It's similar to Hive's behavior which creates the magic file '_orc_acid_version' in each transactional directory. Except Impala only creates these '_empty' files in empty base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 9 files changed, 353 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/7 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG@10 PS5, Line 10: it : just creates new empty ACID base directories. > Can you mention the _empty file + that this is the same as the way Hive wor Extended the commit message. http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1912 PS5, Line 1912: _empty > It would be nice to mention that Hive works the same way + maybe add a the Added link to the IMPALA Jira, I don't know the exact Hive Jira when they introduced _orc_acid_version. Hive uses this file differently than we use this "_empty" file. It creates the magic file in every transactional directory and as its name suggests it stores the ACID version being used. I'm not sure that it is a good idea to comment about Hive's behavior here because it could be distracting. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 26 Aug 2019 15:57:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#6). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. Although from a filessystem point of view the directories won't be completely empty. They will contain the hidden '_empty' file, so FileSystemUtil.listFiles() will list them. It's similar to Hive's behavior which creates the magic file '_orc_acid_version' in each transactional directory. Except Impala only creates these '_empty' files in empty base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 9 files changed, 336 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/6 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4365/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 26 Aug 2019 15:38:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 5: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG@10 PS5, Line 10: it : just creates new empty ACID base directories. Can you mention the _empty file + that this is the same as the way Hive works? http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1912 PS5, Line 1912: _empty It would be nice to mention that Hive works the same way + maybe add a the Hive Jira about this. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 26 Aug 2019 15:15:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#5). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 9 files changed, 336 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/5 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1850 PS4, Line 1850: try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) > The common pattern seems to be to hold only one msClient at the same time i I agree that it is something that we should clean up. The MetastoreShim methods usually take an IMetaStoreClient, methods from other classes access to a Catalog object so they can just ask another client from the pool. Maybe we should always pass around the pool? The cleanup might require quite a few changes so I think we should do it in a separate jira. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 26 Aug 2019 13:47:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 4: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@241 PS4, Line 241: // Correct validWriteIdList is needed : // to commit the alter partitions operation in hms side. : client.alter_partitions(dbName, tblName, partitions, null, : tblTxn.validWriteIds, tblTxn.writeId); Ouch, thanks for fixing this. http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1850 PS4, Line 1850: try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) The common pattern seems to be to hold only one msClient at the same time in a single DDL operation, while here we can hold two as unsetTableColStats also asks for one. getMetaStoreClient() asks for a free client from the pool, so it is not just returning the same client. We also seem to try to hold them for as little time as possible, while here we call out to another component (HDFS) while we are holding the HMS client. I don't know how often to we have problems with running out of HMS client, so this may not matter in practice. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 23 Aug 2019 17:15:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4344/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 23 Aug 2019 16:42:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 4: Sorry about the diff between PS3 and PS4, I thought I already upload a rebase commit after PS3. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 23 Aug 2019 16:10:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@705 PS3, Line 705: public void lockTableInTransaction(String dbName, String tableName, long txnId, > We could get the caller to pass in a Transaction object, I think I'd prefer Done. I kept the long for lockTableInternal() because it can take 0L as no transaction, and it's also a private method. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1838 PS2, Line 1838: unsetPartitionsColStats(table.getMetaStoreTable(), hmsPartitions, : writeId, txn.getId()); : } : // Remove COLUMN_STATS_ACCURATE property from the table. : unsetTableColStats(table.getMetaStoreTable(), txn.getId() > Thanks for the notice, will fix it in a separate rebase commit. Done. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 23 Aug 2019 16:03:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#4). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 10 files changed, 336 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/4 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@705 PS3, Line 705: public void lockTableInTransaction(String dbName, String tableName, long txnId, We could get the caller to pass in a Transaction object, I think I'd prefer the strongly-typed interface. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 22 Aug 2019 18:24:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4331/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 22 Aug 2019 14:38:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java@732 PS2, Line 732: st > Can this be called with txnId == -1? I would change this to <=0 or add a pr This is a private method, this should only be called via the public LockTableStandalone() and LockTableInTransaction() methods. Via these methods txnId can only be >=0. Added precondition check in case someone else adds a new method that calls this one. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java File fe/src/main/java/org/apache/impala/catalog/Transaction.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17 PS2, Line 17: Transaction > Does this class need to be thread-safe? I can see there are possible races The idea is to use it from a single thread in a try-with-resources statement, so I rather keep it simple unless a good use case comes that requires it to be thread-safe. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@20 PS2, Line 20: transactionId_ > can we make this final? I see that you are using this to keep track of whet See answer above. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@38 PS2, Line 38: checkState(transactionId_ > 0) > Does this need to throw an exception? What is the behavior from HMS if a cl I think we shouldn't commit a transaction twice so it checks that Impalas's logic works well. HMS throws NoSuchTxnException in case of already committed transactions. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@46 PS2, Line 46: transactionId_ > nit, can we change this to if (transactionId_ < 0) return; makes it easier I agree, done. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1800 PS2, Line 1800: creation > typo: creation Done http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1821 PS2, Line 1821: ly Impala cannot upda > Its unclear why we are clearing this property. Can you add a comment explai Explained it in code comment. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1829 PS2, Line 1829: org.apache.hadoop.hive.metastore.api.Partition hmsPart = : ((HdfsPartition)part).toHmsPartition(); > Do we have to alter the partition if COLUMN_STATS_ACCURATE was set? In a ta We talked about it offline, but for the record: It's safer to always remove this property in case of writes. E.g. if Hive computes statistics and later Impala INSERTs data without invalidating its metadata first, then Impala won't clear COLUMN_STATS_ACCURATE (because without invalidate metadata it won't see this property and wants to save an RPC). In the end Hive will return wrong results. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1838 PS2, Line 1838: } : // For partitioned tables we need to alter all the partitions in HMS. : if (!hmsPartitions.isEmpty()) { : unsetPartitionsColStats(table.getMetaStoreTable(), hmsPartitions, : writeId, txn.getId()); > Note that this will break when https://gerrit.cloudera.org/#/c/14066 is mer Thanks for the notice, will fix it in a separate rebase commit. http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test: http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@96 PS2, Line 96: truncate table pt; > Shouldn't this have a RESULTS section? No, we check a Hive query in test_acid.py. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#3). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 9 files changed, 333 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/3 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java@732 PS2, Line 732: 0L Can this be called with txnId == -1? I would change this to <=0 or add a precondition for 0<=txnId http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1800 PS2, Line 1800: createion typo: creation http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1829 PS2, Line 1829: if (hmsPart.getParameters() != null) { : hmsPart.getParameters().remove(StatsSetupConst.COLUMN_STATS_ACCURATE); Do we have to alter the partition if COLUMN_STATS_ACCURATE was set? In a table that is generally written by Impala it will be never set and we could spare the alter partitions RPC. This optimization was implemented for INSERT in https://gerrit.cloudera.org/#/c/14037/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1838 PS2, Line 1838: unsetPartitionsColStats(table.getMetaStoreTable(), hmsPartitions, : writeId, txn.getId()); : } : // Remove COLUMN_STATS_ACCURATE property from the table. : unsetTableColStats(table.getMetaStoreTable(), txn.getId() Note that this will break when https://gerrit.cloudera.org/#/c/14066 is merged. http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test: http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@96 PS2, Line 96: truncate table pt; Shouldn't this have a RESULTS section? -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Aug 2019 19:16:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java File fe/src/main/java/org/apache/impala/catalog/Transaction.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17 PS2, Line 17: Transaction Does this class need to be thread-safe? I can see there are possible races with respect to the transactionId_ being used as the flag to track if this transaction is open or close. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@20 PS2, Line 20: transactionId_ can we make this final? I see that you are using this to keep track of whether the transaction is open or close. May be you can use a AtomicBoolean to keep track of that separately so that commit() can be made thread-safe. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@38 PS2, Line 38: checkState(transactionId_ > 0) Does this need to throw an exception? What is the behavior from HMS if a client calls commit on a already committed transaction? http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@46 PS2, Line 46: transactionId_ nit, can we change this to if (transactionId_ < 0) return; makes it easier to read in my opinion. http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1821 PS2, Line 1821: COLUMN_STATS_ACCURATE Its unclear why we are clearing this property. Can you add a comment explaining the same? -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Aug 2019 17:22:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has removed a vote on this change. Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Removed Code-Review+2 by Zoltan Borok-Nagy -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 2: Code-Review+2 +2ed it so you can carry it if you fix those nits -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Aug 2019 16:03:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4280/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Aug 2019 15:08:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java File fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java@71 PS1, Line 71: } > I think this line can be removed, the check on line 70 already blocks the F Done http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@684 PS1, Line 684:* heartbeating the lock. This function is for locks that doesn't belong to a > Might be worth explicitly saying that the client is responsible for calling Done http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@687 PS1, Line 687: he particular table is. > I think we should hide this from callers. How about we add two public varia Modified the lockTable() interface as suggested. Yeah it would be useful to create a LockGuard as well, but the code structure is a bit different for drop table and I don't want to mess it up. But we plan to do some cleanup soon. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@704 PS1, Line 704:*/ > Is there a corresponding remove lock part? For truncate, the lock is removed when the transaction is aborted/committed. DROP table uses this method also, it removes the lock manually because DROP doesn't need to run in transaction. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java File fe/src/main/java/org/apache/impala/catalog/Transaction.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17 PS1, Line 17: public class Transaction implements AutoCloseable { > This is really nice Thanks! http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@37 PS1, Line 37: public void commit() throws TransactionException { > Precondition check that transactionId > 0? Done http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1817 PS1, Line 1817: > typo: hmsPartitions Done http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1832 PS1, Line 1832: hmsPartitions.add(hmsPart); > Can you comment what this special case is required for? Do we have a test c It's skipped for non-partitioned tables. Extended the comment about it. The tests check column stats-property removal for both partitioned and non-partitioned tables. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1854 PS1, Line 1854:* @param writeId the write id of the new base directory. > Could you comment a little more on why a transactional table does not remov Extended truncateTransactionalTable()'s comment. http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test: PS1: > Can you also test truncating a completely empty table? I think that's a som Added a TRUNCATE stmt just after table creation. http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@40 PS1, Line 40: > Can you truncate the table again here, to confirm that it's idempotent (i.e Sure, done. http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@90 PS1, Line 90: > Do we need to confirm that the stats are correct? Unfortunately for partitioned tables we can only check it implicitly. We truncate the table here, then in test_acid.py::test_acid_truncate() we issue a count(*) query towards Hive and if it sees 0 rows then it means we successfully invalidated the statistics. Other examples for this are the test_*_statschg methods in test_acid.py. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Hello Yongzhi Chen, Tim Armstrong, Dinesh Garg, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14071 to look at the new patch set (#2). Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 9 files changed, 330 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/2 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 1: (9 comments) I think this is pretty good, some cleanup and some tests suggested. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@684 PS1, Line 684:* heartbeating the lock if it doesn't have a transaction context. Might be worth explicitly saying that the client is responsible for calling releaseTableLock()? http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@687 PS1, Line 687: 0 for standalone locks. I think we should hide this from callers. How about we add two public variants of lockTable() - lockTableInTransaction() that takes a Transaction argument, and lockTableStandalone() that doesn't take a txnId argument. They could both use this function as the private implementation. It would also be nice to use the auto-close pattern for standalone locks, but that's unrelated to this change. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java File fe/src/main/java/org/apache/impala/catalog/Transaction.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17 PS1, Line 17: public class Transaction implements AutoCloseable { This is really nice http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@37 PS1, Line 37: public void commit() throws TransactionException { Precondition check that transactionId > 0? http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1817 PS1, Line 1817: hmsPartitons typo: hmsPartitions http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1832 PS1, Line 1832: if (!hmsPartitons.isEmpty()) { Can you comment what this special case is required for? Do we have a test case that exercises the case when we skip unsetting the stats. http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test: PS1: Can you also test truncating a completely empty table? I think that's a somewhat interesting case. http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@40 PS1, Line 40: Can you truncate the table again here, to confirm that it's idempotent (i.e. truncating a truncated table doesn't break anything). http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@90 PS1, Line 90: truncate table pt; Do we need to confirm that the stats are correct? -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dinesh Garg (430) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 15 Aug 2019 21:08:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 1: (3 comments) A couple of my comments. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java File fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java@71 PS1, Line 71: analyzer.ensureTableNotFullAcid(table_); I think this line can be removed, the check on line 70 already blocks the FullAcid table. http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@704 PS1, Line 704: if (txnId == 0L) transactionKeepalive_.addLock(lockId, ctx); Is there a corresponding remove lock part? http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1854 PS1, Line 1854: for (FeFsPartition part: partitions) { Could you comment a little more on why a transactional table does not remove files? Can that cause file leaks? -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 15 Aug 2019 18:49:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14071 ) Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4265/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 15 Aug 2019 18:41:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14071 Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables .. IMPALA-8793: Implement TRUNCATE for insert-only ACID tables This commit adds support for the TRUNCATE statement for transactional tables. TRUNCATE on transactional tables doesn't remove the files, it just creates new empty ACID base directories. All the functionality is implemented in the catalog, therefore transaction handling, table locking, heartbeating also happens there. Added the new Transaction class that should prevent leakage of transactions, it should be used in try-with-resources statements. Testing: Added backend tests that truncate non-partitioned and partitioned tables as well. The tests check whether the statistics were removed. The tests also check if Hive sees the effect of truncation. Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a --- M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java A fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M tests/metadata/test_hms_integration.py M tests/query_test/test_acid.py 9 files changed, 294 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/1 -- To view, visit http://gerrit.cloudera.org:8080/14071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a Gerrit-Change-Number: 14071 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy