[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

Although from a filesystem point of view the directories won't be
completely empty. They will contain the hidden '_empty' file, so
FileSystemUtil.listFiles() will list them. It's similar to Hive's
behavior which creates the magic file '_orc_acid_version' in each
transactional directory. Except Impala only creates these '_empty'
files in empty base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Reviewed-on: http://gerrit.cloudera.org:8080/14071
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 368 insertions(+), 47 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 27 Aug 2019 18:51:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4855/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 27 Aug 2019 14:44:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 11: Code-Review+2

(1 comment)

Carrying +2

http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG@13
PS10, Line 13: s
> typo?
Done



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 27 Aug 2019 14:43:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#11).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

Although from a filesystem point of view the directories won't be
completely empty. They will contain the hidden '_empty' file, so
FileSystemUtil.listFiles() will list them. It's similar to Hive's
behavior which creates the magic file '_orc_acid_version' in each
transactional directory. Except Impala only creates these '_empty'
files in empty base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 368 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/11
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 10: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14071/10//COMMIT_MSG@13
PS10, Line 13: s
typo?



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 27 Aug 2019 14:15:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4384/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 27 Aug 2019 13:00:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4383/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 27 Aug 2019 12:35:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#10).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

Although from a filessystem point of view the directories won't be
completely empty. They will contain the hidden '_empty' file, so
FileSystemUtil.listFiles() will list them. It's similar to Hive's
behavior which creates the magic file '_orc_acid_version' in each
transactional directory. Except Impala only creates these '_empty'
files in empty base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 368 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/10
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-27 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#8).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

Although from a filessystem point of view the directories won't be
completely empty. They will contain the hidden '_empty' file, so
FileSystemUtil.listFiles() will list them. It's similar to Hive's
behavior which creates the magic file '_orc_acid_version' in each
transactional directory. Except Impala only creates these '_empty'
files in empty base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 368 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/8
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 7:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4367/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 26 Aug 2019 16:35:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 6:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4366/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 26 Aug 2019 16:27:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#7).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

Although from a filessystem point of view the directories won't be
completely empty. They will contain the hidden '_empty' file, so
FileSystemUtil.listFiles() will list them. It's similar to Hive's
behavior which creates the magic file '_orc_acid_version' in each
transactional directory. Except Impala only creates these '_empty'
files in empty base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
9 files changed, 353 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/7
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG@10
PS5, Line 10: it
: just creates new empty ACID base directories.
> Can you mention the _empty file + that this is the same as the way Hive wor
Extended the commit message.


http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1912
PS5, Line 1912: _empty
> It would be nice to mention that Hive works the same way + maybe add a the
Added link to the IMPALA Jira, I don't know the exact Hive Jira when they 
introduced _orc_acid_version.

Hive uses this file differently than we use this "_empty" file. It creates the 
magic file in every transactional directory and as its name suggests it stores 
the ACID version being used. I'm not sure that it is a good idea to comment 
about Hive's behavior here because it could be distracting.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 26 Aug 2019 15:57:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#6).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

Although from a filessystem point of view the directories won't be
completely empty. They will contain the hidden '_empty' file, so
FileSystemUtil.listFiles() will list them. It's similar to Hive's
behavior which creates the magic file '_orc_acid_version' in each
transactional directory. Except Impala only creates these '_empty'
files in empty base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
9 files changed, 336 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/6
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4365/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 26 Aug 2019 15:38:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 5: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14071/5//COMMIT_MSG@10
PS5, Line 10: it
: just creates new empty ACID base directories.
Can you mention the _empty file + that this is the same as the way Hive works?


http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1912
PS5, Line 1912: _empty
It would be nice to mention that Hive works the same way + maybe add a the Hive 
Jira about this.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 26 Aug 2019 15:15:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#5).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
9 files changed, 336 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/5
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-26 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1850
PS4, Line 1850: try (MetaStoreClient msClient = catalog_.getMetaStoreClient())
> The common pattern seems to be to hold only one msClient at the same time i
I agree that it is something that we should clean up. The MetastoreShim methods 
usually take an IMetaStoreClient, methods from other classes access to a 
Catalog object so they can just ask another client from the pool. Maybe we 
should always pass around the pool?

The cleanup might require quite a few changes so I think we should do it in a 
separate jira.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 26 Aug 2019 13:47:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 4: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@241
PS4, Line 241: // Correct validWriteIdList is needed
 : // to commit the alter partitions operation in hms side.
 : client.alter_partitions(dbName, tblName, partitions, null,
 : tblTxn.validWriteIds, tblTxn.writeId);
Ouch, thanks for fixing this.


http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1850
PS4, Line 1850: try (MetaStoreClient msClient = catalog_.getMetaStoreClient())
The common pattern seems to be to hold only one msClient at the same time in a 
single DDL operation, while here we can hold two as unsetTableColStats also 
asks for one.  getMetaStoreClient() asks for a free client from the pool, so it 
is not just returning the same client.

We also seem to try to hold them for as little time as possible, while here we 
call out to another component (HDFS) while we are holding the HMS client.

I don't know how often to we have problems with running out of HMS client, so 
this may not matter in practice.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 23 Aug 2019 17:15:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4344/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 23 Aug 2019 16:42:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 4:

Sorry about the diff between PS3 and PS4, I thought I already upload a rebase 
commit after PS3.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 23 Aug 2019 16:10:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@705
PS3, Line 705:   public void lockTableInTransaction(String dbName, String 
tableName, long txnId,
> We could get the caller to pass in a Transaction object, I think I'd prefer
Done. I kept the long for lockTableInternal() because it can take 0L as no 
transaction, and it's also a private method.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1838
PS2, Line 1838:   unsetPartitionsColStats(table.getMetaStoreTable(), 
hmsPartitions,
  :   writeId, txn.getId());
  : }
  : // Remove COLUMN_STATS_ACCURATE property from the table.
  : unsetTableColStats(table.getMetaStoreTable(), 
txn.getId()
> Thanks for the notice, will fix it in a separate rebase commit.
Done.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 23 Aug 2019 16:03:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-23 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#4).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
10 files changed, 336 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/4
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@705
PS3, Line 705:   public void lockTableInTransaction(String dbName, String 
tableName, long txnId,
We could get the caller to pass in a Transaction object, I think I'd prefer the 
strongly-typed interface.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 22 Aug 2019 18:24:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4331/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 22 Aug 2019 14:38:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java@732
PS2, Line 732: st
> Can this be called with txnId == -1? I would change this to <=0 or add a pr
This is a private method, this should only be called via the public 
LockTableStandalone() and LockTableInTransaction() methods. Via these methods 
txnId can only be >=0.

Added precondition check in case someone else adds a new method that calls this 
one.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java
File fe/src/main/java/org/apache/impala/catalog/Transaction.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17
PS2, Line 17: Transaction
> Does this class need to be thread-safe? I can see there are possible races
The idea is to use it from a single thread in a try-with-resources statement, 
so I rather keep it simple unless a good use case comes that requires it to be 
thread-safe.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@20
PS2, Line 20: transactionId_
> can we make this final? I see that you are using this to keep track of whet
See answer above.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@38
PS2, Line 38: checkState(transactionId_ > 0)
> Does this need to throw an exception? What is the behavior from HMS if a cl
I think we shouldn't commit a transaction twice so it checks that Impalas's 
logic works well.

HMS throws NoSuchTxnException in case of already committed transactions.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@46
PS2, Line 46: transactionId_
> nit, can we change this to if (transactionId_ < 0) return; makes it easier
I agree, done.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1800
PS2, Line 1800: creation
> typo: creation
Done


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1821
PS2, Line 1821: ly Impala cannot upda
> Its unclear why we are clearing this property. Can you add a comment explai
Explained it in code comment.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1829
PS2, Line 1829: org.apache.hadoop.hive.metastore.api.Partition 
hmsPart =
  : ((HdfsPartition)part).toHmsPartition();
> Do we have to alter the partition if COLUMN_STATS_ACCURATE was set? In a ta
We talked about it offline, but for the record:
It's safer to always remove this property in case of writes. E.g. if Hive 
computes statistics and later Impala INSERTs data without invalidating its 
metadata first, then Impala won't clear COLUMN_STATS_ACCURATE (because without 
invalidate metadata it won't see this property and wants to save an RPC). In 
the end Hive will return wrong results.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1838
PS2, Line 1838: }
  : // For partitioned tables we need to alter all the 
partitions in HMS.
  : if (!hmsPartitions.isEmpty()) {
  :   unsetPartitionsColStats(table.getMetaStoreTable(), 
hmsPartitions,
  :   writeId, txn.getId());
> Note that this will break when https://gerrit.cloudera.org/#/c/14066 is mer
Thanks for the notice, will fix it in a separate rebase commit.


http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test:

http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@96
PS2, Line 96: truncate table pt;
> Shouldn't this have a RESULTS section?
No, we check a Hive query in test_acid.py.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071

[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Tim Armstrong, Csaba Ringhofer, Dinesh 
Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#3).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
9 files changed, 333 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/3
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java@732
PS2, Line 732: 0L
Can this be called with txnId == -1? I would change this to <=0 or add a 
precondition for 0<=txnId


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1800
PS2, Line 1800: createion
typo: creation


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1829
PS2, Line 1829: if (hmsPart.getParameters() != null) {
  :   
hmsPart.getParameters().remove(StatsSetupConst.COLUMN_STATS_ACCURATE);
Do we have to alter the partition if COLUMN_STATS_ACCURATE was set? In a table 
that is generally written by Impala it will be never set and we could spare the 
alter partitions RPC. This optimization was implemented for INSERT in 
https://gerrit.cloudera.org/#/c/14037/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1838
PS2, Line 1838:   unsetPartitionsColStats(table.getMetaStoreTable(), 
hmsPartitions,
  :   writeId, txn.getId());
  : }
  : // Remove COLUMN_STATS_ACCURATE property from the table.
  : unsetTableColStats(table.getMetaStoreTable(), 
txn.getId()
Note that this will break when https://gerrit.cloudera.org/#/c/14066 is merged.


http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test:

http://gerrit.cloudera.org:8080/#/c/14071/2/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@96
PS2, Line 96: truncate table pt;
Shouldn't this have a RESULTS section?



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Aug 2019 19:16:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java
File fe/src/main/java/org/apache/impala/catalog/Transaction.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17
PS2, Line 17: Transaction
Does this class need to be thread-safe? I can see there are possible races with 
respect to the transactionId_ being used as the flag to track if this 
transaction is open or close.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@20
PS2, Line 20: transactionId_
can we make this final? I see that you are using this to keep track of whether 
the transaction is open or close. May be you can use a AtomicBoolean to keep 
track of that separately so that commit() can be made thread-safe.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@38
PS2, Line 38: checkState(transactionId_ > 0)
Does this need to throw an exception? What is the behavior from HMS if a client 
calls commit on a already committed transaction?


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/catalog/Transaction.java@46
PS2, Line 46: transactionId_
nit, can we change this to if (transactionId_ < 0) return; makes it easier to 
read in my opinion.


http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1821
PS2, Line 1821: COLUMN_STATS_ACCURATE
Its unclear why we are clearing this property. Can you add a comment explaining 
the same?



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Aug 2019 17:22:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has removed a vote on this change.

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Removed Code-Review+2 by Zoltan Borok-Nagy 
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 2: Code-Review+2

+2ed it so you can carry it if you fix those nits


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Aug 2019 16:03:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4280/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Aug 2019 15:08:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
File fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java@71
PS1, Line 71:   }
> I think this line can be removed, the check on line 70 already blocks the F
Done


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@684
PS1, Line 684:* heartbeating the lock. This function is for locks that 
doesn't belong to a
> Might be worth explicitly saying that the client is responsible for calling
Done


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@687
PS1, Line 687: he particular table is.
> I think we should hide this from callers. How about we add two public varia
Modified the lockTable() interface as suggested.

Yeah it would be useful to create a LockGuard as well, but the code structure 
is a bit different for drop table and I don't want to mess it up. But we plan 
to do some cleanup soon.


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@704
PS1, Line 704:*/
> Is there a corresponding remove lock part?
For truncate, the lock is removed when the transaction is aborted/committed.

DROP table uses this method also, it removes the lock manually because DROP 
doesn't need to run in transaction.


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java
File fe/src/main/java/org/apache/impala/catalog/Transaction.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17
PS1, Line 17: public class Transaction implements AutoCloseable {
> This is really nice
Thanks!


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@37
PS1, Line 37:   public void commit() throws TransactionException {
> Precondition check that transactionId > 0?
Done


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1817
PS1, Line 1817:
> typo: hmsPartitions
Done


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1832
PS1, Line 1832:   hmsPartitions.add(hmsPart);
> Can you comment what this special case is required for? Do we have a test c
It's skipped for non-partitioned tables. Extended the comment about it.

The tests check column stats-property removal for both partitioned and 
non-partitioned tables.


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1854
PS1, Line 1854:* @param writeId the write id of the new base directory.
> Could you comment a little more on why a transactional table does not remov
Extended truncateTransactionalTable()'s comment.


http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test:

PS1:
> Can you also test truncating a completely empty table? I think that's a som
Added a TRUNCATE stmt just after table creation.


http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@40
PS1, Line 40: 
> Can you truncate the table again here, to confirm that it's idempotent (i.e
Sure, done.


http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@90
PS1, Line 90: 
> Do we need to confirm that the stats are correct?
Unfortunately for partitioned tables we can only check it implicitly.

We truncate the table here, then in test_acid.py::test_acid_truncate() we issue 
a count(*) query towards Hive and if it sees 0 rows then it means we 
successfully invalidated the statistics.

Other examples for this are the test_*_statschg methods in test_acid.py.



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071

[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-16 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Tim Armstrong, Dinesh Garg, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14071

to look at the new patch set (#2).

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
9 files changed, 330 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/2
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 1:

(9 comments)

I think this is pretty good, some cleanup and some tests suggested.

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@684
PS1, Line 684:* heartbeating the lock if it doesn't have a transaction 
context.
Might be worth explicitly saying that the client is responsible for calling 
releaseTableLock()?


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@687
PS1, Line 687: 0 for standalone locks.
I think we should hide this from callers. How about we add two public variants 
of lockTable() - lockTableInTransaction() that takes a Transaction argument, 
and lockTableStandalone() that doesn't take a txnId argument. They could both 
use this function as the private implementation.

It would also be nice to use the auto-close pattern for standalone locks, but 
that's unrelated to this change.


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java
File fe/src/main/java/org/apache/impala/catalog/Transaction.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@17
PS1, Line 17: public class Transaction implements AutoCloseable {
This is really nice


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Transaction.java@37
PS1, Line 37:   public void commit() throws TransactionException {
Precondition check that transactionId > 0?


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1817
PS1, Line 1817: hmsPartitons
typo: hmsPartitions


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1832
PS1, Line 1832: if (!hmsPartitons.isEmpty()) {
Can you comment what this special case is required for? Do we have a test case 
that exercises the case when we skip unsetting the stats.


http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
File testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test:

PS1:
Can you also test truncating a completely empty table? I think that's a 
somewhat interesting case.


http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@40
PS1, Line 40: 
Can you truncate the table again here, to confirm that it's idempotent (i.e. 
truncating a truncated table doesn't break anything).


http://gerrit.cloudera.org:8080/#/c/14071/1/testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test@90
PS1, Line 90: truncate table pt;
Do we need to confirm that the stats are correct?



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dinesh Garg (430)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 15 Aug 2019 21:08:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-15 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 1:

(3 comments)

A couple of my comments.

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
File fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java@71
PS1, Line 71: analyzer.ensureTableNotFullAcid(table_);
I think this line can be removed, the check on line 70 already blocks the 
FullAcid table.


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@704
PS1, Line 704:   if (txnId == 0L) transactionKeepalive_.addLock(lockId, 
ctx);
Is there a corresponding remove lock part?


http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14071/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1854
PS1, Line 1854: for (FeFsPartition part: partitions) {
Could you comment a little more on why a transactional table does not remove 
files? Can that cause file leaks?



--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 15 Aug 2019 18:49:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14071 )

Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4265/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 15 Aug 2019 18:41:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

2019-08-15 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14071


Change subject: IMPALA-8793: Implement TRUNCATE for insert-only ACID tables
..

IMPALA-8793: Implement TRUNCATE for insert-only ACID tables

This commit adds support for the TRUNCATE statement for transactional
tables. TRUNCATE on transactional tables doesn't remove the files, it
just creates new empty ACID base directories.

All the functionality is implemented in the catalog, therefore
transaction handling, table locking, heartbeating also happens there.
Added the new Transaction class that should prevent leakage of
transactions, it should be used in try-with-resources statements.

Testing:
Added backend tests that truncate non-partitioned and partitioned
tables as well.
The tests check whether the statistics were removed.
The tests also check if Hive sees the effect of truncation.

Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
---
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
A fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
A testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
9 files changed, 294 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14071/1
--
To view, visit http://gerrit.cloudera.org:8080/14071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic749b7f27da157e1c0ebf9b7e9b6ee09afad122a
Gerrit-Change-Number: 14071
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy