[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

Hive depends on property COLUMN_STATS_ACCURATE to tell if the
stored statistics accurate. After Impala inserts data, it does
not set statistics values up-to-date(for example numRows).
Impala should unset COLUMN_STATS_ACCURATE to tell Hive the
stored stats are no longer accurate.
The patch impletes:
After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid
tables.

Implements the stats changes above for both acid/non-acid tables.

Tests:
Manual tests.
Run core tests.
Add ee tests to test interop with Hive for acid/external tables.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Reviewed-on: http://gerrit.cloudera.org:8080/14037
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 340 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 8
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:55:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6: Code-Review+2

Okay, thanks for the explanation!


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:41:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

(1 comment)

Because partition's column_stats_accurate property is never shown in "show 
create table" or show partitions, I added Hive query select count(*) in the 
tests to show the patch works by showing hive does not return wrong values for 
count after impala insert.

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
File 
testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test:

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test@45
PS6, Line 45: 
> Again, no checks about the stats for the partitioned table.
the same reason



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:19:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

Because partition's column_stats_accurate property is never shown in "show 
create table" or show partitions, I added Hive query select count(*) in the 
tests to show the patch works by showing hive does not return wrong values for 
count after impala insert.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:17:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
File 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test:

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test@52
PS6, Line 52:
Didn't you want to add a 'show create table' statement for the partitioned 
table to check the removal of COLUMN_STATS_ACCURATE?


http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
File 
testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test:

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test@45
PS6, Line 45: 
Again, no checks about the stats for the partitioned table.



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:00:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4222/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 12 Aug 2019 19:24:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-12 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 5:

(2 comments)

submit patch set 6

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3762
PS3, Line 3762: table.getDb().getName(), table.getName());
  :   }
  : }
> You could get the write id of INSERT the same way as we get the transaction
I got the conclusion of the same writeid by testing as the following:
This list the write ID in partitions table after 1. Hive did compute stats for 
the whole table (analyze table insertonly_part_colstats compute statistics for 
columns;) . 2. Impala insert a row with column stats accurate remove.
3. Hive compute stats for the whole table again. You can see the write ID in 
partitions increased only 1 each time which means no waste of write id number. 
And it consists with hive's insert statement. See the last select on 
partitions(it is after hive insert a row to the 2010-01-01 partiton:
HMS_home_yongzhi_Impala_cdp=> select * from "PARTITIONS" where "TBL_ID"=3274;
 PART_ID | CREATE_TIME | LAST_ACCESS_TIME |   PART_NAME   | SD_ID | TBL_ID | 
WRITE_ID
-+-+--+---+---++--
   15991 |  1565289739 |0 | ds=2010-01-01 | 19250 |   3274 |
   18
   15992 |  1565289739 |0 | ds=2010-01-02 | 19251 |   3274 |
   18
(2 rows)

HMS_home_yongzhi_Impala_cdp=> select * from "PARTITIONS" where "TBL_ID"=3274;
 PART_ID | CREATE_TIME | LAST_ACCESS_TIME |   PART_NAME   | SD_ID | TBL_ID | 
WRITE_ID
-+-+--+---+---++--
   15991 |  1565289739 |0 | ds=2010-01-01 | 19250 |   3274 |
   19
   15992 |  1565289739 |0 | ds=2010-01-02 | 19251 |   3274 |
   18
(2 rows)

HMS_home_yongzhi_Impala_cdp=> select * from "PARTITION_PARAMS" where 
"PART_ID"=15991;
 PART_ID |   PARAM_KEY   | PARAM_VALUE
-+---+-
   15991 | transient_lastDdlTime | 1565633500
   15991 | numFiles  | 4
   15991 | totalSize | 8
   15991 | numRows   | 4
   15991 | rawDataSize   | 4
(5 rows)

HMS_home_yongzhi_Impala_cdp=> select * from "TBLS" where 
"TBL_NAME"='insertonly_part_colstats';
HMS_home_yongzhi_Impala_cdp=> select * from "PARTITIONS" where "TBL_ID"=3274;
 PART_ID | CREATE_TIME | LAST_ACCESS_TIME |   PART_NAME   | SD_ID | TBL_ID | 
WRITE_ID
-+-+--+---+---++--
   15991 |  1565289739 |0 | ds=2010-01-01 | 19250 |   3274 |
   20
   15992 |  1565289739 |0 | ds=2010-01-02 | 19251 |   3274 |
   20
(2 rows)

HMS_home_yongzhi_Impala_cdp=> select * from "PARTITIONS" where "TBL_ID"=3274;
 PART_ID | CREATE_TIME | LAST_ACCESS_TIME |   PART_NAME   | SD_ID | TBL_ID | 
WRITE_ID
-+-+--+---+---++--
   15991 |  1565289739 |0 | ds=2010-01-01 | 19250 |   3274 |
   21
   15992 |  1565289739 |0 | ds=2010-01-02 | 19251 |   3274 |
   20
(2 rows)


http://gerrit.cloudera.org:8080/#/c/14037/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14037/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3718
PS5, Line 3718: if (update.isSetTransaction_id()) {
  :   transactionId = update.getTransaction_id();
  : }
> nit: fits single line
Done



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 12 Aug 2019 18:43:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-12 Thread Yongzhi Chen (Code Review)
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14037

to look at the new patch set (#6).

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

Hive depends on property COLUMN_STATS_ACCURATE to tell if the
stored statistics accurate. After Impala inserts data, it does
not set statistics values up-to-date(for example numRows).
Impala should unset COLUMN_STATS_ACCURATE to tell Hive the
stored stats are no longer accurate.
The patch impletes:
After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid
tables.

Implements the stats changes above for both acid/non-acid tables.

Tests:
Manual tests.
Run core tests.
Add ee tests to test interop with Hive for acid/external tables.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 340 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14037/6
--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 5:

(2 comments)

Did a quick initial pass over it. Looks good to me in overall but I'm planning 
to do another pass tomorrow.

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3762
PS3, Line 3762: table.getDb().getName(), table.getName());
  :   }
  : }
> From my test, it seems the same value. How to get Insert statement's writeI
You could get the write id of INSERT the same way as we get the transaction id, 
i.e. putting it in the relevant thrift object and transfer it from the 
coordinator.

But since allocateTableWriteId() returns the same write id I think it's not a 
problem to get it this way. It's just one extra round-trip to HMS.


http://gerrit.cloudera.org:8080/#/c/14037/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14037/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3718
PS5, Line 3718: if (update.isSetTransaction_id()) {
  :   transactionId = update.getTransaction_id();
  : }
nit: fits single line



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 12 Aug 2019 17:08:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4216/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Sat, 10 Aug 2019 20:20:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-10 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 5:

(2 comments)

Submit patch 5

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@110
PS3, Line 110:
 :   /**
> This signature should be changed in the Hive2 MetastoreShim
Done


http://gerrit.cloudera.org:8080/#/c/14037/3/tests/query_test/test_acid.py
File tests/query_test/test_acid.py:

http://gerrit.cloudera.org:8080/#/c/14037/3/tests/query_test/test_acid.py@104
PS3, Line 104: pIfABFS.hive
> This sounds weird, it would be good to investigate the cause, but I am ok w
I will do more research later, I have a feeling that hive client and impala 
clients not using the same thread (at least sometimes), they cannot see each 
other's changes sometimes because one calls before another one really finishes.



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Sat, 10 Aug 2019 20:08:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-10 Thread Yongzhi Chen (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14037

to look at the new patch set (#5).

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

Hive depends on property COLUMN_STATS_ACCURATE to tell if the
stored statistics accurate. After Impala inserts data, it does
not set statistics values up-to-date(for example numRows).
Impala should unset COLUMN_STATS_ACCURATE to tell Hive the
stored stats are no longer accurate.
The patch impletes:
After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid
tables.

Implements the stats changes above for both acid/non-acid tables.

Tests:
Manual tests.
Run core tests.
Add ee tests to test interop with Hive for acid/external tables.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 342 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14037/5
--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 4: Code-Review+1

(2 comments)

Besides the build fix for Hive 2 builds looks good to me.

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14037/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@110
PS3, Line 110:
 :   /**
> Done
This signature should be changed in the Hive2 MetastoreShim
too (this lead to compilation error in precommit build)


http://gerrit.cloudera.org:8080/#/c/14037/3/tests/query_test/test_acid.py
File tests/query_test/test_acid.py:

http://gerrit.cloudera.org:8080/#/c/14037/3/tests/query_test/test_acid.py@104
PS3, Line 104: pIfABFS.hive
> I have to use hive to create tables to make the test work for I need hive t
This sounds weird, it would be good to investigate the cause, but I am ok with 
leaving it as it is for this patch.

It is a known thing that without SYNC_DDL, other Impala coordinators may not 
see the new tables for some time, but it is surprising with Hive - Impala 
CREATE TABLE should wait for the table to be written to Metastore, and Hive 
queries always check the metastore for fresh data as far as I know.



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Sat, 10 Aug 2019 14:46:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-09 Thread Yongzhi Chen (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14037

to look at the new patch set (#4).

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

Hive depends on property COLUMN_STATS_ACCURATE to tell if the
stored statistics accurate. After Impala inserts data, it does
not set statistics values up-to-date(for example numRows).
Impala should unset COLUMN_STATS_ACCURATE to tell Hive the
stored stats are no longer accurate.
The patch impletes:
After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid
tables.

Implements the stats changes above for both acid/non-acid tables.

Tests:
Manual tests.
Run core tests.
Add ee tests to test interop with Hive for acid/external tables.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 340 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14037/4
--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4205/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 09 Aug 2019 15:19:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-09 Thread Yongzhi Chen (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14037

to look at the new patch set (#3).

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid
tables.
Implemented the stats change for both acid/non-acid tables.

Tests:
Manual tests.
Add ee tests to test interop with Hive for acid/external tables.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 338 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14037/3
--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4197/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 09 Aug 2019 02:21:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-08 Thread Yongzhi Chen (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14037

to look at the new patch set (#2).

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid table.
Implemented for both acid/non-acid tables.

Tests:
Manual tests.
Add ee test to test interop with Hive for acid table.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 332 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14037/2
--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4187/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 08 Aug 2019 21:58:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14037/1/tests/query_test/test_acid.py
File tests/query_test/test_acid.py:

http://gerrit.cloudera.org:8080/#/c/14037/1/tests/query_test/test_acid.py@23
PS1, Line 23: import pytest
flake8: F401 'pytest' imported but unused


http://gerrit.cloudera.org:8080/#/c/14037/1/tests/query_test/test_acid.py@24
PS1, Line 24: import time
flake8: F401 'time' imported but unused


http://gerrit.cloudera.org:8080/#/c/14037/1/tests/query_test/test_acid.py@93
PS1, Line 93: #
flake8: E265 block comment should start with '# '


http://gerrit.cloudera.org:8080/#/c/14037/1/tests/query_test/test_acid.py@96
PS1, Line 96: #
flake8: E265 block comment should start with '# '



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 08 Aug 2019 21:18:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-08 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14037


Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid table.
Implemented for both acid/non-acid tables.

Tests:
Manual tests.
Add ee test to test interop with Hive for acid table.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 326 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14037/1
--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen