[Impala-ASF-CR](2.x) IMPALA-6131: Track time of last statistics update in metadata
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10484 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. IMPALA-6131: Track time of last statistics update in metadata The timestamp of the last COMPUTE STATS operation is saved to table property "impala.lastComputeStatsTime". The format is the same as in "transient_lastDdlTime", so the two can be compared to check if the schema has changed since computing statistics. Other changes: - Handling of "transient_lastDdlTime" is simplified - the old logic set it to current time + 1, if the old version was >= current time, to ensure that it is always increased by DDL operations. This was useful in the past, as IMPALA-387 used lastDdlTime to check if partition data needs to be reloaded, but since IMPALA-1480, Impala does not rely on lastDdlTime at all. - Computing / setting stats on HDFS tables no longer increases "transient_lastDdlTime". - When Kudu tables are (re)loaded, it is checked if their HMS representation is up to date, and if it is, then IMetaStoreClient.alter_table() is not called. The old logic always called alter_table() after loading metadata from Kudu. This change was needed to ensure that "transient_lastDdlTime" works similarly in HDFS and Kudu tables, and should also make (re)loading Kudu tables faster. Notes: - Kudu will be able to sync its tables to HMS in the near future (see KUDU-2191), so the Kudu metadata handling in Impala may need to be redesigned. Testing: tests/metadata/test_last_ddl_time_update.py is extended by - also checking "impala.lastComputeStatsTime" - testing more SQL statements - tests for Kudu tables Note that test_last_ddl_time_update.py is ran only in exhaustive testing. Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Reviewed-on: http://gerrit.cloudera.org:8080/10484 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/metadata/test_last_ddl_time_update.py 7 files changed, 226 insertions(+), 173 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Gerrit-Change-Number: 10484 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR](2.x) IMPALA-6131: Track time of last statistics update in metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10484 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Gerrit-Change-Number: 10484 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 24 May 2018 03:59:47 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6131: Track time of last statistics update in metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10484 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2538/ -- To view, visit http://gerrit.cloudera.org:8080/10484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Gerrit-Change-Number: 10484 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 24 May 2018 00:22:56 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6131: Track time of last statistics update in metadata
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10484 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Gerrit-Change-Number: 10484 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Comment-Date: Wed, 23 May 2018 16:49:12 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6131: Track time of last statistics update in metadata
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/10484 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 1: The commit on master ( https://gerrit.cloudera.org/#/c/10116/ ) changed AuthorizationTest.java to fix a broken test. This test is not present in 2.x, which made the changes in that file unnecessary + conflicting. -- To view, visit http://gerrit.cloudera.org:8080/10484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Gerrit-Change-Number: 10484 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Comment-Date: Wed, 23 May 2018 11:55:21 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6131: Track time of last statistics update in metadata
Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10484 Change subject: IMPALA-6131: Track time of last statistics update in metadata .. IMPALA-6131: Track time of last statistics update in metadata The timestamp of the last COMPUTE STATS operation is saved to table property "impala.lastComputeStatsTime". The format is the same as in "transient_lastDdlTime", so the two can be compared to check if the schema has changed since computing statistics. Other changes: - Handling of "transient_lastDdlTime" is simplified - the old logic set it to current time + 1, if the old version was >= current time, to ensure that it is always increased by DDL operations. This was useful in the past, as IMPALA-387 used lastDdlTime to check if partition data needs to be reloaded, but since IMPALA-1480, Impala does not rely on lastDdlTime at all. - Computing / setting stats on HDFS tables no longer increases "transient_lastDdlTime". - When Kudu tables are (re)loaded, it is checked if their HMS representation is up to date, and if it is, then IMetaStoreClient.alter_table() is not called. The old logic always called alter_table() after loading metadata from Kudu. This change was needed to ensure that "transient_lastDdlTime" works similarly in HDFS and Kudu tables, and should also make (re)loading Kudu tables faster. Notes: - Kudu will be able to sync its tables to HMS in the near future (see KUDU-2191), so the Kudu metadata handling in Impala may need to be redesigned. Testing: tests/metadata/test_last_ddl_time_update.py is extended by - also checking "impala.lastComputeStatsTime" - testing more SQL statements - tests for Kudu tables Note that test_last_ddl_time_update.py is ran only in exhaustive testing. Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/metadata/test_last_ddl_time_update.py 7 files changed, 226 insertions(+), 173 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/10484/1 -- To view, visit http://gerrit.cloudera.org:8080/10484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234 Gerrit-Change-Number: 10484 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer