[
https://issues.apache.org/jira/browse/IMPALA-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489472#comment-16489472
]
ASF subversion and git services commented on IMPALA-387:
--------------------------------------------------------
Commit c98c01c55d7f6af7e536347986c5b22841bc78e7 in impala's branch
refs/heads/2.x from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c98c01c ]
IMPALA-6131: Track time of last statistics update in metadata
The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.
Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
logic set it to current time + 1, if the old version was
>= current time, to ensure that it is always increased by
DDL operations. This was useful in the past, as IMPALA-387
used lastDdlTime to check if partition data needs to be
reloaded, but since IMPALA-1480, Impala does not rely on
lastDdlTime at all.
- Computing / setting stats on HDFS tables no longer increases
"transient_lastDdlTime".
- When Kudu tables are (re)loaded, it is checked if their
HMS representation is up to date, and if it is, then
IMetaStoreClient.alter_table() is not called. The old
logic always called alter_table() after loading metadata
from Kudu. This change was needed to ensure that
"transient_lastDdlTime" works similarly in HDFS and Kudu
tables, and should also make (re)loading Kudu tables faster.
Notes:
- Kudu will be able to sync its tables to HMS in the near
future (see KUDU-2191), so the Kudu metadata handling in
Impala may need to be redesigned.
Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables
Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.
Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234
Reviewed-on: http://gerrit.cloudera.org:8080/10484
Reviewed-by: Alex Behm <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Make "REFRESH" command a SQL statement rather than RPC
> ------------------------------------------------------
>
> Key: IMPALA-387
> URL: https://issues.apache.org/jira/browse/IMPALA-387
> Project: IMPALA
> Issue Type: New Feature
> Affects Versions: Impala 1.0
> Reporter: Lenni Kuff
> Assignee: Alan Choi
> Priority: Major
> Fix For: Impala 1.1
>
>
> It would be good to make "REFRESH" a first-class SQL statement rather than
> just an RPC. This would allow users to submit refreshes outside of the
> impala-shell (for example - via JDBC/ODBC). Initially, we would need to
> support both a full catalog refresh as well as a table-level refresh:
> REFRESH;
> REFRESH <table name>;
> IMPALA-339 may introduce some additional syntax to choose between a RELOAD
> and a REFRESH so that should be covered as well.
> As part of the change the impala-shell should be updated to submit refreshes
> using the regular "query" API rather than calling ResetCatalog/ResetTable
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]