Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
......................................................................


Patch Set 10:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1235
PS10, Line 1235:    * TODO: the following paragraph seems at least partially 
obsolate
> Why not clean it up then? Point (1) is obsolete but point (2) is still vali
Done


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@244
PS10, Line 244:             Long.toString(System.currentTimeMillis() / 1000));
> Doesn't the HMS set this in alter_table()?
It sets it if the property does not exist, but this would not work well for 
Kudu tables: after calling alter_table here, the table will not be loaded from 
HMS again (unlike in HDFS tables, where this is done to ensure HMS-Impala 
consistency). This means that if I would remove "transient_lastDdlTime" here, 
then HMS would calculate a valid value, but Impala catalogd would not know 
about this value.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@785
PS10, Line 785:       msTbl.putToParameters("impala.lastComputeStatsTime",
> Create a constant for the table property in Table similar to what we have i
I have put the constant to HdfsTable, because all the other property keys 
reside their, even those that are used by Kudu tables too.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS10, Line 2868:         // HMS updates transient_lastDdlTime if it is removed.
> I understand what this comment says, but I don't understand it's relevance
This comment remained here by mistake.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2870
PS10, Line 2870:             Long.toString(System.currentTimeMillis() / 1000));
> We're doing "System.currentTimeMillis() / 1000" in quite a few places, I su
I have created a bit different helper function.


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py
File tests/metadata/test_last_ddl_time_update.py:

http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@74
PS10, Line 74:     def run_test(self, query, expect_changed_ddl_time, 
expect_changed_stats_time):
> Passing a ";" delimited string is really weird. Why not pass a list of quer
The "invalidate metadata %(TBL)s; describe %(TBL)s" combo was used the check 
that in case of Kudu tables, lastDdlTime is not increased by reloading the 
table (it used to increase it), so an extra query is needed after INVALIDATE 
METADATA to load the table. The two can be separated (like for other init 
queries) but I felt that they belong together.

I replaced the string splitting with variable argument list - I can replace 
this with two separate calls if needed.


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@93
PS10, Line 93:       # Hive uses a seconds granularity on the last ddl time.
> Not just Hive - we do too. Just say "All times are stored in seconds" or so
Isn't it an HMS convention in this case? Or is there a reason behind not using 
higher precision timestamp?


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@155
PS10, Line 155:     h.expect_no_time_change("drop incremental stats %(TBL)s 
partition (j=1, s='2012')")
> use "drop stats" instead to wipe everything (including incremental stats)
Are you sure about this? I use DROP INCREMENTAL STATS on purpose to check +1 
statement's effect in the test suite.



--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Fri, 11 May 2018 17:26:43 +0000
Gerrit-HasComments: Yes

Reply via email to