Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
......................................................................


Patch Set 1:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/11224/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11224/1//COMMIT_MSG@11
PS1, Line 11: HDFS tables
why only HDFS tables?


http://gerrit.cloudera.org:8080/#/c/11224/1//COMMIT_MSG@13
PS1, Line 13: query/DML finishes its
            : execution
curious why at end of execution instead of at end of planning?


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@165
PS1, Line 165: UpdateUsedTableNames
nit: i think "ReportTableUsage" might be a better name? This current name makes 
it sound like it is replacing a list instead of just marking that the tables 
got used. And, maybe we might want to report some more details such as the 
count of queries in a particular interval that used the table.


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@169
PS1, Line 169:     if (!status.ok()) LOG(ERROR) << status.GetDetail();
this should probably be WARNING instead of ERROR since it doesn't affect 
correctness.


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@171
PS1, Line 171:   void InvalidateUnusedTables(TInvalidateUnusedTablesResponse& 
resp,
any particular reason to expose this via RPC instead of just automatic in the 
background?


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@537
PS1, Line 537:     const char msg[] = "unused_table_ttl_sec not specified";
I think if you just pass a string constant, it'll end up in static storage, and 
you wouldn't need to use the "copying" approach right?


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@539
PS1, Line 539:     document->AddMember("error", value, 
document->GetAllocator());
maybe just early-return here so that the rest of the function doesn't need to 
be nested?


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@542
PS1, Line 542:       unused_table_ttl_sec = 
boost::lexical_cast<int>(unused_table_ttl_sec_it->second);
can you tighten the "catch" block to only cover this line, and then 
early-return from it? That way the majority of the function won't be indented.


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@547
PS1, Line 547:       if (status.ok()) {
if not OK don't we want to put this in the 'error'? I think if we can refactor 
this whole function into another one that just returns Status, then we could 
consolidate all of the various code that fills in the "error" field in the 
document into one place


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@549
PS1, Line 549: tables
nit: can we name this thrift field "removed_tables" or "invalidated_tables" or 
something so it's clearer?


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@570
PS1, Line 570:         const char msg[] = "invalid unused_table_ttl_sec value";
             :         Value value(msg, arraysize(msg));
             :
nit: indentation off


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/common/global-flags.cc@202
PS1, Line 202: catalog invalidates tables older than this"
             :     "threshold
nit: I think this doc should be expanded out to explain a bit more about the 
usage/semantics.


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/exec/catalog-op-executor.h
File be/src/exec/catalog-op-executor.h:

http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/exec/catalog-op-executor.h@76
PS1, Line 76:   TInvalidateUnusedTablesResponse* resp);
nit: indentation


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/exec/catalog-op-executor.cc@336
PS1, Line 336: InvalidateUnusedTable
nit: this method name is missing an 's'


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/service/client-request-state.cc@528
PS1, Line 528:   // Update used tables in this query.
I think this should be deferred onto some queue rather than done inline, 
otherwise this can block the user, right? eg if we have a bounded queue of 
usages, and some background thread which pulls items off the queue once every 
few seconds and sends them to catalogd, we avoid this critical-path RPC


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/service/client-request-state.cc@534
PS1, Line 534:     req.tables.emplace_back(name);
nit: std::move(name)


http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/service/client-request-state.cc@537
PS1, Line 537:   catalog_op_executor_.reset(
             :       new CatalogOpExecutor(exec_env_, frontend_, 
server_profile_));
hrm, I'm surprised this is a member variable instead of just in local scope. If 
it's a member variable, is it already initialized here?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@84
PS1, Line 84: import static jodd.util.ThreadUtil.sleep;
I think this is coming by some accidental transitive dependency. Why not just 
use Thread.sleep or Guava's Uninterruptibles.sleep if you don't want to handle 
interruption?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@252
PS1, Line 252:     if (unusedTableTtlSec != 0) {
can we do a check that it's not negative?

also, mind refactoring this code into a new method like 
'startAutoInvalidationThread' or something?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@253
PS1, Line 253:       new Thread(new Runnable() {
this thread should have a name and also probably be marked as a daemon thread.


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@255
PS1, Line 255:         {
what's with this scope?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@262
PS1, Line 262:             invalidateUnusedTables(req);
I think it makes sense to put a try/catch around this so that, even if the call 
throws some kind of weird runtime exception, we can log it and keep going. 
Otherwise we'll silently lose this thread and start to "leak" memory


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1916
PS1, Line 1916: getTable(tableName.db_name, tableName.table_name).
do we have to worry about null here? seems plausible that a table might have 
gotten renamed/removed


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1925
PS1, Line 1925:     Calendar oldestTableAllowed = null;
I think it would be safer to avoid any calendar calculations and instead just 
use a monotonic timestamp (System.nanotime gives you a monotonic clock)


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1935
PS1, Line 1935: FeFsTable
why only evict FeFsTable? I agree they are the largest, but seems unexpected to 
ignore the other tables. If we are going to do this, I think it makes sense to 
actually put the 'lastUsedTime' bit in the HdfsTable class instead of in the 
Table class, since that will also avoid the extra memory usage in 
'IncompleteTable'.


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1935
PS1, Line 1935: req.isSetUnused_table_ttl_sec()
isn't this handled up above?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1936
PS1, Line 1936: table.lastUsedTime
can we add a getter?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/Table.java@111
PS1, Line 111:   protected Calendar lastUsedTime;
this seems bulky. Per comment elsewhere how about just using a long, and 
setting it to System.nanotime?


http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java:

http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@29
PS1, Line 29: import com.google.common.collect.Iterables;
unused


http://gerrit.cloudera.org:8080/#/c/11224/1/tests/custom_cluster/test_automatic_invalidation.py
File tests/custom_cluster/test_automatic_invalidation.py:

http://gerrit.cloudera.org:8080/#/c/11224/1/tests/custom_cluster/test_automatic_invalidation.py@24
PS1, Line 24: class TestAutomaticCatalogInvalidation(CustomClusterTestSuite):
I think this test would probably be easier to write in Java so you can directly 
inspect the state of the Catalog instead of having to do it e2e like this.

If you do this in the Java side, you could also easily mock the advancement of 
time. For example, you could add a Guava 'Ticker' as a static member of 
CatalogServiceCatalog. In normal runtime the ticker would be set to be the 
systemTicker, and in the test, you can mock it out to make time advance 
deterministically.



--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Wed, 15 Aug 2018 19:55:51 +0000
Gerrit-HasComments: Yes

Reply via email to