Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 )
Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd ...................................................................... Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc File be/src/catalog/catalog-util.cc: http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc@256 PS3, Line 256: bool TTableName::operator<(const impala::TTableName& that) const { > This is only for the following map usage in thrift. Renaming is not possibl Could we make that a list of pairs instead to get around this? This works fine for now, but if for example we at some point added a third field in TTableName (eg a "catalog" level to the naming) we'd have a really hard-to-find bug since it's unlikely anyone would remember to update this comparator. Furthermore, such a bug could cause bad crashes since operator== and operator< would end up inconsistent. http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@209 PS4, Line 209: invalidate_tables_timeout nit: this should have a suffix like "_sec" or "_s" so it's clear the units http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@218 PS4, Line 218: invalidate_tables_gc_ if the other two flags are hidden, maybe we shouldn't document them? It seems inconsistent to hide them and then mention them in a non-hidden flag. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@86 PS4, Line 86: /** : * A callback called after a shrinking pass is triggered. Used for test purposes. : */ wrong doc http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@112 PS4, Line 112: Preconditions.checkArgument(unusedTableTtlSec >= 0); add back the message so that if the user hits this, they know which configuration they got wrong http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@141 PS4, Line 141: if (poolName.contains("Old")) { maybe invert this and 'continue' so you can unnest the rest of the block? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@148 PS4, Line 148: it's not a implementation of more concise: "it does not implement" http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@209 PS4, Line 209: if (table.isLoaded()) { style nit: I think easier to read this loop if we invert this: "if (!table.isLoaded()) continue;" (generally I find it easier to read code when there is less nesting going on, by handling error or "unusual" cases up front in the control flow) http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@211 PS4, Line 211: if (inactivityTime > retireAgeNano) { same here http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@255 PS4, Line 255: unusedTableTtlNano_ this should also be changed to be more often than the TTL for the same reason as below. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@260 PS4, Line 260: unusedTableTtlSec needs update http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@266 PS4, Line 266: error this should probably be a WARNING -- I think ERROR should be reserved for unrecoverable user-impacting errors, whereas this is a recoverable error. I think we should also ensure that the error message explains the context and what will happen as a result. For example, something like: LOG.warning("Unexpected exception thrown while attempting to automatically invalidate tables. Will retry in {} seconds", sleepTime, e) that way if the user sees the message they will know that this doesn't mean that some user query was failed, or that data was lost, or whatever. Also, I think we should make sure that if an exception is thrown, we still sleep before retrying. Otherwise we might get into some state with a tight loop. Perhaps we should just add a fixed wait of 5 seconds or something here in the catch clause? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java File fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@56 PS4, Line 56: .invalidateTable(new TTableName(/*dbName=*/"functional", /*tblName=*/"alltypes"), sorry, in my previous comment I meant that you can use the 'dbName' and 'tblName' variables that you defined just above. -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Wed, 05 Sep 2018 18:04:37 +0000 Gerrit-HasComments: Yes
