Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 )
Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd ...................................................................... Patch Set 2: (46 comments) http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/11224/1/be/src/catalog/catalog-server.cc@537 PS1, Line 537: > How do you measure the length of a string literal without using an array? I mean: static const char* kMsg = "unused_table_ttl_sec not specified"; Value value(kMsg, strlen(kMsg); Even though strlen looks like a function call here, gcc -O1 or higher will optimize out the strlen to just the known constant size, see https://godbolt.org/z/3AMhwL http://gerrit.cloudera.org:8080/#/c/11224/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/2/be/src/common/global-flags.cc@208 PS2, Line 208: Talbe typo: Table nit: I think it's better to avoid talking too much about implementation mechanism in the doc here. How about something like: If a table has not been referenced in a query for more than the configured amount of time, the catalog server will automatically evict its cached metadata about this table. This has the same effect as a user-initiated "INVALIDATE METADATA" statement on the unused table. Configuring this to 0 disables automatic invalidation of tables. http://gerrit.cloudera.org:8080/#/c/11224/2/be/src/common/global-flags.cc@207 PS2, Line 207: DEFINE_int32(unused_table_ttl_sec, 0, "Catalogd invalidates unused tables older than this" : "threshold. Talbe usage is reported periodically from impalad to catalogd. Then" : "catalog invalidates old unused tables as if they are invalidated by a user." : "0 disables this feature."); : : DEFINE_bool(invalidate_tables_on_memory_pressure, false, "Catalog invalidates recently" : "unused tables when the old GC generation is almost full."); nit: missing extra spaces at end of each line before the end quote http://gerrit.cloudera.org:8080/#/c/11224/2/be/src/exec/catalog-op-executor.h File be/src/exec/catalog-op-executor.h: http://gerrit.cloudera.org:8080/#/c/11224/2/be/src/exec/catalog-op-executor.h@72 PS2, Line 72: Status ReportTableUsage(const TReportTableUsageRequest& req, nit: add some docs http://gerrit.cloudera.org:8080/#/c/11224/2/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/11224/2/common/thrift/CatalogService.thrift@415 PS2, Line 415: num_usage nit: I think "usage_count" or "num_usages" would be better http://gerrit.cloudera.org:8080/#/c/11224/2/common/thrift/CatalogService.thrift@419 PS2, Line 419: 1: required map<CatalogObjects.TTableName, TTableUsage> tables; nit: extra space in here http://gerrit.cloudera.org:8080/#/c/11224/2/common/thrift/CatalogService.thrift@423 PS2, Line 423: nit: no need for empty line http://gerrit.cloudera.org:8080/#/c/11224/2/common/thrift/CatalogService.thrift@461 PS2, Line 461: TReportTableUsageResponse ReportTableUsage(1: TReportTableUsageRequest req); nit: doc http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@141 PS2, Line 141: if (ImpaladTableUsageTracker.INSTANCE != null) { : ImpaladTableUsageTracker.INSTANCE.addTables(tbls); : } I think we need to do this both on the views _and_ the expanded tables, not just the top-level views. Otherwise, if some tables are only accessed via views, we'll never update their lastUsed time, and they'll get constantly evicted. If we do this at the end of the function instead of at the top, then I think we'll cover both the views and the underlying tables, right? http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@255 PS1, Line 255: catalogdTableShrinker_ = new CatalogdTableShrinker(this, > It's initialization scope like a constructor hm, maybe it would be simpler to just put this as a local variable inside 'run'? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@252 PS2, Line 252: throw new IllegalArgumentException("unused_table_ttl_sec flag must be a " + : "non-negative integer."); I think doing this validation in the ctor below makes more sense -- maybe just plumb the gflags structure in directly so that you can keep all of the validation with the code that uses it, and use a factory method so you can return null if it's disabled in the configuration? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1920 PS2, Line 1920: for (TTableName tableName : req.tables.keySet()) { nit: indentation http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1935 PS2, Line 1935: setCatalogdTableShrinker is this for tests? If so please annotate @VisibleForTesting http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java: PS2: license http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@23 PS2, Line 23: table nit: tables http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@30 PS2, Line 30: CatalogdTableShrinker I think "Invalidator" is better than "Shrinker", because we are actually evicting whole tables, not shrinking them down to some compressed representation. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@32 PS2, Line 32: // Plugable time source for tests. Defined as static to avoid passing nit: use /** javadoc format */ so that IDEs can highlight the docs nicely http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@34 PS2, Line 34: static Ticker timeSource_ = Ticker.systemTicker(); nit: probably should be TIME_SOURCE as something that's mostly-constant. You can annotate @VisibleForTesting to make it clearer why it exists. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@35 PS2, Line 35: private CatalogServiceCatalog catalog_; can this be final? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@55 PS2, Line 55: ne.addNotificationListener(new NotificationListener() { I think breaking out the NotificationListener subclass and Thread subclass below into their own private inner classes would make this file a bit more readable http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@63 PS2, Line 63: if (!info.getGcAction().equals("end of major GC")) return; oof, is this the only way to do this? what other kind of events might we see? looking at the fields of 'getMemoryUsageAfterGc', it seems like all events are "GC finished". One thought: I know I'd suggested using the notification listener interface initially, but if we already have a thread that is doing periodic checks for TTL purposes, maybe it's easier to eliminate a bunch of this code and instead just use GarbageCollectorMXBean.getLastGcInfo() in that notification loop? If, in that loop, we see that the GC count for old gen increased, and the usage after the GC was still high, we can do the cleanup? then we don't have the complexity of the 'gcTrigger' wait/notify http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@65 PS2, Line 65: MemoryUsage tenuredGenUsage = mem.get("PS Old Gen"); is this always the name of the old gen, even if a different collector is configured? Could this be null if enabling some future GC like Shenandoah, etc? I wonder if there is any less fragile way to find the old gen http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@66 PS2, Line 66: 7 / 10 please add a constant for this fraction. I think it's fine to use floating point math here to make things simpler. Also, I think we should consider a slightly lower amount, even though it might seem wasteful. My reasoning is that the default configuration for CMSInitiatingOccupancyFraction is 68%. That's the percent heap usage at which a concurrent GC is triggered. So, if, *after* a GC has finished, we have more than 70% heap usage, then we'll have already triggered a back-to-back second CMS cycle. If instead we triggered this at 60% post-GC, then we'd have time to run our eviction logic before reaching the 68% threshold again, and we'd actually be able to collect heap in the next triggered GC. Making this configurable, if it's not too much of a pain in the ass, might be useful for us to test and determine a good default empirically. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@89 PS2, Line 89: .timedWait(CatalogdTableShrinker.this, unusedTableTtlSec_); this doesn't seem quite right. Even if the TTL is 100 seconds, that doesnt' mean we only want to _check_ every 100 seconds. Otherwise, we could see a table when it was used 99sec ago, and then go back to sleep for another 100 seconds http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@117 PS2, Line 117: invalidate10percent think this should be coded more generically to invalidate a provided ratio, and have a config for that ratio http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@124 PS2, Line 124: // TODO: use quick select as an alternative, why not use a PriorityQueue above instead of a List? You can determine the 10% count by doing one pass through the tables/dbs, and then limit the size of the priority queue to the computed count. Of course the count could change slightly in between but it probably doesn't matter. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@135 PS2, Line 135: catalog_.invalidateTable(tTableName, tblWasRemoved, dbWasAdded); let's LOG.info here http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@144 PS2, Line 144: retireAgeSecs * 1000000000L I think using one of the TimeUnit conversion functions is easier to read here http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@148 PS2, Line 148: catalog_.invalidateTable(tTableName, tblWasRemoved, dbWasAdded); same http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/Db.java File fe/src/main/java/org/apache/impala/catalog/Db.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/Db.java@26 PS2, Line 26: import org.apache.impala.thrift.TReportTableUsageRequest; unused http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@114 PS2, Line 114: throw new IllegalArgumentException("unused_table_ttl_sec flag must be a " + same comment as elsewhere - better to keep the validation with the code that uses these configs http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@117 PS2, Line 117: ImpaladTableUsageTracker.create(); shouldn't you be assigning this to the member variable? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@32 PS2, Line 32: /** : * Report table usage to catalogd. : */ can you expand this javadoc to talk a bit more about the role of this class? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@36 PS2, Line 36: ImpaladCatalog nit: wrong class name http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@43 PS2, Line 43: reportThread can you give this thread a name? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@54 PS2, Line 54: INSTANCE = new ImpaladTableUsageTracker(); can we avoid having yet another singleton? Singletons make testing harder, etc. Maybe this could be instantiated as a member of Frontend? Seems like the table metadata loader could grab it from Frontend pretty easily. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@61 PS2, Line 61: tableName : style nit: we seem to more commonly skip the space in between the variable name and the ':'. I don't really care but Vuk pointed this out in a review earlier today so figured I'd note it :) http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@73 PS2, Line 73: while (true) { I think it would be good to add a sleep in here as well so that we rate limit the reporting of usage to once every few seconds or something. There isn't any need to report usage immediately, and this will reduce the load on catalogd, etc. If that makes testing inconvenient, you could put in a @VisibleForTesting global that you could set back down to 0 to encourage tests to run faster. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@83 PS2, Line 83: LOG.warn("Thrift serialization failure: ", e); : } catch (Exception e) { : LOG.error("Unexpected exception thrown: ", e); I think you could probably combine both of these into a multi-catch close, and probably should downgrade the ERROR log to WARN since it is a recoverable error (no incorrect results or failed queries resulted). The error message should also be clearer to the user what the effect of the error was. For example, "Unable to report table usage information to catalog server" http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@89 PS2, Line 89: nit: extra space http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: PS2: do we need to mark usage for all of the other calls in this file? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@728 PS2, Line 728: refreshLastUsedTime why not just put this in 'load()' since it seems you always call it after load? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/test/java/org/apache/impala/catalog/CatalogdTableShrinkerTest.java File fe/src/test/java/org/apache/impala/catalog/CatalogdTableShrinkerTest.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/test/java/org/apache/impala/catalog/CatalogdTableShrinkerTest.java@92 PS2, Line 92: synchronized (cleaner) { : cleaner.notify(); : } this seems to be coupled pretty tightly to the implementation details of the class under test. Can we add some function like 'wakeNowForTests()' instead? http://gerrit.cloudera.org:8080/#/c/11224/2/tests/custom_cluster/test_automatic_invalidation.py File tests/custom_cluster/test_automatic_invalidation.py: http://gerrit.cloudera.org:8080/#/c/11224/2/tests/custom_cluster/test_automatic_invalidation.py@42 PS2, Line 42: time.sleep(20) if we add a log message when auto-invalidation happens, we can change this to tail the logs and wait for it, right? then we'd be less sensitive to test timing http://gerrit.cloudera.org:8080/#/c/11224/2/tests/custom_cluster/test_automatic_invalidation.py@51 PS2, Line 51: """IMPALA-5789: Test that always false filter filters out splits when file-level : filtering is disabled.""" ? same above http://gerrit.cloudera.org:8080/#/c/11224/2/tests/custom_cluster/test_automatic_invalidation.py@60 PS2, Line 60: time.sleep(20) : assert metadata_cache_string not in urllib.urlopen(url).read() : similarly, to avoid test timing dependency, would be better to loop here and check until it succeeds (with a long timeout) -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Tue, 21 Aug 2018 01:42:25 +0000 Gerrit-HasComments: Yes