Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 )
Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd ...................................................................... Patch Set 3: (42 comments) http://gerrit.cloudera.org:8080/#/c/11224/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11224/3//COMMIT_MSG@13 PS3, Line 13: 2. If the old GC generation is almost full, a centen percentage of LRU certain http://gerrit.cloudera.org:8080/#/c/11224/3//COMMIT_MSG@14 PS3, Line 14: could be enabled nit: "can be enabled" (if you say "could" it makes me think that this patch chose not to implement it but maybe will add that in the future, but it seems this is in the patch) (I don't really care that much in the context of the commit message, just wanted to give the tip for better writing clarity) http://gerrit.cloudera.org:8080/#/c/11224/3//COMMIT_MSG@18 PS3, Line 18: unsed nit: used http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc File be/src/catalog/catalog-util.cc: http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc@256 PS3, Line 256: bool TTableName::operator<(const impala::TTableName& that) const { hmm... it seems risky to do operator-overloading on 'operator<' but not operator>, operator>=, operator==, etc. The style guide (https://google.github.io/styleguide/cppguide.html#Operator_Overloading) also says: "If you define an operator, also define any related operators that make sense, and make sure they are defined consistently. For example, if you overload <, overload all the comparison operators, and make sure < and > never return true for the same arguments." Given that, I think it's better to define this as a named function/functor TableNameLessThan and specify that where necessary http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog.h File be/src/catalog/catalog.h: http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog.h@121 PS3, Line 121: Status ReportTableUsage(const TReportTableUsageRequest& req); nit: doc http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/common/global-flags.cc@220 PS3, Line 220: ratio of " : "the numbers I think it's clearer to say "the fraction of tables" and also name the flag with the word "fraction" instead of "ratio". I think this is more consistent terminology with other software -- eg there are various JVM flags ending in "Ratio" and they take integers indicating a "1:N" ratio rather than a floating point fraction. http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/service/fe-support.cc@541 PS3, Line 541: status.AddDetail("Error making an RPC call to Catalog server."); shouldn't this 'status' go somewhere? eg use it to throw an exception or put it into 'result.status' or something? As is, it's just ignored. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@150 PS3, Line 150: ImpaladTableUsageTracker.maybeAddTables(fe_.getImpaladTableUsageTracker() I guess you are doing this so that you can handle a 'null' usage tracker, but seems like it's not worth the confusion. Wouldn't it be simpler to just always instantiate the usage tracker, and have it no-op out the calls if disabled? (or even create a simple interface and instantiate a "noop" implementation of that interface, so you dont have to worry about a null) http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2002 PS3, Line 2002: ReportTableUsage style nit: lower case 'r' http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@69 PS3, Line 69: * invalidation is executed. nit: was executed http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@85 PS3, Line 85: if (invalidateTableOnMemoryPressure) { do you mind refactoring out the contents of this block to a new function like "bool tryInstallGcListener()" or something, just to keep the constructor a bit more tidy? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@98 PS3, Line 98: ((NotificationEmitter) gcbean) is this cast guaranteed safe? I think it's probably worth an instanceof check before downcast just to insulate us from a potential JVM change down the line (eg if they decide to stop supporting notifications, we will gracefully degrade) http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@105 PS3, Line 105: in the GC. "in the GC MXBean" nit: as a general rule, we should try to have warning messages explain what the effect is. In this case, we should probably say something like "Continuing without GC-triggered invalidation of tables." or "Continuing with only TTL-based table invalidation" or something like that, so that the operator knows that there was a problem, but that its effect is limited to that particular functionality. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@112 PS3, Line 112: daemon thread nit: I think the "daemon thread" part is self-explanatory, since the jstack output already indicates whether or not a thread is a daemon http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@121 PS3, Line 121: if (unusedTableTtlSec < 0) { : throw new IllegalArgumentException( : "unused_table_ttl_sec flag must be a " + "non-negative integer."); this can just be a Preconditions.checkArgument() with the same effect http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@138 PS3, Line 138: private boolean detectGC() { This function should be renamed to indicate its full scope. When I first read the name I thought it was only detecting whether a GC occurred, but in fact it is also checking whether the post-GC heap was too full. Maybe a better name would be something like "shouldEvictFromFullHeapAfterGC", and add a short javadoc to indicate? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@154 PS3, Line 154: if (table.isLoaded()) tables.add(table); can we add some kind of check here that the table's lastUsedTime is non-zero? I'm a little afraid of us introducing a bug or race where a table gets loaded but doesn't get its lastUsedTime set, and then we end up evicting it very quickly after. That would be a tricky bug for us to diagnose. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@178 PS3, Line 178: long inactivityTime = now - table.getLastUsedTime(); same as above -- we should be careful about not exposing a "zero" lastUsedTime. Perhaps we could even just put this check in 'getLastUsedTime' itself as a Preconditions.checkState? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@186 PS3, Line 186: seconds nit: space before 'seconds' http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@89 PS2, Line 89: > If we want the timeout to be accurate we need to keep a list of timeout in agreed that absolute accuracy would require the full list of expiration times in a priority queue or something. However, right now the error is up to 2x of the configuration. If we instead always slept for 1 second between checks, then we would only be up to 1 second "late". I think a reasonable amount of error might be to sleep for min(5 minutes, 10% of the configured interval) or something -- this ensures that no matter how long your expiration interval is, we'd be at most 5 minutes late. And if you set a short interval like 60 seconds for testing purposes, we'll only be 6 seconds late. http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/catalog/CatalogdTableShrinker.java@124 PS2, Line 124: > I think that's still O(n lgn) and only marginally better than sorting. I was thinking something like: int count = 0; for (Db db : catalog_.getAllDbs()) { for (Table t : db.getTables()) { if (t.isLoaded()) count++; } } int numToRemove = count/10; PriorityQueue<Table> toRemove = new PriorityQueue<>(...comparator...); for (Db db : catalog_.getAllDbs()) { for (Table t : db.getTables()) { toRemove.add(t); if (toRemove.size() > numToRemove) toRemove.poll(); } } then we end up with O(k) space and O(n lg k) time. I guess your point is valid in that k = n/10 here so the runtimes are asymptotically equivalent. So, maybe the savings here aren't that big a deal. Up to you. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/Db.java File fe/src/main/java/org/apache/impala/catalog/Db.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/Db.java@26 PS3, Line 26: import org.apache.impala.thrift.TReportTableUsageRequest; unused? also, sorting http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@26 PS3, Line 26: import org.apache.hadoop.yarn.webapp.hamlet2.Hamlet; oops? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@101 PS3, Line 101: private ImpaladTableUsageTracker impaladTableUsageTracker_; unused? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@34 PS3, Line 34: and the data think the word "report" is missing in here http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@39 PS3, Line 39: @VisibleForTesting this isn't visible http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@61 PS3, Line 61: if (unusedTableTtlSec < 0) { : throw new IllegalArgumentException( Preconditions.checkArgument http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@83 PS3, Line 83: 0 1? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@93 PS3, Line 93: Thread.sleep(REPORT_INTERVAL_MS); I think it's worth adding some randomness here so that, in a large cluster, the reports from multiple nodes don't all get synchronized. Keep in mind that typically all impalads start at the same time, and also that some blocking event on the catalogd side (like a GC pause) will end up encouraging periodic requestors to all be on the same synchronized period. If we randomly sleep for between 0.5 and 1.5x the report interval then the requests will nicely spread out and not cause any spikes on the remote node. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@96 PS3, Line 96: while (req_.tables.isEmpty()) wait(); it may be better to just 'continue' if it's empty. Otherwise, in the normal case of an idle Impalad which gets its first query (which has multiple tables referenced), we'll end up waking on the first table reference, immediately constructing a report for that first table, and then going back to sleep. So, the query causes two reports instead of just one. If instead we just went back to sleep here we'd probably only send one report. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/Table.java@108 PS3, Line 108: nanoseconds should probably document "as returned by CatalogdTableInvalidator.nanoTime". Also should probably document that this is only set on the catalogd side, not kept up to date on the impalad, right? http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/main/java/org/apache/impala/catalog/Table.java@173 PS3, Line 173: public long getLastUsedTime() { return lastUsedTime_; } can we Preconditions.checkState(!storedInImpaladCatalogCache_) here? this should only be called on the catalog side I think? http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11224/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@728 PS2, Line 728: > There is a load() of each type of table ... what about putting it in 'getExistingTable'? that seems to be called any time a table is touched. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java File fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java: http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@47 PS3, Line 47: "functional", "alltypes"), dbName, tblName http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@50 PS3, Line 50: CatalogdTableInvalidator.TIME_SOURCE = ticker; probably needs an equivalent "undo" of this back to System ticker or else other tests that run after this one in the same JVM will continue to be polluted by this test setup. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@52 PS3, Line 52: 2, false, 0.6, 0.1) can you add comments like: /*paramName=*/2, /*otherParamName=*/false here? These can be checked for validity using ErrorProne (https://github.com/google/error-prone/blob/master/docs/bugpattern/argumentselectiondefects/NamedParameters.md) and provide some nice documentation value at the call site. I normally try to use this style of comments whenever there's a function that takes several arguments of the same or similar type since it's easy to accidentally swap them. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@53 PS3, Line 53: assert !catalog_.getDb(dbName).getTable(tblName).isLoaded(); for Java unit tests we should use Assert.assertTrue and Assert.assertFalse. I'm not sure if we enable Java assertions ('-ea' flag on the JVM) for our test runs at the moment. http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@56 PS3, Line 56: assert table.getLastUsedTime() == 0; this assertion surprises me. Shouldn't a table, when loaded, have a "lastUsedTime" of now? Is this just because the ticker is starting at time 0? Maybe it would be clearer if the ticker started at time 1000 or something, so ti's clear that we're asserting that the initial "lastUsedTime" is the time that the table was loaded http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@58 PS3, Line 58: // Get notified when the shrinking is done and proceed with the tests. instead of the callback and this clever Boolean[] thing, what if we just added a @VisibleForTesting integer inside CatalogdTableInvalidator called 'triggerCount' that we could loop on and check once every 10ms or something? I think net lines of code would be far less and easier to follow. Saving the extra few milliseconds by using wait/notify isn't worth the code complexity in the context of a test http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@69 PS3, Line 69: nanoSecsPerSec nit: make this a class-level constant NANOS_PER_SEC, or just use TimeUnit APIs to convert to/from http://gerrit.cloudera.org:8080/#/c/11224/3/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@81 PS3, Line 81: catalog_.setCatalogdTableInvalidator(null); the stopping and invalidation should probably be an a @After method rather than here so that you're sure to stop it even on a test failure. http://gerrit.cloudera.org:8080/#/c/11224/2/tests/custom_cluster/test_automatic_invalidation.py File tests/custom_cluster/test_automatic_invalidation.py: http://gerrit.cloudera.org:8080/#/c/11224/2/tests/custom_cluster/test_automatic_invalidation.py@42 PS2, Line 42: time.sleep(10) > Do we flush the log then? I think we already start the daemons with a default of -logbufsecs=1 so the log will automatically flush once a second, rigth? -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Sun, 26 Aug 2018 15:24:29 +0000 Gerrit-HasComments: Yes
