Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11341 )
Change subject: IMPALA-7424: Optimize in-memory representation of incremental stats ...................................................................... Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/11341/3/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java: http://gerrit.cloudera.org:8080/#/c/11341/3/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@337 PS3, Line 337: > Done. Do you recall where the '400' came from in the first place? Looking through the HLL code it seems like we actually set HLL_LEN=1024. But then we do some run-length coding and thrift encoding. Was 400 just an empirical estimate? I suppose we can gather a new empirical estimate to make this more accurate, but I think it makes sense to do in a follow-up so we can try to get these committed while they are smaller patches. http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java: http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@334 PS4, Line 334: checkNotNull checkNotNull returns its argument, so you can inline this into the parameter below (setPartition_stats(checkNotNull(..))) http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@637 PS4, Line 637: public void setPartitionStatsBytes(byte[] partitionStats, boolean hasIncrStats) { worth a precondition check that, if hasIncrStats is true, then partitionStats is non-null? http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@654 PS4, Line 654: PartitionStatsUtil.deletePartStats(this); still think this is a surprising side effect for a method that looks like it's just a setter. Given the parameter here is always 'hmsParameters_', maybe better to make this method take no arguments, and call it something like 'extractAndCompressPartitionStats()' and javadoc it to say that it removes the partition stats from the HMS parameters, compresses them, and stores them in the appropriate members? http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java File fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java: http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@61 PS4, Line 61: if (!part.hasIncrementalStats()) return null; At a high level, I'm a little confused in this patch about the boolean hasIncrementalStats. It seems like you're setting that boolean based on whether the decoded TPartitionStats has the intermediate_col_stats member. But, this function seems like the old behavior was to return the TPartitionStats regardless -- ie even if it has a TTableStats 'stats' member but _not_ the 'intermediate_col_stats' member. Maybe we can chat about this? http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@127 PS4, Line 127: "Error saving partition stats: " nit: why not put this substring as part of debugString instead of repeating it twice? http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@58 PS4, Line 58: private byte[] partitionStats_; this and below can be final -- To view, visit http://gerrit.cloudera.org:8080/11341 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02 Gerrit-Change-Number: 11341 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Tue, 28 Aug 2018 22:51:15 +0000 Gerrit-HasComments: Yes