Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11341 )

Change subject: IMPALA-7424: Optimize in-memory representation of incremental 
stats
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11341/3/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java:

http://gerrit.cloudera.org:8080/#/c/11341/3/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@337
PS3, Line 337:
> Done.
Do you recall where the '400' came from in the first place? Looking through the 
HLL code it seems like we actually set HLL_LEN=1024. But then we do some 
run-length coding and thrift encoding. Was 400 just an empirical estimate? I 
suppose we can gather a new empirical estimate to make this more accurate, but 
I think it makes sense to do in a follow-up so we can try to get these 
committed while they are smaller patches.


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@334
PS4, Line 334: checkNotNull
checkNotNull returns its argument, so you can inline this into the parameter 
below (setPartition_stats(checkNotNull(..)))


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@637
PS4, Line 637:   public void setPartitionStatsBytes(byte[] partitionStats, 
boolean hasIncrStats) {
worth a precondition check that, if hasIncrStats is true, then partitionStats 
is non-null?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@654
PS4, Line 654:       PartitionStatsUtil.deletePartStats(this);
still think this is a surprising side effect for a method that looks like it's 
just a setter.

Given the parameter here is always 'hmsParameters_', maybe better to make this 
method take no arguments, and call it something like 
'extractAndCompressPartitionStats()' and javadoc it to say that it removes the 
partition stats from the HMS parameters, compresses them, and stores them in 
the appropriate members?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java
File fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@61
PS4, Line 61:       if (!part.hasIncrementalStats()) return null;
At a high level, I'm a little confused in this patch about the boolean 
hasIncrementalStats. It seems like you're setting that boolean based on whether 
the decoded TPartitionStats has the intermediate_col_stats member. But, this 
function seems like the old behavior was to return the TPartitionStats 
regardless -- ie even if it has a TTableStats 'stats' member but _not_ the 
'intermediate_col_stats' member. Maybe we can chat about this?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@127
PS4, Line 127: "Error saving partition stats: "
nit: why not put this substring as part of debugString instead of repeating it 
twice?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@58
PS4, Line 58:   private byte[] partitionStats_;
this and below can be final



--
To view, visit http://gerrit.cloudera.org:8080/11341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02
Gerrit-Change-Number: 11341
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Aug 2018 22:51:15 +0000
Gerrit-HasComments: Yes

Reply via email to