Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11341 )

Change subject: IMPALA-7424: Optimize in-memory representation of incremental 
stats
......................................................................


Patch Set 4:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/11341/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11341/4//COMMIT_MSG@7
PS4, Line 7: Optimize
nit: Reduce


http://gerrit.cloudera.org:8080/#/c/11341/4//COMMIT_MSG@10
PS4, Line 10: HMS parameters map of partition objects. Each of these strings 
are Java
nit: ... when stored in catalogd.


http://gerrit.cloudera.org:8080/#/c/11341/4//COMMIT_MSG@11
PS4, Line 11: takes
nit: take


http://gerrit.cloudera.org:8080/#/c/11341/4//COMMIT_MSG@17
PS4, Line 17: on-disk
persistent?


http://gerrit.cloudera.org:8080/#/c/11341/4//COMMIT_MSG@26
PS4, Line 26: optimizes
nit: improves


http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogObjects.thrift@297
PS4, Line 297: Deflate compressed byte[]
still having trouble understanding this one. how about: "byte[] representation 
of TPartitionStats for this partition that is compressed using 
'deflate-compression'"?


http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogService.thrift@283
PS4, Line 283: includes
nit: should include (for consistency)


http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogService.thrift@284
PS4, Line 284: and then
nit: that is


http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogService.thrift@302
PS4, Line 302: want_partition_incremental_stats
sync this with the field name in that struct.


http://gerrit.cloudera.org:8080/#/c/11341/4/common/thrift/CatalogService.thrift@307
PS4, Line 307: want_partition_incremental_stats
sync


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@a446
PS4, Line 446:
any reason to get rid of these?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
File fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java@120
PS4, Line 120:    * are stored as a deflate-compressed byte array to reduce 
memory footprint.
perhaps include the util method to use to turn it into a TPartitionStats?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java@158
PS4, Line 158: serialized
nit: remove


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@527
PS4, Line 527: Set once and reused
             :   // since thrift deserialization is costly.
remove


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@637
PS4, Line 637:   public void setPartitionStatsBytes(byte[] partitionStats, 
boolean hasIncrStats) {
you mentioned on L527 that this should be set once-- perhaps enforce it?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@638
PS4, Line 638:     partitionStats_ = partitionStats;
is there some sanity check here, for example, hasIncrStats => partitionStats != 
null ?


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java
File fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@119
PS4, Line 119:           (partStats));
strange formatting


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@139
PS4, Line 139: tition.hasIncrementalStats()
what happens with the non-incremental stats in this case?
(e.g., TPartitionStats.stats)


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@60
PS4, Line 60: Set once and reused
            :   // since thrift deserialization is costly.
this clarification refers to the current case where we deserialize for each 
access. a new reader does not have this context, so I'd omit this part.


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/util/CompressionUtil.java
File fe/src/main/java/org/apache/impala/util/CompressionUtil.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/util/CompressionUtil.java@48
PS4, Line 48: incremental stats
keep it more general here


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/main/java/org/apache/impala/util/CompressionUtil.java@61
PS4, Line 61: incr stats, internal error
keep it general, and consistent with the error message for compressing.


http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/test/java/org/apache/impala/util/CompressionUtilTest.java
File fe/src/test/java/org/apache/impala/util/CompressionUtilTest.java:

http://gerrit.cloudera.org:8080/#/c/11341/4/fe/src/test/java/org/apache/impala/util/CompressionUtilTest.java@35
PS4, Line 35:     Assert.assertEquals(new String(decompressed), test);
include empty string/bytes.



--
To view, visit http://gerrit.cloudera.org:8080/11341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02
Gerrit-Change-Number: 11341
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Aug 2018 22:51:44 +0000
Gerrit-HasComments: Yes

Reply via email to