Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4029: Reduce memory requirements for storing file 
metadata
......................................................................


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6406/2/common/fbs/CMakeLists.txt
File common/fbs/CMakeLists.txt:

PS2, Line 64: message(${JAVA_FE_ARGS})
Looks like debug printing. Remove? (below too)


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

PS2, Line 67: storageIdtoInt
(Unrelated to your change). Looks obsolete.


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

Line 119:       FlatBufferBuilder fbb = new FlatBufferBuilder(1);
Preconditions.checkState(isBlockMdSupported(fs))?


PS2, Line 124: } else {
             :         locations = fileSystem.getFileBlockLocations(fileStatus, 
0, fileStatus.getLen());
             :       }
Do we really need this? Per my understanding, we only fetch 
LocatedFileStatus'es going forward as we don't want to do another RPC to 
getFileBlockLocations(). In another words, is there some caller that requires 
this?


Line 141:         ListMap<TNetworkAddress> hostIndex) {
Preconditions.checkState(!isBlockMdSupported(fs))


PS2, Line 425: fbFileBlock_.replicaHostIdxs(replicaIdx);
- replicaHostIdxs is [ushort] as per fbs definition in which case idx should be 
a short? That REPLICA_HOST_CACHE_MASK could be made made short and avoid 
typecasting in L361. 

- Also that way we could use only REPLICA_HOST_CACHE_MASK and use 
~REPLICA_HOST_CACHE_MASK for unmasking instead of a separate 
REPLICA_HOST_IDX_MASK. Feel free to ignore this if you think that affects 
readability.


Line 463:   public static class FileDescStats {
For supportability purposes, I'm wondering if we should track the 
remote_network_addresses and present a rolled up version of stats so that we 
can know which DNs to look at when we see -ve disk IDs.


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS2, Line 344: =
nit: spacing


Line 365:   private static String findCommonPrefix(String a, String b) {
Use Strings.commonPrefix() from guava?


Line 1933:       Collections.sort(orderedFds, new Comparator<FileDescriptor>() {
Can we just make FileDescriptor implement Comparator<FileDescriptor> and add a 
compare() method there. I think that'd be cleaner.


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

Line 93:   protected boolean storedInImpaladCatalogCache_ = false;
What is the motivation to track this? A quick look at the places it is used, it 
seems like they are only called when it actually true (for example 
resetPartitions()). Are there callers when its false too? Like tests?


-- 
To view, visit http://gerrit.cloudera.org:8080/6406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I483d3cadc9d459f71a310c35a130d073597b0983
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to