Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
......................................................................


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@221
PS7, Line 221:     // are excluded because they are considered hidden from 
Impala's perspecitve
Shrink:

// Number of hidden files excluded from file metadata loading . More details at 
isValidDataFile().


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@226
PS7, Line 226:     // Number of files skipped while computeing the block 
metadata information.
Shrink:

// Number of files skipped from file metadata loading because the the files 
have not changed since the last load. More details at hasFileChanged().


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248
PS7, Line 248:     // If set to true, reloads the block metadata only when the 
files in this path
reloads the file metadata (let's try to use the new "file metadata" terminology 
consistently)


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@249
PS7, Line 249:     // has changed since last load (more details in 
hasFileChanged()).
have changed


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@785
PS7, Line 785:    * much higher throughput of RPC calls for 
listStatus/listFiles. Based on our
... RPC calls for listStatus/listFiles. For simplicity, the filesystem type is 
determined based on the table's root path and not for each partition 
individually. Based on our experiments, S3 showed ...


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@791
PS7, Line 791:    * This method is not optimized for tables with mixed 
partition types (partitions mapped
I don'd think we need this much detail, how about folding a simple sentence 
into the existing paragraph above, see suggestion above.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@794
PS7, Line 794:    * This may not work well with the mixed partition types and 
needs to fixed.
Don't think this is needed.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@819
PS7, Line 819:     // For tables without partitions, we have no block metadata 
to load.
remove ","


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@841
PS7, Line 841:           LOG.error("Encountered an error loading block metadata 
for table: " +
Could this flood the log when HDFS is under pressure or there is an 
intermittent Kerberos issue? I'm thinking we should aggregate these into a 
single log message. Do you think having every single path helps with 
supportability? My feeling is it's more important to note how many tasks 
failed, but I'll defer to you if you think having the paths is useful.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Sat, 21 Oct 2017 00:12:39 +0000
Gerrit-HasComments: Yes

Reply via email to