Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
......................................................................


Patch Set 4:

Just tried out the latest patch and metadata loading is 5.4x faster. 

With the patch metadata loading for 80 partitions with 250K files finished in 
27 seconds compared to 146 seconds without. 

Most of the CPU time is spent in the RemoteIterator, to further speedup 
metadata loading I recommend using a thread pool. 

Stack Trace     Sample Count    Percentage(%)
org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table)      
509     74.307
   org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table, 
boolean, boolean, Set)    509     74.307
      org.apache.impala.catalog.HdfsTable.loadAllPartitions(List, Table)        
507     74.015
         org.apache.impala.catalog.HdfsTable.loadMetadataAndDiskIds(FileSystem, 
List, HashMap)  497     72.555
            org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem, 
Path, HashMap, Map)       472     68.905
               org.apache.hadoop.fs.FileSystem$5.hasNext()      365     53.285
                  
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext()     
339     49.489
                     
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter()
  258     37.664
                        org.apache.hadoop.hdfs.DFSClient.listPaths(String, 
byte[], boolean)     258     37.664
                                                
com.sun.proxy.$Proxy21.getListing(String, byte[], boolean)      258     37.664
                              
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Object, Method, 
Object[])        258     37.664
                                 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Method, 
Object[])       258     37.664
                                                         
java.lang.reflect.Method.invoke(Object, Object[])      258     37.664
                     
org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus.makeQualifiedLocated(URI, 
Path)      81      11.825

-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-HasComments: No

Reply via email to