Mostafa Mokhtar has posted comments on this change.
Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
......................................................................
Patch Set 4:
Just tried out the latest patch and metadata loading is 5.4x faster.
With the patch metadata loading for 80 partitions with 250K files finished in
27 seconds compared to 146 seconds without.
Most of the CPU time is spent in the RemoteIterator, to further speedup
metadata loading I recommend using a thread pool.
Stack Trace Sample Count Percentage(%)
org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table)
509 74.307
org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table,
boolean, boolean, Set) 509 74.307
org.apache.impala.catalog.HdfsTable.loadAllPartitions(List, Table)
507 74.015
org.apache.impala.catalog.HdfsTable.loadMetadataAndDiskIds(FileSystem,
List, HashMap) 497 72.555
org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem,
Path, HashMap, Map) 472 68.905
org.apache.hadoop.fs.FileSystem$5.hasNext() 365 53.285
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext()
339 49.489
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter()
258 37.664
org.apache.hadoop.hdfs.DFSClient.listPaths(String,
byte[], boolean) 258 37.664
com.sun.proxy.$Proxy21.getListing(String, byte[], boolean) 258 37.664
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Object, Method,
Object[]) 258 37.664
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Method,
Object[]) 258 37.664
java.lang.reflect.Method.invoke(Object, Object[]) 258 37.664
org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus.makeQualifiedLocated(URI,
Path) 81 11.825
--
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-HasComments: No