[ https://issues.apache.org/jira/browse/HBASE-15885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299373#comment-15299373 ]
Guanghao Zhang commented on HBASE-15885: ---------------------------------------- Yes, it changes the behavior when get exception in computeHDFSBlocksDistribution(fs). As I profile the region close/open process, computeHDFSBlocksDistribution take about 10ms. Reducing the hdfs block distribution calculation three times may save 3 * 10ms for region not serving time. > Compute StoreFile HDFS Blocks Distribution when needed > ------------------------------------------------------ > > Key: HBASE-15885 > URL: https://issues.apache.org/jira/browse/HBASE-15885 > Project: HBase > Issue Type: Improvement > Components: HFile > Affects Versions: 2.0.0 > Reporter: Guanghao Zhang > Attachments: HBASE-15885.patch > > > Now when open a StoreFileReader, it always need to compute HDFS blocks > distribution. But when balance a region, it will increase the region not > serving time. Because it need first close region on rs A, then open it on rs > B. When close region, it first preFlush, then flush the new update to a new > store file. The new store file will first be flushed to tmp directory, then > move it to column family directory. These need open StoreFileReader twice > which means it need compute HDFS blocks distribution twice. When open region > on rs B, it need open StoreFileReader and compute HDFS blocks distribution > too. So when balance a region, it need compute HDFS blocks distribution three > times for per new store file. This will increase the region not serving time > and we don't need compute HDFS blocks distribution when close a region. > The related three methods in HStore. > 1. validateStoreFile(...) > 2. commitFile(...) > 3. openStoreFiles(...) -- This message was sent by Atlassian JIRA (v6.3.4#6332)