[ 
https://issues.apache.org/jira/browse/HBASE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kannan Muthukkaruppan updated HBASE-3693:
-----------------------------------------

    Description: 
We noticed that there are lots of listStatus calls on the ColumnFamily 
directories within each region, coming from this codepath:

{code}
compactionSelection()
 --> isMajorCompaction 
    --> getLowestTimestamp()
       -->  FileStatus[] stats = fs.listStatus(p);
{code}

So on every compactionSelection() we're taking this hit. While not immediately 
an issue, just from log inspection, this accounts for quite a large number of 
RPCs to namenode at the moment and seems like an unnecessary load to be sending 
to the namenode.

Seems like it would be easy to cache the timestamp for each opened/created 
StoreFile, in memory, in the region server, and avoid going to DFS each time 
for this information.


  was:
We noticed that are lots of listStatus calls on the ColumnFamily directories 
within each regions, coming from this codepath:

{code}
compactionSelection()
 --> isMajorCompaction 
    --> getLowestTimestamp()
       -->  FileStatus[] stats = fs.listStatus(p);
{code}

So on every compactionSelection() we're taking this hit. While not immediately 
an issue, just from log inspection, this accounts for quite a large number of 
RPCs to namenode at the moment and seems like an unnecessary load to be sending 
to the namenode.

Seems like it would be easy to cache the timestamp for each opened/created 
StoreFile, in memory, in the region server, and avoid going to DFS each time 
for this information.



> isMajorCompaction() check triggers lots of listStatus DFS RPC calls from HBase
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-3693
>                 URL: https://issues.apache.org/jira/browse/HBASE-3693
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Liyin Tang
>
> We noticed that there are lots of listStatus calls on the ColumnFamily 
> directories within each region, coming from this codepath:
> {code}
> compactionSelection()
>  --> isMajorCompaction 
>     --> getLowestTimestamp()
>        -->  FileStatus[] stats = fs.listStatus(p);
> {code}
> So on every compactionSelection() we're taking this hit. While not 
> immediately an issue, just from log inspection, this accounts for quite a 
> large number of RPCs to namenode at the moment and seems like an unnecessary 
> load to be sending to the namenode.
> Seems like it would be easy to cache the timestamp for each opened/created 
> StoreFile, in memory, in the region server, and avoid going to DFS each time 
> for this information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to