[ 
https://issues.apache.org/jira/browse/HBASE-17057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710244#comment-15710244
 ] 

Ashu Pachauri commented on HBASE-17057:
---------------------------------------

Talked to [~eclark] offline, it turns out that throttling compactions has 
nothing to do with dropping page cache, it was used as a hint to figure out the 
total size of the files involved in a compaction request. Since, in the old 
world, compactions piggybacked on the store file scanners that were already 
open, we considered it more efficient to not drop pages during compactions that 
were small enough rather than potentially dropping pages for storefiles that 
were probably already being read. However, since we use private readers for 
compactions by default, we should drop pages for minor compactions by default.
I'll add a patch that introduces a config to drop page cache for minor and 
major compactions. This config will be set to true by default, but someone who 
is not using private readers can choose to turn it off (though I doubt turning 
it off will be any positive impact especially in large clusters.)
For master branch, this jira will address correctly passing the drop cache 
hint; I'll open a separate issue (or find one if it already exists) that makes 
sure we honor the hint in the compaction path.

> Minor compactions should also drop page cache behind reads
> ----------------------------------------------------------
>
>                 Key: HBASE-17057
>                 URL: https://issues.apache.org/jira/browse/HBASE-17057
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Ashu Pachauri
>            Assignee: Ashu Pachauri
>
> Long compactions currently drop cache behind reads/writes so that they don't 
> pollute the page cache but short compactions don't do that. The bulk of the 
> data is actually read during minor compactions instead of major compactions,  
> and thrashes the page cache since it's mostly not needed. 
> We should drop page cache behind minor compactions too. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to