[
https://issues.apache.org/jira/browse/HBASE-23066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984143#comment-16984143
]
ramkrishna.s.vasudevan commented on HBASE-23066:
------------------------------------------------
bq.On a side note, (Not related to this issue) when we have cache on write ON
as well as prefetch also On, do we do the caching part for the flushed files
twice? When it is written, its already been added to cache. Later as part of
HFile reader open, the prefetch threads will again do a read and add to cache!
I checked this part. Seems we just read the block and if it is from cache we
just return it. Because HfileReaderImpl#readBlock() just return if the block is
already cached.
bq.The comment from @chenxu seems valid. Should we see that angle also?
Ok. We can see that but it is part of this JIRA or should we raise another JIRA
and address it.
> Allow cache on write during compactions when prefetching is enabled
> -------------------------------------------------------------------
>
> Key: HBASE-23066
> URL: https://issues.apache.org/jira/browse/HBASE-23066
> Project: HBase
> Issue Type: Improvement
> Components: Compaction, regionserver
> Affects Versions: 1.4.10
> Reporter: Jacob LeBlanc
> Assignee: Jacob LeBlanc
> Priority: Minor
> Fix For: 2.3.0, 1.6.0
>
> Attachments: HBASE-23066.patch, performance_results.png,
> prefetchCompactedBlocksOnWrite.patch
>
>
> In cases where users care a lot about read performance for tables that are
> small enough to fit into a cache (or the cache is large enough),
> prefetchOnOpen can be enabled to make the entire table available in cache
> after the initial region opening is completed. Any new data can also be
> guaranteed to be in cache with the cacheBlocksOnWrite setting.
> However, the missing piece is when all blocks are evicted after a compaction.
> We found very poor performance after compactions for tables under heavy read
> load and a slower backing filesystem (S3). After a compaction the prefetching
> threads need to compete with threads servicing read requests and get
> constantly blocked as a result.
> This is a proposal to introduce a new cache configuration option that would
> cache blocks on write during compaction for any column family that has
> prefetch enabled. This would virtually guarantee all blocks are kept in cache
> after the initial prefetch on open is completed allowing for guaranteed
> steady read performance despite a slow backing file system.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)