[ 
https://issues.apache.org/jira/browse/HBASE-23066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986152#comment-16986152
 ] 

Jacob LeBlanc commented on HBASE-23066:
---------------------------------------

Apologies for the delayed response on my part.

Just as an FYI, we manually deployed this patch (my original version) to our 
production environment and have been running it with good results for about a 
month and a half.
{quote}With this new config, what we do is write to cache along with the HFile 
create itself. Blocks are added to cache as and when it is written to the 
HFile. So its aggressive. Ya it helps to make the new File data available from 
time 0 itself. The concern is this in a way demands 2x cache size. Because the 
compacting files data might be already there in the cache. While the new file 
write, those old files are still valid. The new one is not even committed by 
the RS.
{quote}
Without evictblocksonclose enabled (it looks like it is disabled by default) 
then wouldn't the old file data still be in cache even after compaction is 
finished? Granted once compaction is done it will no longer be accessed, age 
out, and be evicted if necessary but the same amount of data is read into the 
cache both with and without this new setting, is it not? When prefetching is 
enabled the only difference is the caching of the new file is done a little bit 
earlier, but other than that it seems caching requirements are the same. I'm 
not sure I understand why 2x cache size is needed - perhaps I am missing 
something. Having eviceblocksonclose enabled does change things and means you 
would need 2x cache size compared to normal as you change the ordering of 
caching/evicting.
{quote}Also IMHO no need to check that based on whether prefetch is on or not! 
Make this conf name and doc clear what it is doing and what is the size 
expectations.
{quote}
Coming from the perspective of my organization's requirements - this would not 
work well for us as we only want data to be cached on compaction for tables 
where prefetching is enabled:

The clear intention of enabling prefetching on a table is to keep as much data 
in the read cache as possible to ensure consistently fast reading, but without 
this configuration there are consistently huge drops in read performance 
whenever compaction is done because large parts of the table are essentially 
dropped from the cache (actually the pre-compaction data is still there unless 
evictblocksonclose is enabled, but the pre-compaction data is for the old file 
names which will never be accessed again after compaction is finished so it's 
the same as dropping the data). This configuration is to mitigate that effect 
to better achieve read performance sought by prefetching. The intention is 
*not* just to cache everything that gets compacted.

So caching all compacted data on all tables does not meet this requirement and 
in fact would cause problems if it were to be used. In our use cases we have 
several tables where we write and compact a lot but where we don't want to 
prefetch those tables into our cache. Caching all blocks on compaction would 
cause big problems where we'd evict data we care about in favor of data we will 
never/rarely read.

An alternative to having this setting contingent on prefetching would be to 
have a CACHE_BLOCKS_ON_COMPACTION as part of ColumnFamilyDescriptor. Then we 
could choose to turn it on for the same CFs where we also have prefetching. 
This seems like a bigger code/documentation change, whereas my original 
intention on this patch was to keep it small and focused for the only use case 
I could think of (why else would someone want to cache blocks during compaction 
except if they were prefetching?). But if a per-column family setting is 
preferred, then I could try making those changes.

I welcome input from you experts. Thanks!

> Allow cache on write during compactions when prefetching is enabled
> -------------------------------------------------------------------
>
>                 Key: HBASE-23066
>                 URL: https://issues.apache.org/jira/browse/HBASE-23066
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction, regionserver
>    Affects Versions: 1.4.10
>            Reporter: Jacob LeBlanc
>            Assignee: Jacob LeBlanc
>            Priority: Minor
>             Fix For: 2.3.0, 1.6.0
>
>         Attachments: HBASE-23066.patch, performance_results.png, 
> prefetchCompactedBlocksOnWrite.patch
>
>
> In cases where users care a lot about read performance for tables that are 
> small enough to fit into a cache (or the cache is large enough), 
> prefetchOnOpen can be enabled to make the entire table available in cache 
> after the initial region opening is completed. Any new data can also be 
> guaranteed to be in cache with the cacheBlocksOnWrite setting.
> However, the missing piece is when all blocks are evicted after a compaction. 
> We found very poor performance after compactions for tables under heavy read 
> load and a slower backing filesystem (S3). After a compaction the prefetching 
> threads need to compete with threads servicing read requests and get 
> constantly blocked as a result. 
> This is a proposal to introduce a new cache configuration option that would 
> cache blocks on write during compaction for any column family that has 
> prefetch enabled. This would virtually guarantee all blocks are kept in cache 
> after the initial prefetch on open is completed allowing for guaranteed 
> steady read performance despite a slow backing file system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to