[
https://issues.apache.org/jira/browse/HBASE-23066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986152#comment-16986152
]
Jacob LeBlanc commented on HBASE-23066:
---------------------------------------
Apologies for the delayed response on my part.
Just as an FYI, we manually deployed this patch (my original version) to our
production environment and have been running it with good results for about a
month and a half.
{quote}With this new config, what we do is write to cache along with the HFile
create itself. Blocks are added to cache as and when it is written to the
HFile. So its aggressive. Ya it helps to make the new File data available from
time 0 itself. The concern is this in a way demands 2x cache size. Because the
compacting files data might be already there in the cache. While the new file
write, those old files are still valid. The new one is not even committed by
the RS.
{quote}
Without evictblocksonclose enabled (it looks like it is disabled by default)
then wouldn't the old file data still be in cache even after compaction is
finished? Granted once compaction is done it will no longer be accessed, age
out, and be evicted if necessary but the same amount of data is read into the
cache both with and without this new setting, is it not? When prefetching is
enabled the only difference is the caching of the new file is done a little bit
earlier, but other than that it seems caching requirements are the same. I'm
not sure I understand why 2x cache size is needed - perhaps I am missing
something. Having eviceblocksonclose enabled does change things and means you
would need 2x cache size compared to normal as you change the ordering of
caching/evicting.
{quote}Also IMHO no need to check that based on whether prefetch is on or not!
Make this conf name and doc clear what it is doing and what is the size
expectations.
{quote}
Coming from the perspective of my organization's requirements - this would not
work well for us as we only want data to be cached on compaction for tables
where prefetching is enabled:
The clear intention of enabling prefetching on a table is to keep as much data
in the read cache as possible to ensure consistently fast reading, but without
this configuration there are consistently huge drops in read performance
whenever compaction is done because large parts of the table are essentially
dropped from the cache (actually the pre-compaction data is still there unless
evictblocksonclose is enabled, but the pre-compaction data is for the old file
names which will never be accessed again after compaction is finished so it's
the same as dropping the data). This configuration is to mitigate that effect
to better achieve read performance sought by prefetching. The intention is
*not* just to cache everything that gets compacted.
So caching all compacted data on all tables does not meet this requirement and
in fact would cause problems if it were to be used. In our use cases we have
several tables where we write and compact a lot but where we don't want to
prefetch those tables into our cache. Caching all blocks on compaction would
cause big problems where we'd evict data we care about in favor of data we will
never/rarely read.
An alternative to having this setting contingent on prefetching would be to
have a CACHE_BLOCKS_ON_COMPACTION as part of ColumnFamilyDescriptor. Then we
could choose to turn it on for the same CFs where we also have prefetching.
This seems like a bigger code/documentation change, whereas my original
intention on this patch was to keep it small and focused for the only use case
I could think of (why else would someone want to cache blocks during compaction
except if they were prefetching?). But if a per-column family setting is
preferred, then I could try making those changes.
I welcome input from you experts. Thanks!
> Allow cache on write during compactions when prefetching is enabled
> -------------------------------------------------------------------
>
> Key: HBASE-23066
> URL: https://issues.apache.org/jira/browse/HBASE-23066
> Project: HBase
> Issue Type: Improvement
> Components: Compaction, regionserver
> Affects Versions: 1.4.10
> Reporter: Jacob LeBlanc
> Assignee: Jacob LeBlanc
> Priority: Minor
> Fix For: 2.3.0, 1.6.0
>
> Attachments: HBASE-23066.patch, performance_results.png,
> prefetchCompactedBlocksOnWrite.patch
>
>
> In cases where users care a lot about read performance for tables that are
> small enough to fit into a cache (or the cache is large enough),
> prefetchOnOpen can be enabled to make the entire table available in cache
> after the initial region opening is completed. Any new data can also be
> guaranteed to be in cache with the cacheBlocksOnWrite setting.
> However, the missing piece is when all blocks are evicted after a compaction.
> We found very poor performance after compactions for tables under heavy read
> load and a slower backing filesystem (S3). After a compaction the prefetching
> threads need to compete with threads servicing read requests and get
> constantly blocked as a result.
> This is a proposal to introduce a new cache configuration option that would
> cache blocks on write during compaction for any column family that has
> prefetch enabled. This would virtually guarantee all blocks are kept in cache
> after the initial prefetch on open is completed allowing for guaranteed
> steady read performance despite a slow backing file system.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)