[
https://issues.apache.org/jira/browse/HBASE-23066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941073#comment-16941073
]
Jacob LeBlanc commented on HBASE-23066:
---------------------------------------
I've run some performance tests to demonstrate the effectiveness of the patch.
I have not patched our production cluster yet as I'm waiting on confirmation
from AWS service team that I won't be overwriting AWS-specific changes in the
HStore class, but I've done some sampling on a test cluster.
Basic setup is an EMR cluster running 1.4.9 backed by S3 as well as ganglia
installed to capture the metrics. I have a stress tester executing about 1000
scans per second on a 1.5 GB region. Prefetching is enabled, and I have one
region server that is unpatched or has the new configuration setting disabled,
and one region server that is patched and has the new configuration option
enabled. I then execute the following test:
1. Move the region to the desired region server (either patched or unpatched).
2. Wait for prefetching to complete and for mean scan times to normalize.
3. Execute a major compaction on the target region.
4. Check region server UI / logs to see when the compaction completes.
5. Collect data from ganglia.
One issue I identified with my test is the scans aren't as random as they
should be so I believe data after compaction is getting cached on read more
quickly than it otherwise should be on the unpatched server if my scans were
truly random. I can improve the test, but results still validate the patch.
Baseline mean scan time was about 20 - 60 milliseconds. After compaction the
results were:
Trial 1 (unpatched): mean scan time peaked at over 27000 milliseconds, and
stayed above 5000 milliseconds for 3 minutes
Trial 2 (unpatched): mean scan time peaked at over 27000 milliseconds, and
stayed above 5000 milliseconds for 3.5 minutes
Trial 3 (patched): mean scan time peaked to 282 milliseconds for one time sample
Trial 4 (patched): mean scan time peaked at just over 1300 milliseconds and
remained abover 1000 milliseconds for 30 seconds
Trial 5 (patched): no noticable spike in mean scan time
I've attached a picture of a graph of the results.
> Allow cache on write during compactions when prefetching is enabled
> -------------------------------------------------------------------
>
> Key: HBASE-23066
> URL: https://issues.apache.org/jira/browse/HBASE-23066
> Project: HBase
> Issue Type: Improvement
> Components: Compaction, regionserver
> Affects Versions: 1.4.10
> Reporter: Jacob LeBlanc
> Assignee: Jacob LeBlanc
> Priority: Minor
> Fix For: 1.5.0, 2.3.0
>
> Attachments: HBASE-23066.patch, performance_results.png,
> prefetchCompactedBlocksOnWrite.patch
>
>
> In cases where users care a lot about read performance for tables that are
> small enough to fit into a cache (or the cache is large enough),
> prefetchOnOpen can be enabled to make the entire table available in cache
> after the initial region opening is completed. Any new data can also be
> guaranteed to be in cache with the cacheBlocksOnWrite setting.
> However, the missing piece is when all blocks are evicted after a compaction.
> We found very poor performance after compactions for tables under heavy read
> load and a slower backing filesystem (S3). After a compaction the prefetching
> threads need to compete with threads servicing read requests and get
> constantly blocked as a result.
> This is a proposal to introduce a new cache configuration option that would
> cache blocks on write during compaction for any column family that has
> prefetch enabled. This would virtually guarantee all blocks are kept in cache
> after the initial prefetch on open is completed allowing for guaranteed
> steady read performance despite a slow backing file system.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)