Elliott Clark created HBASE-14708:
-------------------------------------
Summary: Use copy on write TreeMap for region location cache
Key: HBASE-14708
URL: https://issues.apache.org/jira/browse/HBASE-14708
Project: HBase
Issue Type: Improvement
Components: Client
Affects Versions: 1.1.2
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
Fix For: 2.0.0, 1.2.0, 1.3.0
Attachments: location_cache_times.pdf, result.csv
Internally a co-worker profiled their application that was talking to HBase. >
60% of the time was spent in locating a region. This was while the cluster was
stable and no regions were moving.
To figure out if there was a faster way to cache region location I wrote up a
benchmark here: https://github.com/elliottneilclark/benchmark-hbase-cache
This tries to simulate a heavy load on the location cache.
* 24 different threads.
* 2 Deleting location data
* 2 Adding location data
* Using floor to get the result.
To repeat my work just run ./run.sh and it should produce a result.csv
Results:
ConcurrentSkiplistMap is a good middle ground. It's got equal speed for reading
and writing.
However most operations will not need to remove or add a region location. There
will be potentially several orders of magnitude more reads for cached locations
than there will be on clearing the cache.
So I propose a copy on write tree map.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)