[
https://issues.apache.org/jira/browse/HBASE-14708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979964#comment-14979964
]
Hiroshi Ikeda commented on HBASE-14708:
---------------------------------------
Breaking contracts is a bug by itself and it is dangerous to allow.
{quote}
So I benchmarked that.
{quote}
Oops sorry I should have checked more carefully.
{quote}
If I went with the atomic reference I would have gone with the compare and
swap. It would have bloated the size of each method.
{quote}
I rather worry that cost of creation and copy of the internal map can waste if
failed swapping, which fruitless cost and possibility might increase in
accordance with the size of the map.
BTW, how about using ConcurrentHashMap or something with a simple wrapper of a
byte array? NavigableMap implementations requires several comparisons of bytes.
> Use copy on write TreeMap for region location cache
> ---------------------------------------------------
>
> Key: HBASE-14708
> URL: https://issues.apache.org/jira/browse/HBASE-14708
> Project: HBase
> Issue Type: Improvement
> Components: Client
> Affects Versions: 1.1.2
> Reporter: Elliott Clark
> Assignee: Elliott Clark
> Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14708-v2.patch, HBASE-14708-v3.patch,
> HBASE-14708-v4.patch, HBASE-14708-v5.patch, HBASE-14708-v6.patch,
> HBASE-14708-v7.patch, HBASE-14708.patch, location_cache_times.pdf, result.csv
>
>
> Internally a co-worker profiled their application that was talking to HBase.
> > 60% of the time was spent in locating a region. This was while the cluster
> was stable and no regions were moving.
> To figure out if there was a faster way to cache region location I wrote up a
> benchmark here: https://github.com/elliottneilclark/benchmark-hbase-cache
> This tries to simulate a heavy load on the location cache.
> * 24 different threads.
> * 2 Deleting location data
> * 2 Adding location data
> * Using floor to get the result.
> To repeat my work just run ./run.sh and it should produce a result.csv
> Results:
> ConcurrentSkiplistMap is a good middle ground. It's got equal speed for
> reading and writing.
> However most operations will not need to remove or add a region location.
> There will be potentially several orders of magnitude more reads for cached
> locations than there will be on clearing the cache.
> So I propose a copy on write tree map.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)