[
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-15716:
--------------------------
Attachment: Screen Shot 2016-04-27 at 9.49.35 AM.png
hits.png
15716.prune.synchronizations.v3.patch
This patch plugs the 'hole' identified in the above scenario (The one where we
get the mvcc readpoint at p1 in the scanner creation but before we can add
ourselves to the region scannerReadPoints map, the readpoint moves forward to
p2; then a call to getSmallestReadpoint comes in, and Cells between p2 and p1
are purged corrupting our scan 'view')
We plug the hole by doing a check and put and not progressing with the scanner
creation until we are sure that what is registered in scannerReadPoints is the
current readpoint. If it is not, we go around until what is in
scannerReadPoints matches the current state of the mvcc read point.
We are doing two reads of an atomic long (mvcc#getReadPoint) for
synchronization across the atomic long read and update of the
scannerReadPoints.put Map.
The difference in the throughput is pretty dramatic: 220k ops/second vs 290k
ops/second (30%?). See attached hits png. I also include the fr recording which
shows lock incidence is gone.
Let me check my work by doing a few more runs. [~lhofhansl] what you think of
the latest patch? Can you find a hole in it?
> HRegion#RegionScannerImpl scannerReadPoints synchronization costs
> -----------------------------------------------------------------
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Reporter: stack
> Attachments: 15716.prune.synchronizations.patch,
> 15716.prune.synchronizations.v3.patch, Screen Shot 2016-04-26 at 2.05.45
> PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at
> 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot
> 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png,
> hits.png, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it
> with the scanner instance in a Region scoped CSLM. This is done under a
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the
> outstanding point of lock contention according to flight recorder (My work
> load is workload c, random reads).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)