[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-06-23 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521376#comment-16521376
 ] 

stack commented on HBASE-20236:
---

Attached latest (after HBASE-20780). [^2.0623.111354.lock.svg] 

> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
> Attachments: 2.0623.111354.lock.svg, HBASE-20236.master.001.patch, 
> no_semaphore_vs_semaphore.png
>
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
>  32   [13] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
>  33   [14] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
>  34   [15] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
>  35   [16] 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
>  36   [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
>  37   [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  38   [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  39   [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
>  40   [21] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
>  41   [22] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
>  42   [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
>  43   [24] 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
>  44   [25] org.apache.hadoop.hbase.ipc.RpcServer.call
>  45   [26] org.apache.hadoop.hbase.ipc.CallRunner.run
>  46   [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  47   [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  48
>  49 Total: 717708856 (0.09%)  samples: 214
>  50   [ 0] java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync
>  51   [ 1] sun.misc.Unsafe.park
>  52   [ 2] java.util.concurrent.locks.LockSupport.park
>  53   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  54   [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
>  55   [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
>  56   [ 6] java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock
>  57   [ 7] org.apache.hadoop.hbase.regionserver.HRegion.blockUpdates
>  58   [ 8] 
> org.apache.hadoop.hbase.regionserver.RegionServicesForStores.blockUpdates
>  59   [ 9] 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory
>  60   [10] 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run
>  61   [11] java.util.concurrent.ThreadPoolExecutor.runWorker
>  62   [12] 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-06-23 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521375#comment-16521375
 ] 

stack commented on HBASE-20236:
---

So, MutableSegment locking has been cleaned up in HBASE-20411. The fast path 
shows as a locking offender still but I just tried removing it again and again, 
there is a slow down if it is turned off; it looks bad in the locking profile 
but our throughput benefits. Would be interested if it could be made run 
smoother. Let me attach a graph of current write lock offenders. Mostly reading 
and writing... off the socket, mvcc, and of course the rpc scheduling semaphore 
(20%).

> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
> Attachments: HBASE-20236.master.001.patch, 
> no_semaphore_vs_semaphore.png
>
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
>  32   [13] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
>  33   [14] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
>  34   [15] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
>  35   [16] 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
>  36   [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
>  37   [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  38   [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  39   [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
>  40   [21] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
>  41   [22] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
>  42   [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
>  43   [24] 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
>  44   [25] org.apache.hadoop.hbase.ipc.RpcServer.call
>  45   [26] org.apache.hadoop.hbase.ipc.CallRunner.run
>  46   [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  47   [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  48
>  49 Total: 717708856 (0.09%)  samples: 214
>  50   [ 0] java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync
>  51   [ 1] sun.misc.Unsafe.park
>  52   [ 2] java.util.concurrent.locks.LockSupport.park
>  53   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  54   [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
>  55   [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
>  56   [ 6] java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock
>  57   [ 7] 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-04-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437860#comment-16437860
 ] 

stack commented on HBASE-20236:
---

I moved the MutableSegment locking to its own issue, HBASE-20411

> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
> Attachments: HBASE-20236.master.001.patch, 
> no_semaphore_vs_semaphore.png
>
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
>  32   [13] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
>  33   [14] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
>  34   [15] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
>  35   [16] 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
>  36   [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
>  37   [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  38   [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  39   [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
>  40   [21] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
>  41   [22] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
>  42   [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
>  43   [24] 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
>  44   [25] org.apache.hadoop.hbase.ipc.RpcServer.call
>  45   [26] org.apache.hadoop.hbase.ipc.CallRunner.run
>  46   [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  47   [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  48
>  49 Total: 717708856 (0.09%)  samples: 214
>  50   [ 0] java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync
>  51   [ 1] sun.misc.Unsafe.park
>  52   [ 2] java.util.concurrent.locks.LockSupport.park
>  53   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  54   [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
>  55   [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
>  56   [ 6] java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock
>  57   [ 7] org.apache.hadoop.hbase.regionserver.HRegion.blockUpdates
>  58   [ 8] 
> org.apache.hadoop.hbase.regionserver.RegionServicesForStores.blockUpdates
>  59   [ 9] 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory
>  60   [10] 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run
>  61   [11] java.util.concurrent.ThreadPoolExecutor.runWorker
>  62   [12] java.util.concurrent.ThreadPoolExecutor$Worker.run
>  63   [13] 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423678#comment-16423678
 ] 

Hadoop QA commented on HBASE-20236:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 7s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 35s{color} 
| {color:red} hbase-server generated 6 new + 182 unchanged - 6 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 2 new + 282 
unchanged - 0 fixed = 284 total (was 282) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
17m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}159m 36s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}200m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestPerColumnFamilyFlush |
|   | hadoop.hbase.regionserver.TestDefaultMemStore |
|   | hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController |
|   | hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore |
|   | hadoop.hbase.regionserver.TestCompactingMemStore |
|   | hadoop.hbase.regionserver.TestHStore |
|   | hadoop.hbase.regionserver.TestWalAndCompactingMemStoreFlush |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20236 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917304/HBASE-20236.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux cdfd0933ee08 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 
12:16:42 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 219625233c |

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-04-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423317#comment-16423317
 ] 

stack commented on HBASE-20236:
---

[~eshcar] This showed up in a write-only run where I'm using the default 
scheduler. This is the locking profile. We hold a synchronize while we update 
three volatiles every time there's an edit. I was looking at seeing if could 
create an instance of the mem sizing while synchronize on the Segment.

{code}
Total: 477709588748 (98.47%)  samples: 582970
  [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
  [ 1] org.apache.hadoop.hbase.regionserver.Segment.incSize
  [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
  [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
  [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
  [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
  [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
  [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
  [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
  [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
  [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
  [11] 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
  [12] 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
  [13] 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$122.1325836816.visit
  [14] 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
  [15] 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
  [16] 
org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
  [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
  [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
  [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
  [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
  [21] org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
  [22] 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
  [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
  [24] 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
  [25] org.apache.hadoop.hbase.ipc.RpcServer.call
  [26] org.apache.hadoop.hbase.ipc.CallRunner.run
  [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
  [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
{code}

> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
> Attachments: no_semaphore_vs_semaphore.png
>
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-03-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418474#comment-16418474
 ] 

stack commented on HBASE-20236:
---

Attached no_semaphore_vs_semaphore.png. First run is no semaphore.

Seems like improvement in coordination between threads via Semaphore could make 
for some nice yields.

> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
> Attachments: no_semaphore_vs_semaphore.png
>
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
>  32   [13] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
>  33   [14] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
>  34   [15] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
>  35   [16] 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
>  36   [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
>  37   [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  38   [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  39   [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
>  40   [21] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
>  41   [22] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
>  42   [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
>  43   [24] 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
>  44   [25] org.apache.hadoop.hbase.ipc.RpcServer.call
>  45   [26] org.apache.hadoop.hbase.ipc.CallRunner.run
>  46   [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  47   [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  48
>  49 Total: 717708856 (0.09%)  samples: 214
>  50   [ 0] java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync
>  51   [ 1] sun.misc.Unsafe.park
>  52   [ 2] java.util.concurrent.locks.LockSupport.park
>  53   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  54   [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
>  55   [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
>  56   [ 6] java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock
>  57   [ 7] org.apache.hadoop.hbase.regionserver.HRegion.blockUpdates
>  58   [ 8] 
> org.apache.hadoop.hbase.regionserver.RegionServicesForStores.blockUpdates
>  59   [ 9] 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory
>  60   [10] 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run
>  61   [11] java.util.concurrent.ThreadPoolExecutor.runWorker
>  

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-03-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418472#comment-16418472
 ] 

stack commented on HBASE-20236:
---

Hmm... No. Semaphore is slightly faster (3-4%) even though it has this 
crazy-looking locking profile:
{code}
   1 Started [lock] profiling
   2 --- Execution profile ---
   3 Total samples: 2388092
   4 Skipped:   608 (0.03%)
   5
   6 Frame buffer usage:0.2034%
   7
   8 Total: 730190050914 (97.47%)  samples: 2367078
   9   [ 0] java.util.concurrent.Semaphore$NonfairSync
  10   [ 1] sun.misc.Unsafe.park
  11   [ 2] java.util.concurrent.locks.LockSupport.park
  12   [ 3] 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
  13   [ 4] 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
  14   [ 5] 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
  15   [ 6] java.util.concurrent.Semaphore.acquire
  16   [ 7] 
org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
  17   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
  18
  19 Total: 15140093209 (2.02%)  samples: 1
  20   [ 0] java.lang.Object
  21   [ 1] sun.nio.ch.ServerSocketChannelImpl.accept
  22   [ 2] org.eclipse.jetty.server.ServerConnector.accept
  23   [ 3] org.eclipse.jetty.server.AbstractConnector$Acceptor.run
  24   [ 4] org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
  25   [ 5] org.eclipse.jetty.util.thread.QueuedThreadPool$2.run
  26   [ 6] java.lang.Thread.run
  27
  28 Total: 1256664266 (0.17%)  samples: 4945
  29   [ 0] java.util.concurrent.ConcurrentHashMap
  30   [ 1] 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.
  31   [ 2] 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.
  32   [ 3] 
org.apache.hadoop.hbase.regionserver.ReversedRegionScannerImpl.
  33   [ 4] 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner
  34   [ 5] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  35   [ 6] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  36   [ 7] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  37   [ 8] org.apache.hadoop.hbase.regionserver.RSRpcServices.get
  38   [ 9] org.apache.hadoop.hbase.regionserver.RSRpcServices.get
  39   [10] 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
  40   [11] org.apache.hadoop.hbase.ipc.RpcServer.call
  41   [12] org.apache.hadoop.hbase.ipc.CallRunner.run
  42   [13] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
  43   [14] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
  44
  45 Total: 662087825 (0.09%)  samples: 3570
  46   [ 0] org.apache.hadoop.hdfs.PeerCache
  47   [ 1] com.google.common.collect.LinkedListMultimap.size
  48   [ 2] org.apache.hadoop.hdfs.PeerCache.putInternal
  49   [ 3] org.apache.hadoop.hdfs.PeerCache.put
  50   [ 4] org.apache.hadoop.hdfs.RemoteBlockReader2.close
  51   [ 5] org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode
  52   [ 6] org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange
  53   [ 7] org.apache.hadoop.hdfs.DFSInputStream.pread
  54   [ 8] org.apache.hadoop.hdfs.DFSInputStream.read
  55   [ 9] org.apache.hadoop.fs.FSDataInputStream.read
  56   [10] org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra
  57   [11] 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readAtOffset
  58   [12] 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal
  59   [13] 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData
  60   [14] org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock
  61   [15] 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo
  62   [16] 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo
  63   [17] 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo
  64   [18] org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter
  65   [19] org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek
  66   [20] org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners
  67   [21] org.apache.hadoop.hbase.regionserver.StoreScanner.
  68   [22] org.apache.hadoop.hbase.regionserver.HStore.createScanner
  69   [23] org.apache.hadoop.hbase.regionserver.HStore.getScanner
  70   [24] 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners
  71   [25] 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.
  72   [26] 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner
  73   [27] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  74   [28] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  75   [29] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  76   [30] 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-03-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418453#comment-16418453
 ] 

stack commented on HBASE-20236:
---

I just tried a run where we skip the Semaphore and just go to the queue 
instead. Seems to run faster, about 78k/second vs 73k/second doing workloadc 
(100% reads). 1.2.7 does 87k ops/second.

Here is patch:
{code}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/FastPathBalancedQueueRpcExecutor.java
 b/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/FastPat
index eaea34d..4a65ada 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/FastPathBalancedQueueRpcExecutor.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/FastPathBalancedQueueRpcExecutor.java
@@ -80,7 +80,7 @@ public class FastPathBalancedQueueRpcExecutor extends 
BalancedQueueRpcExecutor {
 // if an empty queue of CallRunners so we are available for direct handoff 
when one comes in.
 final Deque fastPathHandlerStack;
 // Semaphore to coordinate loading of fastpathed loadedTask and our 
running it.
-private Semaphore semaphore = new Semaphore(0);
+// private Semaphore semaphore = new Semaphore(0);
 // The task we get when fast-pathing.
 private CallRunner loadedCallRunner;

@@ -98,6 +98,7 @@ public class FastPathBalancedQueueRpcExecutor extends 
BalancedQueueRpcExecutor {
   if (cr == null) {
 // Else, if a fastPathHandlerStack present and no callrunner in Q, 
register ourselves for
 // the fastpath handoff done via fastPathHandlerStack.
+/*
 if (this.fastPathHandlerStack != null) {
   this.fastPathHandlerStack.push(this);
   this.semaphore.acquire();
@@ -105,8 +106,11 @@ public class FastPathBalancedQueueRpcExecutor extends 
BalancedQueueRpcExecutor {
   this.loadedCallRunner = null;
 } else {
   // No fastpath available. Block until a task comes available.
+  // */
   cr = super.getCallRunner();
+/*
 }
+*/
   }
   return cr;
 }
@@ -116,9 +120,12 @@ public class FastPathBalancedQueueRpcExecutor extends 
BalancedQueueRpcExecutor {
  * @return True if we successfully loaded our task
  */
 boolean loadCallRunner(final CallRunner cr) {
+  /*
   this.loadedCallRunner = cr;
   this.semaphore.release();
   return true;
+  */
+  return false;
 }
   }
 }
{code}

YCSB figures:
{code}
 {
  "metric" : "OVERALL",
  "measurement" : "RunTime(ms)",
  "value" : 1200135
}, {
  "metric" : "OVERALL",
  "measurement" : "Throughput(ops/sec)",
  "value" : 78433.52456182013
},
{code}

So, Semaphore is broke or I imagined a speedup doing fastpath?

Here is what locking profile looks like w/ no Semaphore:

{code}
1 Started [lock] profiling
   2 --- Execution profile ---
   3 Total samples: 113365
   4
   5 Frame buffer usage:0.173%
   6
   7 Total: 15130846044 (61.74%)  samples: 1
   8   [ 0] java.lang.Object
   9   [ 1] sun.nio.ch.ServerSocketChannelImpl.accept
  10   [ 2] org.eclipse.jetty.server.ServerConnector.accept
  11   [ 3] org.eclipse.jetty.server.AbstractConnector$Acceptor.run
  12   [ 4] org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
  13   [ 5] org.eclipse.jetty.util.thread.QueuedThreadPool$2.run
  14   [ 6] java.lang.Thread.run
  15
  16 Total: 3752193080 (15.31%)  samples: 44234
  17   [ 0] java.util.concurrent.locks.ReentrantLock$NonfairSync
  18   [ 1] sun.misc.Unsafe.park
  19   [ 2] java.util.concurrent.locks.LockSupport.park
  20   [ 3] 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
  21   [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
  22   [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
  23   [ 6] java.util.concurrent.locks.ReentrantLock$NonfairSync.lock
  24   [ 7] java.util.concurrent.locks.ReentrantLock.lock
  25   [ 8] java.util.concurrent.LinkedBlockingQueue.poll
  26   [ 9] 
org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
  27   [10] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
  28
  29 Total: 1277795387 (5.21%)  samples: 4895
  30   [ 0] java.util.concurrent.ConcurrentHashMap
  31   [ 1] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  32   [ 2] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  33   [ 3] org.apache.hadoop.hbase.regionserver.HRegion.getScanner
  34   [ 4] org.apache.hadoop.hbase.regionserver.RSRpcServices.get
  35   [ 5] org.apache.hadoop.hbase.regionserver.RSRpcServices.get
  36   [ 6] 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
  37   [ 7] org.apache.hadoop.hbase.ipc.RpcServer.call
  38   [ 8] org.apache.hadoop.hbase.ipc.CallRunner.run
  39   [ 9] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
  40   [10] 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-03-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417809#comment-16417809
 ] 

stack commented on HBASE-20236:
---

[~eshcar] Thank you. Funny. I was looking at this yesterday. Sorry the doc is 
poor. I was trying to read what was there yesterday and came to same conclusion 
about my hackery. The class comment tries to give the high-level notion of what 
is going on here; i.e. if a handler is available when a Call is queued, skip 
the queue and pass the Call directly to the Handler and let it go run it 
immediately skipping queuing and coordination costs; otherwise, add to the 
queue for processing later when a Handler is free. See HBASE-16023 for 
measurements and some interesting commentary by [~ikeda], a contributor smart 
on concurrency.

The Semaphore is meant to coordinate the Reader doing hand-off of the Call to 
the Handler. The Handler may be occupied running an existing Call. If you 
remove the Semaphore, I'm afraid that a Handler could be asked run a Call while 
in the midst of running another. Yesterday I spent five minutes trying to 
figure if there was a way we could purge the Semaphore. BTW, this hand-off from 
Readers to Handlers when a read-only workload mostly from cache is our most 
expensive operation; having the Readers run the Call rather than pass to a 
Handler doubles our throughput at least (but makes it so there is no 
scheduling).

I used to be intimate with the machinations here but would need to spend time 
to refresh (and improve the doc). Any input would be much appreciated (could 
even try stuff if you want me to -- I'm doing perf stuff generally these 
times... Currently making the report you asked for over on HBASE-20188).


> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
>  32   [13] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
>  33   [14] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
>  34   [15] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
>  35   [16] 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
>  36   [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
>  37   [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  38   [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  39   [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
>  40   [21] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
>  41   [22] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
>  42   [23] 

[jira] [Commented] (HBASE-20236) [locking] Write-time worst offenders

2018-03-28 Thread Eshcar Hillel (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417228#comment-16417228
 ] 

Eshcar Hillel commented on HBASE-20236:
---

I can take a look into this.
I see the semaphor used in {{FastPathBalancedQueueRpcExecutor}} is initiated 
with zero permits, so acquire succeeds only after a release is executed, 
otherwise the thread is blocked (parking). I am trying to understand why the 
semaphor is needed what does it protect against. It is mostly related to  
{{loadedCallRunner}} but there is no documentation in the code of what exactly 
is going on. I wonder what happens if I simply remove the semaphor? [~stack] 
you are the one signed-off this code, can you explain the code or knows someone 
who can or have a pointer to a blog or any documentation?  

> [locking] Write-time worst offenders
> 
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0-beta-2
>Reporter: stack
>Priority: Major
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if 
> they show up in this sampling profiler:
> {code}
>  7 Total: 769321884622 (99.24%)  samples: 2965
>   8   [ 0] java.util.concurrent.Semaphore$NonfairSync
>   9   [ 1] sun.misc.Unsafe.park
>  10   [ 2] java.util.concurrent.locks.LockSupport.park
>  11   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  12   [ 4] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
>  13   [ 5] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
>  14   [ 6] java.util.concurrent.Semaphore.acquire
>  15   [ 7] 
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
>  16   [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  17
>  18 Total: 4284274263 (0.55%)  samples: 23543
>  19   [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
>  20   [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
>  21   [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
>  22   [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
>  23   [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
>  24   [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
>  25   [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  26   [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
>  27   [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
>  28   [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
>  29   [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
>  30   [11] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
>  31   [12] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
>  32   [13] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
>  33   [14] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
>  34   [15] 
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
>  35   [16] 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
>  36   [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
>  37   [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  38   [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
>  39   [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
>  40   [21] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
>  41   [22] 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
>  42   [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
>  43   [24] 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
>  44   [25] org.apache.hadoop.hbase.ipc.RpcServer.call
>  45   [26] org.apache.hadoop.hbase.ipc.CallRunner.run
>  46   [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  47   [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
>  48
>  49 Total: 717708856 (0.09%)  samples: 214
>  50   [ 0] java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync
>  51   [ 1] sun.misc.Unsafe.park
>  52   [ 2] java.util.concurrent.locks.LockSupport.park
>  53   [ 3] 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
>  54   [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
>  55   [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
>  56   [ 6]