[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Biju Nair updated HBASE-15971: -- Component/s: Scheduler > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc, Scheduler >Affects Versions: 1.3.0, 2.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 1.3.0, 2.0.0 > > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, > Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, > flight_recording_10172402220203_28.branch-1.jfr, > flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, > hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Resolution: Fixed Fix Version/s: 1.3.0 2.0.0 Status: Resolved (was: Patch Available) Pushed to branch-1.3+ The test failure looks unrelated and passes locally. Will revert if it shows again. Thanks for reviews. > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Affects Versions: 2.0.0, 1.3.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, > Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, > flight_recording_10172402220203_28.branch-1.jfr, > flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, > hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Hadoop Flags: Incompatible change,Reviewed Release Note: Change the default rpc scheduler from 'deadline' to 'fifo' instead so it is the same as in branch 0.98. 'deadline' was of questionable benefit but with a high cost scheduling. To re-enable 'deadline', set hbase.ipc.server.callqueue.type to 'deadline' in your hbase-site.xml. Affects Version/s: 1.3.0 2.0.0 Status: Patch Available (was: Open) > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Affects Versions: 2.0.0, 1.3.0 >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, > Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, > flight_recording_10172402220203_28.branch-1.jfr, > flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, > hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: HBASE-15971.branch-1.002.patch > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, > Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, > flight_recording_10172402220203_28.branch-1.jfr, > flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, > hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: flight_recording_10172402220203_28.branch-1.jfr For branch-1 > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > Screen Shot 2016-06-10 at 5.08.24 PM.png, Screen Shot 2016-06-10 at 5.08.26 > PM.png, branch-1.hits.png, branch-1.png, > flight_recording_10172402220203_28.branch-1.jfr, > flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, > hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: flight_recording_10172402220203_29.09820.0.98.20.jfr Attaching 0.98.20 JFR > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > Screen Shot 2016-06-10 at 5.08.24 PM.png, Screen Shot 2016-06-10 at 5.08.26 > PM.png, branch-1.hits.png, branch-1.png, > flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, > hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: Screen Shot 2016-06-10 at 5.08.24 PM.png Screen Shot 2016-06-10 at 5.08.26 PM.png Comparing JFR output while under same load, the block seek takes more CPU when passed the 1.0 Cell and heavy use of thread locals in 1.0 also seems to cost. On the other hand, the locking/contention profile looks worse for 0.98 than for 1.0 with more time lost waiting on locks. It spends more time waiting on the regionscanner registration lock than 1.0 and it has the LinkedList blocking when doing a response. > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > Screen Shot 2016-06-10 at 5.08.24 PM.png, Screen Shot 2016-06-10 at 5.08.26 > PM.png, branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png, > hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: hits.patched1.0.vs.unpatched1.0.vs.098.png One difference is the sort in the scheduler by priority in SimpleRpcScheduler. In 1.0 we do the following as our default scheduler: {code} CallPriorityComparator callPriority = new CallPriorityComparator(conf, this.priority); callExecutor = new BalancedQueueRpcExecutor("B.default", handlerCount, numCallQueues, conf, abortable, BoundedPriorityBlockingQueue.class, maxQueueLength, callPriority); {code} In 0.98 we do: {code} callExecutor = new BalancedQueueRpcExecutor("B.Default", handlerCount, numCallQueues, maxQueueLength, conf, abortable); {code} In the graph, you see three humps. The first is branch-1 with the same default as 0.98. It does 290k with 24% idle. Next is branch-1 default. It does 210k with 40% of cpu idle. The third hump is default 0.98 with 21% of cpu idle. Loading for the record is workloadc using asynchbase (because it seems to be able to put up more load): {code} % for i in `seq 0 24`; do for i in `cat /tmp/slaves`; do echo $i; ssh $i "sh -c 'nohup ./bin/run_ycsb.sh > /dev/null 2>&1 &'"; done; done {code} The script is attached (stolen from Busbey) > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png, > hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: run_ycsb.sh Busbey script hacked. > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png, > hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: handlers.fp.png Handlers are erratic still but the peaks are higher. When the q backs up, the peaks fall as though as though we are 'off' the fast path. Seems sensitive to amount of Readers. If 6 as is in this case, then we go fast. If 12, we slow down. Seems like the Readers also act as a bit of bottleneck which means the Handlers get to go the fast path more often. If I drive even more load, ops go up to 320k. 15% idle so there is more to be had here. Work still to do. > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: hits.fp.png Here is the hits I was getting w/ this patch... This is 6 Readers and 48 Handlers (I have 48 cpus). The extreme right is when I had 48 Readers and 48 Handlers. > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > branch-1.hits.png, branch-1.png, hits.fp.png > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: HBASE-15971.branch-1.001.patch > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, > branch-1.hits.png, branch-1.png > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98
[ https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15971: -- Attachment: 098.png branch-1.png branch-1.hits.png 098.hits.png Here are graphs of me running ycsb -- both asynchbase and hbase10 -- on 8 servers pounding a single node carrying all regions. See how we do about 125k in branch-1 and 300k in 0.98. See how the handler occupancy is less in branch-1. > Regression: Random Read/WorkloadC slower in 1.x than 0.98 > - > > Key: HBASE-15971 > URL: https://issues.apache.org/jira/browse/HBASE-15971 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: 098.hits.png, 098.png, branch-1.hits.png, branch-1.png > > > branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be > doing about 1/2 the throughput of 0.98. > In branch-1, we have low handler occupancy compared to 0.98. Hacking in > reader thread occupancy metric, is about the same in both. In parent issue, > hacking out the scheduler, I am able to get branch-1 to go 3x faster so will > dig in here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)