[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2019-05-17 Thread Biju Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biju Nair updated HBASE-15971:
--
Component/s: Scheduler

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc, Scheduler
>Affects Versions: 1.3.0, 2.0.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 1.3.0, 2.0.0
>
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
   Resolution: Fixed
Fix Version/s: 1.3.0
   2.0.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-1.3+ The test failure looks unrelated and passes locally. Will 
revert if it shows again. Thanks for reviews.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
 Hadoop Flags: Incompatible change,Reviewed
 Release Note: Change the default rpc scheduler from 'deadline' to 
'fifo' instead so it is the same as in branch 0.98. 'deadline' was of 
questionable benefit but with a high cost scheduling. To re-enable 'deadline', 
set hbase.ipc.server.callqueue.type to 'deadline' in your hbase-site.xml.
Affects Version/s: 1.3.0
   2.0.0
   Status: Patch Available  (was: Open)

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: HBASE-15971.branch-1.002.patch

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-11 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: flight_recording_10172402220203_28.branch-1.jfr

For branch-1

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> Screen Shot 2016-06-10 at 5.08.24 PM.png, Screen Shot 2016-06-10 at 5.08.26 
> PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-11 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: flight_recording_10172402220203_29.09820.0.98.20.jfr

Attaching 0.98.20 JFR

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> Screen Shot 2016-06-10 at 5.08.24 PM.png, Screen Shot 2016-06-10 at 5.08.26 
> PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: Screen Shot 2016-06-10 at 5.08.24 PM.png
Screen Shot 2016-06-10 at 5.08.26 PM.png

Comparing JFR output while under same load, the block seek takes more CPU when 
passed the 1.0 Cell and heavy use of thread locals in 1.0 also seems to cost. 
On the other hand, the locking/contention profile looks worse for 0.98 than for 
1.0 with more time lost waiting on locks. It spends more time waiting on the 
regionscanner registration lock than 1.0 and it has the LinkedList blocking 
when doing a response.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> Screen Shot 2016-06-10 at 5.08.24 PM.png, Screen Shot 2016-06-10 at 5.08.26 
> PM.png, branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png, 
> hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: hits.patched1.0.vs.unpatched1.0.vs.098.png

One difference is the sort in the scheduler by priority in SimpleRpcScheduler.

In 1.0 we do the following as our default scheduler:
{code}
CallPriorityComparator callPriority = new CallPriorityComparator(conf, 
this.priority);
callExecutor = new BalancedQueueRpcExecutor("B.default", handlerCount, 
numCallQueues,
  conf, abortable, BoundedPriorityBlockingQueue.class, maxQueueLength, 
callPriority);
{code}

In 0.98 we do:

{code}
  callExecutor = new BalancedQueueRpcExecutor("B.Default", handlerCount,
numCallQueues, maxQueueLength, conf, abortable);
{code}

In the graph, you see three humps. The first is branch-1 with the same default 
as 0.98. It does 290k with 24% idle. Next is branch-1 default. It does 210k 
with 40% of cpu idle. The third hump is default 0.98 with 21% of cpu idle.

Loading for the record is workloadc using asynchbase (because it seems to be 
able to put up more load):

{code}
% for i in `seq 0 24`; do for i in `cat /tmp/slaves`; do echo $i; ssh $i "sh -c 
'nohup ./bin/run_ycsb.sh > /dev/null 2>&1 &'"; done; done
{code}

The script is attached (stolen from Busbey)



> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png, 
> hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: run_ycsb.sh

Busbey script hacked.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png, 
> hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-09 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: handlers.fp.png

Handlers are erratic still but the peaks are higher. When the q backs up, the 
peaks fall as though as though we are 'off' the fast path. Seems sensitive to 
amount of Readers. If 6 as is in this case, then we go fast. If 12, we slow 
down. Seems like the Readers also act as a bit of bottleneck which means the 
Handlers get to go the fast path more often. If I drive even more load, ops go 
up to 320k. 15% idle so there is more to be had here.

Work still to do.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> branch-1.hits.png, branch-1.png, handlers.fp.png, hits.fp.png
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-09 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: hits.fp.png

Here is the hits I was getting w/ this patch... This is 6 Readers and 48 
Handlers (I have 48 cpus).  The extreme right is when I had 48 Readers and 48 
Handlers.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> branch-1.hits.png, branch-1.png, hits.fp.png
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-09 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: HBASE-15971.branch-1.001.patch

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> branch-1.hits.png, branch-1.png
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
Attachment: 098.png
branch-1.png
branch-1.hits.png
098.hits.png

Here are graphs of me running ycsb -- both asynchbase and hbase10 -- on 8 
servers pounding a single node carrying all regions. See how we do about 125k 
in branch-1 and 300k in 0.98. See how the handler occupancy is less in branch-1.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, branch-1.hits.png, branch-1.png
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)