[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391276#comment-15391276 ] Hadoop QA commented on HBASE-14479: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 17s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} branch-1 passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} branch-1 passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 14m 35s {color} | {color:red} Patch causes 11 errors with Hadoop v2.6.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 16m 20s {color} | {color:red} Patch causes 11 errors with Hadoop v2.6.2. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 18m 3s {color} | {color:red} Patch causes 11 errors with Hadoop v2.6.3. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 19m 47s {color} | {color:red} Patch causes 11 errors with Hadoop v2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 30s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 128m 56s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.ipc.TestRpcClientLeaks | | | hadoop.hbase.procedure.TestProcedureManager | | | hadoop.hbase.master.balancer.TestRegionLocationFinder | | Timed out junit tests | org.apache.hadoop.hbase.ipc.TestAsyncIPC | | | org.apache.hadoop.hbase.ipc.TestIPC | | | org.apache.hadoop.hbase.security.TestAsyncSecureIPC | | | org.apache.hadoop.hbase.security.TestSecureIPC | \\ \\ || Subsystem || Report/Notes || | JIRA Patch
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389724#comment-15389724 ] Hadoop QA commented on HBASE-14479: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} | {color:red} HBASE-14479 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12819646/HBASE-14479-V4-experimental_branch-1.patch | | JIRA Issue | HBASE-14479 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2729/console | | Powered by | Apache Yetus 0.2.1 http://yetus.apache.org | This message was automatically generated. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479-V3-experimental_branch-1.patch, > HBASE-14479-V4-experimental_branch-1.patch, HBASE-14479.patch, > flamegraph-19152.svg, flamegraph-32667.svg, gc.png, gets.png, io.png, > median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389713#comment-15389713 ] Hadoop QA commented on HBASE-14479: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} HBASE-14479 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12819644/HBASE-14479-V3-experimental_branch-1.patch | | JIRA Issue | HBASE-14479 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2728/console | | Powered by | Apache Yetus 0.2.1 http://yetus.apache.org | This message was automatically generated. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479-V3-experimental_branch-1.patch, > HBASE-14479.patch, flamegraph-19152.svg, flamegraph-32667.svg, gc.png, > gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389699#comment-15389699 ] Hiroshi Ikeda commented on HBASE-14479: --- Oops some files are lost. I'll correct... > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479-V3-experimental_branch-1.patch, > HBASE-14479.patch, flamegraph-19152.svg, flamegraph-32667.svg, gc.png, > gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378767#comment-15378767 ] Hiroshi Ikeda commented on HBASE-14479: --- In order to reduce overhead of unnecessarily changing registrations, we should postpone making the read flag on and delegating the leader to exclusive access to the socket, until we realize we cannot construct a task even after retrieving data from the socket. Additionally, we should retrieve data with a off-heap buffer whose size is equal to or larger than the socket's native buffer, for reducing overhead of native calls. Of course retrieving all available tasks from a socket at a time causes memory shortage and unfair execution as to connections. In order to prevent the unfairness, we should queue at most one task per connection. That doesn't mean that one connection cannot execute multiple tasks simultaneously; The restriction is for queued tasks waiting execution. From a different viewpoint, just before executing a task, we should delegate another follower to execute or queue the next task or delegate the leader, as described above. AdaptiveLifoCoDelCallQueue is not appropriate when clients can simultaneously send multiple requests. Because it is not realistic to retrieve all requests, the requests will be executed in available order when congestion. Moreover, that requests will be unfairly executed prior to others because that are retrieved later. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376201#comment-15376201 ] Hiroshi Ikeda commented on HBASE-14479: --- bq. The doRunLoop will doRead for each key gotten on a select. Reader.doRunLoop calls doRead(key) once for each key selected, and doRead calls Connection.readAndProcess() once for each call. readAndProcess reads and processes at most one request from a socket for each call. Actually, readAndProcess prepares a buffer whose size is just equal to the request's one, and reads data and calls process(). That means, doRunLoop processes at most one request for each key selected, and the following request is required to be selected again in order to be processed. That would be good if clients claimed one request at a time. Moreover, that naturally implements round-robin behavior for registered channels in the selector. But that is subtle for asynchronous multiple requests via one socket because of overhead including unnecessarily calling Selector.select(). If SASL is used and the request contains multiple substantial requests, all of them are processed in processUnwrappedData with while loop. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368055#comment-15368055 ] stack commented on HBASE-14479: --- bq. I found that the method Reader.doRead(SelectionKey) just does one request for each call, regardless of whether the next request is available... How do you mean [~ikeda]? The doRunLoop will doRead for each key gotten on a select. bq. BTW, in order to resolve this, when we read as many requests from a connection as possible, the queue will easily become full and it will be difficult to handle requests fairly as to connections. I think it is better to cap the count of the requests simultaneously executing for each connection, according to the current requests queued (instead of using a fixed bounded queue). Sounds good. I can test any experiments you might want to try. Thanks. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360884#comment-15360884 ] Hiroshi Ikeda commented on HBASE-14479: --- RpcServer.Responder is sort of a safety net used when the native sending buffer of a socket is full, and that is rarely used if clients are well-behaved and wait their response for each request. That means, YCSB should call multiple requests simultaneously in one connection. I checked the source of RpcServer and I found that the method Reader.doRead(SelectionKey) just does one request for each call, regardless of whether the next request is available, unless requests are through SASL. That makes the patch of this issue unnecessarily change registration of the key of a connection for each request, causing overhead (as shown by sun.nio.ch.EPollArrayWrapper::updateRegistrations, though I didn't think such different through-puts). BTW, in order to resolve this, when we read as many requests from a connection as possible, the queue will easily become full and it will be difficult to handle requests fairly as to connections. I think it is better to cap the count of the requests simultaneously executing for each connection, according to the current requests queued (instead of using a fixed bounded queue). > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359168#comment-15359168 ] stack commented on HBASE-14479: --- I tried this again with total random read workload (all from cache). Readers are here at safe point: {code} 2449 "RpcServer.reader=0,bindAddress=ve0528.halxg.cloudera.com,port=16020" #34 daemon prio=5 os_prio=0 tid=0x7fb669c7f1e0 nid=0x1c7e8 waiting on condition [0x7fae4d244000] 2450java.lang.Thread.State: WAITING (parking) 2451 at sun.misc.Unsafe.park(Native Method) 2452 - parking to wait for <0x7faf661d4c00> (a java.util.concurrent.Semaphore$NonfairSync) 2453 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) 2454 at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) 2455 at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) 2456 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) 2457 at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) 2458 at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:688) 2459 at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:669) 2460 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 2461 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 2462 at java.lang.Thread.run(Thread.java:745) {code} ...i.e. at the new semaphore. Throughput is way down... 150k ops/s vs 380k ops/s. Looking w/ honest profiler, the call stack is way different w/ current branch-1 spending most of its time responding: Current branch-1 {code} 6 Tree Profile: 7 (t 100.0,s 5.2) org.apache.hadoop.hbase.ipc.RpcServer$Responder::run 8 (t 94.8,s 0.0) org.apache.hadoop.hbase.ipc.RpcServer$Responder::doRunLoop 9(t 81.0,s 0.6) org.apache.hadoop.hbase.ipc.RpcServer$Responder::doAsyncWrite 10 (t 79.9,s 1.1) org.apache.hadoop.hbase.ipc.RpcServer$Responder::processAllResponses 11 (t 76.4,s 0.6) org.apache.hadoop.hbase.ipc.RpcServer$Responder::processResponse 12 (t 75.9,s 0.0) org.apache.hadoop.hbase.ipc.RpcServer::channelWrite 13(t 73.6,s 0.0) org.apache.hadoop.hbase.ipc.BufferChain::write 14 (t 72.4,s 2.3) sun.nio.ch.SocketChannelImpl::write 15 (t 67.8,s 0.6) sun.nio.ch.IOUtil::write 16 (t 62.1,s 0.0) sun.nio.ch.SocketDispatcher::writev 17(t 62.1,s 62.1) sun.nio.ch.FileDispatcherImpl::writev0 18 (t 2.3,s 0.6) sun.nio.ch.Util::getTemporaryDirectBuffer 19(t 1.7,s 0.0) java.lang.ThreadLocal::get 20 (t 1.7,s 0.0) java.lang.ThreadLocal$ThreadLocalMap::access$000 21 (t 1.7,s 1.7) java.lang.ThreadLocal$ThreadLocalMap::getEntry 22 (t 0.6,s 0.0) sun.nio.ch.IOVecWrapper::get 23(t 0.6,s 0.0) java.lang.ThreadLocal::get 24 (t 0.6,s 0.0) java.lang.ThreadLocal$ThreadLocalMap::access$000 25 (t 0.6,s 0.6) java.lang.ThreadLocal$ThreadLocalMap::getEntry 26 (t 0.6,s 0.6) sun.nio.ch.Util::offerLastTemporaryDirectBuffer 27 (t 0.6,s 0.0) java.nio.DirectByteBuffer::put 28(t 0.6,s 0.6) java.nio.Buffer::limit 29 (t 0.6,s 0.6) java.nio.Buffer::position 30 (t 0.6,s 0.0) sun.nio.ch.IOVecWrapper::putLen 31(t 0.6,s 0.6) sun.nio.ch.NativeObject::putLong 32 (t 1.1,s 0.0) java.nio.channels.spi.AbstractInterruptibleChannel::begin 33 (t 1.1,s 0.0) java.nio.channels.spi.AbstractInterruptibleChannel::blockedOn 34(t 1.1,s 0.0) java.lang.System$2::blockedOn 35 (t 1.1,s 1.1) java.lang.Thread::blockedOn 36 (t 1.1,s 1.1) sun.nio.ch.SocketChannelImpl::writerCleanup 37 (t 1.1,s 1.1) java.nio.Buffer::hasRemaining ... {code} With patch: {code} Tree Profile: (t 100.0,s 2.2) java.lang.Thread::run (t 97.8,s 0.0) java.util.concurrent.ThreadPoolExecutor$Worker::run (t 97.8,s 0.0) java.util.concurrent.ThreadPoolExecutor::runWorker (t 97.8,s 0.1) org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader::run (t 97.7,s 0.2) org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader::doRunLoop (t 63.9,s 0.9) org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader::leading (t 59.1,s 0.0) sun.nio.ch.SelectorImpl::select (t 59.1,s 0.0) sun.nio.ch.SelectorImpl::select (t 59.1,s 0.0) sun.nio.ch.SelectorImpl::lockAndDoSelect (t 59.1,s 0.1) sun.nio.ch.EPollSelectorImpl::doSelect (t 49.2,s 0.0) sun.nio.ch.EPollArrayWrapper::poll (t 43.2,s 0.9)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333124#comment-15333124 ] stack commented on HBASE-14479: --- Just putting a placeholder here: Our rpcscheduler is configurable. Default is FIFO. If we do the request on the Reader thread -- not handing off to the Handler -- then we go much faster. Over in https://issues.apache.org/jira/browse/HBASE-15967?focusedCommentId=15317950=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15317950, [~ikeda] suggests doing all requests irrespective of priority on Reader until we get close to limit. Then we switch to queuing and respecting priority. Meantime, there is the FB experience which the lads have codified in AdaptiveLifoCoDelCallQueue where we FIFO until we become loaded and then we go LIFO with a controlled delay that has us shedding load rather than become swamped. Default should be a conflation of the two notions above. TODO. The FB lads are going to come back w/ some more input running AdaptiveLifoCoDelCallQueue. That'll help. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992886#comment-14992886 ] Hiroshi Ikeda commented on HBASE-14479: --- I didn't realize that a wrapped data by SASL might create a several tasks. When we do a task within a received thread, we would make followers process the rest of the tasks. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982952#comment-14982952 ] stack commented on HBASE-14479: --- YCSB does more than one connection... one per client (25 in this case). > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977559#comment-14977559 ] Hiroshi Ikeda commented on HBASE-14479: --- I have an idea that a just simple scheduler can execute tasks within the almost same thread in a low load with a queue for tasks, instead of preparing an exclusive thread pool in RpcExecutor. Pseudo code: {code} void RpcScheduler.dispatch(callRunner) { queue.offer(callRunner); if (threadsExecutingTasks < MAX_THREADS_EXECUTING_TASKS) { threadExecutingTasks++; while ((task = queue.poll()) != null) { execute(task); } // In most cases in a low load, execute the one task you have added. threadExecutingTasks--; } } {code} This is a based on the condition that we can borrow some threads from RpcServer for a while. In the actual code, I would use AtomicLong to manage the numbers of threads and tasks. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973862#comment-14973862 ] Hiroshi Ikeda commented on HBASE-14479: --- That's interesting. I don't know details about YCSB, but does it use multiple connections? The selectors without the patch consume more CPU time than the selector with the patch. Park/unpark in the semaphore with the patch consumes non-trivial CPU time. The semaphore releases a permit for each connection to read, but there might be actually well context switches to wake up threads with overhead of park/unpark compared to cost of their tasks. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, > flamegraph-32667.svg, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966381#comment-14966381 ] Hiroshi Ikeda commented on HBASE-14479: --- {quote} I could experiment with removing queues to see if it buys us throughput. {quote} Some of tasks take a time to execute, and before dispatching a tasks within the same thread, we should do key.interestOps(OP_READ) so that the selector resumes receiving data from the corresponding connection, otherwise parallelized scans in Phoenix or other cheat might reduce performance. Even if so, Support fairness across parallelized scans (HBASE-12790) becomes difficult for scans coming from the same TCP stream. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965933#comment-14965933 ] stack commented on HBASE-14479: --- bq. In this jira issue, the benefit is not from this main advantage because we just add the requests to queues. Yeah, in a new issue, should we pull out the queue... Or, rather, I suppose in a follow-on I could experiment with removing queues to see if it buys us throughput. If it does, then we could look into redoing scheduling so it was like the 'Bound handle/thread association' from the paper. Thanks for looking at the WAL [~ikeda] I have a sense that this pattern might help with the multiple-syncing threads... but let me try. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964423#comment-14964423 ] Hiroshi Ikeda commented on HBASE-14479: --- Sorry for my late response. {quote} (It seems a strange usage but I want to put it aside for now). Does this mean you want to change the patch or just that you think it fine as is; it is just that the implementation is a little odd (all executors are contending on single instance of the Reader Runnable)? {quote} Yes, that works as is. I feel old because that executor creates the fixed number of threads and adjust each thread to just take one task. Executors are for handling tasks independent of threads. I think it is enough to just explicitly create threads in a thread group, but anyway that is not a practical problem. As to FSHLog, I have taken a little time to think, but it is too complex and I can't say for certain. I want to say one thing to make sure about the pattern; In general, the main advantage of the Leader/Followers pattern is the possibility for us to make a response without blocks in the same thread. In this jira issue, the benefit is not from this main advantage because we just add the requests to queues. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955352#comment-14955352 ] stack commented on HBASE-14479: --- So, played with the patch and see what is going on now (it used the idiom described in the paper flipping gating on semaphore to go from leader to follower). I'd be up for committing to master and branch-1 with some commentary added. Waiting on response to question above when [~ikeda] has a moment to reply. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954471#comment-14954471 ] stack commented on HBASE-14479: --- bq. (It seems a strange usage but I want to put it aside for now). Does this mean you want to change the patch or just that you think it fine as is; it is just that the implementation is a little odd (all executors are contending on single instance of the Reader Runnable)? bq. That intends just event dispatching while key != null (corresponding to the transition following->processing) ... hmm. I think I should just run this and see how it operates in action if only for my own education. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953700#comment-14953700 ] stack commented on HBASE-14479: --- A few comments: + There is only one Reader thread so how can there be leaders and followers? + If only one Reader thread, could we discard and let the Listener thread do the dispatch? + Patch could do with a few comments including link to pattern being implemented. For example, what is going on here: + SelectionKey key = selectedKeyQueue.poll(); + if (key != null) { +processing(key); +continue; + } We are the leader and we keep processing the queue till no more keys... then we fall through to do similar relinquishing the lock if no more to do? Patch does some nice cleanup. Just trying to understand it better. Thanks [~ikeda] > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954386#comment-14954386 ] Hiroshi Ikeda commented on HBASE-14479: --- Sorry I have little time now so response a few comments. {quote} + There is only one Reader thread so how can there be leaders and followers? + If only one Reader thread, could we discard and let the Listener thread do the dispatch? {quote} {{Reader}} is a just runnable task and not extending {{Thread}}. Threads are created in a thread pool (It seems a strange usage but I want to put it aside for now). {quote} + SelectionKey key = selectedKeyQueue.poll(); + if (key != null) { + processing(key); + continue; + } {quote} That intends just event dispatching while {{key}} != null (corresponding to the transition following->processing), and the semaphore releases threads so that the only one thread encounters {{key}} == null and promotes to the leader (following->leading). > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952235#comment-14952235 ] Nicolas Liochon commented on HBASE-14479: - Yeah, I tried to get rid of this array of readers a while back, but I didn't push the patch because I didn't get any significant result. Nice work, [~ikeda] > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952159#comment-14952159 ] stack commented on HBASE-14479: --- [~nkeywal] FYI. You'll like this one. [~ikeda] Should we use this patten elsewhere, say, in the handoff to syncer threads in WAL? See http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/wal/FSHLog.html#1770 if I understand the pattern right, we could purge Readers and have Handlers themselves do the select read from the socket (one less handoff)? > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch, gc.png, gets.png, io.png, median.png > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946617#comment-14946617 ] Hadoop QA commented on HBASE-14479: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765335/HBASE-14479-V2%20%281%29.patch against master branch at commit d80c7e95ec7ef4811b83224a87ee6883f67f2d62. ATTACHMENT ID: 12765335 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15899//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15899//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15899//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15899//console This message is automatically generated. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, > HBASE-14479-V2.patch, HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946256#comment-14946256 ] stack commented on HBASE-14479: --- {code} kalashnikov:hbase.git.commit stack$ python ./dev-support/findHangingTests.py https://builds.apache.org/job/PreCommit-HBASE-Build/15893//consoleText Fetching https://builds.apache.org/job/PreCommit-HBASE-Build/15893//consoleText Building remotely on ubuntu-2 (docker Ubuntu ubuntu) in workspace /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build Testing patch for HBASE-14479. Testing patch on branch master. Printing hanging tests Hanging test : org.apache.hadoop.hbase.util.TestHBaseFsck Hanging test : org.apache.hadoop.hbase.namespace.TestNamespaceAuditor Hanging test : org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 Hanging test : org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper Hanging test : org.apache.hadoop.hbase.mob.compactions.TestMobCompactor Printing Failing tests {code} These failed. I'm just going to disable them all... they fail regularly. Let me make issues. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, > HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946178#comment-14946178 ] Hadoop QA commented on HBASE-14479: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765295/HBASE-14479-V2.patch against master branch at commit 0ea1f8122709302ee19279aaa438b37dac30c25b. ATTACHMENT ID: 12765295 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15893//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15893//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15893//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15893//console This message is automatically generated. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, > HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946052#comment-14946052 ] stack commented on HBASE-14479: --- Here's link http://www.kircher-schwanninger.de/michael/publications/lf.pdf I like the explanation here too: http://stackoverflow.com/questions/3058272/explain-leader-follower-pattern Patch seems good. You tried it [~ikeda] (if you'd messed up, unit tests would be failing...) Anyway we could figure if a benefit? I can try running on a cluster and see Thanks [~ikeda] > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, > HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906191#comment-14906191 ] Hadoop QA commented on HBASE-14479: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12762069/HBASE-14479.patch against master branch at commit 5b7894f92ba3e9ff700da1e9194ebb4774d8b71e. ATTACHMENT ID: 12762069 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.master.TestTableLockManager org.apache.hadoop.hbase.master.handler.TestEnableTableHandler org.apache.hadoop.hbase.master.TestRegionPlacement org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.TestFullLogReconstruction {color:red}-1 core zombie tests{color}. There are 23 zombie test(s): at org.apache.hadoop.hbase.util.TestHBaseFsck.testQuarantineCorruptHFile(TestHBaseFsck.java:2231) at org.apache.hadoop.hbase.util.TestHBaseFsck.testQuarantineMissingHFile(TestHBaseFsck.java:2373) at org.apache.hadoop.hbase.util.TestHBaseFsck.testFixHdfsHolesNotWorkingWithNoHdfsChecking(TestHBaseFsck.java:2152) at org.apache.hadoop.hbase.util.TestHBaseFsck.testCoveredStartKey(TestHBaseFsck.java:1145) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testIncrementHook(TestRegionObserverInterface.java:220) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testRegionObserver(TestRegionObserverInterface.java:124) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testRowMutation(TestRegionObserverInterface.java:183) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testCheckAndPutHooks(TestRegionObserverInterface.java:248) at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompactionAfterWALSync(TestIOFencing.java:238) at org.apache.hadoop.hbase.client.TestFromClientSide.testJiraTest861(TestFromClientSide.java:2275) at org.apache.hadoop.hbase.client.TestReplicasClient.testGetNoResultNotStaleSleepRegionWithReplica(TestReplicasClient.java:370) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit(TestSplitTransactionOnCluster.java:1003) at org.apache.hadoop.hbase.client.TestClientTimeouts.testAdminTimeout(TestClientTimeouts.java:108) at
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906191#comment-14906191 ] Hadoop QA commented on HBASE-14479: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12762069/HBASE-14479.patch against master branch at commit 5b7894f92ba3e9ff700da1e9194ebb4774d8b71e. ATTACHMENT ID: 12762069 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.master.TestTableLockManager org.apache.hadoop.hbase.master.handler.TestEnableTableHandler org.apache.hadoop.hbase.master.TestRegionPlacement org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.TestFullLogReconstruction {color:red}-1 core zombie tests{color}. There are 23 zombie test(s): at org.apache.hadoop.hbase.util.TestHBaseFsck.testQuarantineCorruptHFile(TestHBaseFsck.java:2231) at org.apache.hadoop.hbase.util.TestHBaseFsck.testQuarantineMissingHFile(TestHBaseFsck.java:2373) at org.apache.hadoop.hbase.util.TestHBaseFsck.testFixHdfsHolesNotWorkingWithNoHdfsChecking(TestHBaseFsck.java:2152) at org.apache.hadoop.hbase.util.TestHBaseFsck.testCoveredStartKey(TestHBaseFsck.java:1145) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testIncrementHook(TestRegionObserverInterface.java:220) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testRegionObserver(TestRegionObserverInterface.java:124) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testRowMutation(TestRegionObserverInterface.java:183) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testCheckAndPutHooks(TestRegionObserverInterface.java:248) at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompactionAfterWALSync(TestIOFencing.java:238) at org.apache.hadoop.hbase.client.TestFromClientSide.testJiraTest861(TestFromClientSide.java:2275) at org.apache.hadoop.hbase.client.TestReplicasClient.testGetNoResultNotStaleSleepRegionWithReplica(TestReplicasClient.java:370) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit(TestSplitTransactionOnCluster.java:1003) at org.apache.hadoop.hbase.client.TestClientTimeouts.testAdminTimeout(TestClientTimeouts.java:108) at
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907559#comment-14907559 ] Hadoop QA commented on HBASE-14479: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12762308/HBASE-14479-V2.patch against master branch at commit dff86542d558394cc87ede256bd5432d071ed73f. ATTACHMENT ID: 12762308 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.http.TestGlobalFilter.testServletFilter(TestGlobalFilter.java:137) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15733//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15733//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15733//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15733//console This message is automatically generated. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2.patch, HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)