just to clarify, ignore attached patch on hbase-11306. there is no conclusion for the behavior of hbase-11306 yet, the shared connection is just a suspect. thanks.
On Tue, Jul 15, 2014 at 1:52 PM, Rural Hunter <[email protected]> wrote: > Hi Tian Qiang, > > Thanks for the detailed explaination. I have deployed the latest code of > 0.96 branch with hbase-11277 applied. I will keep monitoring to see if > there is still problem and the necessarity of hbase-11306. > > δΊ 2014/7/15 11:06, Qiang Tian ει: > > Hi, below is more details. >> >> the read0 stacktrace you see means reader wants to read something from >> client RPC call. in Andrew's test it shows it is in reading RPC request >> data (reasonable. since other meta data size is quite small). although >> client follows request-receive style, when multiple clients share the >> connection(the default case), the synchronization window when writing to >> the same channel is quite small. if those request data have big size, >> there >> might be a sudden rush to the transportation layer..might causing RPC >> server could not receive the data in time due to congestion control, >> without hbase11277, the reader get 0 byte read again and again... >> >> with hbase11277 the problem should be resolved - we get back to complete >> non-blocking IO, but it is still worth investigation non-shared connection >> under high workload(hbase11306). >> > >
