[
https://issues.apache.org/jira/browse/HBASE-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900048#comment-13900048
]
Hudson commented on HBASE-10506:
--------------------------------
FAILURE: Integrated in HBase-0.98 #154 (See
[https://builds.apache.org/job/HBase-0.98/154/])
HBASE-10506 Fail-fast if client connection is lost before the real call be
executed in RPC layer (liangxie: rev 1567842)
*
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/CallRunner.java
> Fail-fast if client connection is lost before the real call be executed in
> RPC layer
> ------------------------------------------------------------------------------------
>
> Key: HBASE-10506
> URL: https://issues.apache.org/jira/browse/HBASE-10506
> Project: HBase
> Issue Type: Bug
> Components: IPC/RPC
> Affects Versions: 0.94.3
> Reporter: Liang Xie
> Assignee: Liang Xie
> Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17
>
> Attachments: HBASE-10506-0.94.txt, HBASE-10506-trunk.txt
>
>
> In current HBase rpc impletement, there is no any connection double-checking
> just before the "call" be invoked, considing there's a gc or other OS
> scheduling or the call queue is full enough(e.g. the server side is slow/hang
> due to some issues), and if the client side has a small rpc timeout value, it
> could be possible when this request be taken from call queue, the client
> connection is lost in that moment. we'd better has some fail-fast code before
> the reall "call" be invoked, it just waste the server side resource.
> Here is a strace trace from our production env:
> 2014-02-11,18:16:19,525 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call get([B@3eae6c77,
> {"timeRange":[0,9223372036854775807],"totalColumns":1,"cacheBlocks":true,"families":{"X":["T"]},"maxVersions":1,"row":"074103000000001-m8997060"}),
> rpc version=1, client version=29, methodsFingerPrint=-241105381 from
> 10.101.10.181:43252: output error
> 2014-02-11,18:16:19,526 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 151 on 12600 caught a ClosedChannelException, this means that the
> server was processing a request but the client went away. The error message
> was: null
> 2014-02-11,18:16:19,797 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call
> get([B@3f10ffd2,
> {"timeRange":[0,9223372036854775807],"totalColumns":1,"cacheBlocks":true,"families":{"X":["T"]},"maxVersions":1,"row":"4245978-m7281526"}),
> rpc version=1, client version=29, methodsFingerPrint=-241105381 from
> 10.101.10.181:43259 after 0 ms, since caller disconnected
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:450)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3633)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3590)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3615)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4414)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4387)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2075)
> at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:460)
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1457)
> 2014-02-11,18:16:19,802 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call get([B@3f10ffd2,
> {"timeRange":[0,9223372036854775807],"totalColumns":1,"cacheBlocks":true,"families":{"X":["T"]},"maxVersions":1,"row":"4245978-m7281526"}),
> rpc version=1, client version=29, methodsFingerPrint=-241105381 from
> 10.101.10.181:43259: output error
> 2014-02-11,18:16:19,802 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 46 on 12600 caught a ClosedChannelException, this means that the
> server was processing a request but the client went away. The error message
> was: null
> With this fix, we can reduce this hit probability at least:) the upstream
> hadoop has this checking already, see:
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java#L2034-L2036
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)