[
https://issues.apache.org/jira/browse/HDDS-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646564#comment-16646564
]
Nanda kumar commented on HDDS-625:
----------------------------------
The problem seems to be in Ratis
{code:java}
2018-10-11 20:15:38,026 DEBUG util.TimeoutScheduler: schedule a task: timeout
3000 ms, sid 2
2018-10-11 20:15:38,026 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
submitting a new request [seq=2] in client-A3773C27AF63->RAFT: requests[2..2],
nextSeqNum=3, firstSubmitted=0, replied? true, delayed=[]? submitted
2018-10-11 20:15:38,026 DEBUG netty.NettyClientHandler: [id: 0xb2e515b7,
L:/127.0.0.1:59923 - R:/127.0.0.1:9858] OUTBOUND DATA: streamId=3 padding=0
endStream=false length=299
bytes=00000001260a500a1057fa58c4f1304f32b785a3773c27af63122431373035353534302d633835652d346334302d613938392d3761356235663465626634621a...
2018-10-11 20:15:38,037 DEBUG netty.NettyClientHandler: [id: 0xb2e515b7,
L:/127.0.0.1:59923 - R:/127.0.0.1:9858] INBOUND DATA: streamId=3 padding=0
endStream=false length=214
bytes=00000000d10a500a1057fa58c4f1304f32b785a3773c27af63122431373035353534302d633835652d346334302d613938392d3761356235663465626634621a...
2018-10-11 20:15:38,037 DEBUG client.RaftClient: client-A3773C27AF63: receive*
RaftClientReply:client-A3773C27AF63->17055540-c85e-4c40-a989-7a5b5f4ebf4b@group-1E6748549C2B,
cid=2, SUCCESS, commits[17055540-c85e-4c40-a989-7a5b5f4ebf4b:c19]
2018-10-11 20:15:38,038 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
set reply
RaftClientReply:client-A3773C27AF63->17055540-c85e-4c40-a989-7a5b5f4ebf4b@group-1E6748549C2B,
cid=2, SUCCESS, commits[17055540-c85e-4c40-a989-7a5b5f4ebf4b:c19] for seq=2 in
client-A3773C27AF63->RAFT: requests[2..2]
2018-10-11 20:15:38,038 DEBUG scm.XceiverClientRatis: received reply cmdType:
PutBlock
traceID: "468d48e9-d447-403c-8c73-c3ba4a593017"
containerID: 9
datanodeUuid: "17055540-c85e-4c40-a989-7a5b5f4ebf4b"
putBlock {
blockData {
blockID {
containerID: 9
localID: 100877542185500679
}
metadata {
key: "TYPE"
value: "KEY"
}
chunks {
chunkName:
"8764262b17d9c32a8650d17e91b7600b_stream_146224af-cabe-42f2-84c5-a94b9742fe25_chunk_1"
offset: 0
len: 6774
}
}
}
for request:
RaftClientReply:client-A3773C27AF63->17055540-c85e-4c40-a989-7a5b5f4ebf4b@group-1E6748549C2B,
cid=2, SUCCESS, commits[17055540-c85e-4c40-a989-7a5b5f4ebf4b:c19] exception:
null
2018-10-11 20:15:38,047 DEBUG ipc.Client: IPC Client (973576304) connection to
localhost/127.0.0.1:9862 from nvadivelu sending #6
org.apache.hadoop.ozone.protocol.OzoneManagerProtocol.commitKey
2018-10-11 20:15:38,048 DEBUG ipc.Client: IPC Client (973576304) connection to
localhost/127.0.0.1:9862 from nvadivelu got value #6
2018-10-11 20:15:38,048 DEBUG ipc.ProtobufRpcEngine: Call: commitKey took 2ms
2018-10-11 20:15:40,756 DEBUG util.TimeoutScheduler: run a task: sid 0
2018-10-11 20:15:41,021 DEBUG util.TimeoutScheduler: run a task: sid 1
2018-10-11 20:15:41,027 DEBUG util.TimeoutScheduler: run a task: sid 2
2018-10-11 20:15:41,028 DEBUG util.TimeoutScheduler: Schedule a shutdown task:
grace 1 m, sid 3
2018-10-11 20:15:42,582 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:15:47,971 DEBUG ipc.Client: IPC Client (973576304) connection to
localhost/127.0.0.1:9860 from nvadivelu: closed
2018-10-11 20:15:47,971 DEBUG ipc.Client: IPC Client (973576304) connection to
localhost/127.0.0.1:9860 from nvadivelu: stopped, remaining connections 1
2018-10-11 20:15:48,051 DEBUG ipc.Client: IPC Client (973576304) connection to
localhost/127.0.0.1:9862 from nvadivelu: closed
2018-10-11 20:15:48,051 DEBUG ipc.Client: IPC Client (973576304) connection to
localhost/127.0.0.1:9862 from nvadivelu: stopped, remaining connections 0
2018-10-11 20:15:52,586 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:16:02,586 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:16:12,587 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:16:22,590 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:16:32,593 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:16:41,030 DEBUG util.TimeoutScheduler: shutdown scheduler: sid 3
2018-10-11 20:16:42,598 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:16:52,602 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:17:02,607 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:17:12,611 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:17:22,613 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:17:32,613 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:17:42,616 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:17:52,619 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:18:02,623 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:18:12,627 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:18:22,630 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:18:32,635 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:18:42,635 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:18:52,640 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:19:02,641 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:19:12,642 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:19:22,643 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:19:32,647 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:19:42,649 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:19:52,652 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
2018-10-11 20:20:02,655 DEBUG util.SlidingWindow: client-A3773C27AF63->RAFT:
requests[]
{code}
We have {{TimeoutScheduler}} which is used to send async request using
{{SlidingWindow,}} this might be causing problem.
cc/ [~szetszwo]
> putKey hangs for a long time after completion, sometimes forever
> ----------------------------------------------------------------
>
> Key: HDDS-625
> URL: https://issues.apache.org/jira/browse/HDDS-625
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Client
> Reporter: Arpit Agarwal
> Priority: Blocker
> Attachments: ozone-shell-thread-dump.txt
>
>
> putKey hangs, sometimes forever.
> TRACE log output in comment below.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]