[
https://issues.apache.org/jira/browse/TAJO-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485011#comment-14485011
]
Hyunsik Choi commented on TAJO-1540:
------------------------------------
I agreed. It is really necessary.
> RpcCallback must be able to handle TimeoutException or cancel.
> --------------------------------------------------------------
>
> Key: TAJO-1540
> URL: https://issues.apache.org/jira/browse/TAJO-1540
> Project: Tajo
> Issue Type: Bug
> Reporter: Hyoungjun Kim
>
> I investigated the lock of CallFuture while reviewing TAJO-1469. CallFuture
> should be synchronized with run() and get(). Current code looks like this
> would be implemented but not. If the following situation is occur, some
> resources or tasks will be lost forever.
> Worker: TaskRunner sends GetTask request.
> QM: QueryMaster selects proper task and calls RpcCallback.
> Worker: AsyncRpcClient receives the response and calls
> CallFuture.run(response). 3-1. Worker: If TimeoutException occurs after 1)
> between 2) ~ 3), TaskRunner can't receive the response and doesn't run the
> allocated task, but QM doesn't know about that.
> We should fix this problem in the RPC module and add a right cancel logic.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)