[ 
https://issues.apache.org/jira/browse/TAJO-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481179#comment-14481179
 ] 

ASF GitHub Bot commented on TAJO-1469:
--------------------------------------

Github user babokim commented on the pull request:

    https://github.com/apache/tajo/pull/480#issuecomment-90046382
  
    I investigated the lock of CallFuture. CallFuture should be synchronized 
with run() and get(). Current code looks like this would be implemented but 
not. If the following situation is occur, some resources or tasks will be lost 
forever.
    
    1. Worker: TaskRunner sends GetTask request.
    2. QM: QueryMaster selects proper task and calls RpcCallback.
    3. Worker: AsyncRpcClient receives the response and calls 
CallFuture.run(response).
    3-1. Worker: If TimeoutException occurs after 1) between 2) ~ 3), 
TaskRunner can't receive the response and doesn't run the allocated task, but 
QM doesn't know about that.
    
    If my thought is wrong, please let me know.
    If my thought is right, this patch is temporary solution and we need to 
create another issue for this problem. 


> allocateQueryMaster can leak resources if it times-out (3sec, hardcoded)
> ------------------------------------------------------------------------
>
>                 Key: TAJO-1469
>                 URL: https://issues.apache.org/jira/browse/TAJO-1469
>             Project: Tajo
>          Issue Type: Bug
>            Reporter: Navis
>            Assignee: Navis
>
> {code}
>     WorkerResourceAllocationResponse response = null;
>     try {
>       response = callFuture.get(3, TimeUnit.SECONDS);
>     } catch (Throwable t) {
>       LOG.error(t, t);
>       return null;
>     }
> {code}
> If it times-out (or interrupted), allocated resources cannot be retrieved 
> forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to