Navis created TAJO-1399:
---------------------------

             Summary: TajoResourceAllocator might hang on network error
                 Key: TAJO-1399
                 URL: https://issues.apache.org/jira/browse/TAJO-1399
             Project: Tajo
          Issue Type: Bug
          Components: rpc
            Reporter: Navis
            Assignee: Navis


{code}
CallFuture<WorkerResourceAllocationResponse> callBack = new 
CallFuture<WorkerResourceAllocationResponse>();

...

RpcConnectionPool connPool = RpcConnectionPool.getPool();
NettyClientBase tmClient = null;
try {
  ServiceTracker serviceTracker = 
queryTaskContext.getQueryMasterContext().getWorkerContext().getServiceTracker();
  tmClient = connPool.getConnection(serviceTracker.getUmbilicalAddress(), 
QueryCoordinatorProtocol.class, true);
  QueryCoordinatorProtocolService masterClientService = tmClient.getStub();
  masterClientService.allocateWorkerResources(null, request, callBack);
} catch (Throwable e) {
  LOG.error(e.getMessage(), e);
} finally {
  connPool.releaseConnection(tmClient);
}

WorkerResourceAllocationResponse response = null;
while(!stopped.get()) {
  try {
    response = callBack.get(3, TimeUnit.SECONDS);
    ...
{code}

If "callBack" is not registered properly in netty by failed connection, etc., 
allocator thread would block on empty future forever, possibly making thread 
leakage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to