Navis created TAJO-1399:
---------------------------
Summary: TajoResourceAllocator might hang on network error
Key: TAJO-1399
URL: https://issues.apache.org/jira/browse/TAJO-1399
Project: Tajo
Issue Type: Bug
Components: rpc
Reporter: Navis
Assignee: Navis
{code}
CallFuture<WorkerResourceAllocationResponse> callBack = new
CallFuture<WorkerResourceAllocationResponse>();
...
RpcConnectionPool connPool = RpcConnectionPool.getPool();
NettyClientBase tmClient = null;
try {
ServiceTracker serviceTracker =
queryTaskContext.getQueryMasterContext().getWorkerContext().getServiceTracker();
tmClient = connPool.getConnection(serviceTracker.getUmbilicalAddress(),
QueryCoordinatorProtocol.class, true);
QueryCoordinatorProtocolService masterClientService = tmClient.getStub();
masterClientService.allocateWorkerResources(null, request, callBack);
} catch (Throwable e) {
LOG.error(e.getMessage(), e);
} finally {
connPool.releaseConnection(tmClient);
}
WorkerResourceAllocationResponse response = null;
while(!stopped.get()) {
try {
response = callBack.get(3, TimeUnit.SECONDS);
...
{code}
If "callBack" is not registered properly in netty by failed connection, etc.,
allocator thread would block on empty future forever, possibly making thread
leakage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)