Stephan Erb created AURORA-1533:
-----------------------------------
Summary: Transient connection errors can leave client in
irrecoverable state
Key: AURORA-1533
URL: https://issues.apache.org/jira/browse/AURORA-1533
Project: Aurora
Issue Type: Bug
Reporter: Stephan Erb
Priority: Minor
During a cluster update, some of our schedulers returned an unknown error to
connecting clients ([relevant
code|https://github.com/apache/aurora/blob/b712d577364f6b1613b54ba696bac4ddc255ae58/src/main/python/apache/aurora/client/api/scheduler_client.py#L268]).
Long running clients failed to recover from these errors as the code assumed
the connection was already established. Subsequent scheduling calls thus failed
with the following exception:
{code}
File
"venv/local/lib/python2.7/site-packages/apache/aurora/client/api/__init__.py"
in query_no_configs
140. raise self.ThriftInternalError(e.args[0])
Exception Type: ThriftInternalError
Exception Value: Error during thrift call getTasksWithoutConfigs to
testcluster: 'NoneType' object has no attribute 'getTasksWithoutConfigs'
{code}
Background: We are using the python client to dispatch calls to Aurora from
within a long-running web service. The connection is kept open as long as the
web service is running.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)