[
https://issues.apache.org/jira/browse/IMPALA-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531704#comment-16531704
]
Joe McDonnell commented on IMPALA-7238:
---------------------------------------
This looks like a very similar issue to IMPALA-4454. By default, the connection
times out after 45 seconds. Looking at the impalad logs, the create of the
database is taking longer than 45 seconds:
{noformat}
I0628 06:22:53.785679 23644 Frontend.java:1014] Analyzing query: CREATE
DATABASE testcreateexternaltable_23808_vu8cqo
...
I0628 06:23:41.316967 23644 ImpaladCatalog.java:178] Adding:
DATABASE:testcreateexternaltable_23808_vu8cqo version: 4854 size: 157
I0628 06:23:41.318166 23644 impala-hs2-server.cc:475] ExecuteStatement():
return_val=TExecuteStatementResp {
01: status (struct) = TStatus {
01: statusCode (i32) = 0,
},
02: operationHandle (struct) = TOperationHandle {
01: operationId (struct) = THandleIdentifier {
01: guid (string) = "\x9d?^7\x1awG\xfd\x00\x00\x00\x00(\xed\xa7\x03",
02: secret (string) = "\x9d?^7\x1awG\xfd\x00\x00\x00\x00(\xed\xa7\x03",
},
02: operationType (i32) = 0,
03: hasResultSet (bool) = true,
},
}
I0628 06:23:41.318212 23644 impala-server.cc:1907] Connection from client
::1:52744 closed, closing 1 associated session(s)
I0628 06:23:41.318222 23644 impala-server.cc:1086] UnregisterQuery():
query_id=fd47771a375e3f9d:3a7ed2800000000
I0628 06:23:41.318224 23644 impala-server.cc:1173] Cancel():
query_id=fd47771a375e3f9d:3a7ed2800000000
...
I0628 06:23:46.593343 27496 impala-server.cc:1194] CloseSessionInternal():
Invalid session id: f54064f9a4604f23:fb686144269fc8b1
I0628 06:23:46.593348 27496 impala-server.cc:1194] CloseSessionInternal():
Invalid session id: f54064f9a4604f23:fb686144269fc8b1{noformat}
When this times out, the database isn't cleaned up (or it retries), and so
things continue to fail. One fix is to use a higher connection timeout for this
test, similar to IMPALA-4454.
> test_kudu.TestCreateExternalTable sees unique database already exists
> ---------------------------------------------------------------------
>
> Key: IMPALA-7238
> URL: https://issues.apache.org/jira/browse/IMPALA-7238
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 3.1.0
> Reporter: Joe McDonnell
> Priority: Critical
> Labels: broken-build, flaky
>
> All of the tests from query_test.test_kudu.TestCreateExternalTable fail with
> an error like:
> {noformat}
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:704:
> in err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase'
> RPC to Hive Metastore:
> E CAUSED BY: AlreadyExistsException: Database
> testcreateexternaltable_23808_vu8cqo already exists{noformat}
> It looks like the failures all happen at once in a single process. The first
> test to fail is test_kudu.TestCreateExternalTable.test_col_types. It takes 52
> seconds where all the other tests take no time. It also has an extra error on
> stderr:
> {noformat}
> -- connecting to: localhost:21000
> MainThread: Failed to open transport (tries_left=3)
> Traceback (most recent call last):
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py",
> line 940, in _execute
> return func(request)
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
> line 265, in ExecuteStatement
> return self.recv_ExecuteStatement()
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
> line 276, in recv_ExecuteStatement
> (fname, mtype, rseqid) = self._iprot.readMessageBegin()
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
> line 126, in readMessageBegin
> sz = self.readI32()
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
> line 206, in readI32
> buff = self.trans.readAll(4)
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
> line 58, in readAll
> chunk = self.read(sz - have)
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
> line 159, in read
> self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
> File
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py",
> line 105, in read
> buff = self.handle.recv(sz)
> timeout: timed out
> MainThread: Error closing Impala cursor: Invalid session id:
> f54064f9a4604f23:fb686144269fc8b1{noformat}
> The other failures don't have this.
> This happened only once, so it is definitely intermittent. This has some
> similarity to IMPALA-6933, but this looks like a repeated failure in a single
> process, not a concurrency issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]