driazati opened a new issue, #13205: URL: https://github.com/apache/tvm/issues/13205
Seen on main in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4580/tests/, the RPC server name is also the same for every failing test and the tests all failed on the same shard ``` failed on setup with "RuntimeError: Cannot request hexagon-dev.5788 after 5 retry, last_error:Traceback (most recent call last): 5: TVMFuncCall at /workspace/src/runtime/c_runtime_api.cc:477 4: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const at /workspace/include/tvm/runtime/packed_func.h:1217 3: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::$_0> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) at /workspace/include/tvm/runtime/packed_func.h:1213 2: tvm::runtime::$_0::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const at /workspace/src/runtime/rpc/rpc_socket_impl.cc:132 1: tvm::runtime::RPCClientConnect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, tvm::runtime::TVMArgs) at /workspace/src/runtime/rpc/rpc_socket_impl.cc:112 0: tvm::runtime::RPCConnect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, tvm::runtime::TVMArgs) at /workspace/src/runtime/rpc/rpc_socket_impl.cc:72 File "/workspace/src/runtime/rpc/rpc_socket_impl.cc", line 72 TVMError: --------------------------------------------------------------- An error occurred during the execution of TVM. For more information, please see: https://tvm.apache.org/docs/errors.html --------------------------------------------------------------- Check failed: (sock.Connect(addr)) is false: Connect to 127.0.0.1:65535 failed" Stacktrace request = <FixtureRequest for <Function test_reduce_map[in_shape0-0-False-argmax-float32]>> def fill(request): item = request._pyfuncitem fixturenames = getattr(item, "fixturenames", None) if fixturenames is None: fixturenames = request.fixturenames if hasattr(item, 'callspec'): for param, val in sorted_by_dependency(item.callspec.params, fixturenames): if val is not None and is_lazy_fixture(val): item.callspec.params[param] = request.getfixturevalue(val.name) elif param not in item.funcargs: > item.funcargs[param] = request.getfixturevalue(param) /venv/apache-tvm-py3.8/lib/python3.8/site-packages/pytest_lazyfixture.py:37: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ python/tvm/contrib/hexagon/pytest_plugin.py:278: in hexagon_session with hexagon_launcher.create_session() as session: python/tvm/contrib/hexagon/session.py:109: in __enter__ raise exception python/tvm/contrib/hexagon/session.py:92: in __enter__ self._rpc = tracker.request( _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <tvm.rpc.client.TrackerSession object at 0x7f14670bc880> key = 'hexagon-dev.5788', priority = 0, session_timeout = 0, max_retry = 5 session_constructor_args = ['tvm.contrib.hexagon.create_hexagon_session', 'hexagon-rpc', 262144, '', 268435456] def request( self, key, priority=1, session_timeout=0, max_retry=5, session_constructor_args=None ): """Request a new connection from the tracker. Parameters ---------- key : str The type key of the device. priority : int, optional The priority of the request. session_timeout : float, optional The duration of the session, allows server to kill the connection when duration is longer than this value. When duration is zero, it means the request must always be kept alive. max_retry : int, optional Maximum number of times to retry before give up. session_constructor_args : list, optional List of additional arguments to passed as the remote session constructor. The first element of the list is always a string specifying the name of the session constructor, the following args are the positional args to that function. """ last_err = None for _ in range(max_retry): try: if self._sock is None: self._connect() base.sendjson(self._sock, [base.TrackerCode.REQUEST, key, "", priority]) value = base.recvjson(self._sock) if value[0] != base.TrackerCode.SUCCESS: raise RuntimeError("Invalid return value %s" % str(value)) url, port, matchkey = value[1] return connect( url, port, matchkey, session_timeout, session_constructor_args=session_constructor_args, ) except socket.error as err: self.close() last_err = err except TVMError as err: last_err = err > raise RuntimeError( "Cannot request %s after %d retry, last_error:%s" % (key, max_retry, str(last_err)) ) E RuntimeError: Cannot request hexagon-dev.5788 after 5 retry, last_error:Traceback (most recent call last): E 5: TVMFuncCall E at /workspace/src/runtime/c_runtime_api.cc:477 E 4: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const E at /workspace/include/tvm/runtime/packed_func.h:1217 E 3: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::$_0> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) E at /workspace/include/tvm/runtime/packed_func.h:1213 E 2: tvm::runtime::$_0::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const E at /workspace/src/runtime/rpc/rpc_socket_impl.cc:132 E 1: tvm::runtime::RPCClientConnect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, tvm::runtime::TVMArgs) E at /workspace/src/runtime/rpc/rpc_socket_impl.cc:112 E 0: tvm::runtime::RPCConnect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, tvm::runtime::TVMArgs) E at /workspace/src/runtime/rpc/rpc_socket_impl.cc:72 E File "/workspace/src/runtime/rpc/rpc_socket_impl.cc", line 72 E TVMError: E --------------------------------------------------------------- E An error occurred during the execution of TVM. E For more information, please see: https://tvm.apache.org/docs/errors.html E --------------------------------------------------------------- E Check failed: (sock.Connect(addr)) is false: Connect to 127.0.0.1:65535 failed python/tvm/rpc/client.py:416: RuntimeError ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
