Hi Alexander,
On 6/4/21 1:46 PM, Alexander Grund wrote:
======================================================================
ERROR: test_process_group_as_module_member
(__main__.C10dProcessGroupSerialization)
----------------------------------------------------------------------
Traceback (most recent call last):
File
"/tmp/eb-3tUIrb/tmpFJsxiC/lib/python3.8/site-packages/torch/testing/_internal/common_distributed.py",
line 146, in wrapper
return func(*args, **kwargs)
File "distributed/test_jit_c10d.py", line 228, in
test_process_group_as_module_member
self.checkModule(TestModule(), (torch.rand((2, 3)),))
File "distributed/test_jit_c10d.py", line 216, in __init__
tcp_store = _create_tcp_store()
File "distributed/test_jit_c10d.py", line 40, in _create_tcp_store
return torch.classes.dist_c10d.TCPStore(addr, port, 1, True,
timeout_millisecond)
RuntimeError: Address already in use
Please report that issue in the pytorch github repo. This is something
they need to fix or at least investigate first
Thanks for your recommendation! The PR is:
https://github.com/pytorch/pytorch/issues/59441
/Ole