leezu commented on issue #18564:
URL: 
https://github.com/apache/incubator-mxnet/issues/18564#issuecomment-700907614


   Also flaky on CI. @ArmageddonKnight can you take a look why the test is 
causing segfault?
   
   ```
   [2020-09-29T17:47:32.445Z] 
tests/python/gpu/test_profiler_gpu.py::test_gpu_memory_profiler_gluon 
   [2020-09-29T17:47:32.445Z] Fatal Python error: Segmentation fault
   [2020-09-29T17:47:32.445Z] 
   [2020-09-29T17:47:32.445Z] Thread 0x00007f1161e1c700 (most recent call 
first):
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 400 in 
read
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 432 in 
from_io
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 967 in 
_thread_receiver
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 220 in 
run
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 285 in 
_perform_spawn
   [2020-09-29T17:47:32.445Z] 
   [2020-09-29T17:47:32.445Z] Current thread 0x00007f11633a4740 (most recent 
call first):
   [2020-09-29T17:47:32.445Z]   File 
"/work/mxnet/python/mxnet/ndarray/ndarray.py", line 2907 in backward
   [2020-09-29T17:47:32.445Z]   File 
"/work/mxnet/tests/python/gpu/test_profiler_gpu.py", line 129 in 
test_gpu_memory_profiler_gluon
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/python.py", line 167 in 
pytest_pyfunc_call
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in 
_multicall
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/python.py", line 1445 in runtest
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 134 in 
pytest_runtest_call
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in 
_multicall
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 210 in <lambda>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 237 in 
from_call
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 210 in 
call_runtest_hook
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/flaky/flaky_pytest_plugin.py", line 129 
in call_and_report
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 99 in 
runtestprotocol
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 84 in 
pytest_runtest_protocol
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/flaky/flaky_pytest_plugin.py", line 92 
in pytest_runtest_protocol
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in 
_multicall
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/xdist/remote.py", line 87 in 
run_one_test
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/xdist/remote.py", line 70 in 
pytest_runtestloop
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in 
_multicall
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 247 in _main
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 197 in 
wrap_session
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 240 in 
pytest_cmdline_main
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in 
_multicall
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/xdist/remote.py", line 258 in <module>
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 1084 in 
executetask
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 220 in 
run
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 285 in 
_perform_spawn
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 267 in 
integrate_as_primary_thread
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 1060 in 
serve
   [2020-09-29T17:47:32.445Z]   File 
"/usr/local/lib/python3.6/dist-packages/execnet/gateway_base.py", line 1554 in 
serve
   [2020-09-29T17:47:32.445Z]   File "<string>", line 8 in <module>
   [2020-09-29T17:47:32.445Z]   File "<string>", line 1 in <module>
   [2020-09-29T17:47:32.445Z] 
tests/python/gpu/test_profiler_gpu.py::test_gpu_memory_profiler_symbolic 
   [2020-09-29T17:47:32.699Z] [gw0] [ 90%] PASSED 
tests/python/gpu/test_profiler_gpu.py::test_gpu_memory_profiler_symbolic 
   [2020-09-29T17:47:32.699Z] 
tests/python/gpu/test_profiler_gpu.py::test_profile_create_domain 
   [2020-09-29T17:47:32.699Z] [gw0] [ 90%] PASSED 
tests/python/gpu/test_profiler_gpu.py::test_profile_create_domain 
   [2020-09-29T17:47:32.699Z] [gw3] [ 90%] PASSED 
tests/python/gpu/test_gluon_gpu.py::test_cosine_loss[False] 
   [2020-09-29T17:47:32.699Z] [gw1] node down: Not properly terminated
   [2020-09-29T17:47:32.699Z] [gw1] [ 91%] FAILED 
tests/python/gpu/test_profiler_gpu.py::test_gpu_memory_profiler_gluon 
   [2020-09-29T17:47:32.699Z] 
   [2020-09-29T17:47:32.699Z] replacing crashed worker gw1
   ```
   
   
https://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-gpu/branches/PR-19185/runs/2/nodes/277/steps/307/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to