lebeg commented on issue #12443: Revert "Subgraph API for integrating accelerators with MXNet (#12157)" URL: https://github.com/apache/incubator-mxnet/pull/12443#issuecomment-418138778 My thought was the following: Git history for this file shows that this was the last commit for this file. ``` commit a64cf7d9c8c1c473e201b5bd68ab9af6bf7365ba Author: reminisce <[email protected]> Date: Thu Aug 30 19:13:33 2018 -0700 Subgraph API for integrating accelerators with MXNet (#12157) commit 2193819d40792d0526118819b991111e7ac4162d Author: Sam Skalicky <[email protected]> Date: Sun Aug 12 12:43:19 2018 -0700 [MXNET-788] Fix for issue #11733 pooling op test (#12067) ``` The build that failed was from 03-Sep-2018 06:00. Based on multiple errors not seen before and probably related to inconsistent state of CUDA memory: ``` test_operator_gpu.test_countsketch ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=104987558 to reproduce. ERROR test_operator_gpu.test_sparse_nd_basic ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2134146737 to reproduce. ERROR test_operator_gpu.test_exc_multiple_waits ... ok test_operator_gpu.test_lstm_bidirectional ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=200476953 to reproduce. ERROR test_operator_gpu.test_sparse_nd_setitem ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2082345391 to reproduce. ERROR test_operator_gpu.test_exc_post_fail ... ok test_operator_gpu.test_gru_sym ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1532640391 to reproduce. ERROR test_operator_gpu.test_exc_mutable_var_fail ... ok test_operator_gpu.test_sparse_nd_slice ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1828661033 to reproduce. ERROR test_operator_gpu.test_ndarray_elementwise ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1460065938 to reproduce. ERROR test_operator_gpu.test_gru_bidirectional ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=16762643 to reproduce. ERROR test_operator_gpu.test_ndarray_elementwisesum ... [06:59:47] src/operator/tensor/./.././../common/../operator/mxnet_op.h:622: Check failed: (err) == (cudaSuccess) Name: mxnet_generic_kernel ErrStr:an illegal memory access was encountered /work/runtime_functions.sh: line 639: 8 Aborted (core dumped) nosetests-2.7 $NOSE_COVERAGE_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu ``` I looked at what test was executed before: ``` test_operator_gpu.test_exc_imperative ... ok test_operator_gpu.test_subgraph_exe ... [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp0. Excluding nodes _plus0, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node exp1. Excluding nodes _plus1, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:741: The graph has no attribute of subgraph_property attached. The original graph is returned. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:741: The graph has no attribute of subgraph_property attached. The original graph is returned. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node sin3. Excluding nodes _plus3, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node sin3. Excluding nodes _plus3, and retrying [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node sin3. Excluding nodes _plus3, and retrying [06:59:45] src/executor/graph_executor.cc:1486: SubgraphPropertyOpNameSet for subgraph property default has been assigned a value. Please make sure it is initialized only for the testing purpose. [06:59:45] src/operator/subgraph/partition_graph.cc:335: Found a cycle when BFS from node sin3. Excluding nodes _plus3, and retrying ``` All of this made me think the issue might be related to the mentioned PR #12157.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
