unrealwill opened a new issue #8029: Bug on sum URL: https://github.com/apache/incubator-mxnet/issues/8029 ## Environment info Operating System: Ubuntu 16.04 Package used (Python/R/Scala/Julia): python MXNet version: '0.11.0' (pip3 install mxnet-cu80==0.11.0) I got a very strange crash in my code. Even though you have a clean crash line code, you will probably have an hard time with this bug (as this is probably a bug about the core of the graph). Have fun :) I managed to reproduce the crash with the simple version below : ``` import mxnet as mx import numpy as np from mxnet.io import DataDesc def bugBroadcast(): bs = 16 inpsize = 784 featsize = 128 data = mx.symbol.Variable('data') bin = mx.symbol.FullyConnected(data=data, num_hidden=featsize) m0 = bin + bin + bin #m0 = bin + bin #this works m1 = bin + bin + bin # m1 = bin + bin #this works cum0 = mx.symbol.sum(m0) cum1 = mx.symbol.sum(m1) s1 = cum0+cum1 diff = bin + bin #diff = bin #this works s2= mx.symbol.sum(diff) res = s1+s2 out = mx.symbol.MakeLoss(res) #this doesn't work #out = mx.symbol.MakeLoss(mx.symbol.BlockGrad(res)) # this work #out = mx.symbol.MakeLoss(s1) # this work #out = mx.symbol.MakeLoss(s2) # this work m = mx.mod.Module(symbol=out, context=mx.cpu(), data_names=["data"], label_names=[]) #also crash on context=mx.gpu() m.bind(data_shapes=[DataDesc("data", (bs, inpsize))]) m.init_params() m.init_optimizer(optimizer='adam') dd = np.random.randn(bs,inpsize) iter = mx.io.NDArrayIter(dd, batch_size=bs, shuffle=True) batch = iter.next() m.forward(batch) res = m.get_outputs() m.backward() m.update() print(res[0].asnumpy()) if __name__ == '__main__': bugBroadcast() ``` Here is the crash : > [17:25:45] /home/username/mxnet/dmlc-core/include/dmlc/./logging.h:308: [17:25:45] src/core/graph.cc:50: Check failed: it != node2index_.end() && it->first == nptr.get() And the associated Stack trace : > Stack trace returned 10 entries: > [bt] (0) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(+0x27364e1) [0x7f08a4b5b4e1] > [bt] (1) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm12IndexedGraphC1ERKNS_5GraphE+0x2ed) [0x7f08a4b5c4dd] > [bt] (2) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm5Graph13indexed_graphEv+0x2f) [0x7f08a4b5dc1f] > [bt] (3) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13AssignContextEN4nnvm5GraphERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_mm+0x5f) [0x7f08a3b5c47f] > [bt] (4) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor9InitGraphEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSN_INS_9OpReqTypeESaISS_EE+0xe6) [0x7f08a3b61736] > [bt] (5) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x11f) [0x7f08a3b6d4ff] > [bt] (6) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor10SimpleBindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_RKSt13unordered_mapISC_NS1_6TShapeESt4hashISC_ESt8equal_toISC_ESaISF_ISG_SS_EEERKSR_ISC_iSU_SW_SaISF_ISG_iEEERKSM_INS_9OpReqTypeESaIS17_EERKSt13unordered_setISC_SU_SW_SaISC_EEPSM_INS_7NDArrayESaIS1H_EES1K_S1K_PSR_ISC_S1H_SU_SW_SaISF_ISG_S1H_EEEPS0_+0x233) [0x7f08a3b6e313] > [bt] (7) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(MXExecutorSimpleBind+0x2d4a) [0x7f08a3b2b2da] > [bt] (8) /home/username/venvmxnet/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7f08bfbb1e20] > [bt] (9) /home/username/venvmxnet/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7f08bfbb188b] > > Traceback (most recent call last): > File "/home/username/mxnet/python/mxnet/symbol.py", line 1473, in simple_bind > ctypes.byref(exe_handle))) > File "/home/username/mxnet/python/mxnet/base.py", line 129, in check_call > raise MXNetError(py_str(_LIB.MXGetLastError())) > mxnet.base.MXNetError: [17:25:45] src/core/graph.cc:50: Check failed: it != node2index_.end() && it->first == nptr.get() > > Stack trace returned 10 entries: > [bt] (0) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(+0x27364e1) [0x7f08a4b5b4e1] > [bt] (1) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm12IndexedGraphC1ERKNS_5GraphE+0x2ed) [0x7f08a4b5c4dd] > [bt] (2) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm5Graph13indexed_graphEv+0x2f) [0x7f08a4b5dc1f] > [bt] (3) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13AssignContextEN4nnvm5GraphERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_mm+0x5f) [0x7f08a3b5c47f] > [bt] (4) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor9InitGraphEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSN_INS_9OpReqTypeESaISS_EE+0xe6) [0x7f08a3b61736] > [bt] (5) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x11f) [0x7f08a3b6d4ff] > [bt] (6) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor10SimpleBindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_RKSt13unordered_mapISC_NS1_6TShapeESt4hashISC_ESt8equal_toISC_ESaISF_ISG_SS_EEERKSR_ISC_iSU_SW_SaISF_ISG_iEEERKSM_INS_9OpReqTypeESaIS17_EERKSt13unordered_setISC_SU_SW_SaISC_EEPSM_INS_7NDArrayESaIS1H_EES1K_S1K_PSR_ISC_S1H_SU_SW_SaISF_ISG_S1H_EEEPS0_+0x233) [0x7f08a3b6e313] > [bt] (7) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(MXExecutorSimpleBind+0x2d4a) [0x7f08a3b2b2da] > [bt] (8) /home/username/venvmxnet/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7f08bfbb1e20] > [bt] (9) /home/username/venvmxnet/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7f08bfbb188b] > > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "bugbroadcast.py", line 50, in <module> > bugBroadcast() > File "bugbroadcast.py", line 36, in bugBroadcast > m.bind(data_shapes=[DataDesc("data", (bs, inpsize))]) > File "/home/username/mxnet/python/mxnet/module/module.py", line 417, in bind > state_names=self._state_names) > File "/home/username/mxnet/python/mxnet/module/executor_group.py", line 231, in __init__ > self.bind_exec(data_shapes, label_shapes, shared_group) > File "/home/username/mxnet/python/mxnet/module/executor_group.py", line 327, in bind_exec > shared_group)) > File "/home/username/mxnet/python/mxnet/module/executor_group.py", line 603, in _bind_ith_exec > shared_buffer=shared_data_arrays, **input_shapes) > File "/home/username/mxnet/python/mxnet/symbol.py", line 1479, in simple_bind > raise RuntimeError(error_msg) > RuntimeError: simple_bind error. Arguments: > data: (16, 784) > [17:25:45] src/core/graph.cc:50: Check failed: it != node2index_.end() && it->first == nptr.get() > > Stack trace returned 10 entries: > [bt] (0) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(+0x27364e1) [0x7f08a4b5b4e1] > [bt] (1) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm12IndexedGraphC1ERKNS_5GraphE+0x2ed) [0x7f08a4b5c4dd] > [bt] (2) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm5Graph13indexed_graphEv+0x2f) [0x7f08a4b5dc1f] > [bt] (3) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13AssignContextEN4nnvm5GraphERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_mm+0x5f) [0x7f08a3b5c47f] > [bt] (4) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor9InitGraphEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSN_INS_9OpReqTypeESaISS_EE+0xe6) [0x7f08a3b61736] > [bt] (5) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x11f) [0x7f08a3b6d4ff] > [bt] (6) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor10SimpleBindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_RKSt13unordered_mapISC_NS1_6TShapeESt4hashISC_ESt8equal_toISC_ESaISF_ISG_SS_EEERKSR_ISC_iSU_SW_SaISF_ISG_iEEERKSM_INS_9OpReqTypeESaIS17_EERKSt13unordered_setISC_SU_SW_SaISC_EEPSM_INS_7NDArrayESaIS1H_EES1K_S1K_PSR_ISC_S1H_SU_SW_SaISF_ISG_S1H_EEEPS0_+0x233) [0x7f08a3b6e313] > [bt] (7) /home/username/mxnet/python/mxnet/../../lib/libmxnet.so(MXExecutorSimpleBind+0x2d4a) [0x7f08a3b2b2da] > [bt] (8) /home/username/venvmxnet/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7f08bfbb1e20] > [bt] (9) /home/username/venvmxnet/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7f08bfbb188b] ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
With regards, Apache Git Services
