perdasilva opened a new issue #17020: [CUDA 9.0] NVRTC Compilation failed URL: https://github.com/apache/incubator-mxnet/issues/17020 ## Description Since #15167, the CD pipeline for CUDA 9.0 have been failing. ### Error Message Many examples can be taken from the CD pipeline, [e.g.](http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-cd-release-job/detail/mxnet-cd-release-job/276/pipeline) ``` ====================================================================== ERROR: test_operator_gpu.test_batchnorm_training ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/nose/util.py", line 620, in newfunc return func(*arg, **kw) File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 177, in test_new orig_test(*args, **kwargs) File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 1830, in test_batchnorm_training check_batchnorm_training('default') File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 1769, in check_batchnorm_training check_numeric_gradient(test, in_location, mean_std, numeric_eps=1e-2, rtol=0.16, atol=1e-2) File "/work/mxnet/python/mxnet/test_utils.py", line 1101, in check_numeric_gradient symbolic_grads = {k:executor.grad_dict[k].asnumpy() for k in grad_nodes} File "/work/mxnet/python/mxnet/test_utils.py", line 1101, in <dictcomp> symbolic_grads = {k:executor.grad_dict[k].asnumpy() for k in grad_nodes} File "/work/mxnet/python/mxnet/ndarray/ndarray.py", line 2532, in asnumpy ctypes.c_size_t(data.size))) File "/work/mxnet/python/mxnet/base.py", line 255, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) MXNetError: [21:10:06] src/operator/fusion/fused_op.cu:558: Check failed: compileResult == NVRTC_SUCCESS (6 vs. 0) : NVRTC Compilation failed. Please set environment variable MXNET_USE_FUSION to 0. ``` ## To Reproduce Run the CD pipeline for cu90 and/or cu90mkl ## What have you tried to solve it? I noticed that USE_NVTX=1 wasn't set in the [make configuration](https://github.com/apache/incubator-mxnet/blob/master/make/pip/pip_linux_cu90.mk) for CUDA 9.0 - but this had no effect. ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services