xidulu opened a new issue #18198:
URL: https://github.com/apache/incubator-mxnet/issues/18198


   ## Description
   
   Ran GPU unit tests
   `DMLC_LOG_STACK_TRACE_DEPTH=10 MXNET_MODULE_SEED=781106105 
MXNET_ENGINE_TYPE=NaiveEngine pytest tests/python/gpu/test_operator_gpu.py`
   
   ### Error Message
   ```
   tests/python/gpu/test_operator_gpu.py .........s.s...................... [  
5%]
   .........................FFFFFsF.FFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFF [ 
16%]
   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 
28%]
   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFsFssFFFFFF [ 
39%]
   FFFFFFFFFFFFFFFFsFFFFFFFFFFFFFFFFFFFsFFFFFFFFFFFsFFFFFFFsFFFsFFFFFFFFFFF [ 
50%]
   FFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFsFFFFFFFFFFFFFFFFFFFFFFFFF [ 
62%]
   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFsFFFFFFFFFsFFFFFF.FFFFFFFFFFFF [ 
73%]
   FFFFFFFFFFFFFFFFFFF....F.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF....FFFFFFFFFFFFF [ 
84%]
   FFFxxxFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 
96%]
   FFFFFFFFFFFFFFFFFFFFFFFF                                                 
[100%]
   
   =================================== FAILURES 
===================================
   _______________________ test_batchnorm_backwards_notrain 
_______________________
   
   args = (), kwargs = {}, test_count = 1, env_seed_str = None, i = 0
   this_test_seed = 1871614074, log_level = 10
   post_test_state = ('MT19937', array([ 793462385, 4162567913, 2690816661, 
3146259572, 1379942102,
           894119658,  364406528, 36749442..., 3314795127, 3420630909, 
2538379262,
          3698999054, 2822638424,  471751221, 3037373484], dtype=uint32), 1, 0, 
0.0)
   
       @functools.wraps(orig_test)
       def test_new(*args, **kwargs):
           test_count = int(os.getenv('MXNET_TEST_COUNT', '1'))
           env_seed_str = os.getenv('MXNET_TEST_SEED')
           for i in range(test_count):
               if seed is not None:
                   this_test_seed = seed
                   log_level = logging.INFO
               elif env_seed_str is not None:
                   this_test_seed = int(env_seed_str)
                   log_level = logging.INFO
               else:
                   this_test_seed = np.random.randint(0, np.iinfo(np.int32).max)
                   log_level = logging.DEBUG
               post_test_state = np.random.get_state()
               np.random.seed(this_test_seed)
   >           mx.random.seed(this_test_seed)
   
   tests/python/unittest/common.py:206: 
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ 
   python/mxnet/random.py:96: in seed
       check_call(_LIB.MXRandomSeed(seed_state))
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ 
   
   ret = -1
   
       def check_call(ret):
           """Check the return value of C API call.
       
           This function will raise an exception when an error occurs.
           Wrap every API call with this function.
       
           Parameters
           ----------
           ret : int
               return value from API calls.
           """
           if ret != 0:
   >           raise get_last_ffi_error()
   E           mxnet.base.MXNetError: Traceback (most recent call last):
   E             [bt] (5) 
/home/ubuntu/mxnet_master_develop/python/mxnet/../../build/libmxnet.so(MXRandomSeed+0x1a)
 [0x7f89ce0a0c0a]
   E             [bt] (4) 
/home/ubuntu/mxnet_master_develop/python/mxnet/../../build/libmxnet.so(mxnet::resource::ResourceManagerImpl::SeedRandom(unsigned
 int)+0x30b) [0x7f89d11b081b]
   E             [bt] (3) 
/home/ubuntu/mxnet_master_develop/python/mxnet/../../build/libmxnet.so(mxnet::engine::NaiveEngine::PushAsync(std::function<void
 (mxnet::RunContext, mxnet::engine::CallbackOnComplete)>, mxnet::Context, 
std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, 
std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, 
mxnet::FnProperty, int, char const*, bool)+0x43b) [0x7f89ce1d08eb]
   E             [bt] (2) 
/home/ubuntu/mxnet_master_develop/python/mxnet/../../build/libmxnet.so(std::_Function_handler<void
 (mxnet::RunContext, mxnet::engine::CallbackOnComplete), 
mxnet::resource::ResourceManagerImpl::ResourceParallelRandom<mshadow::gpu>::SeedOne(unsigned
 long, unsigned int)::{lambda(mxnet::RunContext, 
mxnet::engine::CallbackOnComplete)#1}>::_M_invoke(std::_Any_data const&, 
mxnet::RunContext&&, mxnet::engine::CallbackOnComplete&&)+0x1e) [0x7f89d11ac6ce]
   E             [bt] (1) 
/home/ubuntu/mxnet_master_develop/python/mxnet/../../build/libmxnet.so(mxnet::common::random::RandGenerator<mshadow::gpu,
 float>::Seed(mshadow::Stream<mshadow::gpu>*, unsigned int)+0x1e9) 
[0x7f89d1240f55]
   E             [bt] (0) 
/home/ubuntu/mxnet_master_develop/python/mxnet/../../build/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x7f)
 [0x7f89cdf9b24f]
   E             File "../src/common/random_generator.cu", line 58
   E           Name: Check failed: err == cudaSuccess (10 vs. 0) : 
rand_generator_seed_kernel ErrStr:invalid device ordinal
   
   python/mxnet/base.py:246: MXNetError
   ---------------------------- Captured stderr setup 
-----------------------------
   WARNING:root:Unable to import numpy/mxnet. Skipping seeding for numpy/mxnet.
   ------------------------------ Captured log setup 
------------------------------
   WARNING  root:conftest.py:177 Unable to import numpy/mxnet. Skipping seeding 
for numpy/mxnet.
   ____________________ test_create_sparse_ndarray_gpu_to_cpu 
_____________________
   
   ```
   
   ## To Reproduce
   (If you developed your own code, please provide a short script that 
reproduces the error. For existing examples, please provide link.)
   
   ### Steps to reproduce
   (Paste the commands you ran that produced the error.)
   
   1.
   2.
   
   ## What have you tried to solve it?
   
   1.
   2.
   
   ## Environment
   
   We recommend using our script for collecting the diagnositc information. Run 
the following command and paste the outputs below:
   ```
   curl --retry 10 -s 
https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | 
python
   
   # paste outputs here
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to