szha commented on issue #10453: Bug of CuDNN RNN with variable sequence length URL: https://github.com/apache/incubator-mxnet/issues/10453#issuecomment-379809938 What I observed is that it doesn't fail consistently on certain specific batch. Another team observed the same issue before, and it is likely caused by our backend memory pool holding too much memory, in which case the curand doesn't have enough memory to keep the random number generator states for each stream multiprocessor.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services