mseth10 commented on issue #12314: flaky test: test_operator.test_dropout
URL: 
https://github.com/apache/incubator-mxnet/issues/12314#issuecomment-438501681
 
 
   I investigated this issue. The dropout symbol implementation calls a 
function BernoulliGenerate when MKL and OpenMP flags are on. This is where the 
flakiness comes from.
   
   BernoulliGenerate uses multithreading and calls MKL library function 
viRngBernoulli to populate a mask vector. 
https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/dropout-inl.h#L82
   
   When **p**=1.0, it is supposed to populate the mask vector **r** of length 
**n** with all 1s. But for a few values of **seed** internally generated, mask 
vector has n-1 1s and a 0 (located at a different index for a different flaky 
seed), which causes the error.
   
   Also, the error only occurs when multi-threading used. It is not reproduced 
when single thread used.
   
   I suspect that viRngBernoulli is not thread-safe.
   
   @TaoLv @pengzhao-intel @ZhennanQin @xinyu-intel Can you please take a look? 
Would really like your inputs on the same. Thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to