mseth10 commented on issue #12314: flaky test: test_operator.test_dropout URL: https://github.com/apache/incubator-mxnet/issues/12314#issuecomment-438501681 I investigated this issue. The dropout symbol implementation calls a function BernoulliGenerate when MKL and OpenMP flags are on. This is where the flakiness comes from. BernoulliGenerate uses multithreading and calls MKL library function viRngBernoulli to populate a mask vector. https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/dropout-inl.h#L82 When **p**=1.0, it is supposed to populate the mask vector **r** of length **n** with all 1s. But for a few values of **seed** internally generated, mask vector has n-1 1s and a 0 (located at a different index for a different flaky seed), which causes the error. Also, the error only occurs when multi-threading used. It is not reproduced when single thread used. I suspect that viRngBernoulli is not thread-safe. @TaoLv @pengzhao-intel @ZhennanQin @xinyu-intel Can you please take a look? Would really like your inputs on the same. Thanks!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
