zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-369194351
@cjolivier01 do you have more comments?
@piiswrong do you want to review the code?
The PR should have fixed
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-369142927
@marcoabreu Reorder2Default and MKLDNNDataReorder shouldn't be called
frequently. They are not in the critical path.
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-369142927
@marcoabreu Reorder2Default and MKLDNNDataReorder shouldn't be called
frequently. They are not in the critical path.
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368912733
@TaoLv I have updated the design doc to explain why we need data layout
conversion.
---
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368856181
I think I'm done with changes for this PR. I run test_gluon_model_zoo_gpu.py
for 1000 times and didn't see a race co
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368856181
I think I'm done with changes for this PR. I run test_gluon_model_zoo_gpu.py
for 1000 times and didn't see a race co
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368098057
@larroy this is the design doc of mkldnn:
https://cwiki.apache.org/confluence/display/MXNET/The+design+of+MKLDNN+int
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368098057
@larroy not yet. this is the design doc of mkldnn:
https://cwiki.apache.org/confluence/display/MXNET/The+design+of+M
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368095680
@cjolivier01 why race condition happens more frequently when threads run in
a smaller number of CPU cores? It seems
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-368095680
@cjolivier01 why race condition happens more frequently when threads run in
a smaller number of CPU cores? It seems
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-367913772
it seems the current modification still can't get rid of all race conditions
in the code. the reason is that we want
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-367876047
It's very difficult to reproduce a race condition in a deterministic way if
it's possible.
---
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-367869410
The reason I disabled the inference tests because I previously thought the
failure was related to numeric errors and
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-367862145
@marcoabreu enabling the tests can catch the error more easily.
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-367861997
@cjolivier01 The seed is set so we know what is the expected result. It's
easier to tell whether CPU or GPU compute
zheng-da commented on issue #9862: Fix a race condition in converting data
layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#issuecomment-367845835
@marcoabreu previously, I disabled the inference test. Now I enabled all
tests. I also added some prints to clearly
16 matches
Mail list logo