zixuanweeei edited a comment on issue #15741: MKL-DNN LBR-GRU Inference Integration (FP32 LBR-GRU) URL: https://github.com/apache/incubator-mxnet/pull/15741#issuecomment-519739904 @DickJC123 Thanks for your patience. And FYI, it seems that the possible flaky tests become effective with the edited UTs for RNN variants. I have tried to modify the code following the instructions from https://github.com/apache/incubator-mxnet/pull/14476#pullrequestreview-225789440. Specifically, + the `temp_space`, which is renamed to `workspace_`, is allocated along with `reserve_space_` in `Init()` at [#L1485-L1490](https://github.com/zixuanweeei/incubator-mxnet/blob/2b07a5f0c58a8f96621e1e8d1308513c726e09c4/src/operator/rnn-inl.h#L1485-L1490). + then, `host_workspace`, renamed to `seq_len_space_`, is allocated in an `if (!init_cudnn_) {...}` branch at [#L622-L631](https://github.com/zixuanweeei/incubator-mxnet/blob/2b07a5f0c58a8f96621e1e8d1308513c726e09c4/src/operator/rnn-inl.h#L622-L631). And all the spaces above are allocated once only using ` ctx.requested[rnn_enum::kTempSpace]` instead of `Storage`. But it didn't work on *NIX system (CI has passed on windows-gpu, while it was failed on *NIX-gpu with `test_gluon_gpu.test_layer_bidirectional_varseqlength`). Though the modifications are **not included in this PR**, you can find the whole source from [this link](https://github.com/zixuanweeei/incubator-mxnet/blob/2b07a5f0c58a8f96621e1e8d1308513c726e09c4/src/operator/rnn-inl.h). I have no idea about whether the `temp_space` and `host_workspace` should be re-initialized for every iteration. Need your help since I'm not familiar with GPU :)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
