Hi developers,

As you may know, currently there is no fused RNN operator for CPU in mxnet and 
that prevents users from migrating or deploying their models on CPU, if their 
models are built with mxnet fused RNN cell APIs. This feature disparity also 
makes it hard to maintain mxnet code and develop unit tests for this feature.

We are trying to fill this gap with self-implemented RNN operators or with 
MKL-DNN primitives. Currently, PR 
#10104<https://github.com/apache/incubator-mxnet/pull/10104> and 
#10311<https://github.com/apache/incubator-mxnet/pull/10311> are submitted for 
fused LSTM and GRU and ready for review. Both inference and training are 
implemented for these two RNN variants. We can get >2x performance improvement 
compared with LSTMCell and GRUCell. Recently, Intel released the experimental 
feature of RNN primitives in MKL-DNN. We are also planning to integrate MKL-DNN 
RNN primitives into mxnet.

A proposal is drafted to describe what we have done and what we are planning to 
do in the near future. Please kindly find it with below links and feel free to 
give any comments:
Mxnet wiki: 
https://cwiki.apache.org/confluence/display/MXNET/Fused+RNN+Operators+for+CPU
Google doc: 
https://docs.google.com/document/d/1XC_PmbSc7q6px22LIW3vwhbA_wmX8wRGLRnet3pMJrs/edit?usp=sharing

BTW, we are trying to enable more RNN related models to verify the performance 
improvement and accuracy in real workloads. It would be very appreciated if 
anyone can provide open-sourced models which are using the fused RNN operators 
of mxnet. Seems sockeye from awslab and DS2 from mxnet example folder are not 
using that.

Thanks in advance.

*we: intel team, cced
-------------------------------
Best Regards,
LvTao

Reply via email to