fhieber opened a new issue #12629: Conflicting documentation for RNN dropout
URL: https://github.com/apache/incubator-mxnet/issues/12629
 
 
   Current documentation on dropout in RNN layers / FusedRNN operator is a bit 
confusing to me: 
   - mx.gluon.rnn.{LSTM,RNN,GRU}'s docstring (for example: 
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/rnn/rnn_layer.py#L359)
 state `If non-zero, introduces a dropout layer **on the outputs** of each RNN 
layer except the last layer.` and this is in line with the `unfuse()` method, 
which interleaves LSTMCells with DropoutCells, but skips it on the last layer: 
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/rnn/rnn_layer.py#L144.
   - However, fused forward_kernel() uses mx.symbol.RNN, which itself documents 
dropout as follows: `p (float, optional, default=0) – Dropout probability, 
**fraction of the input** that gets dropped out at training time.`
   
   I assume equivalence between fused and unfused RNN layers has been tested, 
so the implementation is probably fine, but one of the docstrings (probably 
mx.symbol.RNN) should probably be updated for clarity.
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to