stephenrawls opened a new issue #14019: SequenceLast undefined behavior
URL: https://github.com/apache/incubator-mxnet/issues/14019
 
 
   SequenceLast currently causes undefined behavior if `use_sequence_length` is 
true and any of the values in the `sequence_length` array are 0.
   
   Proof: Run this program with cuda-memcheck
   ```
   import mxnet as mx
   
   data = mx.nd.array([[0,0],[0,0]], ctx=mx.gpu())
   seq_lens = mx.nd.array([0, 0], ctx=mx.gpu())
   
   res = mx.nd.SequenceLast(data, sequence_length=seq_lens, 
use_sequence_length=True, axis=0)
   
   print("res = %s" % str(res.as_in_context(mx.cpu())))
   ```
   
   Line in code where sequence_length index is used w/o checking if it is valid:
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/sequence_last-inl.h#L73-L75
   
   On the one hand, the documentation does clearly state that: 
   ```
   Parameter sequence_length is used to handle variable-length sequences. 
sequence_length should be an input array of positive ints of dimension 
[batch_size]
   ```
   And 0 is not a positive integer so it fails this pre-condition.
   
   On the other hand, both of SequenceReverse and SequenceMask handle 0-valued 
sequence_length's safely. Also, the Take operator, which SequenceLast even 
mentions as an almost equivalent alternative, also handles out-of-band indices 
safely.
   
   The argument I have for why you would want to pass in a 0 sequence length 
(or perhaps even an invalid sequence length) -- In a bucketing executor 
approach where the user only supports a fixed number of shapes, it is entirely 
likely that my batch size is, say 16, but I only have 10 actual elements to 
fill into my input array. The other 6 elements are all padding, and my code 
currently just zeros out all padding without regard to if it is an input 
element that will be used as a "sequence_length" parameter.
   
   In the interest of not having undefined behavior (which led to 
non-deterministic crashes in some of my code), I would like to propose making 
SequenceLast safe when any of the sequence_lengths is 0.
   
   If there is interest in this I can put in a PR. Please let me know.
   
   Thanks,
   Stephen

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to