Hello MXNet community,
It seems that there is currently no agreed upon principle to handle `nan` values in operators. This has led to inconsistencies between operators and also to inconsistency over releases. Some operators ignore nan values (eg. argmax), others treated it as maximum (e.g. topk up to mxnet v1.2) or just return “undefined” output (e.g. topk starting with mxnet v1.3). Initially the change in topk was reported as a bug (https://github.com/apache/incubator-mxnet/issues/8510) as some users relied on the behavior. However (and rightfully) @asmushetzel, who contributed the improved topk operator for mxnet v1.3 pointed out that the change did not break any documented behavior. To go forward, please share your opinion how MXNet should handle `nan` values. Should we continue to treat the behavior as undefined and possibly silently changing between releases? Should we define a reasonable standard (e.g. follow numpy) and treat operators that deviate as buggy? Should we just document how operators behave currently and warn if the behavior changes? Something else? Please make your opinion known so above issue can be resolved/closed and general guidelines can be defined for future contributions, following whatever consensus emerges. Thanks! Leonard
