Thanks to everyone who made their opinion known. So far the consensus is that any nan handling in MXNet should not affect performance, at least not by default.
This still leaves the question open if we should aim for documenting the behavior of MXNet operators under presence of nan values. For example, should we include a sentence in the argmax and topk documentation? Should the 1.3 release notes note the changed behavior of topk? So far this has not been done. Instead any change of operator behavior with respect to nan values is treated as implementation change that is not worth noting to the user. As this can decrease user experience, I advocate for documenting the current behavior and possible future changes. In case there are no objections, is there any way to edit the changelog for the upcoming release?