Hi
Could you be specific about the bugs? While we could use this for debug some
particular errors as you describe I would think that in the general case you
would want to rely on unit testing and conditional checks for very small
numbers on the denominator if you can’t have a NaN. I think we should collect
some examples before and study them carefully as fp artihmetic is tricky. I
think is not common practice and also not portable to use signals and fp
exceptions, as you mentioned.
Pedro
> On 9. Nov 2018, at 00:30, Lin Yuan wrote:
>
> Dear MXNet Community,
>
> I recently found the NaN errors sometimes could be due to some
> divide-by-zero float number bugs in engine backend. However, by default,
> such an exception will not be thrown. I added a signal trap to catch this
> error (https://github.com/apache/incubator-mxnet/pull/13190) and caught a
> few exceptions when running the python unit test. But this only works for
> Linux OS.
>
> I would like to get more feedback on the best practice to catch such bugs
> in the code and if we should enforce such checks in CI. Any comment is
> appreciated.
>
> Best Regards,
>
> Lin