subject:"\[GitHub\] BiranLi commented on issue #8373\: distribute training in fp16"

[GitHub] BiranLi commented on issue #8373: distribute training in fp16

2018-02-23 Thread GitBox

BiranLi commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-368188462 @rahul003 Because of the limited range of expression of FP16, gradient diffusion occurs in BP. The easiest way to handle this is to scale the gra

[GitHub] BiranLi commented on issue #8373: distribute training in fp16

2018-02-23 Thread GitBox

BiranLi commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-368186682 @solin319 Is it possible to consider gradient diffusion in computational calculations, such as adding a grad_scale processing interface? --