haojin2 opened a new pull request #10255: [MXNET-142] Enhance test for 
LeakyReLU operator
URL: https://github.com/apache/incubator-mxnet/pull/10255
 
 
   ## Description ##
   Enhancement of test_leaky_relu and test_prelu
   
   ## Checklist ##
   ### Essentials ###
   - [x] The PR title starts with [MXNET-142]
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] Code is well-documented
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] Improve test_leaky_relu to provide coverage for all float types
   - [x] Improve test_prelu to provide coverage for all float types
   
   ## Comments ##
   - This PR aims to address the issue with previous version of test_leaky_relu 
caused by precision issues with finite difference method with floating point 16 
data type
   - With some experiment I discovered that finite difference method may not be 
suitable for checking numeric gradients with 16-bit floating point inputs. 
Here's an example (All numbers used are represented by 16-bit floating point 
numbers):
   x: [[-0.9,-0.8,-0.7],
        [-0.6,-0.5,-0.4],
        [-0.3,-0.2,-0.1],
        [0.1,0.2,0.3],
        [0.4,0.5,0.6],
        [0.7,0.8,0.9]]
   act_type: leaky_relu
   slope:0.25
   Analytical Derivative:
   [[0.25,0.25,0.25],
    [0.25,0.25,0.25],
    [0.25,0.25,0.25],
    [ 1.0 ,  1.0 ,  1.0],
    [ 1.0 ,  1.0 ,  1.0],
    [ 1.0 ,  1.0 ,  1.0]]
   Numeric Derivative from finite difference method with epsilon=1e-4:
   [[ 0.61035156  0.61035156  0.61035156]
    [ 0.61035156  0.30517578  0.30517578]
    [ 0.30517578  0.15258789  0.22888184]
    [ 0.91552734  0.61035156  1.22070312]
    [ 1.22070312  1.22070312  2.44140625]
    [ 2.44140625  2.44140625  2.44140625]]
   Now if we divide all values in x by 256, which means we are shrinking their 
absolute values, then apply numeric method with the same epsilon=1e-4, we get a 
new set of derivatives:
   [ 0.25268555  0.25268555  0.25268555]
    [ 0.25268555  0.25024414  0.25024414]
    [ 0.25024414  0.24914551  0.24975586]
    [ 0.99902344  0.99658203  1.00097656]
    [ 1.00097656  1.00097656  1.01074219]
    [ 1.01074219  1.01074219  1.01074219]]
   We can see that derivatives from numeric and analytical methods have way 
bigger difference when the absolute value of input x gets bigger. As a result, 
we need to use a smaller range for drawing the random inputs if we want to do 
verification through numeric methods on 16-bit floating point numbers.
   - The seeds for both tests are set to be fixed because the test could still 
be a bit flaky for check_numeric_gradient, most randomized tests run on my 
local machine passed. The failure cases all failed with a slightly bigger error 
than the tolerance, to reduce the occasional flaky behavior, I chose to fix the 
seed, or we can also get rid of check_numeric_gradient and just check the 
analytical gradients.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to