D-Roberts opened a new pull request #18757:
URL: https://github.com/apache/incubator-mxnet/pull/18757


   As titled. This is a resubmit of 
[#18197](https://github.com/apache/incubator-mxnet/pull/18197) . In addition, 
tests were re-verified for robustness.
   ```
   MXNET_TEST_COUNT=10000 pytest -v 
    ~/workspace/incubator 
mxnet/tests/python/unittest/test_numpy_op.py::test_np_linalg_qr
   
   incubator-mxnet/tests/python/unittest/test_numpy_op.py::test_np_linalg_qr 
PASSED  [100%]
   ```                                                   
   The obtained gradient has the same values for a given input as with 
TensorFlow. The implemented method is similar to the method implemented in 
[tf](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/linalg_grad.py)
 .
   
   Here are cross-checked examples:
   ```
   import mxnet as mx
   import mxnet.numpy as np
   import numpy as _np
   _np.random.seed(42)
   data_np = _np.random.uniform(-1, 1, (3, 5)).astype(_np.float32)
   data = np.array(data_np, dtype='float32')
   data.attach_grad()
   with mx.autograd.record():
       ret = np.linalg.qr(data)
   mx.autograd.backward(ret)
   print(data.grad)
   [[ 0.7569422   0.5140486  -0.48962986 -0.48962957 -0.48962957]
    [-2.414882   -1.2380642  -1.6602778  -1.6602782  -1.6602782 ]
    [ 0.2763981  -0.5044659   0.06115001  0.06115055  0.06115055]]
   
   import tensorflow as tf
   import numpy as _np
   _np.random.seed(42)
   data_np = _np.random.uniform(-1, 1, (3, 5)).astype(_np.float32)
   data = tf.convert_to_tensor(data_np)
   with tf.GradientTape() as g:
       g.watch(data)
       ret = tf.linalg.qr(data)
   print(g.gradient(ret, data))
   tf.Tensor(
   [[ 0.75694233  0.51404876 -0.48962957 -0.48962957 -0.48962957]
    [-2.414882   -1.2380638  -1.6602784  -1.6602781  -1.6602781 ]
    [ 0.276398   -0.50446594  0.0611502   0.06115052  0.06115052]], shape=(3, 
5), dtype=float32)
   ```
   
   At high level the methodology is: partition/split the input A into 2 
matrices X and Y and split matrix R (from A=QR decomposition) into 2 matrices U 
and V. Then X = QU and get X_grad by applying the gradient derivation from the 
square input case (m=n) with adjusted Q_grad. Also get Y_grad separately. Then 
A_grad is the concatenation of X_grad and Y_grad.
   
   ### Changes ###
   - [ ] qr backward wide input
   - [ ] tests
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to