@samskalicky Thanks! As what the diag operator does is basically copying some elements from the input directly to output, the backward process is quite straightforward. What it does is simply copying the gradients from the output back to the corresponding elements of the input. This is achieved with a template parameter indicating whether it is forward or backward to control the direction of the assignment in the copy process.
[ Full content available at: https://github.com/apache/incubator-mxnet/pull/12430 ] This message was relayed via gitbox.apache.org for [email protected]
