t-vi commented on pull request #5959:
URL: https://github.com/apache/incubator-tvm/pull/5959#issuecomment-651298087


   This pull request is only about the second commit, the first is #5946 .
   I noticed that my gradient had many more O^3 (matmul etc.) operations than 
it should have and tracked this down to how gradients are computed when a value 
is used several times in the computation.
   Graphs are becoming really big and unwieldy if they are not purely 
sequential computation. Also, the duplication cannot be eliminated by CSE 
because the "output part" is duplicate rather than the input (one could, in 
theory commute add with all the gradient ops).
   While it doesn't fix anything, it might also have a mitigating impact for 
people seeing other effects when working with first order gradients (e.g. 
#4534).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to