comaniac commented on issue #8991:
URL: https://github.com/apache/tvm/issues/8991#issuecomment-918376858


   You need to specify head when calculating the gradient:
   
   ```
   x = tvm.te.placeholder((1,128,7,7), name='x')
   w1 = tvm.te.placeholder((128,128,3,3), name='w1')
   z1=topi.nn.conv2d_nchw(x,w1,(1,1),(1,1),dilation=1,out_dtype="float32")
   dy = tvm.te.placeholder(z1.shape, name="dy")
   [dw1] = tvm.te.gradient(z1, [w1], head=dy)
   ```
   
   According to 
https://github.com/apache/tvm/blob/main/python/tvm/te/autodiff.py#L33, if head 
is None, then the identity tensor of shape `output.shape + output.shape` will 
be used. It means the size of `dy` will be super large and unlikely to fit into 
the GPU memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to