comaniac commented on issue #8991: URL: https://github.com/apache/tvm/issues/8991#issuecomment-918376858
You need to specify head when calculating the gradient: ``` x = tvm.te.placeholder((1,128,7,7), name='x') w1 = tvm.te.placeholder((128,128,3,3), name='w1') z1=topi.nn.conv2d_nchw(x,w1,(1,1),(1,1),dilation=1,out_dtype="float32") dy = tvm.te.placeholder(z1.shape, name="dy") [dw1] = tvm.te.gradient(z1, [w1], head=dy) ``` According to https://github.com/apache/tvm/blob/main/python/tvm/te/autodiff.py#L33, if head is None, then the identity tensor of shape `output.shape + output.shape` will be used. It means the size of `dy` will be super large and unlikely to fit into the GPU memory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
