zhuwenxi opened a new pull request #8415:
URL: https://github.com/apache/tvm/pull/8415
# Problem Statement
`scatter_nd` crashes on cuda backend, when input data shape is slightly
larger than usual.
# Code to reproduce
<pre>
import tvm
import numpy as np
import tvm.relay as relay
dev = tvm.cuda()
target = tvm.target.Target("cuda")
# input data:
data_np = np.zeros((32, 128, 128, 256)).astype(np.float32)
indices_np = np.random.uniform(1,5,(32, 600, 3)).astype(np.int64)
updates_np = np.random.rand(32, 600, 256).astype(np.float32)
# Construct relay input nodes:
data = relay.var("data", shape=data_np.shape, dtype=str(data_np.dtype))
indices = relay.var("indices", shape=indices_np.shape,
dtype=str(indices_np.dtype))
updates = relay.var("updates", shape=updates_np.shape,
dtype=str(updates_np.dtype))
# Compute indices:
indices_dim = len(indices_np.shape)
axes = list(range(indices_dim))
indices_t = relay.transpose(indices, axes[-1:] + axes[:-1])
# Construct relay scatter_nd op:
out = relay.op.scatter_nd(data, indices_t, updates, "update")
func = relay.Function([data, indices, updates], out)
# Execute scatter_nd:
intrp = relay.create_executor("debug", device=dev, target=target)
op_res = intrp.evaluate(func)(data_np, indices_np, updates_np)
</pre>
# Error Message

# Root Cause

We can see the problem more clearly from the cuda code generated. The TIR
implementation of scatter_nd would cause a int32 overflow when "i" is large,
thus the if statement is always evaluate to true, and conducts a invalid memory
access.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]