malixian opened a new issue, #15987:
URL: https://github.com/apache/tvm/issues/15987
### Expected behavior
I try to use MetaScheduler to tuning matmul, and the dimensions of the
matrix are m=8192, n=14336, k=8192.
When n=8192, everything is ok, but once m or n is equal to 14336, an error
`RuntimeError: parallel_for_dynamic error with [02:23:57]
/home/malixian/repos/tensorir/tvm/src/ir/expr.cc:88: InternalError: Check
failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) :
ValueError: Literal value 8589934591 exceeds maximum of int32` will occur. BTW,
it is ok when k equals 14336.
According to the error message, I tried to comment out the` ICHECK` code of
the function IntImm in expr.cc and it worked normally, again.
I think the `DataType` of Tir should be expanded to suit this case.
### Actual behavior
error `RuntimeError: parallel_for_dynamic error with [02:23:57]
/home/malixian/repos/tensorir/tvm/src/ir/expr.cc:88: InternalError: Check
failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) :
ValueError: Literal value 8589934591 exceeds maximum of int32`
### Environment
TVM version is '0.15.dev0'
### Steps to reproduce
```
def matmul_fp16(M: int, N: int, K: int, in_dtype: str, out_dtype: str):
x = te.placeholder((M, K), name="X", dtype=in_dtype)
y = te.placeholder((K, N), name="Y", dtype=in_dtype)
k = te.reduce_axis((0, K), name="k")
c = te.compute( # pylint: disable=invalid-name
(M, N),
lambda i, j: te.sum(x[i][k].astype(out_dtype) *
y[k][j].astype(out_dtype), axis=[k]),
name="C",
)
return (x, y, c)
def tune(in_dtype, out_dtype):
target = Target("nvidia/nvidia-a100")
M, N, K = 8192, 14336, 8192
func = te.create_prim_func(
matmul_fp16(M=M, N=N, K=K, in_dtype=in_dtype, out_dtype=out_dtype)
).with_attr({"global_symbol": "main"})
space = ms.space_generator.PostOrderApply(
sch_rules="cuda-tensorcore",
postprocs="cuda-tensorcore",
mutator_probs="cuda-tensorcore",
)
mod = tvm.IRModule({"main": func})
with tempfile.TemporaryDirectory() as work_dir:
db = ms.tir_integration.tune_tir(
mod=mod,
target=target,
work_dir=work_dir,
max_trials_global=32,
builder=LocalBuilder(
f_build="meta_schedule.builder.async_build",
initializer=initializer
),
space=space,
)
sch = db.query_schedule(mod, target=target, workload_name="main")
with tvm.transform.PassContext(config={"tir.use_async_copy": 1}):
rt_mod = tvm.build(sch.mod, target=target)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]