funnyApe opened a new issue #7599:
URL: https://github.com/apache/tvm/issues/7599
This bug was encountered when I was trying to use auto_scheduler method to
optimize the conv3d op. I referred to the script
"/tvm/tutorials/auto_scheduler/tune_conv2d_layer_cuda.py" and did a little
modification to implement the optimization of conv3d with format "NDHWC" by
call function topi.nn.conv3d_ndhwc(). I set input as a tensor with layout (N,
D, H, W, CI), and kernel with (KD, KH, KW, CI, CO)。when I used (1, 128, 64, 32,
32) as input, (16, 5, 5, 32, 3) as kernel, (1, 1, 1) as stride, 1 as dilation
and "SAME" as padding to call topi.nn.conv3d_ndhwc(),I expected an output with
shape (1, 128, 64, 32, 3) while I got an actual output shape(1, 117, 75, 32,
3).
**Environment**
- OS: Ubuntu 16.04 LTS
- TVM: 0.8.dev0
- CUDA: 10.1
**Code to generate the result**
`import os
import numpy as np
import tvm
from tvm import te, auto_scheduler, topi
from tvm.topi.testing import conv2d_nchw_python
######################################################################
# Define the computation
# ^^^^^^^^^^^^^^^^^^^^^^
# To begin with, let us define the computation of a convolution layer.
# The function should return the list of input/output tensors.
# From these tensors, the auto-scheduler can get the whole computational
graph.
@auto_scheduler.register_workload
def conv3d_layer(N, D, H, W, CO, CI, KD, KH, KW, stride, padding):
data = te.placeholder((N, D, H, W, CI), name="data")
kernel = te.placeholder((KD, KH, KW, CI, CO), name="kernel")
out = topi.nn.conv3d_ndhwc(data, kernel, stride, padding, dilation=1,
out_dtype="float32")
print(out.shape)
return [data, kernel, out]
######################################################################
# Create the search task
target = tvm.target.Target("cuda")
# Use the last layer in ResNet-50
N, D, H, W, CO, CI, KD, KH, KW, strides, padding = 1, 128, 64, 32, 3, 32,
16, 5, 5, (1,1,1), "SAME"
task = auto_scheduler.SearchTask(
func=conv3d_layer, args=(N, D, H, W, CO, CI, KD, KH, KW, strides,
padding), target=target
)
# Inspect the computational graph
print("Computational DAG:")
print(task.compute_dag)
log_file = "conv3d.json"
measure_ctx = auto_scheduler.LocalRPCMeasureContext(min_repeat_ms=300)
tune_option = auto_scheduler.TuningOptions(
num_measure_trials=10, # change this to 1000 to achieve the best
performance
runner=measure_ctx.runner,
measure_callbacks=[auto_scheduler.RecordToFile(log_file)],
verbose=2,
)
# Run auto-tuning (search)
task.tune(tune_option)
# Apply the best schedule
sch, args = task.apply_best(log_file)
# Kill the measurement process
del measure_ctx
print("Lowered TIR:")
print(tvm.lower(sch, args, simple_mode=True))
func = tvm.build(sch, args, target)`
I checked the TVM's source code, and found that topi.nn.conv3d_ndhwc( ) and
topi.nn.conv3d_ncdhw( ) get error results when calling function
"get_pad_tuple3d()". When "padding==SAME", pad_h and pad_d are interchanged.
The following code is from TVM' source code.
`def get_pad_tuple3d(padding, kernel):
...
elif padding == "SAME":
**pad_h = kernel[0] - 1
pad_w = kernel[1] - 1
pad_d = kernel[2] - 1**
def conv3d_ncdhw(Input, Filter, stride, padding, dilation, out_dtype=None):
...
pad_front, pad_top, pad_left, pad_back, pad_down, pad_right =
get_pad_tuple3d(
padding, (**dilated_kernel_d, dilated_kernel_h,
dilated_kernel_w**))
def conv3d_ndhwc(
Input,
Filter,
stride,
padding,
dilation,
out_dtype="float32",
auto_scheduler_rewritten_layout="",
):
...
pad_front, pad_top, pad_left, pad_back, pad_down, pad_right =
get_pad_tuple3d(
padding, (**dilated_kernel_d, dilated_kernel_h,
dilated_kernel_w**) )
`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]