bhahn221 opened a new issue #8929: URL: https://github.com/apache/tvm/issues/8929
Thanks for participating in the TVM community! We use https://discuss.tvm.ai for any general usage questions and discussions. The issue tracker is used for actionable items such as feature proposals discussion, roadmaps, and bug tracking. You are always welcomed to post on the forum first :smile_cat: Issues that are inactive for a period of time may get closed. We adopt this policy so that we won't lose track of actionable issues that may fall at the bottom of the pile. Feel free to reopen a new one if you feel there is an additional problem that needs attention when an old one gets closed. ### Expected behavior I am tuning a model (based on MobileNet-V2) that was trained on PyTorch. I tried tuning the model on my Raspberry Pi 3B which has a "arm_cpu" using the following command: `tvmc tune --rpc-key "rasp3b" --rpc-tracker IP:PORT --target "llvm -device=arm_cpu -mtriple=aarch64-linux-gnu" --target-host "llvm -device=arm_cpu -mtriple=aarch64-linux-gnu" --output tuning.log --input-shape "input:[1,3,224,224]" --model-format pytorch model.pt` I attach the model.pt as a compressed file: [model.zip](https://github.com/apache/tvm/files/7109685/model.zip) ### Actual behavior While I expected all the tuning tasks to be "*.arm_cpu" I found many "*.x86" among the tasks when I printed all the tasks: > [Task(func_name=dense_nopack.x86, args=(('TENSOR', (1, 1280), 'float32'), ('TENSOR', (2, 1280), 'float32'), None, 'float32'), kwargs={}, workload=('dense_nopack.x86', ('TENSOR', (1, 1280), 'float32'), ('TENSOR', (2, 1280), 'float32'), None, 'float32')), Task(func_name=dense_pack.x86, args=(('TENSOR', (1, 1280), 'float32'), ('TENSOR', (2, 1280), 'float32'), None, 'float32'), kwargs={}, workload=('dense_pack.x86', ('TENSOR', (1, 1280), 'float32'), ('TENSOR', (2, 1280), 'float32'), None, 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 320, 7, 7), 'float32'), ('TENSOR', (1280, 320, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 320, 7, 7), 'float32'), ('TENSOR', (1280, 320, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 320, 7, 7), 'float32'), ('TENSOR', (1280, 320, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 320, 7, 7), 'float32'), ('TENSOR', (1280, 320, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (320, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (320, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (320, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (320, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('T ENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (960, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (960, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (960, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (960, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 160, 7, 7), 'float32'), ('TENSOR', (960, 160, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 160, 7, 7), 'float32'), ('TENSOR', (960, 160, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 160, 7, 7), 'float32'), ('TENSOR', (960, 160, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 160, 7, 7), 'float32'), ('TENSOR', (960, 160, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (160, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (160, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 960, 7, 7), 'float32'), ('TENSOR', (160, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 960, 7, 7), 'float32'), ( 'TENSOR', (160, 960, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 576, 7, 7), 'float32'), ('TENSOR', (160, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 576, 7, 7), 'float32'), ('TENSOR', (160, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 576, 7, 7), 'float32'), ('TENSOR', (160, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 576, 7, 7), 'float32'), ('TENSOR', (160, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 96, 14, 14), 'float32'), ('TENSOR', (576, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 96, 14, 14), 'float32'), ('TENSOR', (576, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 96, 14, 14), 'float32'), ('TENSOR', (576, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 96, 14, 14), 'float32'), ('TENSOR', (576, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (96, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (96, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (96, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (96, 576, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=dep thwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 576, 14, 14), 'float32'), ('TENSOR', (576, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (96, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (96, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (96, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (96, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (384, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (384, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (384, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthw ise_conv2d_NCHWc.x86', ('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (384, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 64, 14, 14), 'float32'), ('TENSOR', (384, 64, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 64, 14, 14), 'float32'), ('TENSOR', (384, 64, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 64, 14, 14), 'float32'), ('TENSOR', (384, 64, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 64, 14, 14), 'float32'), ('TENSOR', (384, 64, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (64, 384, 1, 1 ), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (64, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (64, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 384, 14, 14), 'float32'), ('TENSOR', (64, 384, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 192, 14, 14), 'float32'), ('TENSOR', (64, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 192, 14, 14), 'float32'), ('TENSOR', (64, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=((' TENSOR', (1, 192, 14, 14), 'float32'), ('TENSOR', (64, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 192, 14, 14), 'float32'), ('TENSOR', (64, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (2, 2), (1 , 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 32, 28, 28), 'float32'), ('TENSOR', (192, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 32, 28, 28), 'float32'), ('TENSOR', (192, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 32, 28, 28), 'float32'), ('TENSOR', (192, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 32, 28, 28), 'float32'), ('TENSOR', (192, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (32, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (32, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (32, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (32, 192, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NC HW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 192, 28, 28), 'float32'), ('TENSOR', (192, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 144, 28, 28), 'float32'), ('TENSOR', (32, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 144, 28, 28), 'float32'), ('TENSOR', (32, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 144, 28, 28), 'float32'), ('TENSOR', (32, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 144, 28, 28), 'float32'), ('TENSOR', (32, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', ( 1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 24, 56, 56), 'float32'), ('TENSOR', (144, 24, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 24, 56, 56), 'float32'), ('TENSOR', (144, 24, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0 ), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 24, 56, 56), 'float32'), ('TENSOR', (144, 24, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 24, 56, 56), 'float32'), ('TENSOR', (144, 24, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (24, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (24, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (24, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 144, 56, 56), 'float32' ), ('TENSOR', (24, 144, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 144, 56, 56), 'float32'), ('TENSOR', (144, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 96, 56, 56), 'float32'), ('TENSOR', (24, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'fl oat32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 96, 56, 56), 'float32'), ('TENSOR', (24, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 96, 56, 56), 'float32'), ('TENSOR', (24, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 96, 56, 56), 'float32'), ('TENSOR', (24, 96, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 96, 112, 112), 'float32'), ('TENSOR', (96, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 96, 112, 112), 'float32'), ('TENSOR', (96, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 96, 112, 112), 'float32'), ('TENSOR', ( 96, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 96, 112, 112), 'float32'), ('TENSOR', (96, 1, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 16, 112, 112), 'float32'), ('TENSOR', (96, 16, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 16, 112, 112), 'float32'), ('TENSOR', (96, 16, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 16, 112, 112), 'float32'), ('TENSOR', (96, 16, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 16, 112, 112), 'float32'), ('TENSOR', (96, 16, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task( func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (16, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (16, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (16, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (16, 32, 1, 1), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=depthwise_conv2d_nchw.arm_cpu, args=(('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (32, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('depthwise_conv2d_nchw.arm_cpu', ('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (32 , 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=depthwise_conv2d_NCHWc.x86, args=(('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (32, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=('depthwise_conv2d_NCHWc.x86', ('TENSOR', (1, 32, 112, 112), 'float32'), ('TENSOR', (32, 1, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32')), Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=(('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (32, 3, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (32, 3, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'float32')), Task(func_name=conv2d_NCHWc.x86, args=(('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (32, 3, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'), kwargs={}, workload=(' conv2d_NCHWc.x86', ('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (32, 3, 3, 3), 'float32'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW', 'NCHW', 'float32'))] Is this expected? ### Environment My host machine has Ubuntu 20.04, uses AMD CPU My RPC server is a Rapsberry Pi 3B ### Steps to reproduce `tvmc tune --rpc-key "rasp3b" --rpc-tracker IP:PORT --target "llvm -device=arm_cpu -mtriple=aarch64-linux-gnu" --target-host "llvm -device=arm_cpu -mtriple=aarch64-linux-gnu" --output tuning.log --input-shape "input:[1,3,224,224]" --model-format pytorch model.pt` I attach the model.pt as a compressed file: [model.zip](https://github.com/apache/tvm/files/7109685/model.zip) * One can also reproduce this using the script in https://tvm.apache.org/docs/tutorials/autotvm/tune_relay_arm.html#sphx-glr-tutorials-autotvm-tune-relay-arm-py (but changing the part where it loads the network) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
