gmeeker opened a new issue, #17279: URL: https://github.com/apache/tvm/issues/17279
### Expected behavior Tune retina-face-resnet50-fixed.onnx from this repo: https://github.com/gmeeker/RetinaFace This is a fixed size input version of this: https://github.com/discipleofhamilton/RetinaFace ### Actual behavior [Task 30/37] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (27/27) | 552.19 sWARNING:root:Could not find any valid schedule for task Task(func_name=conv2d_nchw_winograd.cuda, args=(('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')). A file containing the errors has been written to /var/folders/bd/rc6mzcg1423fzylm2vd6qsd00000gn/T/tvm_tuning_errors_h172s9it.log. In the log: RPCError: Error caught from RPC call: [21:48:30] [...]/src/runtime/metal/metal_module.mm:130: InternalError: Check failed: (state != nil) is false: cannot get state: for function default_function_kernelThread group memory requested is more than MAX allowed Also, this issue is very frequent on Intel Macs, to the point where Metal targets are slower than CPU. TVM 0.17.0's Metal timer may have made this more prevalent, but I believe that's irrelevant and earlier versions were just not tuning properly. ### Environment macOS 14.6.1 M1 2020 Mac Mini Intel Mac: 2019 MacBook Pro, AMD 5500M TVM 0.17.0 ### Steps to reproduce ``` tvmc tune --target metal --output retina-face-resnet50-autotuner_records.json retina-face-resnet50-fixed.onnx ``` ### Triage * needs-triage * backend:metal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
