Lunderberg commented on pull request #8069: URL: https://github.com/apache/tvm/pull/8069#issuecomment-864221346
> Later I can test it on CUDA (tensorcore) and OpenCL (intel), and hopefully @Lunderberg for vulkan. Currently, I can run all the tests in `test_to_mixed_precision.py` with the LLVM target/device, but both cuda and vulkan backends throw an exception at `TVMFuncCall` in `c_runtime_api.cc` if I edit the `run_module` function to use a different target. On the cuda side, it's failing a check that requires 16-bit floats to be used in pairs. ``` Check failed: lanes % 2 == 0 (1 vs. 0) : only support even lane for half type ``` On the vulkan side, it's something similar with the validation checks failing an alignment rule. ``` Check failed: res == SPV_SUCCESS (-10 vs. 0) : index=27 error:Structure id 12 decorated as Block for variable in StorageBuffer storage class must follow standard storage buffer layout rules: member 0 contains an array with stride 6 not satisfying alignment to 8 %_struct_12 = OpTypeStruct %_runtimearr_v3half ``` I don't think either of these are reasons not to merge, and I've added the vulkan errors to my todo list for the ongoing `float16` work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
