Lunderberg commented on pull request #8069:
URL: https://github.com/apache/tvm/pull/8069#issuecomment-864221346


   > Later I can test it on CUDA (tensorcore) and OpenCL (intel), and hopefully 
@Lunderberg for vulkan.
   
   Currently, I can run all the tests in `test_to_mixed_precision.py` with the 
LLVM target/device, but both cuda and vulkan backends throw an exception at 
`TVMFuncCall` in `c_runtime_api.cc` if I edit the `run_module` function to use 
a different target.
   
   On the cuda side, it's failing a check that requires 16-bit floats to be 
used in pairs.
   
   ```
   Check failed: lanes % 2 == 0 (1 vs. 0) : only support even lane for half type
   ```
   
   On the vulkan side, it's something similar with the validation checks 
failing an alignment rule.
   
   ```
   Check failed: res == SPV_SUCCESS (-10 vs. 0) :  index=27 error:Structure id 
12 decorated as Block for variable in StorageBuffer storage class must follow 
standard storage buffer layout rules: member 0 contains an array with stride 6 
not satisfying alignment to 8
   %_struct_12 = OpTypeStruct %_runtimearr_v3half  
   ```
   
   I don't think either of these are reasons not to merge, and I've added the 
vulkan errors to my todo list for the ongoing `float16` work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to