masahi commented on issue #8295:
URL: https://github.com/apache/tvm/issues/8295#issuecomment-884813492


   I can confirm that TF2 ssd mobilenet v2 can be converted to fp16 and runs on 
vulkan (AMD) and opencl (Intel Ice lake), if I disable vectorization on fp16 at 
https://github.com/apache/tvm/blob/main/python/tvm/topi/cuda/injective.py#L54-L55
 (cc @Lunderberg).
   
   But the output from fp16 is a bit off compared to fp32 (on both vk and ocl):
   ```
   fp32
   Mean Squared Error of output 0 and shape (1, 100, 4) is 9.562732618023824e-15
   Mean Squared Error of output 1 and shape (1, 100) is 0.0
   Mean Squared Error of output 2 and shape (1, 100) is 4.539840725570343e-13
   Mean Squared Error of output 3 and shape (1,) is 0.0
   Mean Squared Error of output 4 and shape (1, 12804, 4) is 
3.1784283863710294e-13
   Mean Squared Error of output 5 and shape (1, 12804, 91) is 2.194374375133825
   
   fp16
   Mean Squared Error of output 0 and shape (1, 100, 4) is 0.01756046526134014
   Mean Squared Error of output 1 and shape (1, 100) is 8.5600004196167
   Mean Squared Error of output 2 and shape (1, 100) is 5.59057809823571e-07
   Mean Squared Error of output 3 and shape (1,) is 0.0
   Mean Squared Error of output 4 and shape (1, 12804, 4) is 
5.098227120470256e-07
   Mean Squared Error of output 5 and shape (1, 12804, 91) is 
2.664001463870136e-09
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to