masahi commented on issue #8295: URL: https://github.com/apache/tvm/issues/8295#issuecomment-884813492
I can confirm that TF2 ssd mobilenet v2 can be converted to fp16 and runs on vulkan (AMD) and opencl (Intel Ice lake), if I disable vectorization on fp16 at https://github.com/apache/tvm/blob/main/python/tvm/topi/cuda/injective.py#L54-L55 (cc @Lunderberg). But the output from fp16 is a bit off compared to fp32 (on both vk and ocl): ``` fp32 Mean Squared Error of output 0 and shape (1, 100, 4) is 9.562732618023824e-15 Mean Squared Error of output 1 and shape (1, 100) is 0.0 Mean Squared Error of output 2 and shape (1, 100) is 4.539840725570343e-13 Mean Squared Error of output 3 and shape (1,) is 0.0 Mean Squared Error of output 4 and shape (1, 12804, 4) is 3.1784283863710294e-13 Mean Squared Error of output 5 and shape (1, 12804, 91) is 2.194374375133825 fp16 Mean Squared Error of output 0 and shape (1, 100, 4) is 0.01756046526134014 Mean Squared Error of output 1 and shape (1, 100) is 8.5600004196167 Mean Squared Error of output 2 and shape (1, 100) is 5.59057809823571e-07 Mean Squared Error of output 3 and shape (1,) is 0.0 Mean Squared Error of output 4 and shape (1, 12804, 4) is 5.098227120470256e-07 Mean Squared Error of output 5 and shape (1, 12804, 91) is 2.664001463870136e-09 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
