LeiWang1999 commented on PR #14329:
URL: https://github.com/apache/tvm/pull/14329#issuecomment-1478927917
## Microbenchmark
- Test Device: A100 ( because async copy works better on devices with huge
bandwith like the A100 or H100 gpu
- CUDA Version: 12.0
- Tested DIffusion Conv2d shapes
- Tunner: without tuner ( scheduled hands on to see performance influence.
schedule is not optimal.
- diffusion conv2d benchmark of vectorized if_then_else async
copy(nhwc_nhwc, fp16 precison, tensorcore enabled.).
| | N | C | H | W | CO | K | S | D | P |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| C8 | 2 | 640 | 64 | 64 | 640 | 3 | 1 | 1 | 1 |
| C11 | 2 | 960 | 32 | 32 | 640 | 3 | 1 | 1 | 1 |
| C13 | 2 | 1280 | 32 | 32 | 1280 | 3 | 1 | 1 | 1 |
performance (weight vectorized load is in async way in both cases)
| | without vectorized if_then_else async (ms) | with vectorized
if_then_else async (ms) |
| ---- | ------------------------------------------ |
--------------------------------------- |
| C8 | 0.779605 | 0.543061
|
| C11 | 0.544085 | 0.356011
|
| C13 | 0.267264 | 0.218794
|
- gemm to see l2 cache intrin influnce (float16-float16-row-col-tensorcore)
| | M | N | K | without :l2 (ms) | with :l2 (ms) |
| ----- | ----- | ----- | ----- | ---------------- | ------------- |
| GEM-0 | 256 | 256 | 256 | 0.020821 | 0.020821 |
| GEM-1 | 16384 | 16384 | 16384 | 45.1103 | 45.1103 |
: The L2 performance was as expected, as in my previous tests I didn't
really see any impact on performance. I leveraged the L2 Cache in a different
way, and I will wait until it was ready before submitting another pull request.
Eventhough this :l2 feature has no effect in tests, I think it is still
necessary to add such a feature because sota library like cutlass/cublas ‘s
kernel enbaled such?

This may need more discussions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]