altanh commented on PR #11341:
URL: https://github.com/apache/tvm/pull/11341#issuecomment-1159227969

   > I prepared environment for the performance testing and suppose to provide 
latency/throughput curves for different configurations tomorrow. ![res_DLRM 
100G](https://user-images.githubusercontent.com/88086617/170212554-c571240a-23c3-40b3-a632-aba98de621f3.png)
 The picture above provides performance improvements in concatenation for DLRM 
model. The max throughput/min latency values are following: Main branch: max 
throughput 4481.75, i:95, t:1 min latency 1.15, i:1, t:20 New concat branch, 
concatenation is inlined: max throughput 5471.46, i:95, t:1 min latency 1.1, 
i:1, t:17 New concat branch, concatenation is opaque: max throughput 5681.72, 
i:92, t:1 min latency 1.0, i:1, t:12
   > 
   > I'm trying to identify the root-cause of performance drop for inlined 
concatenation and it looks like the problem is connected with reshape layer. In 
case of inlined version this layer is not removed from the pipeline and it 
leads to performance drop.
   
   is the performance testing script available? I think this concat change 
might have caused some regressions on some vision models, so just wanted to see 
if I can replicate the results locally


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to