altanh commented on PR #11341: URL: https://github.com/apache/tvm/pull/11341#issuecomment-1159227969
> I prepared environment for the performance testing and suppose to provide latency/throughput curves for different configurations tomorrow.  The picture above provides performance improvements in concatenation for DLRM model. The max throughput/min latency values are following: Main branch: max throughput 4481.75, i:95, t:1 min latency 1.15, i:1, t:20 New concat branch, concatenation is inlined: max throughput 5471.46, i:95, t:1 min latency 1.1, i:1, t:17 New concat branch, concatenation is opaque: max throughput 5681.72, i:92, t:1 min latency 1.0, i:1, t:12 > > I'm trying to identify the root-cause of performance drop for inlined concatenation and it looks like the problem is connected with reshape layer. In case of inlined version this layer is not removed from the pipeline and it leads to performance drop. is the performance testing script available? I think this concat change might have caused some regressions on some vision models, so just wanted to see if I can replicate the results locally -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
