tkonolige opened a new pull request, #11066: URL: https://github.com/apache/tvm/pull/11066
`tvm.analysis.roofline_analysis` add estimated roofline performance to a profiling report. The roofline model measures how close a operator gets to best possible memory bandwidth or FLOP/s depending on whether it is memory or compute bound. This computation uses the runtime of the operator along with two numbers extracted from the TIR code: bytes of memory touched and number of floating point operations. Because these numbers are extracted from TIR, they may not be 100% accurate. The best possible memory bandwidth and FLOP/s are measured by running small programs that are memory and compute bound respectively. For now, this function only works with llvm cpu targets, but it should be possible to extend to GPU targets. @AndrewZhaoLuo @mbrookhart @masahi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
