srkreddy1238 opened a new pull request, #13843: URL: https://github.com/apache/tvm/pull/13843
Tuning cache bin is serialized through DMLC::Stream to support multiple CLML sub graphs with in a tvm module. Individual tuning cache blobs are saved to same output file. New API on OpenCLWorkspace to enable or disable profiling on command queue rather doing this only when Timer is invoked. This is required to perform CLML operator tuning. CLML layer profiling now uses OpenCL Timer interface. This PR also fix avoiding pad operator offloading at the very first layer (to be specific before at least one convolution layer) due to the limitation of CLML pad operator is concerned about layout. Please refer to CLML SDK documentation for more details. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
