srkreddy1238 opened a new pull request, #13843:
URL: https://github.com/apache/tvm/pull/13843

   Tuning cache bin is serialized through DMLC::Stream to support multiple CLML 
sub graphs with in a tvm module. Individual tuning cache blobs are saved to 
same output file.
   
   New API on OpenCLWorkspace to enable or disable profiling on command queue 
rather doing this only when Timer is invoked. This is required to perform CLML 
operator tuning.
   
   CLML layer profiling now uses OpenCL Timer interface.
   
   This PR also fix avoiding pad operator offloading at the very first layer 
(to be specific before at least one convolution layer) due to the limitation of 
CLML pad operator is concerned about layout. Please refer to CLML SDK 
documentation for more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to