argrento opened a new pull request, #11180:
URL: https://github.com/apache/tvm/pull/11180

   Profiling in TVM is enabled or disable in compile time by the `USE_PROFILER` 
switch. It means that if we enable profile in `config.cmake`, but do not use 
any profiling features in the app, OpenCL is forced to collect `cl_events` 
objects.
   
   Build TVM with `set(USE_PROFILER ON)`.
   Consider simple app, where we create module from the `.so` file:
   ```c
   tvm::runtime::Module mod_factory = 
tvm::runtime::Module::LoadFromFile("model.so");
   tvm::runtime::Module gmod = mod_factory.GetFunction("default")(ctx);
   tvm::runtime::PackedFunc set_input = gmod.GetFunction("set_input");
   tvm::runtime::PackedFunc get_input = gmod.GetFunction("get_input");
   tvm::runtime::PackedFunc get_output = gmod.GetFunction("get_output");
   tvm::runtime::PackedFunc run = gmod.GetFunction("run");
   
   // set inputs and outputs
   
   size_t niterations = 5000;
   for (size_t i = 0; i < niterations; i++) {
     run();
   }
   ```
   
   Then we collect memory usage info with Valgrind.
   ```
       MB
   818.5^                                                                       
#
        |                                                                    
@@@#
        |                                                                
@@@@@@@#
        |                                                            
@@@@@@@@@@@#
        |                                                         
@@@@@@@@@@@@@@#
        |                                                     
::::@@@@@@@@@@@@@@#
        |                                                  @:::: 
:@@@@@@@@@@@@@@#
        |                                             @@@:@@:::: 
:@@@@@@@@@@@@@@#
        |                                           @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                                      :@@@@@@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                                  :@@::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                               ::::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                           :@@@: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                       :::@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                   ::@@: :@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |                @@@: @@: :@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        |           :::::@@ : @@: :@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        | @@:::::::@:: ::@@ : @@: :@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        | @ :::::::@:: ::@@ : @@: :@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
        | @ :::::::@:: ::@@ : @@: :@:@ @: ::@ ::@@@ @@@ @:@@:::: 
:@@@@@@@@@@@@@@#
      0 
+----------------------------------------------------------------------->Gi
        0                                                                   
75.95
   ```
   
   We do not use any profiling info, but it is collected implicitly because of 
the compile-time switches:
   1. 
https://github.com/apache/tvm/blob/6babb89cbb9fc5ab718f8b996c7ce60bf5ebbefd/src/runtime/opencl/opencl_device_api.cc#L431
   2. 
https://github.com/apache/tvm/blob/6babb89cbb9fc5ab718f8b996c7ce60bf5ebbefd/src/runtime/opencl/opencl_module.cc#L84
   
   
   
   With the proposed modifications this behavior is changed: `clCommandQueue` 
by default is created in the normal mode and is recreated again with profiling 
capabilities when user calls profiler explicitly. When a profiling session is 
finished, the queue is recreated again in normal mode, which allows to mix 
`profile()` calls and `run()` calls.
   
   With the proposed changes valgrind shows no abnormal memory usage for the 
example above.
   
   ```
       MB
   148.9^#                                                                      
 
        
|#::::::::::::::::::::::::::@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
        |#: ::::::::::::::: ::::::: 
@:::::::::::::@:::@:::::@::::::@:::::@:::::@:
      0 
+----------------------------------------------------------------------->Gi
        0                                                                   
83.07
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to