masahi opened a new pull request, #14147:
URL: https://github.com/apache/tvm/pull/14147

   https://github.com/apache/tvm/pull/13877 lowered memory usage for VM runtime 
load / save. The same issue observed by @AndrewZhaoLuo is still present in the 
graph runtime counterpart, `tvm.runtime.load_param_dict` and 
`tvm.runtime.save_param_dict`. In addition, we are carelessly making another 
copy of `ndarray`s in `save_param_dict`. 
   
   All of this leads to absurd RAM requirement when loading / serializing 
stable diffusion UNet. The total size of its parameters is < 3.5G, but loading 
or saving these params take more than 16G of RAM (!!):
   ```
   $ cat test_params.py
   import tvm
   with open("unet.params", "rb") as fi:
       params_ref = tvm.runtime.load_param_dict(fi.read())
   
   with open("unet.params", "wb") as fo:
       fo.write(tvm.runtime.save_param_dict(params_ref))
   
   $ /usr/bin/time -v python test_params.py 
           ...
           System time (seconds): 9.57
           ....
           Maximum resident set size (kbytes): 16921216
           ...
   ``` 
   
   After this PR:
   ```
   $ /usr/bin/time -v python test_params.py 
           System time (seconds): 3.06
           Maximum resident set size (kbytes): 3492820
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to