masahi opened a new pull request, #14147: URL: https://github.com/apache/tvm/pull/14147
https://github.com/apache/tvm/pull/13877 lowered memory usage for VM runtime load / save. The same issue observed by @AndrewZhaoLuo is still present in the graph runtime counterpart, `tvm.runtime.load_param_dict` and `tvm.runtime.save_param_dict`. In addition, we are carelessly making another copy of `ndarray`s in `save_param_dict`. All of this leads to absurd RAM requirement when loading / serializing stable diffusion UNet. The total size of its parameters is < 3.5G, but loading or saving these params take more than 16G of RAM (!!): ``` $ cat test_params.py import tvm with open("unet.params", "rb") as fi: params_ref = tvm.runtime.load_param_dict(fi.read()) with open("unet.params", "wb") as fo: fo.write(tvm.runtime.save_param_dict(params_ref)) $ /usr/bin/time -v python test_params.py ... System time (seconds): 9.57 .... Maximum resident set size (kbytes): 16921216 ... ``` After this PR: ``` $ /usr/bin/time -v python test_params.py System time (seconds): 3.06 Maximum resident set size (kbytes): 3492820 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
