[GitHub] [incubator-mxnet] bgawrych opened a new pull request #20925: [v1.x] Reduce after quantization memory usage

GitBox Wed, 02 Mar 2022 02:23:22 -0800


bgawrych opened a new pull request #20925:
URL: https://github.com/apache/incubator-mxnet/pull/20925



   ## Description ##
   Port of https://github.com/apache/incubator-mxnet/pull/20894
   
   Script:
   ```
   import mxnet as mx
   from mxnet.gluon.model_zoo import vision
   import psutil
   import os
   
   def get_process_memory():
       process = psutil.Process(os.getpid())
       mem_info = process.memory_info()
       return mem_info.rss * 1e-6
   
   
   batch_shape = (1, 3, 224, 224)
   data = mx.nd.random.normal(shape=batch_shape)
   
   print("memory before loading model: ", get_process_memory())
   net = vision.resnet50_v1(pretrained=True)
   print("memory after loading model: ", get_process_memory()) 
   out = net(data)
   out.wait_to_read()
   print("memory after fp32 forward pass", get_process_memory())
   
   indata = {'data':data}
   label = {'label':mx.nd.zeros(shape=(1,))}
   dataiter = mx.io.NDArrayIter(indata, label, 3, True, 
last_batch_handle='discard')
   net_quantized = mx.contrib.quant.quantize_net(net, quantized_dtype='auto',
                                                   quantize_mode="smart",
                                                   calib_mode='naive',
                                                   calib_data=dataiter,
                                                   num_calib_examples=1,
                                                   ctx=mx.current_context())
   
   print("memory after quantization: ", get_process_memory())
   
   outputs = net_quantized(data)
   outputs.wait_to_read()
   print("memory after int8 forward pass: ", get_process_memory())
   ```
   
   Output before:
   ```
   memory before loading model:  201.936896
   memory after loading model:  433.41004799999996
   memory after fp32 forward pass 523.698176
   memory after quantization:  1308.803072
   memory after int8 forward pass:  1313.349632
   ```
   
   Output after:
   ```
   memory before loading model:  202.502144
   memory after loading model:  434.184192
   memory after fp32 forward pass 520.986624
   memory after quantization:  1136.570368
   memory after int8 forward pass:  1141.485568
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-mxnet] bgawrych opened a new pull request #20925: [v1.x] Reduce after quantization memory usage

Reply via email to