dvhg opened a new issue #8233:
URL: https://github.com/apache/tvm/issues/8233
I'm trying to run PyTorch MaskRCNN on GPU and have been running into GPU
memory issues. I get errors when running repeated inferences using different
inputs. There's some variety in the error messages but this is the most common:
```
terminate called after throwing an instance of 'dmlc::Error'
what(): [20:11:56] /home/ubuntu/tvm/include/tvm/runtime/device_api.h:260:
unknown type =0
```
When looking at GPU memory usage (using `nvidia-smi`), I see memory usage
increases over time until the test crashes once it nears the maximum. I'm
running this on Ubuntu 18.04 and a T4 GPU with 16GB of GPU memory.
Following the form of the unit test from `test_tensorrt.py`, the following
script should reproduce the problem I'm seeing. It differs from the unit test
in 2 ways:
1. The VM is run on GPU instead of CPU:
```
ctx = tvm.gpu(0)
vm = VirtualMachine(vm_exec, ctx)
```
2. Inference is run on many different inputs (from COCO dataset) rather than
a single inference.
@masahi, I heard you've been working on PyTorch MaskRCNN. Have you seen this
issue in your testing, or is there a problem in my script? Thank you!
```
import tvm
from tvm import relay
from tvm import relay
from tvm.runtime.vm import VirtualMachine
from tvm.contrib.download import download
import numpy as np
import cv2
import torch
import torchvision
in_size = 300
input_shape = (1, 3, in_size, in_size)
def do_trace(model, inp):
model_trace = torch.jit.trace(model, inp)
model_trace.eval()
return model_trace
def dict_to_tuple(out_dict):
if "masks" in out_dict.keys():
return out_dict["boxes"], out_dict["scores"], out_dict["labels"],
out_dict["masks"]
return out_dict["boxes"], out_dict["scores"], out_dict["labels"]
class TraceWrapper(torch.nn.Module):
def __init__(self, model):
super().__init__()
self.model = model
def forward(self, inp):
out = self.model(inp)
return dict_to_tuple(out[0])
model_func = torchvision.models.detection.maskrcnn_resnet50_fpn
model = TraceWrapper(model_func(pretrained=True))
model.eval()
inp = torch.Tensor(np.random.uniform(0.0, 250.0, size=(1, 3, in_size,
in_size)))
with torch.no_grad():
out = model(inp)
script_module = do_trace(model, inp)
input_name = "input0"
shape_list = [(input_name, input_shape)]
mod, params = relay.frontend.from_pytorch(script_module, shape_list)
target = "cuda"
with tvm.transform.PassContext(opt_level=3, disabled_pass=["FoldScaleAxis"]):
vm_exec = relay.vm.compile(mod, target=target, params=params)
ctx = tvm.gpu(0)
vm = VirtualMachine(vm_exec, ctx)
img_dirpath = 'data/COCO_2017/subset/val2017/'
i = 0
import os
for root, dirs, files in os.walk(img_dirpath):
for f in files:
print(i)
i += 1
imgname = os.path.join(root, f)
img = cv2.imread(imgname)
img = cv2.resize(img, (in_size, in_size))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.transpose(img / 255.0, [2, 0, 1])
img = np.expand_dims(img, axis=0).astype('float32')
vm.set_input("main", **{input_name: img})
tvm_res = vm.run()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]