Elonaever opened a new issue #7749:
URL: https://github.com/apache/tvm/issues/7749


   I have tried to deploy my own tensorflow model with tvm, and because of the 
dynamic_rnn layer, I have to use the vm runtime to deploy.
   Thought I successfully used tvm to inference my model, the performance of 
tvm is so bad. While the tensorflow and onnxruntime take about 200ms for once 
inference, the tvm takes almost 1400ms.
   I wonder if it is reasonable,  and what causes this performance gap.
   My platform is x86-64 cpu with wsl, and use python api to deploy.
   
   here is some of my code:
   
   mod, params = 
relay.frontend.from_tensorflow(graph_def,layout=layout,shape=shape_dict,outputs=output_name)
   print("Tensorflow protobuf imported to relay frontend.")
   
   
   with tvm.transform.PassContext(opt_level=3, disabled_pass=None):
       print(mod["main"])
       mod = relay.transform.InferType()(mod)
       vm_exec = relay.vm.compile(mod, target=target, params=params)
       code, lib = vm_exec.save()
   
   path_lib = "/home/ceuiou/tvm-tf/lib.so"
       lib.export_library(path_lib)
       with open("/home/ceuiou/tvm-tf/code.ro", "wb") as fo:
           fo.write(code)
   loaded_lib = tvm.runtime.load_module(path_lib)
   loaded_code = bytearray(open("/home/ceuiou/tvm-tf/code.ro", "rb").read())
   
   vm_exec = tvm.runtime.vm.Executable.load_exec(loaded_code, loaded_lib)
   vm = VirtualMachine(vm_exec,tvm.cpu())


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to