Maximilianxu opened a new issue #7758:
URL: https://github.com/apache/tvm/issues/7758


   
   Hi
   I want to deploy the BERT-base model on an Android phone. One of its params 
has shape (30522, 768) with dtype float32, the RPC connection will be reset 
each time I allocate this array.
   
   ```
   for pk, pv in params.items():
           print(pv.shape, pv.dtype)
           weights[pk] = 
tvm.nd.array((np.random.uniform(size=pv.shape)).astype(pv.dtype), ctx=ctx)
   ```
   The error message:
   ```
   Traceback (most recent call last):
     File "tune_network_x86.py", line 483, in <module>
       tune_network()
     File "tune_network_x86.py", line 423, in tune_network
       weights[pk] = 
tvm.nd.array((np.random.uniform(size=pv.shape)).astype(pv.dtype), ctx=ctx)
     File 
"/home/zyx/workspaces/python/tvm0.8_v2/python/tvm/runtime/ndarray.py", line 
516, in array
       return empty(arr.shape, arr.dtype, ctx).copyfrom(arr)
     File 
"/home/zyx/workspaces/python/tvm0.8_v2/python/tvm/runtime/ndarray.py", line 
154, in copyfrom
       check_call(_LIB.TVMArrayCopyFromBytes(self.handle, data, nbytes))
     File "/home/zyx/workspaces/python/tvm0.8_v2/python/tvm/_ffi/base.py", line 
344, in check_call
       raise get_last_ffi_error()
   tvm._ffi.base.TVMError: Traceback (most recent call last):
     [bt] (6) 
/home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(TVMArrayCopyFromBytes+0xe)
 [0x7f097dcf53ae]
     [bt] (5) 
/home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::ArrayCopyFromBytes(DLTensor*,
 void const*, unsigned long)+0x2c9) [0x7f097dcf52e9]
     [bt] (4) 
/home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::RPCDeviceAPI::CopyDataFromTo(void
 const*, unsigned long, void*, unsigned long, unsigned long, DLContext, 
DLContext, DLDataType, void*)+0x346) [0x7f097dd265b6]
     [bt] (3) 
/home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::RPCEndpoint::CopyToRemote(void*,
 unsigned long, void*, unsigned long, unsigned long, DLContext, 
DLDataType)+0x75d) [0x7f097dd2a4cd]
     [bt] (2) 
/home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::RPCEndpoint::HandleUntilReturnEvent(bool,
 std::function<void (tvm::runtime::TVMArgs)>)+0x1a5) [0x7f097dd28955]
     [bt] (1) 
/home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::SockChannel::Send(void
 const*, unsigned long)+0xb8) [0x7f097dd490b8]
     [bt] (0) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(+0x1bc2838) 
[0x7f097dd44838]
     File 
"/home/zyx/workspaces/python/tvm0.8_v2/src/runtime/rpc/../../support/socket.h", 
line 360
   TVMError: Socket SockChannel::Send Error:连接被对方重设
   ```
   
   The BERT model was imported from Torch
   ```
               model_class = transformers.BertModel
               tokenizer_class = transformers.BertTokenizer
   
               # Better to download them manualy
               #   
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin
               #   
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
               #   
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json
               # Then rename to pytorch_model.bin, vocab.txt & config.json
               # weight = 'path to downloaded model dir'
               weight = '/home/zyx/.torch/hub/bert-base-uncased'
               model = model_class.from_pretrained(weight)
               model = ModelWrapper(model)
               model.eval()
   
               # tokenizer = tokenizer_class.from_pretrained(weight)
               # A = torch.tensor([tokenizer.encode("Here is some text to 
encode", add_special_tokens=True)])
               # There is 30522 words in bert-base-uncased's vocabulary list
               input_shape = [batch_size, 128]
               input_name = 'input_ids'
               input_dtype = 'int64'
               A = torch.randint(30000, input_shape)
               scripted_model = torch.jit.trace(model, [A])
               shape_list = [('input_ids', input_shape)]
               mod, params = relay.frontend.from_pytorch(scripted_model, 
shape_list)
               mod = optimize_bert(mod, params)
   ```
   
   The ```optimize_bert``` function has the following passes:
   ```
       new_mod = FastSoftmax(mod)
       new_mod = ShapeConstDedup(new_mod)
       new_mod = tvm.relay.transform.EliminateCommonSubexpr()(new_mod)
       BindPass = tvm.relay.transform.function_pass(lambda fn, new_mod, ctx:
               tvm.relay.build_module.bind_params_by_name(fn, params), 
opt_level=1)
       new_mod = BindPass(new_mod)
       new_mod = tvm.relay.transform.FoldConstant()(new_mod)
       new_mod = tvm.relay.transform.CombineParallelBatchMatmul()(new_mod)
       # new_mod = 
tvm.relay.transform._ffi_api.BatchMatmulWeightTranspose()(new_mod)
       new_mod = tvm.relay.transform.FoldConstant()(new_mod)
       ret_list.append(new_mod)
   ```
   
   I also tried the commit https://github.com/apache/tvm/pull/5516 for 
```ring_buffer.h```, but didn't work.
   
   It seems that it will fail when the allocated space is over about 400 MB.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to