vvchernov opened a new pull request, #12231:
URL: https://github.com/apache/tvm/pull/12231

   The main idea is to construct some handshake mechanism which can help to 
check that TVM executor obtained from pre-compiled libraries and other files is 
corresponded to given model.
   
   There is a part implementation of the idea. Onnx model is assumed only. 
Virtual Machine is used only.
   
   The feature is switched on by special argument `get_hash` in `from_onnx` 
method, it is False by default. The pipeline is the following: 1. hash 
calculation (sha256 algorithm is used) from onnx model file. 2. Saving it 
inside VM lib 3. The hash can be extracted from VM lib or write in file by it 
with other exported files (see below).
   
   One important thing which should be highlighted: we do not know anything 
about onnx model when VM is compiled. It means we can not directly generate 
hash from it inside compilation method. Due to this in current implementation 
the hash is generated on line №1 (see below) and saved in IRModule. After that 
on line №2 it is saved on VM side. 
   Of course, there is another way to workaround this problem. It is to 
transfer responsibility of the hash generation and setting it to VM on client 
side, but it looks not friendly and convenient.
   There is pipeline of VM compilation and exporting:
   ```
   from tvm import relay
   mod, params = relay.frontend.from_onnx(onnx_model, freeze_params=True, 
get_hash = True) # №1
   vm_exec = relay.vm.compile(mod, target)                                      
           # №2
   code, lib = vm_exec.save()
   # Save shared lib file.
   tmp = tvm.contrib.utils.tempdir()
   path_lib = tmp.relpath("lib.so")
   lib.export_library(path_lib)
   # Save ro-file
   with open(tmp.relpath("code.ro"), "wb") as fo:
       fo.write(code)
   # Save weights
   vm_exec.move_late_bound_consts(consts_path, byte_limit=256)
   # SAVE MODEL HASH in hash.txt file
   vm_exec.save_hash(tmp)
   ```
   There are two scenarios how the hash can be gotten:
   ```
   # GET MODEL HASH FROM TXT FILE
   hash_path = os.path.join(tmp, "hash.txt")
   with open(hash_path, "r") as f:
       hash = f.read()
   # Load VM exec.
   loaded_lib = tvm.runtime.load_module(path_lib)
   loaded_code = bytearray(open(tmp.relpath("code.ro"), "rb").read())
   des_exec = tvm.runtime.vm.Executable.load_exec(loaded_code, loaded_lib)
   # GET MODEL HASH FROM VM LIB
   hash = des_exec.get_hash()
   
   des_vm = tvm.runtime.vm.VirtualMachine(des_exec, dev)
   ...
   ```
   Procedure of hash calculation:
   ```
   import hashlib
   onnx_model_hash = hashlib.sha256(onnx_model.SerializeToString()).hexdigest()
   ```
   
   Important points which can be developed in the future:
   1. It is done for VirtualMachine only. It can be easy extended for 
GraphExecutor
   2. It is assumed that onnx model used for pre-compiled library. It requires 
more careful and considered approach for extraction of model format independent 
information (instead of or together with hash) for further check.
   3. It looks like TVM design did not initialy assume such feature. Due to 
this the hash is saved in IRModule which is assumed as immutable. I think if it 
is developed further, updates in TVM design also should be considered in 
details.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to