vvchernov opened a new pull request, #12231:
URL: https://github.com/apache/tvm/pull/12231
The main idea is to construct some handshake mechanism which can help to
check that TVM executor obtained from pre-compiled libraries and other files is
corresponded to given model.
There is a part implementation of the idea. Onnx model is assumed only.
Virtual Machine is used only.
The feature is switched on by special argument `get_hash` in `from_onnx`
method, it is False by default. The pipeline is the following: 1. hash
calculation (sha256 algorithm is used) from onnx model file. 2. Saving it
inside VM lib 3. The hash can be extracted from VM lib or write in file by it
with other exported files (see below).
One important thing which should be highlighted: we do not know anything
about onnx model when VM is compiled. It means we can not directly generate
hash from it inside compilation method. Due to this in current implementation
the hash is generated on line №1 (see below) and saved in IRModule. After that
on line №2 it is saved on VM side.
Of course, there is another way to workaround this problem. It is to
transfer responsibility of the hash generation and setting it to VM on client
side, but it looks not friendly and convenient.
There is pipeline of VM compilation and exporting:
```
from tvm import relay
mod, params = relay.frontend.from_onnx(onnx_model, freeze_params=True,
get_hash = True) # №1
vm_exec = relay.vm.compile(mod, target)
# №2
code, lib = vm_exec.save()
# Save shared lib file.
tmp = tvm.contrib.utils.tempdir()
path_lib = tmp.relpath("lib.so")
lib.export_library(path_lib)
# Save ro-file
with open(tmp.relpath("code.ro"), "wb") as fo:
fo.write(code)
# Save weights
vm_exec.move_late_bound_consts(consts_path, byte_limit=256)
# SAVE MODEL HASH in hash.txt file
vm_exec.save_hash(tmp)
```
There are two scenarios how the hash can be gotten:
```
# GET MODEL HASH FROM TXT FILE
hash_path = os.path.join(tmp, "hash.txt")
with open(hash_path, "r") as f:
hash = f.read()
# Load VM exec.
loaded_lib = tvm.runtime.load_module(path_lib)
loaded_code = bytearray(open(tmp.relpath("code.ro"), "rb").read())
des_exec = tvm.runtime.vm.Executable.load_exec(loaded_code, loaded_lib)
# GET MODEL HASH FROM VM LIB
hash = des_exec.get_hash()
des_vm = tvm.runtime.vm.VirtualMachine(des_exec, dev)
...
```
Procedure of hash calculation:
```
import hashlib
onnx_model_hash = hashlib.sha256(onnx_model.SerializeToString()).hexdigest()
```
Important points which can be developed in the future:
1. It is done for VirtualMachine only. It can be easy extended for
GraphExecutor
2. It is assumed that onnx model used for pre-compiled library. It requires
more careful and considered approach for extraction of model format independent
information (instead of or together with hash) for further check.
3. It looks like TVM design did not initialy assume such feature. Due to
this the hash is saved in IRModule which is assumed as immutable. I think if it
is developed further, updates in TVM design also should be considered in
details.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]