[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4564: [Doc] Introduction to module serialization

GitBox Wed, 15 Jan 2020 18:37:38 -0800

FrozenGene commented on a change in pull request #4564: [Doc] Introduction to 
module serialization
URL: https://github.com/apache/incubator-tvm/pull/4564#discussion_r367205537


 ##########
 File path: docs/dev/introduction_to_module_serialization.rst
 ##########
 @@ -0,0 +1,227 @@
+..  Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+..    http://www.apache.org/licenses/LICENSE-2.0
+
+..  Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Introduction to Module Serialization
+====================================
+
+When to deploy TVM runtime module, no matter whether it is CPU or GPU, TVM 
only needs one single DLL.
+The key is our unified module serialization mechanism. This document will 
introduce TVM module
+serialization format standard and implementation details.
+
+*********************
+Module Export Example
+*********************
+
+Let us build one ResNet-18 workload for GPU as an example first.
+
+.. code:: python
+
+   from tvm import relay
+   from tvm.relay import testing
+   from tvm.contrib import util
+   import tvm
+
+   # Resnet18 workload
+   resnet18_mod, resnet18_params = 
relay.testing.resnet.get_workload(num_layers=18)
+
+   # build
+   with relay.build_config(opt_level=3):
+       _, resnet18_lib, _ = relay.build_module.build(resnet18_mod, "cuda", 
params=resnet18_params)
+
+   # create one tempory directory
+   temp = util.tempdir()
+
+   # path lib
+   file_name = "deploy.so"
+   path_lib = temp.relpath(file_name)
+
+   # export library
+   resnet18_lib.export_library(path_lib)
+
+   # load it back
+   loaded_lib = tvm.module.load(path_lib)
+   assert loaded_lib.type_key == "library"
+   assert loaded_lib.imported_modules[0].type_key == "cuda"
+
+*************
+Serialization
+*************
+
+The entrance API is ``export_library`` of ``tvm.module.Module``.
+Inside this function, we will do the following steps:
+
+1. Collect all DSO modules (LLVM modules and C modules)
+
+2. Once we have DSO modules, we will call ``save`` function to save them into 
files.
+
+3. Next, we will check whether we have imported modules, such as CUDA,
+   OpenCL or anything else. We don't restrict the module type here.
+   Once we have imported modules, we will create one file named ``dev.cc``
+   (so that we could embed the binary blob data of import modules into one 
dynamic shared library),
+   then call function ``_PackImportsToLLVM`` or ``_PackImportsToC`` to do 
module serialization.
+
+4. Finally, we call ``fcompile`` which invokes ``_cc.create_shared`` to get
+   dynamic shared library.
+
+.. note::
+    1. For C source modules, we will compile them and link them together with 
the DSO module.
+
+    2. Use ``_PackImportsToLLVM`` or ``_PackImportsToC`` depends on whether we 
enable LLVM in TVM.
+       They achieve the same goal in fact.
+
+***************************************************
+Under the Hood of Serialization and Format Standard
+***************************************************
+
+As said before, we will do the serialization work in the 
``_PackImportsToLLVM`` or ``_PackImportsToC``.
+They both call ``SerializeModule`` to serialize the runtime module. In 
``SerializeModule``
+function, we firstly construct one helper class ``ModuleSerializer``. It will 
take ``module`` to do some
+initialization work, like marking module index. Then we could use its 
``SerializeModule`` to serialize module.
+
+For better understanding, let us dig the implementation of this class a little 
deeper.
+
+The following code is used to construct ``ModuleSerializer``:
+
+.. code:: c++
+
+   explicit ModuleSerializer(runtime::Module mod) : mod_(mod) {
+     Init();
+   }
+   private:
+   void Init() {
+     CreateModuleIndex();
+     CreateImportTree();
+   }
+
+In ``CreateModuleIndex()``, We will inspect module import relationship
+using DFS and create index for them. Note the root module is fixed at
+location 0. In our example, we have module relationship like this:
+
+.. code:: c++
+
+  llvm_mod:imported_modules
+    - cuda_mod
+
+So LLVM module will have index 0, CUDA module will have index 1.
+
+After constructing module index, we will try to construct import tree 
(``CreateImportTree()``),
+which will be used to restore module import relationship when we load
+the exported library back. In our design, we use CSR format to store
+import tree, each row is parent index, the child indices correspond to its 
children
+index. In code, we use ``import_tree_row_ptr_`` and
+``import_tree_child_indices_`` to represent them.
+
+After initialization, we could serialize module using ``SerializeModule`` 
function.
+In its function logic, we will assume the serialization format like this:
+
+.. code:: c++
+
+   binary_blob_size
+   binary_blob_type_key
+   binary_blob_logic
+   binary_blob_type_key
+   binary_blob_logic
+   ...
+   _import_tree
+   _import_tree_logic
+
+``binary_blob_size`` is how many blobs we will have in this
+serialization step. In our example, the number will equal to 3. One for
+LLVM module, one for CUDA module, one for ``_import_tree``.
+
+Then we will write the ``binary_blob_type_key``, for LLVM module / C
+module, the blob type key is ``_lib``. For CUDA module, it is
+``cuda``, which could be got by ``module->type_key()``.
+
+Next we will do the ``binary_blob_logic``. Normally, we will call
 
 Review comment:
   Not correct completely. My meaning is we will do logic handling for 
different binary blob type (like CUDA, LLVM, import tree ...). Different blob 
will have different logic. Like CUDA, in the ``binary_blob_logic``, we will do 
serialization of the binary blob using ``SaveToBinary``. However, for LLVM, we 
just to write `_lib` to indicate this is a DSO module. For `_import_tree`, we 
will write `_import_tree` and one CSR data structure recoding module import 
relationship. So I write will do the ``binary_blob_logic``  and following 
sentences to say, most of time we will call ``SaveToBinary`` to do 
serialization of this  binary blob, but it is not all cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4564: [Doc] Introduction to module serialization

Reply via email to