Hi,

The current TensorRT implementation tweak simple_bind, you pass parameters 
through the shared_params of the function.

The problems of the current implementation:
 - You have to use the symbol API (currently no solution for module API)
 - You have to set an environment variable + use the specific binding function 
which is not very user friendly

Some TensorRT constraints:
 - Go through ONNX representation to use onnx-tensorrt (currently no nnvm to 
tensorrt implementation).
 - The ONNX model must contains some attributes and informations such as shape, 
dtypes, context, and weight values to instantiate the TensorRT engine properly 
(this is TensorRT enginer requirement not an ONNX requirement).

Here is a proposal using Subgraph API:
As most attributes inferences are done at the binding level we cannot create 
the NNVM and instantiate the TensorRT engine at the Subgraph API.

What we can do is the same kind of solution that we have for CuDNNConv to call 
CuDNN find only once (see: 
https://github.com/apache/incubator-mxnet/blob/d22b323df5cfd2d330a321a3daf6880e108eb90c/src/operator/nn/convolution.cu#L39)
 which is to create the NNVM and instantiate the TensorRT engine during the 
first call for forward pass and then just get the existing TensorRT engine for 
the following others forward pass.

It doesn't require any change to the Subgraph API, and I believe if it follows 
the same behavior than CuDNNConv it should be a valid approach, but I'm looking 
for your approval.

One problem arise, on which I hope to start a discussion here:
Some variable nodes will be partitioned away from the main graph and will be 
contained in the subgraphs (As the weights have to be contains inside the ONNX 
model / TensorRT engine).
So we need to find a way to load the weights inside the TensorRT node, if 
possible without adding any function but still relying on whatever functions 
users are currently calling to load weights.
I'm thinking about a way to directly interact with node inside subgraph (so we 
would have to modify Getters) and directly embedded the weights values inside a 
node attribute (the same way we embedded the subgraph), which may or may not be 
used by the node.
Ideally the solution could be use by others futures users of subgraph API if 
they need the weights values inside the node for whatever reasons.


Let me know your thought on it,

Clement Fuji Tsang


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Reply via email to