Zha0q1 opened a new issue #19747:
URL: https://github.com/apache/incubator-mxnet/issues/19747
I am trying to enable the path mxnet/gluon-nlp --> onnx --> tensorrt.
There is a bug that if I use a pretrained bert model, then running inference
with tensor rt in fp16 mode will produce `nan`'s.
Using pretrained weights:
```
bert, _ = nlp.model.get_model(
name=model_name,
ctx=ctx,
dataset_name=dataset,
pretrained=True,
use_pooler=True,
use_decoder=False,
num_layers=3, # hardcode this as 3 layer since this is what the customer
uses
use_classifier=False,
hparam_allow_override=True)
model = bert
```
Not using pretrained weights:
```
bert, _ = nlp.model.get_model(
name=model_name,
ctx=ctx,
dataset_name=dataset,
pretrained=False,
use_pooler=True,
use_decoder=False,
num_layers=3, # hardcode this as 3 layer since this is what the customer
uses
use_classifier=False,
hparam_allow_override=True)
model = bert
model.initialize(ctx=ctx)
```
More specifically, WITHOUT pretrained weights, tensor rt can produce
reasonable outputs in both fp16 mode and regular fp32 mode. However, WITH
pretrained weights, tensor rt will produce nan ouputs in fp16 mode, but fp32
mode seems to work fine. Furthermore, it seems like this nan issue is triggered
by the size of `seq_length`: when `seq_length<=16` even fp16 mode will produce
reasonable outputs; when `seq_length>17`, fp 16 mode will start to produce
`nan`'s. `batch` batch size seems to not affect the nan behavior.
Reproducible code and steps can be found here
https://github.com/apache/incubator-mxnet/pull/19746. Because we have a
customer requesting this feature, it would be great if friends at Nvidia can
help look into this issue. Please let me know how I can provide further
info/help
@sandeep-krishnamurthy @MoisesHer @Kh4L
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]