cdljn2011 opened a new issue, #14526:
URL: https://github.com/apache/tvm/issues/14526
arm_cpu compilation problem!
When the AI model are assigned to arm_cpu for compilation, the compiled
model operators are scattered into multiple subgraphs instead of compiling into
a complete graph (one operator compiled as one subgraph) . This seriously
affects the speed of reasoning.
Could you tell me how to solve this code?
How can arm_cpu compile all operators into one graph instead of multiple
subgraphs?
For example, in the mobilenetv2.onnx model, the operators in the model are
compiled into arm_cpu. The json result of the model is as follows. The model
operators are divided into many subgraphs of the class, resulting in slow
running speed.
:
{
"nodes": [
{
"op": "null",
"name": "input",
"inputs": []
},
{
"op": "null",
"name": "p0",
"inputs": []
},
{
"op": "null",
"name": "p1",
"inputs": []
},
{
"op": "tvm_op",
"name": "tvmgen_default_fused_nn_conv2d_add_clip",
"attrs": {
"num_outputs": "1",
"num_inputs": "3",
"flatten_data": "0",
"func_name": "tvmgen_default_fused_nn_conv2d_add_clip",
"out_layout": "",
"kernel_layout": "OIHW",
"data_layout": "NCHW",
"hash": "2714b8eaaab95a1c"
},
"inputs": [
[
0,
0,
0
],
[
1,
0,
0
],
[
2,
0,
0
]
]
},
{
"op": "null",
"name": "p2",
"inputs": []
},
{
"op": "null",
"name": "p3",
"inputs": []
},
{
"op": "tvm_op",
"name": "tvmgen_default_fused_nn_conv2d_add_clip_1",
"attrs": {
"num_outputs": "1",
"num_inputs": "3",
"flatten_data": "0",
"func_name": "tvmgen_default_fused_nn_conv2d_add_clip_1",
"out_layout": "",
"kernel_layout": "OIHW",
"data_layout": "NCHW",
"hash": "1c02e3eebef4748c"
},
"inputs": [
[
3,
0,
0
],
[
4,
0,
0
],
[
5,
0,
0
]
]
},
{
"op": "null",
"name": "p4",
"inputs": []
},
{
"op": "null",
"name": "p5",
"inputs": []
},
{
"op": "tvm_op",
"name": "tvmgen_default_fused_nn_conv2d_add",
"attrs": {
"num_outputs": "1",
"num_inputs": "3",
"flatten_data": "0",
"func_name": "tvmgen_default_fused_nn_conv2d_add",
"out_layout": "",
"kernel_layout": "OIHW",
"data_layout": "NCHW",
"hash": "0a3ab8dc5430c9ad"
},
"inputs": [
[
6,
0,
0
],
[
7,
0,
0
],
[
8,
0,
0
]
]
},
{
"op": "null",
"name": "p6",
"inputs": []
},
{
"op": "null",
"name": "p7",
"inputs": []
},
{
"op": "tvm_op",
"name": "tvmgen_default_fused_nn_conv2d_add_clip_2",
"attrs": {
"num_outputs": "1",
"num_inputs": "3",
"flatten_data": "0",
"func_name": "tvmgen_default_fused_nn_conv2d_add_clip_2",
"out_layout": "",
"kernel_layout": "OIHW",
"data_layout": "NCHW",
"hash": "738375e0ba07c89e"
},
........
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]