shingjan commented on PR #12141:
URL: https://github.com/apache/tvm/pull/12141#issuecomment-1198304639

   mobilenetv2 on cuda
   ```
    ID |                                   Name |     FLOP | Weight | Speed 
(GFLOPS) | Latency (us) | Weighted Latency (us) | Trials | Terminated 
   
-----------------------------------------------------------------------------------------------------------------------------------------------
     0 |                 fused_layout_transform |        1 |      1 |         
0.0004 |       2.2798 |                2.2798 |      6 |            
     1 |               fused_nn_conv2d_add_clip | 22880256 |      1 |      
3151.3187 |       7.2605 |                7.2605 |     32 |          Y 
     2 |             fused_nn_conv2d_add_clip_1 |  8429568 |      1 |      
1285.2570 |       6.5587 |                6.5587 |     32 |          Y 
     3 |                    fused_nn_conv2d_add | 13045760 |      1 |      
2104.8376 |       6.1980 |                6.1980 |     32 |          Y 
     4 |             fused_nn_conv2d_add_clip_2 | 42147840 |      1 |      
2994.8494 |      14.0734 |               14.0734 |     32 |          Y 
     5 |             fused_nn_conv2d_add_clip_3 |  6322176 |      1 |       
682.4610 |       9.2638 |                9.2638 |     32 |          Y 
     6 |                  fused_nn_conv2d_add_1 | 14525952 |      1 |      
1936.7547 |       7.5002 |                7.5002 |     32 |          Y 
     7 |             fused_nn_conv2d_add_clip_4 |  9483264 |      1 |      
1537.1009 |       6.1696 |                6.1696 |     32 |          Y 
     8 |                fused_nn_conv2d_add_add | 21826560 |      1 |      
2005.0549 |      10.8858 |               10.8858 |     32 |          Y 
     9 |             fused_nn_conv2d_add_clip_5 | 23030784 |      2 |      
1914.2627 |      12.0312 |               24.0623 |     32 |          Y 
    10 |             fused_nn_conv2d_add_clip_6 |  2370816 |      1 |       
393.0634 |       6.0316 |                6.0316 |     32 |          Y 
    11 |                  fused_nn_conv2d_add_2 |  7250432 |      1 |       
917.3106 |       7.9040 |                7.9040 |     32 |          Y 
    12 |             fused_nn_conv2d_add_clip_7 |  3161088 |      2 |       
262.2023 |      12.0559 |               24.1118 |     32 |          Y 
    13 |              fused_nn_conv2d_add_add_1 |  9683968 |      2 |      
1061.2357 |       9.1252 |               18.2504 |     32 |          Y 
    14 |             fused_nn_conv2d_add_clip_8 | 10085376 |      3 |       
737.1134 |      13.6823 |               41.0468 |     32 |          Y 
    15 |             fused_nn_conv2d_add_clip_9 |   790272 |      1 |       
170.2160 |       4.6428 |                4.6428 |     32 |          Y 
    16 |                  fused_nn_conv2d_add_3 |  4829440 |      1 |       
957.4766 |       5.0439 |                5.0439 |     32 |          Y 
    17 |              fused_nn_conv2d_add_add_2 |  9658880 |      3 |       
919.5057 |      10.5044 |               31.5133 |     32 |          Y 
    18 |            fused_nn_conv2d_add_clip_10 |  9859584 |      4 |      
1410.7424 |       6.9889 |               27.9557 |     32 |          Y 
    19 |            fused_nn_conv2d_add_clip_11 |  1580544 |      4 |       
361.8447 |       4.3680 |               17.4721 |     32 |          Y 
    20 |                  fused_nn_conv2d_add_4 | 14469504 |      1 |       
739.5858 |      19.5643 |               19.5643 |     32 |          Y 
    21 |            fused_nn_conv2d_add_clip_12 |  2370816 |      2 |       
503.2051 |       4.7114 |                9.4229 |     32 |          Y 
    22 |              fused_nn_conv2d_add_add_3 | 21713664 |      2 |      
1405.4021 |      15.4501 |               30.9003 |     32 |          Y 
    23 |            fused_nn_conv2d_add_clip_13 | 22014720 |      3 |      
2486.7910 |       8.8527 |               26.5580 |     32 |          Y 
    24 |            fused_nn_conv2d_add_clip_14 |   592704 |      1 |       
125.2444 |       4.7324 |                4.7324 |     32 |          Y 
    25 |                  fused_nn_conv2d_add_5 |  9039520 |      1 |       
410.3605 |      22.0282 |               22.0282 |     32 |          Y 
    26 |              fused_nn_conv2d_add_add_4 | 15068480 |      2 |       
411.3220 |      36.6343 |               73.2685 |     32 |          Y 
    27 |            fused_nn_conv2d_add_clip_15 | 15193920 |      3 |      
1503.2292 |      10.1075 |               30.3226 |     32 |          Y 
    28 |            fused_nn_conv2d_add_clip_16 |   987840 |      3 |       
224.3443 |       4.4032 |               13.2097 |     32 |          Y 
    29 |                  fused_nn_conv2d_add_6 | 30121280 |      1 |      
1749.6604 |      17.2155 |               17.2155 |     32 |          Y 
    30 |            fused_nn_conv2d_add_clip_17 | 40328960 |      1 |      
2609.1046 |      15.4570 |               15.4570 |     32 |          Y 
    31 |           fused_nn_adaptive_avg_pool2d |    64000 |      1 |        
16.9965 |       3.7655 |                3.7655 |     32 |          Y 
    32 | fused_layout_transform_reshape_squeeze |        1 |      1 |         
0.0002 |       4.3296 |                4.3296 |      6 |          Y 
    33 |                     fused_nn_dense_add |  2561000 |      1 |        
66.7119 |      38.3890 |               38.3890 |     32 |          Y 
   
-----------------------------------------------------------------------------------------------------------------------------------------------
   
   ```
   profiler table
   ```
    ID |                                        Name | Time (min) | Percentage 
   ----------------------------------------------------------------------------
       |                                       Total |    82.0160 |   100.0000 
     1 |                   EvoSearch/Evolve/Mutation |    42.1468 |    51.3885 
     2 |                               SendToBuilder |    15.1365 |    18.4556 
     3 |              EvoSearch/SampleInitPopulation |     7.9139 |     9.6492 
     4 |                                SendToRunner |     6.4504 |     7.8648 
     5 |     EvoSearch/Evolve/PredictNormalizedScore |     2.7957 |     3.4087 
     6 |             MeasureCallback/UpdateCostModel |     2.7350 |     3.3348 
     7 |                       EvoSearch/Evolve/Misc |     2.6240 |     3.1994 
     8 |                            ApplyHistoryBest |     1.1672 |     1.4232 
     9 |                              TaskExtraction |     0.4503 |     0.5490 
    10 |                              InitializeTask |     0.0250 |     0.0304 
    11 |               MeasureCallback/AddToDatabase |     0.0198 |     0.0241 
    12 |                 EvoSearch/PickWithEpsGreedy |     0.0100 |     0.0122 
    13 |         MeasureCallback/RemoveBuildArtifact |     0.0034 |     0.0041 
    14 |              EvoSearch/PickBestFromDatabase |     0.0032 |     0.0039 
    15 |              MeasureCallback/EchoStatistics |     0.0027 |     0.0033 
    16 |                           JoinRunnerFutures |     0.0013 |     0.0016 
    17 | EvoSearch/Evolve/Misc/CopyMeasuredWorkloads |     0.0000 |     0.0000 
   ----------------------------------------------------------------------------
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to