shingjan commented on PR #12141:
URL: https://github.com/apache/tvm/pull/12141#issuecomment-1197830152

   bert base llvm:
   
   ```
    ID |                                                              Name |    
  FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency (us) | 
Trials | Terminated 
   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     0 |                                                        fused_take |    
     1 |      1 |         0.0002 |       5.2482 |                5.2482 |      
1 |            
     1 |                                      fused_nn_dense_add_fast_tanh |   
1204224 |      1 |       114.1531 |      10.5492 |               10.5492 |     
32 |          Y 
     2 |                       fused_reshape_add_reshape_transpose_reshape |    
 49152 |     12 |         1.7456 |      28.1570 |              337.8840 |      
1 |          Y 
     3 |                                                    fused_variance |    
147520 |     25 |        17.0011 |       8.6771 |              216.9272 |     
32 |          Y 
     4 |                                                        fused_mean |    
 49216 |     25 |         6.1650 |       7.9831 |              199.5783 |     
32 |          Y 
     5 |                                               fused_cast_take_add |    
 49152 |      1 |         2.5341 |      19.3960 |               19.3960 |      
2 |          Y 
     6 |                     fused_reshape_add_reshape_transpose_reshape_1 |    
 49152 |     24 |         4.7436 |      10.3617 |              248.6808 |      
1 |          Y 
     7 |                                          fused_reshape_divide_add |    
 98304 |     12 |        12.9576 |       7.5866 |               91.0392 |      
2 |          Y 
     8 |                                             fused_nn_fast_softmax |   
4374528 |     12 |        89.9165 |      48.6510 |              583.8123 |     
32 |          Y 
     9 |                                                     fused_reshape |    
     1 |     12 |         0.0000 |     117.9034 |             1414.8410 |      
1 |          Y 
    10 |                                             fused_nn_batch_matmul |   
6291456 |     24 |       141.6181 |      44.4255 |             1066.2123 |     
32 |          Y 
    11 |                                   fused_reshape_transpose_reshape |    
     1 |     12 |         0.0000 |      29.6311 |              355.5735 |      
1 |          Y 
    12 |                                                    fused_nn_dense |  
75497472 |     48 |       191.4222 |     394.4030 |            18931.3435 |     
32 |          Y 
    13 |                                                   fused_reshape_1 |    
     1 |     24 |         0.0001 |      10.0435 |              241.0438 |      
1 |          Y 
    14 |                                                  fused_nn_dense_1 | 
301989888 |     12 |       178.7608 |    1689.3522 |            20272.2265 |    
 32 |          Y 
    15 | fused_reshape_add_multiply_fast_erf_multiply_add_multiply_reshape |  
15532032 |     12 |         4.0672 |    3818.8959 |            45826.7502 |     
 1 |          Y 
    16 |                                                  fused_nn_dense_2 | 
301989888 |     12 |       240.5859 |    1255.2267 |            15062.7200 |    
 32 |          Y 
    17 |                                             fused_reshape_add_add |    
 98304 |     24 |        12.5405 |       7.8389 |              188.1338 |      
2 |          Y 
    18 |                       fused_subtract_add_sqrt_divide_multiply_add |    
196672 |     25 |        12.7443 |      15.4322 |              385.8043 |      
2 |          Y 
   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   
   ```
   profiler table
   ```
    ID |                                        Name | Time (min) | Percentage 
   ----------------------------------------------------------------------------
       |                                       Total |    15.5728 |   100.0000 
     1 |             MeasureCallback/UpdateCostModel |     5.2182 |    33.5082 
     2 |     EvoSearch/Evolve/PredictNormalizedScore |     2.4700 |    15.8609 
     3 |                   EvoSearch/Evolve/Mutation |     2.1395 |    13.7387 
     4 |                                SendToRunner |     1.5999 |    10.2737 
     5 |                       EvoSearch/Evolve/Misc |     1.5694 |    10.0778 
     6 |                               SendToBuilder |     0.9345 |     6.0006 
     7 |              EvoSearch/SampleInitPopulation |     0.7653 |     4.9146 
     8 |                            ApplyHistoryBest |     0.5334 |     3.4250 
     9 |                              TaskExtraction |     0.1634 |     1.0490 
    10 |                              InitializeTask |     0.0280 |     0.1798 
    11 |                 EvoSearch/PickWithEpsGreedy |     0.0047 |     0.0304 
    12 |              EvoSearch/PickBestFromDatabase |     0.0036 |     0.0232 
    13 |               MeasureCallback/AddToDatabase |     0.0019 |     0.0125 
    14 |         MeasureCallback/RemoveBuildArtifact |     0.0004 |     0.0028 
    15 |              MeasureCallback/EchoStatistics |     0.0002 |     0.0010 
    16 |                           JoinRunnerFutures |     0.0002 |     0.0010 
    17 | EvoSearch/Evolve/Misc/CopyMeasuredWorkloads |     0.0000 |     0.0000 
   ----------------------------------------------------------------------------
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to