ReneEnjilian commented on PR #2271:
URL: https://github.com/apache/systemds/pull/2271#issuecomment-2994473184

   Thanks @phaniarnab for providing the resnet script. It executed correctly 
and the CuLibraries init time is now also much lower than before. Here are the 
SystemDS statistics: 
   ```
   SystemDS Statistics:
   Total elapsed time:          5.130 sec.
   Total compilation time:              1.452 sec.
   Total execution time:                3.678 sec.
   CUDA/CuLibraries init time:  0.394/0.121 sec.
   Number of executed GPU inst: 258.
   GPU mem alloc time  (alloc(success/fail) / dealloc / set0):  
0.012(0.012/0.000) / 0.000 / 0.010 sec.
   GPU mem alloc count (alloc(success/fail/reuse) / dealloc / set0):    
99(99/0/635) / 4 / 734.
   GPU mem tx time  (toDev(d2f/s2d) / fromDev(f2d/s2h) / evict(d2s/size)):      
2.140(0.000/0.000) / 0.001(0.000/0.000) / 0.000(0.000/0.000) sec.
   GPU mem tx count (toDev(d2f/s2d) / fromDev(f2d/s2h) / evict(d2s/size)):      
88(0/0) / 4(0/0) / 0(0/0).
   GPU conversion time  (sparseConv / sp2dense / dense2sp):     0.000 / 0.096 / 
0.000 sec.
   GPU conversion count (sparseConv / sp2dense / dense2sp):     0 / 74 / 0.
   Cache hits (Mem/Li/WB/FS/HDFS):      170/0/0/0/0.
   Cache writes (Li/WB/FS/HDFS):        23/0/0/0.
   Cache times (ACQr/m, RLS, EXP):      0.004/0.002/0.004/0.000 sec.
   HOP DAGs recompiled (PRED, SB):      0/241.
   HOP DAGs recompile time:     0.274 sec.
   Functions recompiled:                1.
   Functions recompile time:    0.073 sec.
   Total JIT compile time:              10.022 sec.
   Total JVM GC count:          2.
   Total JVM GC time:           0.02 sec.
   Heavy hitter instructions:
     #  Instruction          Time(s)  Count
     1  resnet18_forward       3.220      1
     2  basic_block            2.347     24
     3  bn2d_forward           2.241     60
     4  gpu_batch_norm2d       2.158     60
     5  rand                   0.587    162
     6  gpu_conv2d_bias_add    0.214     61
     7  gpu_ba+*               0.068      7
     8  gpu_max                0.059     51
     9  gpu_softmax            0.040      3
    10  gpu_*                  0.028     26
   ```
   The init time(s) are now given by: `CUDA/CuLibraries init time:      
0.394/0.121 sec.`. With your consent, since this is your script, I would like 
to add this script as unit test for the GPU-backend for future debugging 
purposes. I will also add other tests ensuring good coverage. In the next 
phase, we can do the perf test to investigate if the changes may result in 
decreased performance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to