[GitHub] [systemds] Baunsgaard commented on pull request #1127: [SYSTEMDS-2760] Transpose micro benchmark

2020-12-18 Thread GitBox


Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-748005848


   When looking at before and after (the way i tested it was dropping the 
transpose commit from the history.) it looks like i might have done something 
wrong in the initial tests. That said, it does not look like the changes had 
any impact, but it did make me notice the difference between executions on the 
wide transpose is large. Sometimes it takes 5 seconds sometimes 2.5 I'm 
guessing it has to do with the two NUMA nodes?
   
   The Full transpose micro benchmark:
   
   After change Alpha
   ```code
   scripts/perftest/results/transpose-skinny-1.0.log
   Total elapsed time:  5.177 sec.
1  r' 2.567  5
   Total elapsed time:  5.592 sec.
1  r' 2.487  5
   Total elapsed time:  5.394 sec.
2  r' 2.393  5
   Total elapsed time:  5.607 sec.
1  r' 2.496  5
   Total elapsed time:  5.361 sec.
1  r' 2.531  5
195735.81 msec task-clock#   31.188 CPUs utilized   
 ( +-  3.50% )
 595845281584  cycles#3.044 GHz 
 ( +-  3.34% )  (30.75%)
  67405027834  instructions  #0.11  insn per cycle  
 ( +-  2.26% )  (38.51%)
   scripts/perftest/results/transpose-wide-1.0.log
   Total elapsed time:  4.870 sec.
1  r' 2.439  5
   Total elapsed time:  5.466 sec.
1  r' 2.418  5
   Total elapsed time:  5.381 sec.
1  r' 2.393  5
   Total elapsed time:  5.257 sec.
1  r' 2.343  5
   Total elapsed time:  4.880 sec.
1  r' 2.453  5
197370.59 msec task-clock#   32.701 CPUs utilized   
 ( +-  6.74% )
 598434626116  cycles#3.032 GHz 
 ( +-  6.70% )  (30.76%)
  70128163005  instructions  #0.12  insn per cycle  
 ( +-  1.65% )  (38.51%)
   scripts/perftest/results/transpose-full-1.0.log
   Total elapsed time:  3.736 sec.
2  r' 1.343  5
   Total elapsed time:  3.858 sec.
2  r' 1.326  5
   Total elapsed time:  3.500 sec.
2  r' 1.299  5
   Total elapsed time:  3.894 sec.
2  r' 1.305  5
   Total elapsed time:  3.526 sec.
2  r' 1.304  5
104490.76 msec task-clock#   22.819 CPUs utilized   
 ( +-  1.56% )
 320478636150  cycles#3.067 GHz 
 ( +-  1.69% )  (30.80%)
  62146562879  instructions  #0.19  insn per cycle  
 ( +-  1.59% )  (38.55%)
   scripts/perftest/results/transpose-skinny-0.1.log
   Total elapsed time:  2.701 sec.
1  r' 1.437  5
   Total elapsed time:  2.659 sec.
1  r' 1.141  5
   Total elapsed time:  3.174 sec.
1  r' 1.761  5
   Total elapsed time:  2.705 sec.
1  r' 1.103  5
   Total elapsed time:  3.112 sec.
1  r' 1.472  5
152922.25 msec task-clock#   43.917 CPUs utilized   
 ( +-  5.32% )
 473697710114  cycles#3.098 GHz 
 ( +-  5.32% )  (31.11%)
  75871932728  instructions  #0.16  insn per cycle  
 ( +-  2.13% )  (38.92%)
   scripts/perftest/results/transpose-wide-0.1.log
   Total elapsed time:  7.215 sec.
1  r' 5.376  5
   Total elapsed time:  6.703 sec.
1  r' 4.871  5
   Total elapsed time:  4.625 sec.
1  r' 2.815  5
   Total elapsed time:  4.400 sec.
1  r' 2.592  5
   Total elapsed time:  5.506 sec.
1  r' 3.721  5
214645.79 msec task-clock#   33.943 CPUs utilized   
 ( +- 18.68% )
 658068071617  cycles#3.066 GHz 
 ( +- 18.75% )  (30.71%)
  78768925872  instructions  #0.12  insn per cycle  
 ( +- 21.76% )  (38.42%)
   scripts/perftest/results/transpose-full-0.1.log
   Total elapsed time:  1.368 sec.
1  r' 0.583  5
   Total elapsed time:  1.365 sec.
1  r' 0.574  5
   Total elapsed time:  1.724 sec.
1  r' 0.835  5
   Total elapsed time:  1.564 sec.
1  r' 0.708  5
   Total elapsed time:  1.404 sec.
1  r' 0.522  5
 79268.38 msec task-clock#  

[GitHub] [systemds] Baunsgaard commented on pull request #1127: [SYSTEMDS-2760] Transpose micro benchmark

2020-12-17 Thread GitBox


Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747426098


   > I'll have a look tonight and see what we can do. Airline was dense, right?
   
   Yes airline is dense, and i don't seem to be able to reproduce the bad 
performance calling transpose in a script.
   
   dimensions on airline is:
   14.5mil row, 29 col, 2200 mil nnz
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [systemds] Baunsgaard commented on pull request #1127: [SYSTEMDS-2760] Transpose micro benchmark

2020-12-17 Thread GitBox


Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747424955


   The large 15 mil case seems to have little to no difference.
   
   But there still is a bug somewhere.
   
   XPS:
   ```bash
   scripts/perftest/results/transpose-large.log
   Total elapsed time: 7.377 sec.
1  r' 4.352  1
   Total elapsed time: 7.835 sec.
1  r' 4.649  1
   Total elapsed time: 7.659 sec.
1  r' 4.398  1
   Total elapsed time: 7.903 sec.
1  r' 4.677  1
   Total elapsed time: 7.723 sec.
1  r' 4.445  1
36.435,71 msec task-clock#4,264 CPUs utilized   
 ( +-  1,27% )
  134.881.449.707  cycles#3,702 GHz 
 ( +-  0,43% )  (30,65%)
  119.303.817.112  instructions  #0,88  insn per cycle  
 ( +-  0,37% )  (38,39%)
   ```
   
   Alpha:
   ```bash
   scripts/perftest/results/transpose-large.log
   Total elapsed time:  8.531 sec.
1  r' 5.459  1
   Total elapsed time:  8.366 sec.
1  r' 5.412  1
   Total elapsed time:  10.413 sec.
1  r' 7.507  1
   Total elapsed time:  8.373 sec.
1  r' 5.420  1
   Total elapsed time:  8.254 sec.
1  r' 5.394  1
100414.75 msec task-clock#   10.271 CPUs utilized   
 ( +-  5.07% )
 314073685855  cycles#3.128 GHz 
 ( +-  4.82% )  (30.86%)
 127951221368  instructions  #0.41  insn per cycle  
 ( +-  3.10% )  (38.62%)
   ```
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org