Baunsgaard opened a new pull request, #1886:
URL: https://github.com/apache/systemds/pull/1886

   This PR contains code to parallelize our normal local filesystem write.
   Previously we had parallelization thresholds that depended on the block size 
of Hadoop.
   This PR now include parallelization thresholds depending on the local file 
system of 4k blocks that activate if the path saved to is a local filesystem 
path.
   
   Speed extracted: before 490 MiB a sec now: 1366 MiB a sec uncompressed for a 
10k by 1k matrix.
   This means the IO time for such a matrix goes from ~17 ms to ~6 ms.
   
   Before:
   
   ```txt
   Me:~/github/systemds$ java -jar 
-agentpath:$HOME/Programs/profiler/lib/libasyncProfiler.so=start,event=cpu,file=temp/log.html
 target/systemds-3.2.0-SNAPSHOT-perf.jar 12 10000 100 4 1.0 16 1000 -1
   Profiling started
              Serialize  Repetitions: 1000 ConstMatrix ( Rows:10000, Cols:100, 
Spar:1.0, Unique: 4) threads: 16
                             Serialize,    3.887+-  0.174 ms,   2058320386+-    
89821429 Byte/s,   2058283594+-    89819824 Byte/s
                          StandardDisk,   16.738+-  0.687 ms,    477966146+-    
19348274 Byte/s,    490246835+-    19845401 Byte/s
                       Compress Normal,    6.042+-  0.410 ms,   1324003460+-    
87110231 Byte/s,    117328309+-     7719388 Byte/s
                   Compress StandardIO,    9.044+-  0.681 ms,    884579067+-    
62485721 Byte/s,     77942529+-     5401477 Byte/s
             Update&Apply Scheme Fused,    2.167+-  0.060 ms,   3691386088+-   
102744524 Byte/s,    211763414+-     5894136 Byte/s
              Update&Apply Standard IO,    3.776+-  0.214 ms,   2118758727+-   
111043936 Byte/s,    117327287+-     7215174 Byte/s
   ```
   
   After:
   
   ```txt
   Me:~/github/systemds$ java -jar 
-agentpath:$HOME/Programs/profiler/lib/libasyncProfiler.so=start,event=cpu,file=temp/log.html
 target/systemds-3.2.0-SNAPSHOT-perf.jar 12 10000 100 4 1.0 16 1000 -1
   Profiling started
              Serialize  Repetitions: 1000 ConstMatrix ( Rows:10000, Cols:100, 
Spar:1.0, Unique: 4) threads: 16
                             Serialize,    4.468+-  0.156 ms,   1790403527+-    
59113889 Byte/s,   1790371524+-    59112832 Byte/s
                          StandardDisk,    6.052+-  1.052 ms,   1321840798+-   
186699910 Byte/s,   1366304529+-   192980072 Byte/s
                       Compress Normal,    6.493+-  0.458 ms,   1232126491+-    
85824210 Byte/s,    109186511+-     7605425 Byte/s
                   Compress StandardIO,    9.391+-  0.551 ms,    851896160+-    
46919041 Byte/s,     74963537+-     4126143 Byte/s
             Update&Apply Scheme Fused,    2.341+-  0.087 ms,   3417727924+-   
129617142 Byte/s,    196064490+-     7435735 Byte/s
              Update&Apply Standard IO,    4.253+-  0.235 ms,   1880890778+-    
95144067 Byte/s,    104397826+-     6055021 Byte/s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to