[PR] [SYSTEMDS-3535] Scalable Linear Algebra Benchmark (SLAB) [systemds]

via GitHub Fri, 26 Jul 2024 15:42:23 -0700


ReneEnjilian opened a new pull request, #2055:
URL: https://github.com/apache/systemds/pull/2055


   This pull request implements the Scalable  Linear Algebra Benchmark (Slab). 
The benchmark divides the workloads into 3 segments: (1) Matrix operators, (2) 
Pipelines and Decompositions, (3) Bulk LA-based ML Algorithms. For those 
different workloads we vary different parameters like number of rows and 
sparsity. In the original paper, the authors also varied parameters like number 
of nodes in their cluster. These parallel experiments are executed via Apache 
Spark. Given the constraints of my setup (only a single CPU), I only executed 
these experiments via spark on my CPU. To run it via multiple CPUs, one would 
need to create a Spark cluster manually. I left the experiment output files in 
this pull request for better reviewing purposes. The output folders can of 
course be deleted when merging in. 
   
   ### Potential Bug
   I noticed that the `tsmm` operator causes error messages in some experiments 
as can be observed in the output directories of operators/distributed_sparse 
and mlAlgorithms/distributed. For the first the corresponding file is 
slabGramMatrixSparse_stats.txt and for the latter 
slabHeteroscedasticityRobustStandardErrorsDistr_stats.txt. I couldn't figure 
out so far why the `ArrayIndexOutOfBoundsException` happens there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [SYSTEMDS-3535] Scalable Linear Algebra Benchmark (SLAB) [systemds]

Reply via email to