ramesesz opened a new pull request, #1959:
URL: https://github.com/apache/systemds/pull/1959

   This patch improves the builtin dist function by removing the outer product 
operator. For 100 function calls on an arbitrary matrix with 4000 rows and 800 
cols, the new dist function shortens the runtime from 66.541s to 60.268s.
   
   The following experiment was run with varying rows and cols size:
   ```
   X = rand(rows=4000, cols=800, min=-1, max=1, seed=42)
   
   for (i in 1:100){
       Y = new_distance_matrix(X)
   }
   
   print( sum(Y) )
   
   
   new_distance_matrix = function(matrix[double] X)
     return (matrix[double] out)
   {
     n = nrow(X)
     s = rowSums(X * X)
     out = - 2*X %*% t(X) + s + t(s)
     out = replace(target = out, pattern=NaN, replacement = 0);
   }
   ```
   
   Terminal outputs from the experiments using the _time_ prefix and _-stats_ 
argument of the systemds CLI can be seen 
[here](https://glossy-flame-9af.notion.site/Distance-function-improvement-751d04bfe5c9458e9cf51a6d99c5d4c6).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to