ramesesz opened a new pull request, #1959: URL: https://github.com/apache/systemds/pull/1959
This patch improves the builtin dist function by removing the outer product operator. For 100 function calls on an arbitrary matrix with 4000 rows and 800 cols, the new dist function shortens the runtime from 66.541s to 60.268s. The following experiment was run with varying rows and cols size: ``` X = rand(rows=4000, cols=800, min=-1, max=1, seed=42) for (i in 1:100){ Y = new_distance_matrix(X) } print( sum(Y) ) new_distance_matrix = function(matrix[double] X) return (matrix[double] out) { n = nrow(X) s = rowSums(X * X) out = - 2*X %*% t(X) + s + t(s) out = replace(target = out, pattern=NaN, replacement = 0); } ``` Terminal outputs from the experiments using the _time_ prefix and _-stats_ argument of the systemds CLI can be seen [here](https://glossy-flame-9af.notion.site/Distance-function-improvement-751d04bfe5c9458e9cf51a6d99c5d4c6). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org