e-strauss opened a new pull request, #2051: URL: https://github.com/apache/systemds/pull/2051
This patch adds a dml script for the bidirectional LSTM layer. The script leverages the existing functionality of the standard LSTM layer for computing one pass in the normal direction and one in the reversed direction by reversing the order input tokens on the temporal dimension. I tried out two approaches for the reversing and ran small benchmark locally, and chose the better performing one. The faster approach 1 uses the rev() operator together with transpose(), since rev() only reverses on column axis. `X_reverse = t(rev(t(X))) # reverse the elements inside a row` `W_reverse[1:D,] = rev(W_reverse[1:D,]) # have to reverse the input weights as well` The other approach 2 uses a for loop over the temporal axis and slicing. `X_reverse = matrix(0, rows=nrow(X), cols=ncol(X))` `for (i in 1:T){X_reverse[,(T - i)*D+1:(T - i + 1)*D] = X[,(i-1)*D+1:i*D]}` I added test cases which compare the results to csv files with expected results that were computed using pytorch's Bi-LSTM implementation. Currently, the backward is not included in the commit, since I have to fix an error, that I found while testing. I'll add another commit / do another PR for backward pass today or tomorrow once I got it fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org