MaximilianSchreff commented on PR #1941: URL: https://github.com/apache/systemds/pull/1941#issuecomment-1796412639
Yes exactly. There is a matrix multiplication and adding a bias happening. This is what takes up 70% of the performance. You can see that in [scripts/nn/layers/graph_conv.dml](https://github.com/apache/systemds/pull/1941/files#diff-cf550d72402ce14bdc5b38206244cbb3152f6c02c5d42e2f066c5d5c0f91a661) in the forward pass function. That are only two lines that use most of the runtime. These two lines cannot be optimized by my layer. As for the rest of the forward pass, I already took numerous steps to optimize it from, initially, 780 seconds to 340 seconds for the whole layer. This includes: - Merging the different functions into a big one to combine for-loops, improving parallelization gain - Optimizing the addition of self loops - Caching normalization weights (removed it again, since it wasn't faster) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org