Baunsgaard commented on PR #1941: URL: https://github.com/apache/systemds/pull/1941#issuecomment-1796189142
> About the performance: My machine showed performance issues when testing against PyTorch for very very big inputs. > > Stress test: SystemDS: 340 seconds - PyTorch: 32 seconds > > The stress test consisted of about 300 forward passes with about 10.000 x 10.000 matrices. This is likely a problem with my setup and not my implementation since the affine layer with the same inputs took 220 seconds. The GCL consists of a simple affine part and a convolutional part with the convolutional part being a lot more complex. So, the implementation is likely quite fast because the complex convolution part makes up less than a third of the runtime. What is a realistic input size? in ILSCRC2012 (Image net) an example size could be 500 x 375 pixels.  Also good progress, if you could write the '-stats' output it would be great! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org