phaniarnab commented on PR #2050: URL: https://github.com/apache/systemds/pull/2050#issuecomment-2241068566
@WDRshadow, thanks for putting the numbers here. Did you take an average of 3 runs to capture the execution time? If not, please do that to avoid the JIT compilation and GC overheads. And I assume the numbers reported in this table only measure the total inference time and not the training time. The speedup from 2 GPUs is way less than I expected. Can you explain, why the speedup is not consistently 2x? If you are scoring n images, then each GPU gets n/2 images, which should lead to 2x speedup. I do not anticipate any additional overhead for two GPUs for this use case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org