Re: [PR] [SYSTEMDS-2951] Multi-GPU Support for End-to-End ML Pipelines [systemds]

via GitHub Sat, 20 Jul 2024 03:31:43 -0700


phaniarnab commented on PR #2050:
URL: https://github.com/apache/systemds/pull/2050#issuecomment-2241079330


   > > @WDRshadow, thanks for putting the numbers here. Did you take an average 
of 3 runs to capture the execution time? If not, please do that to avoid the 
JIT compilation and GC overheads. And I assume the numbers reported in this 
table only measure the total inference time and not the training time.
   > > The speedup from 2 GPUs is way less than I expected. Can you explain, 
why the speedup is not consistently 2x? If you are scoring n images, then each 
GPU gets n/2 images, which should lead to 2x speedup. I do not anticipate any 
additional overhead for two GPUs for this use case.
   > 
   > Thanks. Your assumptions are inaccurate. This time is the total execution 
time, which includes a exactly the same training process before the execution 
of the `parfor`loop. This is one reason. I am not familiar with `.dml` files 
and have no time to learn it, so I don't know how to store and read a trained 
model.
   
   Okay. In that case, try one of the two options: (1) write the model to disk, 
create separate dml scripts for inference where you read the model and 
immediately start the parfor loop. You can find plenty of read, write examples 
in the test scripts and the reproducibility scripts I shared with you. (2) use 
time() method before and after the parfor and report only the inference time. 
You can find an example of using time() here: 
https://github.com/damslab/reproducibility/blob/master/vldb2022-UPLIFT-p2528/FTBench/systemds/T1.dml
   
   For either option, make sure the intermediates are already materialized 
before the loop starts. SystemDS compiler sometime delays operations till used. 
You can print the sum of a matrix to force materialization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [SYSTEMDS-2951] Multi-GPU Support for End-to-End ML Pipelines [systemds]

Reply via email to