Thanks. Could you please send the 24 processors with the GPU? 

   Note the final column of the table gives you the percentage of flops (not 
rates, actual operations) on the GPU. For you biggest run it is

   For the MatMult it is 18 percent and for KSP solve it is 23 percent. I think 
this is much too low, we'd like to see well over 90 percent of the flops on the 
GPU; or 95 or more. Is this because you are forced to put very large matrices 
only the CPU? 

   For the MatMult if we assume the flop rate for the GPU is 25 times as fast 
as the CPU and 18 percent of the flops are done on the GPU then the ratio of 
time for the GPU should be 82.7 percent of the time for the CPU but  it is .90; 
so where is the extra time? Seems too much than just for the communication. 

   There is so much information and so much happening in the final stage that 
it is hard to discern what is killing the performance in the GPU case for the 
KSP solve. Anyway you can just have a stage at the end with several KSP solves 
and nothing else? 

   Barry


> On Jul 29, 2019, at 5:26 PM, Mark Adams <[email protected]> wrote:
> 
> 
> 
> On Mon, Jul 29, 2019 at 5:31 PM Smith, Barry F. <[email protected]> wrote:
> 
>   I don't understand the notation in the legend on the second page
> 
> 12,288 cpus and no GPUs ?
> 
> Yes
>  
> 
> 24 GPUs?  or 6 GPUs
> 
> 24 virtual, 6 real GPUs per node. The first case is one node, 24 cores/vGPUs
>  
> 
> 192 GPUs?
> 
> 1536 GPUs?
> 
> 12,288 GPUs?  or 12288/4 = 3072  GPUs?
> 
> All "GPUs" are one core/process/vGPU. So 12288 virtual GPUs and 3072 physical 
> GPUs.
> 
> Maybe I should add 'virtual GPUs' and put (4 processes/SUMMIT GPU)
>  
> 
> So on the largest run using GPUs or not takes pretty much exactly the same 
> amount of  time?
> 
> yes. The raw Mat-vec is about 3x faster with ~95K equations/process. I've 
> attached the data.
>  
> 
> What about 6 GPUs vs 24 CPUs ? Same equal amount of time. 
> 
> Can you send some log summaries
> 
> <out_cpu_012288><out_cuda_000024><out_cuda_001536><out_cuda_000192><out_cuda_012288>

Reply via email to