VecAssemblyBegin() serves as a barrier unless you set the vector option 
VEC_IGNORE_OFF_PROC_ENTRIES so I am not surprised that it "appears" to take a 
lot of time. BUT the balance between the fastest and slowest is listed in your 
table below is 1.0  which is very surprising; indicating every process 
supposedly spent the same amount of time within the VecAssemblyBegin(). Note 
that for VecAssemblyEnd() the balance is 2.3 which is what I commonly would 
expect. Please send me ALL the output for -log_summary for these cases.  
Version of PETSc shouldn't matter for this issue.

> On May 28, 2015, at 4:59 PM, Mark Adams <[email protected]> wrote:
> 
> We are seeing some large times spent in VecAssemblyBegin:
> 
> VecAssemblyBegin     242 1.0 7.9796e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
> 7.3e+02 12  0  0  0  5  76  0  0  0 10     0
> VecAssemblyEnd       242 1.0 5.6624e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> 
> This is with 64K cores on Edison.  On 128K cores (weak speedup) we see:
> 
> VecAssemblyBegin     248 1.0 2.3615e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
> 7.4e+02 17  0  0  0  4  87  0  0  0 10     0
> VecAssemblyEnd       248 1.0 6.8855e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> 
> We are working on using older versions of PETSc to make sure this is a PETSc 
> issue but does anyone have any thoughts on this?
> 
> Mark

Reply via email to