> On Jan 13, 2016, at 10:24 PM, Justin Chang <[email protected]> wrote:
> 
> Thanks Barry,
> 
> 1) So for block matrices, the ja array is smaller. But what's the "hardware" 
> explanation for this performance improvement? Does it have to do with spatial 
> locality where you are more likely to reuse data in that ja array, or does it 
> have to do with the fact that loading/storing smaller arrays are less likely 
> to invoke a cache miss, thus reducing the amount of bandwidth? 

There are two distinct reasons for the improvement:

1) For 5 by 5 blocks the ja array is 1/25th the size. The "hardware" savings is 
that you have to load something that is much smaller than before. Cache/spatial 
locality have nothing to do with this particular improvement.

2) The other improvement comes from the reuse of each x[j] value multiplied by 
5 values (a column) of the little block. The hardware explanation is that x[j] 
can be reused in a register for the 5 multiplies (while otherwise it would have 
to come from cache to register 5 times and sometimes might even have been 
flushed from the cache so would have to come from memory). This is why we have 
code like

    for (j=0; j<n; j++) {
      xb    = x + 5*(*idx++);
      x1    = xb[0]; x2 = xb[1]; x3 = xb[2]; x4 = xb[3]; x5 = xb[4];
      sum1 += v[0]*x1 + v[5]*x2 + v[10]*x3  + v[15]*x4 + v[20]*x5;
      sum2 += v[1]*x1 + v[6]*x2 + v[11]*x3  + v[16]*x4 + v[21]*x5;
      sum3 += v[2]*x1 + v[7]*x2 + v[12]*x3  + v[17]*x4 + v[22]*x5;
      sum4 += v[3]*x1 + v[8]*x2 + v[13]*x3  + v[18]*x4 + v[23]*x5;
      sum5 += v[4]*x1 + v[9]*x2 + v[14]*x3  + v[19]*x4 + v[24]*x5;
      v    += 25;
    } 

to do the block multiple.

> 
> 2) So if one wants to assemble a monolithic matrix (i.e., aggregation of more 
> than one dof per point) then using the BAIJ format is highly advisable. But 
> if I want to form a nested matrix, say I am solving Stokes equation, then 
> each "submatrix" is of AIJ format? Can these sub matrices also be BAIJ?

   Sure, but if you have separated all the variables of pressure, velocity_x, 
velocity_y, etc into there own regions of the vector then the block size for 
the sub matrices would be 1 so BAIJ does not help. 

   There are Stokes solvers that use Vanka smoothing that keep the variables 
interlaced and hence would use BAIJ and NOT use fieldsplit 


> 
> Thanks,
> Justin 
> 
> On Wed, Jan 13, 2016 at 9:12 PM, Barry Smith <[email protected]> wrote:
> 
> > On Jan 13, 2016, at 9:57 PM, Justin Chang <[email protected]> wrote:
> >
> > Hi all,
> >
> > 1) I am guessing MATMPIBAIJ could theoretically have better performance 
> > than simply using MATMPIAIJ. Why is that? Is it similar to the reasoning 
> > that block (dense) matrix-vector multiply is "faster" than simple 
> > matrix-vector?
> 
>   See for example table 1 in 
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.7668&rep=rep1&type=pdf
> 
> >
> > 2) I am looking through the manual and online documentation and it seems 
> > the term "block" used everywhere. In the section on "block matrices" (3.1.3 
> > of the manual), it refers to field splitting, where you could either have a 
> > monolithic matrix or a nested matrix. Does that concept have anything to do 
> > with MATMPIBAIJ?
> 
>    Unfortunately the numerical analysis literature uses the term block in 
> multiple ways. For small blocks, sometimes called "point-block" with BAIJ and 
> for very large blocks (where the blocks are sparse themselves). I used 
> fieldsplit for big sparse blocks to try to avoid confusion in PETSc.
> >
> > It makes sense to me that one could create a BAIJ where if you have 5 dofs 
> > of the same type of physics (e.g., five different primary species of a 
> > geochemical reaction) per grid point, you could create a block size of 5. 
> > And if you have different physics (e.g., velocity and pressure) you would 
> > ideally want to separate them out (i.e., nested matrices) for better 
> > preconditioning.
> 
>    Sometimes you put them together with BAIJ and sometimes you keep them 
> separate with nested matrices.
> 
> >
> > Thanks,
> > Justin
> 
> 

Reply via email to