Thanks so much, Barry, On Tue, Jun 9, 2020 at 3:08 PM Barry Smith <[email protected]> wrote:
> > You might look at the notes about MPI binding. It might give you a bit > better performance. > https://www.mcs.anl.gov/petsc/documentation/faq.html#computers > I am using mvapich2, and still trying to look for which binding command lines I can use. > > The streams is exactly the DAXPY operation so this is the speed up you > should expect for VecAXPY() which has 2 loads and 1 store per 1 multipy and > 1 add > > VecDot() has 2 loads per 1 multiply and 1 add but also a global > reduction > > Sparse multiply with AIJ has an integer load, 2 double loads plus 1 > store per row with 1 multiply and 1 add plus communication needed for > off-process portion > > Function evaluations often have higher arithmetic intensity so should > give a bit higher speedup > > Jacobian evaluations often have higher arithmetic intensity but they > may have MatSetValues() which is slow because no arithmetic intensity just > memory motion > Got it. Thanks, Fande, > > Barry > > > > On Jun 9, 2020, at 3:43 PM, Fande Kong <[email protected]> wrote: > > Hi All, > > I am trying to interpret the results from "make stream" on two compute > nodes, where each node has 48 cores. > > If my calculations are memory bandwidth limited, such as AMG, MatVec, > GMRES, etc.. > The best speedup I could get is 16.6938 if I start from one core?? The > speedup for function evaluations and Jacobian evaluations can be better > than16.6938? > > Thanks, > > Fande, > > > > Running streams with 'mpiexec ' using 'NPMAX=96' > 1 19412.4570 Rate (MB/s) > 2 29457.3988 Rate (MB/s) 1.51744 > 3 40483.9318 Rate (MB/s) 2.08546 > 4 51429.3431 Rate (MB/s) 2.64929 > 5 59849.5168 Rate (MB/s) 3.08304 > 6 66124.3461 Rate (MB/s) 3.40628 > 7 70888.1170 Rate (MB/s) 3.65167 > 8 73436.2374 Rate (MB/s) 3.78294 > 9 77441.7622 Rate (MB/s) 3.98927 > 10 78115.3114 Rate (MB/s) 4.02397 > 11 81449.3315 Rate (MB/s) 4.19572 > 12 82812.3471 Rate (MB/s) 4.26593 > 13 81442.2114 Rate (MB/s) 4.19535 > 14 83404.1657 Rate (MB/s) 4.29642 > 15 84165.8536 Rate (MB/s) 4.33565 > 16 83739.2910 Rate (MB/s) 4.31368 > 17 83724.8109 Rate (MB/s) 4.31293 > 18 83225.0743 Rate (MB/s) 4.28719 > 19 81668.2002 Rate (MB/s) 4.20699 > 20 83678.8007 Rate (MB/s) 4.31056 > 21 81400.4590 Rate (MB/s) 4.1932 > 22 81944.8975 Rate (MB/s) 4.22124 > 23 81359.8615 Rate (MB/s) 4.19111 > 24 80674.5064 Rate (MB/s) 4.1558 > 25 83761.3316 Rate (MB/s) 4.31481 > 26 87567.4876 Rate (MB/s) 4.51088 > 27 89605.4435 Rate (MB/s) 4.61586 > 28 94984.9755 Rate (MB/s) 4.89298 > 29 98260.5283 Rate (MB/s) 5.06171 > 30 99852.8790 Rate (MB/s) 5.14374 > 31 102736.3576 Rate (MB/s) 5.29228 > 32 108638.7488 Rate (MB/s) 5.59633 > 33 110431.2938 Rate (MB/s) 5.68867 > 34 112824.2031 Rate (MB/s) 5.81194 > 35 116908.3009 Rate (MB/s) 6.02232 > 36 121312.6574 Rate (MB/s) 6.2492 > 37 122507.3172 Rate (MB/s) 6.31074 > 38 127456.2504 Rate (MB/s) 6.56568 > 39 130098.7056 Rate (MB/s) 6.7018 > 40 134956.4461 Rate (MB/s) 6.95204 > 41 138309.2465 Rate (MB/s) 7.12475 > 42 141779.7997 Rate (MB/s) 7.30353 > 43 145653.3687 Rate (MB/s) 7.50307 > 44 149131.2087 Rate (MB/s) 7.68223 > 45 151611.6104 Rate (MB/s) 7.81 > 46 155554.6394 Rate (MB/s) 8.01312 > 47 159033.1938 Rate (MB/s) 8.19231 > 48 162216.5600 Rate (MB/s) 8.35629 > 49 165034.8116 Rate (MB/s) 8.50147 > 50 168001.4823 Rate (MB/s) 8.65429 > 51 170899.9045 Rate (MB/s) 8.8036 > 52 175687.8033 Rate (MB/s) 9.05024 > 53 178203.9203 Rate (MB/s) 9.17985 > 54 179973.3914 Rate (MB/s) 9.27101 > 55 182207.3495 Rate (MB/s) 9.38608 > 56 185712.9643 Rate (MB/s) 9.56667 > 57 188805.5696 Rate (MB/s) 9.72598 > 58 193360.9158 Rate (MB/s) 9.96064 > 59 198160.8016 Rate (MB/s) 10.2079 > 60 201297.0129 Rate (MB/s) 10.3695 > 61 203618.7672 Rate (MB/s) 10.4891 > 62 209599.2783 Rate (MB/s) 10.7971 > 63 211651.1587 Rate (MB/s) 10.9028 > 64 210254.5035 Rate (MB/s) 10.8309 > 65 218576.4938 Rate (MB/s) 11.2596 > 66 220280.0853 Rate (MB/s) 11.3473 > 67 221281.1867 Rate (MB/s) 11.3989 > 68 228941.1872 Rate (MB/s) 11.7935 > 69 232206.2708 Rate (MB/s) 11.9617 > 70 233569.5866 Rate (MB/s) 12.0319 > 71 238293.6355 Rate (MB/s) 12.2753 > 72 238987.0729 Rate (MB/s) 12.311 > 73 246013.4684 Rate (MB/s) 12.6729 > 74 248850.8942 Rate (MB/s) 12.8191 > 75 249355.6899 Rate (MB/s) 12.8451 > 76 252515.6110 Rate (MB/s) 13.0079 > 77 257489.4268 Rate (MB/s) 13.2641 > 78 260884.2771 Rate (MB/s) 13.439 > 79 264341.8661 Rate (MB/s) 13.6171 > 80 269329.1376 Rate (MB/s) 13.874 > 81 272286.4070 Rate (MB/s) 14.0263 > 82 273325.7822 Rate (MB/s) 14.0799 > 83 277334.6699 Rate (MB/s) 14.2864 > 84 280254.7286 Rate (MB/s) 14.4368 > 85 282219.8194 Rate (MB/s) 14.538 > 86 289039.2677 Rate (MB/s) 14.8893 > 87 291234.4715 Rate (MB/s) 15.0024 > 88 295941.1159 Rate (MB/s) 15.2449 > 89 298136.3163 Rate (MB/s) 15.358 > 90 302820.9080 Rate (MB/s) 15.5993 > 91 306387.5008 Rate (MB/s) 15.783 > 92 310127.0223 Rate (MB/s) 15.9756 > 93 310219.3643 Rate (MB/s) 15.9804 > 94 317089.5971 Rate (MB/s) 16.3343 > 95 315457.0938 Rate (MB/s) 16.2502 > 96 324068.8172 Rate (MB/s) 16.6938 > > >
