Hi, I sent this question a while ago. However, it was not answered. Anyway, if the question below is not as relevant, maybe just 1 simple question:
should the ex2f code with about 1000000 unknowns scale properly, maybe up to 8 or 16 processors if the server distributes the MPI load properly? Thank you very much. Here's the previous email: I'm thinking of using the example code ex2f to determine my school server's MPI performance. Since my own code uses a 1200x1080 grid, I let m and n in ex2f.F be equal to 1200 and 1080 respectively. I ran the code on 4 processors and found that it's taking a very long time to complete for this job (> 0.5hr). I then use Hypre as the preconditioner and it completed in a few min or so. I decided to use this to gauge my server's scaling performance. My own code is also using Hypre. However, in order to increase the runtime to make the timing less prone to random timing errors, I added do k=1,100 call KSPSolve(ksp,b,x,ierr) end do to repeat the solving 100 times. Is this fine (will there be any cache related problems)? Else, is there a better way? I need to keep the grid size fixed so that I can compared with my own code which uses the same no. of grids. Anyway, I ran the code and I got this result: processors ksp_solve time (from log_summary) wall time min/sec 1 4.6e2 7/50 2 2.4e2 4/9 4 1.3e2 2/20 8 1.1e2 2/23 12 7.7e1 1/24 Does this mean that my school server scales poorly for >4 processors? I repeated running the code using the 8 processor and the new timing is even longer. Thank you very much.
