Hi All, I'm a new user of PyFR. I started by running the 2D benchmark cases provided with the source files and then moved up to the supplementary materials of journal publications.
I'm currently running the 3D sd7003 case, files are attached. At first, I am using p1 to initialize my domain. Then starting at t=25s, I switch the p3 with the multi-p method until t=32s. Lastly, statistics are gathered until t=45s. A colleague ran the same case using 2 V100 GPUs and it took him about 4 days for the full simulation. I am using 9 nodes of 2x Intel Xeon 6540 (18 cores per CPU, 2 CPU per node, 324 cores total) and therefore running with the OpenMP backend. From previous posts, I've read that PyFR runs best with one MPI rank per CPU and I'm therefore using with OMP_NUM_THREADS=18. The p1 case (sd7003_1.ini) took about 24 hours to run. However, after 36 hours of running the first p3 case (sd7003_2.ini), the estimated time is about 10 days. Extrapolating, this would mean the entire run would take about 28 days. I don't have experience running with GPUs but I was expecting each GPU to match the performance of 3-4 CPUs. The above indicates a ratio of about 60 and does not seem right. - I know PyFR runs very well on GPUs but are these results to be expected? - I've been running a case that was meant to run on GPUs. Perhaps other numerical methods are more adapted to CPUs / OpenMP backend? Thanks, Solal -- You received this message because you are subscribed to the Google Groups "PyFR Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web, visit https://groups.google.com/d/msgid/pyfrmailinglist/69e4f1b1-cbac-493e-a60b-a853f0a0ae86%40googlegroups.com.
sd7003_1.ini
Description: Binary data
sd7003_2.ini
Description: Binary data
sd7003_3.ini
Description: Binary data
