Hi All,

I'm a new user of PyFR. I started by running the 2D benchmark cases 
provided with the source files and then moved up to the supplementary 
materials of journal publications.

I'm currently running the 3D sd7003 case, files are attached. At first, I 
am using p1 to initialize my domain. Then starting at t=25s, I switch the 
p3 with the multi-p method until t=32s. Lastly, statistics are gathered 
until t=45s. A colleague ran the same case using 2 V100 GPUs and it took 
him about 4 days for the full simulation.

I am using 9 nodes of 2x Intel Xeon 6540 (18 cores per CPU, 2 CPU per node, 
324 cores total) and therefore running with the OpenMP backend. From 
previous posts, I've read that PyFR runs best with one MPI rank per CPU and 
I'm therefore using with OMP_NUM_THREADS=18.
The p1 case (sd7003_1.ini) took about 24 hours to run. However, after 36 
hours of running the first p3 case (sd7003_2.ini), the estimated time is 
about 10 days. Extrapolating, this would mean the entire run would take 
about 28 days. 

I don't have experience running with GPUs but I was expecting each GPU to 
match the performance of 3-4 CPUs. The above indicates a ratio of about 60 
and does not seem right.

   - I know PyFR runs very well on GPUs but are these results to be 
   expected?
   - I've been running a case that was meant to run on GPUs. Perhaps other 
   numerical methods are more adapted to CPUs / OpenMP backend?

Thanks,

Solal

-- 
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/pyfrmailinglist/69e4f1b1-cbac-493e-a60b-a853f0a0ae86%40googlegroups.com.

Attachment: sd7003_1.ini
Description: Binary data

Attachment: sd7003_2.ini
Description: Binary data

Attachment: sd7003_3.ini
Description: Binary data

Reply via email to