Hi Nigel, No problem. I've really just redone things that PyFR developers put together with their paper. It's really great when people do reproducible work-flows. After this Wednesday I will be back to HPC work and will extend the benchmark with Taylor-Green case and any updates for the new version of PyFR.
Also, I am interested to do some visualisation for the SD7003 case. Is there any way I could get hold of the original geometry so that I can show for instance Q-criterion + wing shape. The default vtu output only shows the internal fluid field and I can't select components like BCs or surfaces. Best wishes, Robert -- Dr Robert Sawko Research Staff Member, IBM Daresbury Laboratory Keckwick Lane, Warrington WA4 4AD United Kingdom -- Email (IBM): [email protected] Email (STFC): [email protected] Phone (office): +44 (0) 1925 60 3301 Phone (mobile): +44 778 830 8522 Profile page: http://researcher.watson.ibm.com/researcher/view.php?person=uk-RSawko -- [email protected] wrote: ----- To: PyFR Mailing List <[email protected]> From: nnunn Sent by: [email protected] Date: 03/28/2017 03:47PM Cc: [email protected], [email protected] Subject: Re: [pyfrmailinglist] Re: Multi-GPU per node node runs with PyFR Hi Robert - thanks for setting up the SD7003 example. I just tried a short MPI run (metis + Windows 7) on my 4 Kepler Titans. Runs fine, keeping all four GPUs steady at over 95% utilization. To PyFR team: can you post the gmsh geometry file used for extruding the SD7003 profile? I'd like to try running a 2D version over that profile. many thanks for making all this available, Nigel On Monday, March 13, 2017 at 8:51:48 PM UTC+11, Robert Sawko wrote:Peter and Brian, I am just forwarding this message as I thought it should have gone to the list. Thanks. I think you answered for now all my initial questions about PyFR. Also, I finally sat down and compiled the SD7003 results from two GPU clusters to which I have access and I thought I'll share. Let me summarise a few things. * It seems to me that binding to sockets matters on IBM processors. Initially, I was binding everything to socket 1 and I was getting either freezing behaviour or no scaling. * With core and 2 per socket I seem to be getting stable behaviour and scaling up to 16 nodes. * I wasn't able to run with `cuda-aware` switch despite compiling OpenMPI with CUDA. This is something I am still working on. There still may be issues with clusters or software stack as both are in pretty early days in terms of operation, but preliminary results look good. Let me know what you think. This is just a dry run of your code just to prove that it can work in principle. I'm interested in looking into some details and pushing the development. This is the repository I created: https://github.com/robertsawko/PyFR-bench Best wishes, Robert -- Quantum sappers http://en.wikipedia.org/wiki/Elitzur–Vaidman_bomb_tester https://www.youtube.com/watch?v=lCu8OwNMjME -- You received this message because you are subscribed to the Google Groups "PyFR Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/pyfrmailinglist. For more options, visit https://groups.google.com/d/optout. Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- You received this message because you are subscribed to the Google Groups "PyFR Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send an email to [email protected]. Visit this group at https://groups.google.com/group/pyfrmailinglist. For more options, visit https://groups.google.com/d/optout.
