Hi Nigel,

No problem. I've really just redone things that PyFR developers put together
with their paper. It's really great when people do reproducible work-flows.
After this Wednesday I will be back to HPC work and will extend the benchmark
with Taylor-Green case and any updates for the new version of PyFR. 

Also, I am interested to do some visualisation for the SD7003 case. Is there
any way I could get hold of the original geometry so that I can show for
instance Q-criterion + wing shape. The default vtu output only shows the
internal fluid field and I can't select components like BCs or surfaces.

Best wishes,
Robert
--
Dr Robert Sawko
Research Staff Member, IBM
Daresbury Laboratory
Keckwick Lane, Warrington
WA4 4AD
United Kingdom
--
Email (IBM): [email protected]
Email (STFC): [email protected]
Phone (office): +44 (0) 1925 60 3301
Phone (mobile): +44 778 830 8522
Profile page:
http://researcher.watson.ibm.com/researcher/view.php?person=uk-RSawko
--

[email protected] wrote: -----
To: PyFR Mailing List <[email protected]>
From: nnunn 
Sent by: [email protected]
Date: 03/28/2017 03:47PM
Cc: [email protected], [email protected]
Subject: Re: [pyfrmailinglist] Re: Multi-GPU per node node runs with PyFR

Hi Robert - thanks for setting up the SD7003 example. I just tried a short MPI 
run (metis + Windows 7) on my 4 Kepler Titans. Runs fine, keeping all four GPUs 
steady at over 95% utilization.

To PyFR team: can you post the gmsh geometry file used for extruding the SD7003 
profile?  I'd like to try running a 2D version over that profile.

many thanks for making all this available,
Nigel


On Monday, March 13, 2017 at 8:51:48 PM UTC+11, Robert Sawko wrote:Peter and 
Brian, 
 
I am just forwarding this message as I thought it should have gone to 
the list. 
 
Thanks. I think you answered for now all my initial questions about PyFR. Also, 
I finally sat down and compiled the SD7003 results from two GPU clusters to 
which I have access and I thought I'll share. Let me summarise a few things. 
 
 * It seems to me that binding to sockets matters on IBM processors. Initially, 
 I was binding everything to socket 1 and I was getting either freezing 
 behaviour or no scaling. 
 * With core and 2 per socket I seem to be getting stable behaviour and 
 scaling up to 16 nodes. 
 * I wasn't able to run with `cuda-aware` switch despite compiling OpenMPI with 
 CUDA. This is something I am still working on. 
 
There still may be issues with clusters or software stack as both are in pretty 
early days in terms of operation, but preliminary results look good. Let me 
know what you think. 
 
This is just a dry run of your code just to prove that it can work in 
principle. I'm interested in looking into some details and pushing the 
development. 
 
This is the repository I created: 
https://github.com/robertsawko/PyFR-bench 
 
Best wishes, 
Robert 
--  
Quantum sappers 
http://en.wikipedia.org/wiki/Elitzur–Vaidman_bomb_tester 
https://www.youtube.com/watch?v=lCu8OwNMjME 
  
  -- 
 You received this message because you are subscribed to the Google Groups 
"PyFR Mailing List" group.
 To unsubscribe from this group and stop receiving emails from it, send an 
email to [email protected].
 To post to this group, send email to [email protected].
 Visit this group at https://groups.google.com/group/pyfrmailinglist.
 For more options, visit https://groups.google.com/d/optout.
 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-- 
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send an email to [email protected].
Visit this group at https://groups.google.com/group/pyfrmailinglist.
For more options, visit https://groups.google.com/d/optout.

Reply via email to