Monday, 3 November 2014


Hi Freddie,

I’m still looking into the OpenCL backend, but I think I was finally able to 
get the OpenMP backend up and running under OS X. I’ve collected a few relative 
performance benchmark results that perhaps some in the community might be 
interested in. Ideally, I would like to apply this test to see similar results 
for both CUDA and OpenCL backends.

My test system was a modest 2.4 GHz (i7–3630QM) quad-core Intel Core i7 Ivy 
Bridge processor with 16GB 1600 MHz DDR3 RAM running OS X v10.10. I modified 
the compiler flags in ${PYFR_ROOT}/pyfr/backends/openmp/compiler.py replacing 
the -march=native option with -mtune=native. This change isn’t necessary for 
the clang-omp compiler, but was necessary for the gcc–4.9 compiler I tried. To 
keep things consistent, I left that option changed across compilers. I took the 
fastest run in my test matrix (i.e. the very last case) and re-ran the test 
while reverting the -mtune=native option back to -march=native and observed no 
change in runtime.

I ran the couette_flow_2d example case using a single partition initiating 
pyfr-sim as follows:

pyfr-sim -p -n 100 -b openmp run couette_flow_2d.pyfrm couette_flow_2d.ini

The results for the openmp backend tests follow:

Backend
Compiler
Environment
Time
cblas-mt = Accelerate Framework
gcc–4.9
OMP_NUM_THREADS=4
07m 48s
cblas-st = Accelerate Framework
gcc–4.9
OMP_NUM_THREADS=4
10m 52s
cblas-mt = Accelerate Framework
gcc–4.9
OMP_NUM_THREADS=8
12m 20s
cblas-st = OpenBLAS 0.2.12
gcc–4.9
OMP_NUM_THREADS=4
11m 01s
cblas-mt = OpenBLAS 0.2.12
gcc–4.9
OMP_NUM_THREADS=4
07m 46s
cblas-mt = Accelerate Framework
clang-omp
OMP_NUM_THREADS=4
04m 24s
cblas-st = Accelerate Framework
clang-omp
OMP_NUM_THREADS=4
04m 23s
cblas-mt = OpenBLAS 0.2.12
clang-omp
OMP_NUM_THREADS=4
04m 12s
cblas-st = OpenBLAS 0.2.12
clang-omp
OMP_NUM_THREADS=4
04m 10s
The third case run shows that hyperthreading is a no-no as I’m sure you’re 
already aware. I was actually surprised that Apple’s Accelerate Framework was 
less performant than OpenBLAS, and I’ve convinced myself that gcc–4.9 (v4.9.2) 
is garbage. To install an OpenMP version of clang I used homebrew and this brew 
recipe <https://github.com/Homebrew/homebrew/pull/33278>. I also had to compile 
and install Intel’s OpenMP Runtime Library 
<https://www.openmprtl.org/download#stable-releases>. I downloaded the the 
version listed at the top of the table (Version 20140926), unpacked, and 
invoked make with make compiler=clang. Next, I moved the *.dylib and *.h files 
to their respective lib and include directories under /usr/local. Lastly, I set 
the C_INCLUDE_PATH, CPLUS_INCLUDE_PATH to include /usr/local/include and the 
DYLD_LIBRARY_PATH to include /usr/local/lib.

Now something has recently changed with either pycuda under OS X or PyFR, 
because initiating a similar test using the cuda backend results in the 
following traceback: 

pyfr-sim -p -n 100 -b cuda run couette_flow_2d.pyfrm couette_flow_2d.ini

Traceback (most recent call last): File 
"/Users/zdavis/Applications/PyFR/pyfr/scripts/pyfr-sim", line 112, in <module> 
main() File "/usr/local/lib/python2.7/site-packages/mpmath/ctx_mp.py", line 
1301, in g return f(*args, **kwargs) File 
"/Users/zdavis/Applications/PyFR/pyfr/scripts/pyfr-sim", line 82, in main 
backend = get_backend(args.backend, cfg) File 
"/Users/zdavis/Applications/PyFR/pyfr/backends/__init__.py", line 11, in 
get_backend return subclass_where(BaseBackend, name=name.lower())(cfg) File 
"/Users/zdavis/Applications/PyFR/pyfr/backends/cuda/base.py", line 33, in 
__init__ from pycuda.autoinit import context File 
"/usr/local/lib/python2.7/site-packages/pycuda/autoinit.py", line 4, in 
<module> cuda.init() pycuda._driver.RuntimeError: cuInit failed: no device 

I remember when first installing and running PyFR (~v0.2) this worked just fine 
using the default backend.  I’m curious what has changed.

Best Regards,


Zach

> On Nov 3, 2014, at 2:06 PM, Freddie Witherden <[email protected]> wrote:
> 
> On 03/11/14 21:57, Zach Davis wrote:
>> It appears that didn’t work—PyFR complains about being unable to find
>> the OpenMP header file.  Looking at the compiler.py file in
>> ${PYFR_ROOT}/pyfr/backends/openmp/ on line 55 you are using the value
>> of cc to get the path of the compiler to be used.  Unfortunately, on
>> OS X this is a symbolic link to clang.
>> 
>> Is there an environment variable that PyFR supports that will allow
>> you to change which c compiler is used?  Setting the shell
>> environment variable CC is ignored, so I was hoping there might be an
>> alternative way to explicitly specify the compiler PyFR uses.  I have
>> both gcc-4.9 (4.9.2) and have built an OpenMP compatible version of
>> clang which I’ve named clang-omp to test; however, I can’t figure out
>> how to direct PyFR to use those, rather than the cc symbolic link
>> (which points to clang bundled with Apple’s XCode command line
>> tools).
>> 
>> Note, you also outlined an example of getting the copy of gcc-4.8
>> installed on your mac to compile a simple Hello World example. I
>> believe if you replaced the -march=native option with something like
>> -msse4.2 or -mtune=native, then the code snippet compiles without the
>> error.  Although, I’m not certain that is relevant to what you were
>> pointing out.
> 
> The compiler used by PyFR can be changed in the configuration file.  For
> example on my Linux system I have:
> 
> [backend-openmp]
> cc = gcc-4.8.3
> 
> if you are going to be experimenting you might want to put
> 
> [backend-openmp]
> cc = ${CC}
> 
> and then you can simply export CC in your shell to be your desired
> compiler.  We do not currently support expansions such as:
> 
> [backend-openmp]
> cc = gcc -fsomething
> 
> the 'cc' field must be an executable.  Similarly, we do not -- currently
> -- permit one to append arguments to the compiler invocation.  (Although
> this is can be trivially accomplished with a one-line shell script
> should any user require this feature.)
> 
> With regards to GCC on the Mac, yes it is -march=native that is causing
> the trouble.  I would, however, rather that compilers not try and emit
> assembly instructions which they know can not be assembled on the
> current system!
> 
> Regards, Freddie.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "PyFR Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send an email to [email protected].
> Visit this group at http://groups.google.com/group/pyfrmailinglist.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send an email to [email protected].
Visit this group at http://groups.google.com/group/pyfrmailinglist.
For more options, visit https://groups.google.com/d/optout.

Reply via email to