Hi Zach,

Many thanks for all this, and your continued interest in the project!

A couple of general points:

1.) If you are interested in comparisons between different backends, you may 
want to check this out: http://arxiv.org/abs/1409.0405

2.) When looking at absolute (and even relative) performance of different 
backends, the very small 2D example test cases are somewhat pathological; the 
matrices that end up being repeatedly multiplied together are not very big, and 
hence are unlikely to get a good fraction of peak out of dgemm

3.) Regarding failure of the CUDA backend. What GPU and version of CUDA do you 
have on the Mac?

Cheers

Peter



On 4 Nov 2014, at 02:34, Zach Davis <[email protected]<mailto:[email protected]>> 
wrote:

Monday, 3 November 2014



Hi Freddie,


I’m still looking into the OpenCL backend, but I think I was finally able to 
get the OpenMP backend up and running under OS X. I’ve collected a few relative 
performance benchmark results that perhaps some in the community might be 
interested in. Ideally, I would like to apply this test to see similar results 
for both CUDA and OpenCL backends.

My test system was a modest 2.4 GHz (i7–3630QM) quad-core Intel Core i7 Ivy 
Bridge processor with 16GB 1600 MHz DDR3 RAM running OS X v10.10. I modified 
the compiler flags in ${PYFR_ROOT}/pyfr/backends/openmp/compiler.py replacing 
the -march=native option with -mtune=native. This change isn’t necessary for 
the clang-omp compiler, but was necessary for the gcc–4.9 compiler I tried. To 
keep things consistent, I left that option changed across compilers. I took the 
fastest run in my test matrix (i.e. the very last case) and re-ran the test 
while reverting the -mtune=native option back to -march=native and observed no 
change in runtime.

I ran the couette_flow_2d example case using a single partition initiating 
pyfr-sim as follows:

pyfr-sim -p -n 100 -b openmp run couette_flow_2d.pyfrm couette_flow_2d.ini


The results for the openmp backend tests follow:

Backend

Compiler

Environment

Time

cblas-mt = Accelerate Framework

gcc–4.9

OMP_NUM_THREADS=4

07m 48s

cblas-st = Accelerate Framework

gcc–4.9

OMP_NUM_THREADS=4

10m 52s

cblas-mt = Accelerate Framework

gcc–4.9

OMP_NUM_THREADS=8

12m 20s

cblas-st = OpenBLAS 0.2.12

gcc–4.9

OMP_NUM_THREADS=4

11m 01s

cblas-mt = OpenBLAS 0.2.12

gcc–4.9

OMP_NUM_THREADS=4

07m 46s

cblas-mt = Accelerate Framework

clang-omp

OMP_NUM_THREADS=4

04m 24s

cblas-st = Accelerate Framework

clang-omp

OMP_NUM_THREADS=4

04m 23s

cblas-mt = OpenBLAS 0.2.12

clang-omp

OMP_NUM_THREADS=4

04m 12s

cblas-st = OpenBLAS 0.2.12

clang-omp

OMP_NUM_THREADS=4

04m 10s


The third case run shows that hyperthreading is a no-no as I’m sure you’re 
already aware. I was actually surprised that Apple’s Accelerate Framework was 
less performant than OpenBLAS, and I’ve convinced myself that gcc–4.9 (v4.9.2) 
is garbage. To install an OpenMP version of clang I used homebrew and this brew 
recipe<https://github.com/Homebrew/homebrew/pull/33278>. I also had to compile 
and install Intel’s OpenMP Runtime 
Library<https://www.openmprtl.org/download#stable-releases>. I downloaded the 
the version listed at the top of the table (Version 20140926), unpacked, and 
invoked make with make compiler=clang. Next, I moved the *.dylib and *.h files 
to their respective lib and include directories under /usr/local. Lastly, I set 
the C_INCLUDE_PATH, CPLUS_INCLUDE_PATH to include /usr/local/include and the 
DYLD_LIBRARY_PATH to include /usr/local/lib.

Now something has recently changed with either pycuda under OS X or PyFR, 
because initiating a similar test using the cuda backend results in the 
following traceback:

pyfr-sim -p -n 100 -b cuda run couette_flow_2d.pyfrm couette_flow_2d.ini

Traceback (most recent call last): File 
"/Users/zdavis/Applications/PyFR/pyfr/scripts/pyfr-sim", line 112, in <module> 
main() File "/usr/local/lib/python2.7/site-packages/mpmath/ctx_mp.py", line 
1301, in g return f(*args, **kwargs) File 
"/Users/zdavis/Applications/PyFR/pyfr/scripts/pyfr-sim", line 82, in main 
backend = get_backend(args.backend, cfg) File 
"/Users/zdavis/Applications/PyFR/pyfr/backends/__init__.py", line 11, in 
get_backend return subclass_where(BaseBackend, name=name.lower())(cfg) File 
"/Users/zdavis/Applications/PyFR/pyfr/backends/cuda/base.py", line 33, in 
__init__ from pycuda.autoinit import context File 
"/usr/local/lib/python2.7/site-packages/pycuda/autoinit.py", line 4, in 
<module> cuda.init() pycuda._driver.RuntimeError: cuInit failed: no device

I remember when first installing and running PyFR (~v0.2) this worked just fine 
using the default backend.  I’m curious what has changed.

Best Regards,


Zach

On Nov 3, 2014, at 2:06 PM, Freddie Witherden 
<[email protected]<mailto:[email protected]>> wrote:

On 03/11/14 21:57, Zach Davis wrote:
It appears that didn’t work—PyFR complains about being unable to find
the OpenMP header file.  Looking at the compiler.py file in
${PYFR_ROOT}/pyfr/backends/openmp/ on line 55 you are using the value
of cc to get the path of the compiler to be used.  Unfortunately, on
OS X this is a symbolic link to clang.

Is there an environment variable that PyFR supports that will allow
you to change which c compiler is used?  Setting the shell
environment variable CC is ignored, so I was hoping there might be an
alternative way to explicitly specify the compiler PyFR uses.  I have
both gcc-4.9 (4.9.2) and have built an OpenMP compatible version of
clang which I’ve named clang-omp to test; however, I can’t figure out
how to direct PyFR to use those, rather than the cc symbolic link
(which points to clang bundled with Apple’s XCode command line
tools).

Note, you also outlined an example of getting the copy of gcc-4.8
installed on your mac to compile a simple Hello World example. I
believe if you replaced the -march=native option with something like
-msse4.2 or -mtune=native, then the code snippet compiles without the
error.  Although, I’m not certain that is relevant to what you were
pointing out.

The compiler used by PyFR can be changed in the configuration file.  For
example on my Linux system I have:

[backend-openmp]
cc = gcc-4.8.3

if you are going to be experimenting you might want to put

[backend-openmp]
cc = ${CC}

and then you can simply export CC in your shell to be your desired
compiler.  We do not currently support expansions such as:

[backend-openmp]
cc = gcc -fsomething

the 'cc' field must be an executable.  Similarly, we do not -- currently
-- permit one to append arguments to the compiler invocation.  (Although
this is can be trivially accomplished with a one-line shell script
should any user require this feature.)

With regards to GCC on the Mac, yes it is -march=native that is causing
the trouble.  I would, however, rather that compilers not try and emit
assembly instructions which they know can not be assembled on the
current system!

Regards, Freddie.

--
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To post to this group, send an email to 
[email protected]<mailto:[email protected]>.
Visit this group at http://groups.google.com/group/pyfrmailinglist.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
Visit this group at http://groups.google.com/group/pyfrmailinglist.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send an email to [email protected].
Visit this group at http://groups.google.com/group/pyfrmailinglist.
For more options, visit https://groups.google.com/d/optout.

Reply via email to