Hi Freddie, You’re right—user error. The CUDA installation I had was 6.0.48. Clicking on the “Check For Updates” button within the CUDA System Preference was returning that no updates were available. Looking at NVIDIA's website, I noticed version 6.5 had been released. Installing it and re-installing pycuda resolved the issue. I’ll work on a larger model to make comparisons while investigating the OpenCL issue further. Thanks again for all of your expertise and feedback. I appreciate all the great work you and your team are doing.
Best Regards, Zach > On Nov 4, 2014, at 1:52 PM, Freddie Witherden <[email protected]> wrote: > > Hi Zach, > > On 04/11/14 21:16, Zach Davis wrote: >> Thanks for your input. It’s these nuances regarding PyFR’s use that >> make using it all the more familiar as I uncover them, so thanks for >> highlighting where my experiment may have gone astray. Peter, I took a >> look at the paper you provided when it was first announced, so I’m >> familiar with what sort of comparable performance to expect—I guess I >> was hoping to realize these same trends myself as a means to get more >> acquainted with PyFR. Both of your input and experience has helped in >> that regard. >> >> With regards to pycuda on OS 10.10, importing the pucuda.autoinit module >> gives me the following stack trace: >> >> Python 2.7.8 (default, Oct 17 2014, 18:21:39) >> [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.51)] on darwin >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import pycuda.autoinit >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> File "/usr/local/lib/python2.7/site-packages/pycuda/autoinit.py", line >> 4, in <module> >> cuda.init() >> pycuda._driver.RuntimeError: cuInit failed: no device > > This suggests that either PyCUDA or CUDA are not set up correctly. If > regular CUDA applications written in C/C++ work without issue then you > should try recompiling PyCUDA. > >> This is with a meager NVIDIA GT 650m (1024 MB VRAM) card; thus, why I >> resorted to using the simple 2D examples. I’ve got a Tesla K5000 in a >> workstation right next to me, so perhaps I’ll use it for some more >> testing; though, I was specifically interested in setting up and running >> PyFR under OS X, which I realize is probably a very niche use case. I >> was particularly interested in the OpenMP backend, because I can imagine >> someone may have a model that wouldn’t fit on the available memory of >> the GPUs we provide currently, so better understanding the performance >> trade-off of running on a GPU cluster as opposed to a more traditional >> cluster of CPUs was worthwhile to me. > > If you switch from double to single precision the card should be quite > capable. The 1024 MiB of memory behaves more like 2048 MiB and the > FLOP/S increase by a factor of ~24 or so. I expect this is enough to > run some interesting 3D cases (even if they're not scale resolving). > > Regards, Freddie. > > -- > You received this message because you are subscribed to the Google Groups > "PyFR Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send an email to [email protected]. > Visit this group at http://groups.google.com/group/pyfrmailinglist. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "PyFR Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send an email to [email protected]. Visit this group at http://groups.google.com/group/pyfrmailinglist. For more options, visit https://groups.google.com/d/optout.
