Sunday, 2 November 2014

Hi Freddie,

I'm giving your tuning idea a go.  It appears to be a somewhat slow and
compute intensive process.  One question I have related to this tune
executable is whether this process works on the installation, or those
files compiled in the build directory (i.e. do I need to run make install
again after this is has completed)?

Another question I have is what to provide for the openmp backend on
systems running OS X.  I set the cblas-mt parameter to
/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework
in one of the example case input files; however, PyFR gives a RuntimeError:
Unable to load cblas.  It was so much easier with CUDA!  Thanks for your
time today.

Best Regards,


Zach

On Sun, Nov 2, 2014 at 3:48 PM, Freddie Witherden <[email protected]>
wrote:

> Hi Zach,
>
> On 02/11/14 23:38, Zach Davis wrote:
> > Hi Freddie,
> >
> > Setting the PYFR_LIBRARY_PATH environment variable now gets me to the
> > point where clBLAS is found.  I now get a status code of -54.  Stack
> > trace follows:
>
> Now we're getting somewhere!  That error corresponds to -54 =
> CL_INVALID_WORK_GROUP_SIZE.  It is almost certainly an issue with clBLAS
> rather than PyFR (we do not control the workgroup size in clBLAS
> functions).
>
> However, there is something you can try.  In the "staging" directory
> where clBLAS was built there is a flaky utility called "tune".  Running:
>
>   export CLBLAS_STORAGE_PATH=`pwd`
>   ./tune --gemm --float --double
>
> should try and auto-tune clBLAS on your platform.  This should include
> finding kernels and work group sizes that work on your platform.  The
> results from auto-tuning vary considerably across hardware platforms: so
> benefit very little while others show substantial improvements.  If this
> does not help then you'll probably want to file a bug report; although
> the clBLAS project is not as active as one would hope.
>
> Longer time I am looking to support ViennaCL as an alternative
> matrix-multiplication provider for the PyFR OpenCL backend.
> Unfortunately, the current API exposed by ViennaCL is not quite flexible
> enough to make this possible.
>
> It is highly disappointing that the wider OpenCL community has not
> gotten behind clBLAS.  As a consequence it only really works well on AMD
> GPU -- and even then it isn't great.  Indeed, as a broader point, I find
> it unlikely that OpenCL will achieve acceptance within the scientific
> community until BLAS-like functionality is integrated as an optional
> part of the standard.  Writing decent and portable level 3 BLAS kernels
> is just too difficult for one project.
>
> Regards, Freddie.
>
> --
> You received this message because you are subscribed to the Google Groups
> "PyFR Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send an email to [email protected].
> Visit this group at http://groups.google.com/group/pyfrmailinglist.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "PyFR 
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send an email to [email protected].
Visit this group at http://groups.google.com/group/pyfrmailinglist.
For more options, visit https://groups.google.com/d/optout.

Reply via email to