[pyfrmailinglist] Cuda backend parameters

Junting Chen Mon, 06 Jan 2020 09:35:34 -0800

Hello,

I am wondering if someone can provide a bit more descriptions on these 
parameters to optimize performance.

As far as I know, when using multiple GPUs, I had to select local-rank for
device-id and cuda-aware for mpi-type. When exactly should i be using
round-robin and local-rank? And when should i be using standard or
cuda-aware?

How would you select GiMMiK cutoff? How does it affect accuracy /
performance?

I believe block-1d and block-2d are determined by GPU's specification. I am
not very familiar with Cuda. Please someone can elaborate a bit. For
example I am running pyfr with two Tesla k80s in parallel, what's the block
size for 1d and 2d pointswise kernels?

Parameterises the CUDA backend with

device-id — method for selecting which device(s) to run on:

*int* | round-robin | local-rank
2.

gimmik-max-nnz — cutoff for GiMMiK in terms of the number of non-zero
entires in a constant matrix:

*int*
3.

mpi-type — type of MPI library that is being used:

standard | cuda-aware
4.

block-1d — block size for one dimensional pointwise kernels:

*int*
5.

block-2d — block size for two dimensional pointwise kernels:

*int*, *int*

Thanks a lot!

Junting Chen

--
You received this message because you are subscribed to the Google Groups "PyFR
Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web, visit
https://groups.google.com/d/msgid/pyfrmailinglist/cbe65aa4-1765-4fcf-a6ff-a641ef378e9d%40googlegroups.com.

[pyfrmailinglist] Cuda backend parameters

Reply via email to