Ok, so the main point is to keep the backard compatibility with
POCL_DEVICE_*.
In a first time, I will implement probe functions for pthread and basic
by using POCL_DEVICE environnement variables.
Each device driver maintainer will then be able to implement probe as
desired.
In short: there will be no noticeable changes for legacy usage but new
devices will be able to use the dynamic detection easily.
On 04/04/2014 12:50 PM, Pekka Jääskeläinen wrote:
Hi Vincent,
On 04/04/2014 01:15 PM, Vincent Danjean wrote:
I think that, per default, the pthread driver should return the
number of usable cores into the current cpuset.
This is not a good idea in general as it generates one OpenCL
device per core/HW thread. Then the OpenCL app is required to use
multiple command queues to exploit all the HW threads in the CPU.
The multithreading of pthread is currently done at the granularity of
work-groups inside one device instance.
Then, a environment variable can override this default. It would
be good if:
- as all other pocl envvar, this one also starts with POCL_
- its contents is similar to the contents of same-goal-envvar in
other environment. In particular, I think to:
* OMP_NUM_THREADS to define the number of thread to use
* KMP_AFFINITY (Intel extension) that defines how logical thread
numbers are mapped onto physical cores. I think pocl pthread
driver will need something similar (ie define which groups of
threads (on which physical cores) will form a workgroup, ...)
pocl does not run a single work-item per thread in the pthread device
(like it would be easy to do).
It quickly becomes very expensive with larger WGs and does not exploit
all parallel resources efficiently.
Now work-items inside a single WG are statically parallelized using finer
granularity parallel HW (multi-issue HW and/or SIMD instructions) using the
kernel compiler. Thread/task level parallelism is exploited at the
WG/kernel level.
=> if possible, using the same syntax and keyword (as must as
possible) would be great for users
Yes, if there are earlier conventions on naming the envs, it is
a good idea to try to mimic them.
Finally, device init operation would also be called only on
clGetDeviceIds/clCreateContext when requesting a specific device in
order to speedup the initialization and use less ressources.
There can be an envvar to limit the scanned plateform/devices on
enumerating functions (clIcdGetPlatformIDsKHR, clGetDeviceIDs, ...)
This could work well enough for backwards compatibility
I.e., use POCL_DEVICES only for _limiting_ the set of probed
devices.
--
signature Clément
------------------------------------------------------------------------------
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel