Hey,
Under the hood, ViennaCL uses a databases which leads to device-specific
kernels for optimal performance :-) If the device is not in the database, a
fallback is used. Unfortunately, the requirement of the fallback in terms
of local work group size may be beyond your hardware capability... whi
Hi Samy,
> Is there any suggestion with APU programming? Especially with shared
> memory HSA support? Does the ViennaCL library can handle this type of
> devices?
you can run everything in ViennaCL on APUs just like for GPUs. However,
we do not yet exploit all the features of APUs, so there are
Hi Philippe,
hmm, so it seems like this is the new default behavior on Mac OS:
Restricting the local work size to 1. This shows up in the nightly tests
on a pretty dated Mac OS 10.6, for which I incorrectly assumed this to
be an issue with old hardware. Apparently, this also shows up on current