Hi hi,

 >         Yesterday, I've realized that Toby and I didn't have the same
>         vendor id
>         for our Intel integrated GPU.
>         I'm not sure of what the vendor id is supposed to represent, then.
>
>
>     Looks like a unique identifier within each SDK. Not portable across
>     SDKs...
>
>
>
> Unfortunately, this doesn't even seem to be the case! Toby and I both
> use Beignet, but end up with a different Vendor ID...!   I could also
> point to this stackoverflow question:
> http://stackoverflow.com/questions/8146056/how-to-programmatically-discover-specific-gpu-on-platform-with-multiple-gpus-op
> According to what that person gets, we can't assume a one-to-one mapping
> between vendor ids and platforms.
>
> |... Device ATI Radeon HD 5770[AMD]: vendorId[1021b00] ...
> ... Device ATI Radeon HD 5770[AMD]: vendorId[2021b00] ...
> |

Grml, so the only conclusion is that the vendor ID is totally useless?


>         Having
>         (CL_DEVICE_TYPE_GPU, haswell) on beignet or window's sdk
>         wouldn't make
>         any difference from our point of view, and having
>         (CL_DEVICE_TYPE_CPU,
>         haswell) on Intel's SDK or AMD's wouldn't make any difference
>         either.
>
>
>     Why should we treat them equally? I can well imagine that different
>     compiler backends work differently, so I'd actually *expect* that
>     the best performance on these SDKs is obtained with different
>     configurations. For example, a LLVM-backend vs. a non-LLVM-backend
>     is unlikely to behave similarly with such an SDK.
>
>
> Well, I would also expect that... but I would expect the worse
> performance coming from mising optimizations (no automatic loop
> unrolling, no auto-vectorization, etc...).

Not necessarily, as different compiler might have different approaches 
for vectorization. One compiler may decide to only vectorize the vector 
data types (double2, etc.)

>  My insight is that
> auto-tuning the same device for two different SDKs is conceptually
> similar to auto-tuning the same device for two versions of the same SDK.

Agreed.


> Now, do we also want to store the platform version in the builtin
> database? We could, but it will certainly involve a pretty complicated
> fallback mechanism, which will be practically always used because of the
> fragmentation of the SDK versions.

Not for 1.6.0. What I have in mind is that we don't design this 
'database' to be too restrictive and suffer from that later. In other 
words, we should be reasonably prepared for regular user benchmark logs 
coming in, particularly when the benchmark GUI gets released. It would 
be a waste if we can't make use of such valuable data ;-)



>         This would also prevent some headache when populating the database!
>
>
>     Which headaches? I think if we treat all OpenCL SDKs equally, we
>     will later have to refactor this because we will find differences
>     among the SDKs...
>
>
> Well, populating the database will be much longer if we consider the
> variations in the compiler (platform versions, sdks...). Just like we
> stick to auto-tuning a routine for the latest SDK version of a vendor...
> Well, the vendor_id key could still be replaced by a platform enum that
> could be obtained by parsing the platform name + version...

For the upcoming 1.6.0 it's certainly reasonable to ignore the platform 
and SDK versions and only tune on the latest software stack (i.e. SDK 
plus driver).


> This also means : do we want to run all our tuning procedures for Apple,
> since it has its own SDK? If not, should we use a fallback for Apple.
> Which one for the CPU? AMD? Intel? Which one for the iGPU? Beignet,
> Intel? Why?

Well, ultimately we have to (at least for verification purposes), even 
though the Apple SDK uses primarily vendor components under the hood.


Can we agree on a device matching process? I suggest that we use the 
following iterative matching procedure

1.) Device Type
2.) Device Vendor
3.) Hardware Generation (from Device Name?)
4.) Device Name
(further checks like driver SDK version, etc. can be added later)

The matching proceeds as far as possible: If we only match the device 
type, we use defaults for that. If we can also match the vendor, we have 
better defaults for that. If we can map the device name to a vendor 
architecture, we can use an improved configuration for that. Ultimately, 
if we even match the full name, we have a full hardware-aware kernel 
parameter set. I'm confident that with some effort we can manage to get 
the first three points to match for most hardware out there.

Best regards,
Karli



------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
ViennaCL-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to