Hi hi hi,

On 07/21/2014 03:59 PM, Philippe Tillet wrote:
> Hi hi,
>
> Yes, the vendor id looks useless for our purpose. I think it's attached
> to a given machine though, contrary to the device id that changes across
> executions. There are some opencl extensions (provided by apple) to find
> out which opencl device is driving the display, using vendor ids, if I
> remember correctly.
>
> Let's agree on a device matching process. Here it how it works for now
> (including the improvements I made yesterday):
>
> 1)  Device vendor (should be from the hardware generation, since it
> looks like we've agreed on a sdk independent mechanism)
> 2)  Device Type
> 3)  Hardware generation
> 4)  Device name
>
> 1) and 2) should be swapped, indeed.

If everything else fails, at least the device type is a reliable 
information. :-)



> I don't really use vendor defaults,
> here is how fallbacks are handled:
>
> -> There are global defaults for each device type, for both double and
> floats.
> -> When the generation is not found in the database, we look for the
> closest generation for this vendor.
> Example. Generations are stored as:
> enum architecture_generation
> {
>    tesla,
>    fermi,
>    kepler,
>    maxwell,
>
>   evergreen,
>   northern_islands,
>   southern_islands
> }
>
> if the user has the combination (nvidia, kepler) and there is no kepler
> in the database, then the database will parse the database for (nvidia,
> $architecture), and will select the $architecture which is closest to
> kepler (in terms of difference in the enum)
>
> -> If the device name is not found, the first device found for this
> architecture is selected. We should change this to have a similar
> mechanism as above, so that similar devices are closer in an enum.
>
> enum device_name
> {
>    ...
>     gtx560,
>     gtx570,
>     gtx580,
>     gtx590,
>     ...
> }
>
> So that if there is a profile for the gtx520 and one for the gtx580,
> then the gtx530 will pick the former and the gtx570 will pick the
> latter. There is a pitfall with this approach, though, since it won't
> handle how close devices from different generations are. (Ie a gtx470
> may be similar to a gtx550?)

What if 'closer' always picks the weaker GPU?
GPU-rebranding is indeed annoying here, but I think it can be addressed 
in the same way vendors relabel it: We just take the tuning profile for 
device 'a' and rebrand it with name 'b'. ;-)


> -> Finally, even with all these fallbacks we cannot ensure correctness
> for an unknown device. The kernel might require too much resources, for
> example. For now, an exception would be thrown at
> template_base::generate(), which is not acceptable. How should we handle
> this ?
> My idea is that we should check for invalidity when constructing the
> template, and if the profile is not valid for this device, then fallback
> on the defaults.

Does this affect kernels other than GEMM? Either way, this needs some 
kind of hierarchical fallback, which, in simplest term, is just falling 
back to a super-conservative profile.


> As a general rule, when the slow default profile is
> used, should we output a warning?

We should provide a diagnostics function to the user, yes. We should not 
dump anything to stdout without the user explicitly asking for it.

Best regards,
Karli


------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to