Hi hi hi, On 07/21/2014 03:59 PM, Philippe Tillet wrote: > Hi hi, > > Yes, the vendor id looks useless for our purpose. I think it's attached > to a given machine though, contrary to the device id that changes across > executions. There are some opencl extensions (provided by apple) to find > out which opencl device is driving the display, using vendor ids, if I > remember correctly. > > Let's agree on a device matching process. Here it how it works for now > (including the improvements I made yesterday): > > 1) Device vendor (should be from the hardware generation, since it > looks like we've agreed on a sdk independent mechanism) > 2) Device Type > 3) Hardware generation > 4) Device name > > 1) and 2) should be swapped, indeed.
If everything else fails, at least the device type is a reliable information. :-) > I don't really use vendor defaults, > here is how fallbacks are handled: > > -> There are global defaults for each device type, for both double and > floats. > -> When the generation is not found in the database, we look for the > closest generation for this vendor. > Example. Generations are stored as: > enum architecture_generation > { > tesla, > fermi, > kepler, > maxwell, > > evergreen, > northern_islands, > southern_islands > } > > if the user has the combination (nvidia, kepler) and there is no kepler > in the database, then the database will parse the database for (nvidia, > $architecture), and will select the $architecture which is closest to > kepler (in terms of difference in the enum) > > -> If the device name is not found, the first device found for this > architecture is selected. We should change this to have a similar > mechanism as above, so that similar devices are closer in an enum. > > enum device_name > { > ... > gtx560, > gtx570, > gtx580, > gtx590, > ... > } > > So that if there is a profile for the gtx520 and one for the gtx580, > then the gtx530 will pick the former and the gtx570 will pick the > latter. There is a pitfall with this approach, though, since it won't > handle how close devices from different generations are. (Ie a gtx470 > may be similar to a gtx550?) What if 'closer' always picks the weaker GPU? GPU-rebranding is indeed annoying here, but I think it can be addressed in the same way vendors relabel it: We just take the tuning profile for device 'a' and rebrand it with name 'b'. ;-) > -> Finally, even with all these fallbacks we cannot ensure correctness > for an unknown device. The kernel might require too much resources, for > example. For now, an exception would be thrown at > template_base::generate(), which is not acceptable. How should we handle > this ? > My idea is that we should check for invalidity when constructing the > template, and if the profile is not valid for this device, then fallback > on the defaults. Does this affect kernels other than GEMM? Either way, this needs some kind of hierarchical fallback, which, in simplest term, is just falling back to a super-conservative profile. > As a general rule, when the slow default profile is > used, should we output a warning? We should provide a diagnostics function to the user, yes. We should not dump anything to stdout without the user explicitly asking for it. Best regards, Karli ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel