Hi,

2013/8/14 Karl Rupp <r...@iue.tuwien.ac.at>

> Hey,
>
> > I've pushed the changes. Does it solve the GTX285 case?
>
> thanks, it does!
>
>
Cool !

>
>
>  The policy is :
>>
>> - One global GPU fallback (very conservative)
>> - One global CPU fallback (very conservative)
>> - One global Accelerator fallback (very conservative)
>> -One Fallback per architecture family
>> --------
>> if the vendor is not in the database, return the global fallback profile
>> if the vendor is in the database but the architecture fallback isn't,
>> return global fallback
>> if the vendor, architecture is in the database but not the name, test
>> architecture fallback. If the profile is invalid (work group size not
>> compatible, too much local size), returns the global fallback. Else,
>> return the architecture fallback
>> If everything is fine, return the specific device profile.
>>
>
> Looks good to me.
>
> Do we want to keep the full device name in the profiles map? With vendor
> and arch determined, we know pretty much everything we need to know. If we
> need to match the name 1:1, there may be too many devices which we miss
> even though the 'faster' profile should work?
>
>
Hmm, in the case where there is no clear match, ie GTX460 instead of
GTX470, it will fallback on the vendor-arch profile, so good performance
should be obtained. On my laptop, my Geforce GT540m grabs the Fermi profile
obtained through a GTX470, which gives good performance. On AMD, the
problem doesn't really exist, since the DEVICE_NAME returned by the SDK is
the codename. It is just much smarter. Ideally, we would want to do the
same thing for NVidia and also stop at the codename rather than at the full
device name, but it is tedious as it requires us to do a manual 1-to-1
mapping with the NVidia Product List...



> Oh, and btw: I don't think the profiles map is valgrind-clean...
>
>
Haha, I know. Well, this is a static std::map, so the pointers in there
will live until the end of the program, so using directly new or shared_ptr
should be equivalent since the destructor will not be called until the end
of the program, right?

Best regards,
Philippe



> Best regards,
> Karli
>
>
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to