Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

Philippe Tillet Tue, 13 Aug 2013 09:52:07 -0700

Hey,


2013/8/13 Karl Rupp <r...@iue.tuwien.ac.at>

> Hi,
>
>
> >     We can directly query the available local device memory (which is the
>
>>     reason why I added all this buffering to the device class). Am I
>> missing
>>     something?
>>
>>
>> Yes, we could. But having the combination {vendor, local memory} seems a
>> bit weird to me, I think {vendor, generation} makes more sense, don't
>> you think?
>>
>
> {vendor, generation} is the natural format for the handling the profile
> internally, yes. This will presumably involve string parsing of the device
> name, yes :-(
>

I'll do that :) Should I add a "generation" method in the ocl::device
class? I think it is most suited here. We know however that vendors offer
revisions of the current generation. Should GTX 480 and GTX 580 be both
parsed as "Fermi". Since this would just be used for a fallback, I think
that having both GeForce 4xx and GeForce 5xx parsed as Fermi would be a
good solution.

>
> However, the local memory available might vary between devices of the same
> generation (think of desktop vs. mobile), so probably we extend it to:
>   {vendor, generation, min_local_mem_required}
> If the device does not have enough local memory even though it is mapped
> correctly to a certain generation, we simply fall back to a legacy profile
> which stays within the 16kB.
>
>
Right. In this case we probably have to deal with a pretty low-end GPU, and
we should just fallback to the conservative vendor-default... It's probably
too error-prone to maintain multiple fallback profiles for a given
generation :P

I have pushed the modifications to the vendor fallbacks so that they use
below 16kB.
It may not solve the invalid work group size problem... Seems we'll have to
be conservative on both the local size and the work group sizes...

Best regards,
Philippe

Best regards,
> Karli
>
>

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk

_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

Reply via email to