On Sun, Dec 20, 2015 at 5:51 PM, West, Nathan <[email protected]>
wrote:

> Hi Stefan,
>
> First of all I'm really happy to see this done. Using opencl in VOLK has
> come up once in a while and the general consensus was that transport /
> granularity of work in VOLK would not make it worth doing in VOLK, but we
> never knew for sure.
>
> Another wrinkle is where that tradeoff between GPUs and processor work is
> for each pair of processor and GPU which is impossible to know without some
> kind of benchmark/wisdom generation. VOLK doesn't have any mechanism for
> doing that and recording it. It's interesting work if that's what you're
> looking to do.
>
> From a VOLK perspective if we can build up that wisdom ability and add
> that ability to a dispatcher it's probably going to be useful, especially
> for people that are developing some opencl code in a workflow, but don't
> know for sure where code should run. I think the best way to develop this
> might be in a VOLK OOT unless you're fine working off a long-lived branch
> looking at this stuff.
>
> I'm happy to continue discussing this, especially on the list
>
> Nathan
>


Good points, Nathan. This seems like an interesting direction for VOLK, at
least under these circumstances. A wisdom concept might work in general for
different sizes of vectors. This could be an add-on to the volk_profile
utility to do a full benchmarking.

But I definitely don't want to drop Stefan's work here. Let's figure out
the best way to make it available so we don't lose track of it.

Tom




> On Thu, Dec 17, 2015 at 8:53 PM, Douglas Geiger <
> [email protected]> wrote:
>
>> Stefan,
>>  First off I definitely want to encourage investigations of this sort: so
>> even though I have some thoughts similar to Sylvains/Tom's about whether
>> VOLK is the right place to do this, I definitely want to encourage *trying*
>> this, since you never know - we could be entirely wrong about whether or
>> not this will work. The only way to know for sure is to try it.
>>
>>  That said: I do think there are way *within* VOLK to deal with the issue
>> of the input size (i.e. vector size) having a large impact on performance -
>> namely the custom dispatcher. This is a concept that exists in VOLK, but
>> has larger gone unnoticed because by in the large the default dispatcher
>> does a good (or at least, good-enough) job at selecting the proper
>> proto-kernel. For off-loading concepts such as utilizing GPU's via OpenCL,
>> a custom dispatcher *could* select the appropriate proto-kernel (including
>> directing the OpenCL implemention to select a CPU vs. GPU-based
>> implementation, if multiple OpenCL implementations are available) on a
>> per-'work()' call from the GNURadio scheduler. In other words, instead of
>> relying on volk_profile to select the best proto-kernel for all calls to
>> that particular volk kernel, the dispatcher could have something more akin
>> to the FFTW 'wisdom' where for different sizes of matrices/vectors,
>> different proto-kernels are called (including the CPU SIMDized call,
>> instead of the OpenCL call for smaller input sizes, etc.).
>>
>>  Anyways - I definitely think this is something that should be looked
>> into more, and if you are interested in pursuing this as - either as a GSoC
>> project or otherwise, I would definitely encourage it, as well as offer
>> assistance/advice where I can.
>>
>>  Doug
>>
>>
>> On Thu, Dec 17, 2015 at 7:58 PM, Stefan Wunsch <
>> [email protected]> wrote:
>>
>>>
>>>
>>> On 12/18/2015 12:30 AM, Tom Rondeau wrote:
>>> > On Thu, Dec 17, 2015 at 1:14 PM, Sylvain Munaut <[email protected]>
>>> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >>> RUN_VOLK_TESTS:
>>> volk_32f_x2_matrix_nxn_multiply_puppet_32f(1000000,10)
>>> >>> generic completed in 28482ms
>>> >>> a_opencl completed in 13364.3ms
>>> >>
>>> >> Question is how does that number change for smaller problem sizes ?
>>> >> And what would be the average problem size encountered in real env.
>>> >>
>>> >> For SIMD optimization the result of "who's the fastest" doesn't vary
>>> >> too much depending on problem size because they don't have much setup
>>> >> / teardown size.
>>> >> For OpenCL I very much doubt that would be the case and if you end up
>>> >> with an app making a lot of "smallish" (and given the default buffer
>>> >> size of GR, I feel the calls to volk aren't processing millions of
>>> >> samples at a time in a single call)
>>> >>
>>> >>
>>> >> Cheers,
>>> >>
>>> >>     Sylvain
>>> >>
>>> >
>>> >
>>> > Stefan,
>>> >
>>> > This is a great start. But Sylvain makes good points about the data
>>> > transfer issue. That's definitely a problem we have to think about.
>>> It's
>>> > why we have avoided pursuing GPU support in VOLK in the past. Now, if
>>> > heterogeneous processor technologies change, so might this problem.
>>> >
>>> > On the other hand, Doug Geiger has made progress on building OpenCL
>>> support
>>> > into the buffer structure of the scheduler. What you've done here might
>>> > work better as a block designed around this concept.
>>> >
>>> > Tom
>>> >
>>>
>>> Hi,
>>>
>>> I just wondered why it has not been done yet, but I see the problems now
>>> (Sylvain made the point).
>>> If a proper device selection and initialization is integrated into VOLK,
>>> probably the same processings could be used for the scheduler (e.g.,
>>> with a generic fallback). But as well, I think that I don't know enough
>>> about all of this ;)
>>>
>>> Greetings
>>> Stefan
>>>
>>> _______________________________________________
>>> Discuss-gnuradio mailing list
>>> [email protected]
>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>
>>
>>
>>
>> --
>> Doug Geiger
>> [email protected]
>>
>> _______________________________________________
>> Discuss-gnuradio mailing list
>> [email protected]
>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>
>>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>
>
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to