On Sun, Dec 20, 2015 at 5:51 PM, West, Nathan <[email protected]> wrote:
> Hi Stefan, > > First of all I'm really happy to see this done. Using opencl in VOLK has > come up once in a while and the general consensus was that transport / > granularity of work in VOLK would not make it worth doing in VOLK, but we > never knew for sure. > > Another wrinkle is where that tradeoff between GPUs and processor work is > for each pair of processor and GPU which is impossible to know without some > kind of benchmark/wisdom generation. VOLK doesn't have any mechanism for > doing that and recording it. It's interesting work if that's what you're > looking to do. > > From a VOLK perspective if we can build up that wisdom ability and add > that ability to a dispatcher it's probably going to be useful, especially > for people that are developing some opencl code in a workflow, but don't > know for sure where code should run. I think the best way to develop this > might be in a VOLK OOT unless you're fine working off a long-lived branch > looking at this stuff. > > I'm happy to continue discussing this, especially on the list > > Nathan > Good points, Nathan. This seems like an interesting direction for VOLK, at least under these circumstances. A wisdom concept might work in general for different sizes of vectors. This could be an add-on to the volk_profile utility to do a full benchmarking. But I definitely don't want to drop Stefan's work here. Let's figure out the best way to make it available so we don't lose track of it. Tom > On Thu, Dec 17, 2015 at 8:53 PM, Douglas Geiger < > [email protected]> wrote: > >> Stefan, >> First off I definitely want to encourage investigations of this sort: so >> even though I have some thoughts similar to Sylvains/Tom's about whether >> VOLK is the right place to do this, I definitely want to encourage *trying* >> this, since you never know - we could be entirely wrong about whether or >> not this will work. The only way to know for sure is to try it. >> >> That said: I do think there are way *within* VOLK to deal with the issue >> of the input size (i.e. vector size) having a large impact on performance - >> namely the custom dispatcher. This is a concept that exists in VOLK, but >> has larger gone unnoticed because by in the large the default dispatcher >> does a good (or at least, good-enough) job at selecting the proper >> proto-kernel. For off-loading concepts such as utilizing GPU's via OpenCL, >> a custom dispatcher *could* select the appropriate proto-kernel (including >> directing the OpenCL implemention to select a CPU vs. GPU-based >> implementation, if multiple OpenCL implementations are available) on a >> per-'work()' call from the GNURadio scheduler. In other words, instead of >> relying on volk_profile to select the best proto-kernel for all calls to >> that particular volk kernel, the dispatcher could have something more akin >> to the FFTW 'wisdom' where for different sizes of matrices/vectors, >> different proto-kernels are called (including the CPU SIMDized call, >> instead of the OpenCL call for smaller input sizes, etc.). >> >> Anyways - I definitely think this is something that should be looked >> into more, and if you are interested in pursuing this as - either as a GSoC >> project or otherwise, I would definitely encourage it, as well as offer >> assistance/advice where I can. >> >> Doug >> >> >> On Thu, Dec 17, 2015 at 7:58 PM, Stefan Wunsch < >> [email protected]> wrote: >> >>> >>> >>> On 12/18/2015 12:30 AM, Tom Rondeau wrote: >>> > On Thu, Dec 17, 2015 at 1:14 PM, Sylvain Munaut <[email protected]> >>> wrote: >>> > >>> >> Hi, >>> >> >>> >>> RUN_VOLK_TESTS: >>> volk_32f_x2_matrix_nxn_multiply_puppet_32f(1000000,10) >>> >>> generic completed in 28482ms >>> >>> a_opencl completed in 13364.3ms >>> >> >>> >> Question is how does that number change for smaller problem sizes ? >>> >> And what would be the average problem size encountered in real env. >>> >> >>> >> For SIMD optimization the result of "who's the fastest" doesn't vary >>> >> too much depending on problem size because they don't have much setup >>> >> / teardown size. >>> >> For OpenCL I very much doubt that would be the case and if you end up >>> >> with an app making a lot of "smallish" (and given the default buffer >>> >> size of GR, I feel the calls to volk aren't processing millions of >>> >> samples at a time in a single call) >>> >> >>> >> >>> >> Cheers, >>> >> >>> >> Sylvain >>> >> >>> > >>> > >>> > Stefan, >>> > >>> > This is a great start. But Sylvain makes good points about the data >>> > transfer issue. That's definitely a problem we have to think about. >>> It's >>> > why we have avoided pursuing GPU support in VOLK in the past. Now, if >>> > heterogeneous processor technologies change, so might this problem. >>> > >>> > On the other hand, Doug Geiger has made progress on building OpenCL >>> support >>> > into the buffer structure of the scheduler. What you've done here might >>> > work better as a block designed around this concept. >>> > >>> > Tom >>> > >>> >>> Hi, >>> >>> I just wondered why it has not been done yet, but I see the problems now >>> (Sylvain made the point). >>> If a proper device selection and initialization is integrated into VOLK, >>> probably the same processings could be used for the scheduler (e.g., >>> with a generic fallback). But as well, I think that I don't know enough >>> about all of this ;) >>> >>> Greetings >>> Stefan >>> >>> _______________________________________________ >>> Discuss-gnuradio mailing list >>> [email protected] >>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>> >> >> >> >> -- >> Doug Geiger >> [email protected] >> >> _______________________________________________ >> Discuss-gnuradio mailing list >> [email protected] >> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >> >> > > _______________________________________________ > Discuss-gnuradio mailing list > [email protected] > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio > >
_______________________________________________ Discuss-gnuradio mailing list [email protected] https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
