My amd apu has integrated GPU. The cpu part supports f64. the gpu part does not.
If your i7 has integrated graphics, then its gpu may not support opencl at all (though intel gpu's above hd4000 should though) https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units to count opencl cores, take the triplet of config (for earlier hd4000 128 16 2) 64 4 1&(<./@:%~) 128 16 2(2 opencl cores) opencl 2 features are not used by arrayfire (yet) I'd expect that the afcpu driver would always outperform the opencl cpu driver due to full featured assembly, but this is a guess. ----- Original Message ----- From: bill lam <[email protected]> To: 'Pascal Jasmin' via Programming <[email protected]> Cc: Sent: Monday, September 28, 2015 11:28 AM Subject: Re: [Jprogramming] Arrayfire bindings Does your GPU support fp64? OpenCL should already have support for fp64, on my intel i7 which has only one device. it shows extensions with cl_khr_fp64. Пн, 28 сен 2015, jprogramming написал(а): > I am cheating on the benchmark,actually. I'm doing double precision in J, > and float in arrayfire for the 250x speed claim on GPU. It does verify that > they are tolerantly equal to J's results, and gets a 1 on my system. > > The functions are matmulsFonly (for floats), and multimatmulsF (for array > size(s) parameter) > > With doubles, there is also a test included for that you can use to test CPU > (still speed boost) and opencl on CPU. Its called matmuls and multimatmuls. > Running matmuls on device that doesn't support double, crashes gracefully. > > As a testing scaffold, assuming 2 opencl devices with 1 (on cpu/apu) > supporting doubles > > C =. P =. ('afcpu';0) conew 'afdevice' > O =. ('afopencl';1) conew 'afdevice' > C =. G =. ('afopencl';0) conew 'afdevice' > 256 128 100 multimatmuls P, setme__O O (relevant code is matmuls) > > > for my system, 100 array size is faster in J, while 128 is slower for both > CPU (openblas library) and opencl(cpu). I think power of 2 sizes provide > special optimizations. What is superfast in all libraries is returning > (async/lazy) from calls. Accessing a result transparently syncs (invokes > wait until calculation complete) it. 128 size has about even blas and > opencl(apu ) performance. > > for 256 size blas-cpu is 2x faster than opencl-apu. Another advantage blas > has is that running a specific array size the first time is about the same > speed as the 2nd time. opencl has a JIT delay/or just high variation for > each new size. > > at 512 on my system, openblas is 4x+ faster than J, but opencl apu is 2x > slower. I can't explain why the trend from 256 breaks down. > > > OpenCL is already cross platforms, what is the advantage of > using ArrayFire instead of OpenCL? > > On specific hardware, cuda might be faster. The double precision issue above > means you may not be able to use an available GPU to get the exact result you > want. You can use the cpu and gpu at the same time, with the same code, and > it provides async/lazy execution. > It is/was relatively easy to bind to a single array creation/dereferencing > api, to use as backend all of the possible supported functions (on cpu come > from different integrated libraries). > Its also straightforward to create an afdevice compatible class (see coclass > 'afJ' in arrayfire.ijs) that has mostly noops for array > creating/dereferencing/memory management, and is useful for 32 bit systems, > those who don't want to install arrayfire, and the many situations where J is > actually faster than using arrayfire, that you either find out through > benchmarking the same code, or dynamically switching implementation based on > input size. > > > > > > > > ----- Original Message ----- > From: bill lam <[email protected]> > To: 'Pascal Jasmin' via Programming <[email protected]> > Cc: > Sent: Monday, September 28, 2015 1:22 AM > Subject: Re: [Jprogramming] Arrayfire bindings > > The speed improvement is impressive, btw did you also verfiy the > calculation results argeed with each other? > > OpenCL is already cross platforms, what is the advantage of > using ArrayFire instead of OpenCL? > > Пн, 28 сен 2015, jprogramming написал(а): > > it provides a unified api to opencl, cuda (gpu centered computing) and > > various paralell (multithreaded) opensource cpu libraries for the same > > functions. > > > > the function categories it covers: > > http://www.arrayfire.com/docs/group__func__categories.htm > > > > something not in those categories is a nearest neighbour algorithm. > > > > > > ----- Original Message ----- > > From: 'Bo Jacoby' via Programming <[email protected]> > > To: "[email protected]" <[email protected]> > > Cc: > > Sent: Monday, September 28, 2015 12:52 AM > > Subject: Re: [Jprogramming] Arrayfire bindings > > > > What is Arrayfire is all about?-- Bo > > > > > > > > Den 6:29 mandag den 28. september 2015 skrev 'Pascal Jasmin' via > > Programming <[email protected]>: > > > > > > > > updated repo (bug fixes), and included a wiki page describing > > advantages/disadvantages of arrayfire. https://github.com/Pascal-J/Jfire > > > > On my low end GPU, I get a 250x speed improvement over J for floating point > > matrix multiplication on array size of 1024x1024 (including setup and > > results going back to J). > > > > > > > > > > ----- Original Message ----- > > From: Alex Shroyer <[email protected]> > > To: [email protected] > > Cc: > > Sent: Saturday, September 26, 2015 10:07 PM > > Subject: Re: [Jprogramming] Arrayfire bindings > > > > Very cool. I did an audio processing thing in J a while back which I think > > would have benefited from something like this. Perhaps now I'll > > re-implement it using Jfire. > > Thanks for sharing! > > > > Alex > > > > -----Original Message----- > > From: "'Pascal Jasmin' via Programming" <[email protected]> > > Sent: 9/26/2015 7:03 PM > > To: "Programming Forum" <[email protected]> > > Subject: [Jprogramming] Arrayfire bindings > > > > https://github.com/Pascal-J/Jfire > > > > There are probably some helpful applications for it. Mostly with floating > > point. (A lot of work for something I don't use much :P) > > > > Hope you like the interface for it. Its fairly J like. > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > -- > regards, > ==================================================== > GPG key 1024D/4434BAB3 2008-08-24 > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm -- regards, ==================================================== GPG key 1024D/4434BAB3 2008-08-24 gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
