I doubt it, inner product should be 4 to 20 time faster with avx support. check your JVERSION and os support for switching ymm registers.
On 12 Apr, 2017 9:39 am, "Xiao-Yong Jin" <[email protected]> wrote: > > > On Apr 11, 2017, at 7:56 PM, bill lam <[email protected]> wrote: > > > > without -DC_AVX=1 , it will build a non-avx version, and your benchmark > > result was expected.Ignore those 0.4 since the actual execution time were > > very small and unstable for comparison. > I didn't write that in my message, but I did have those. > Considering those mmint/float/complex reached comparable performance, I > believe > my binary should have the proper avx code compiled. > > > > > If your cpu supports avx, try again with cflags -DC_AVX=1 -mavx which are > > used in the j64avx target. > > > > > > On 12 Apr, 2017 8:33 am, "Xiao-Yong Jin" <[email protected]> wrote: > > > > How do you guys compile the jengine? > > I tried to compile the source code from github, and got the following > > benchmark. > > I used gcc6.3 from macports. I also tried the clang from xcode, but > didn't > > see much difference. > > I tried a few optimization levels, O1, O2, O3, Ofast, and turning off > some > > aggressive optimizations to pass the tsu.ijs.* > > The following is from -march=native -Ofast -fno-finite-math-only > > -fno-tree-loop-vectorize -fwrapv -fno-strict-aliasing > > It seems that the intsr and intbr are much worse in the binary I > compiled. > > What could be the possible reason for my result here? > > > > > > 2017 4 11 19 2 > > Intel(R) Core(TM) i5-2557M CPU @ 1.70GHz > > > > j806/j64avx/darwin/beta-3/commercial/www.jsoftware.com/ > 2017-04-10T18:24:03 > > j806/j64avx/darwin/beta/GPL3/jxy/2017-04-11T14:38:44 > > intsr (small range) special code avoids hash - intbr (big range) > > float0 tests use !.0 where appropriate > > N in tables below indicate avx JE runs N times faster than 805 > > > > 'type' set 1e7 1e3 > > intsr intbr char float float0 test > > 1.0 1.0 1.2 1.0 1.1 a i. a > > 1.2 1.0 1.3 1.0 1.0 a i. b > > 0.7 1.0 1.3 1.0 1.0 b i. a > > 0.4 0.9 1.2 1.0 1.0 a e. b > > 1.1 0.9 1.4 1.1 1.0 b e. a > > 0.4 0.9 1.4 1.0 1.1 a (+/@:e.) b > > 0.4 1.0 1.3 0.8 1.0 a (e. i. 1:) b > > 0.9 0.9 1.2 1.0 1.0 ~.a > > 0.7 0.9 1.3 1.0 1.0 ~:a > > 1.0 1.0 1.3 1.0 1.0 /:a > > 1.0 1.0 1.0 1.0 0.9 /:~a > > > > 'type' set 1e5 1e3 > > intsr intbr char float float0 test > > 0.9 1.0 1.0 0.9 1.0 a i. a > > 1.2 0.9 1.2 1.0 1.0 a i. b > > 0.5 0.9 1.2 1.0 1.1 b i. a > > 0.4 0.8 1.2 1.0 1.0 a e. b > > 1.0 0.9 1.2 1.0 1.0 b e. a > > 0.4 0.9 1.2 1.0 1.1 a (+/@:e.) b > > 0.8 1.0 1.3 1.0 1.0 a (e. i. 1:) b > > 0.9 0.9 1.0 1.0 1.0 ~.a > > 0.8 0.9 1.1 0.9 0.9 ~:a > > 1.0 1.0 1.1 1.0 1.0 /:a > > 1.0 1.0 1.0 1.0 1.0 /:~a > > > > 'type' set '1e3' > > mmint mmfloat mmcomplex test > > 1.0 0.9 1.0 a +/ . * b > > > >> On Apr 11, 2017, at 5:48 PM, Eric Iverson <[email protected]> > > wrote: > >> > >> It is no possible to upgrade through pacman. The onlyl way is to install > >> the zip release. > >> > >> The previous release included avx and non-avx binaries. The new version > >> includes only the avx version. This reflects what will probably be in > the > >> final release. > >> > >> On Tue, Apr 11, 2017 at 6:16 PM, 'Pascal Jasmin' via Programming < > >> [email protected]> wrote: > >> > >>> The original beta included a load script, and verb execution that IIRC > >>> switched j.dll to "avx.dll" > >>> > >>> > >>> Those instructions could not be found in the web instructions. > >>> > >>> Also, is it possible to upgrade beta through pacman? > >>> > >>> > >>> ________________________________ > >>> From: Eric Iverson <[email protected]> > >>> To: Programming forum <[email protected]> > >>> Sent: Tuesday, April 11, 2017 3:18 PM > >>> Subject: [Jprogramming] 806 beta-3 available > >>> > >>> > >>> > >>> 806 beta-3 available. > >>> > >>> > >>> Comments from original announcement are repeated here for emphasis. > >>> > >>> > >>> 806 will be primarily a performance release. This is the first J > release > >>> > >>> where hardware features are directly used for performance. Previous > >>> > >>> releases depended on excellent code and smart algorithms. With Advanced > >>> > >>> Vector Extensions (AVX) Intel finally (first hardware released in 2011) > > has > >>> > >>> hardware that seems to have J, at least partly, in mind. > >>> > >>> > >>> A rough benchmark report is at the end of this message. It has been a > > long > >>> > >>> time since we've been able to brag of a factor of 10 speedup in a > >>> primitive. > >>> > >>> > >>> Improvements in i. and related areas are important in J, but faster > >>> > >>> crunching is usually overwhelmed by all the housekeeping in an > > application. > >>> > >>> Some things run 10 times faster, but your application won't. > >>> > >>> > >>> Please get involved in the beta program, it helps make a better product > > for > >>> > >>> everyone. > >>> > >>> > >>> And give big thanks to Henry Rich for this core JE development! > >>> > >>> > >>> *** > >>> > >>> Follow web site download links. There have been changes. Please follow > > the > >>> > >>> directions and report any problems. > >>> > >>> > >>> These releases are only for windows/osx/linux intel/amd 64 and include > > only > >>> > >>> an avx binary. > >>> > >>> > >>> The J Engine load will fail if the hardware/OS does not support avx. > >>> > >>> > >>> *** benchmark report - ~addons/ide/jhs/misc/avx.ijs > >>> > >>> > >>> 2017 4 11 15 1 > >>> > >>> Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz > >>> > >>> j805/j64/linux/release-a/commercial/www.jsoftware.com/ > 2017-02-26T16:47:20 > >>> > >>> j806/j64avx/linux/beta-3/commercial/www.jsoftware.com/ > 2017-04-10T17:51:14 > >>> > >>> intsr (small range) special code avoids hash - intbr (big range) > >>> > >>> float0 tests use !.0 where appropriate > >>> > >>> N in tables below indicate avx JE runs N times faster than 805 > >>> > >>> > >>> 'type' set 1e7 1e3 > >>> > >>> intsr intbr char float float0 test > >>> > >>> 1.3 2.0 4.1 2.1 3.6 a i. a > >>> > >>> 12.9 10.7 25.8 2.1 20.5 a i. b > >>> > >>> 3.4 7.3 8.6 6.7 10.6 b i. a > >>> > >>> 5.3 8.2 9.1 7.2 13.2 a e. b > >>> > >>> 6.4 10.6 25.9 2.1 20.4 b e. a > >>> > >>> 5.3 8.8 9.5 7.0 13.2 a (+/@:e.) b > >>> > >>> 4.4 6.5 9.6 53.4 13.2 a (e. i. 1:) b > >>> > >>> 1.7 1.9 3.8 1.7 1.7 ~.a > >>> > >>> 1.6 2.1 4.1 2.2 2.0 ~:a > >>> > >>> 1.1 1.3 1.3 1.2 1.2 /:a > >>> > >>> 1.3 2.1 2.2 1.9 1.9 /:~a > >>> > >>> > >>> 'type' set 1e5 1e3 > >>> > >>> intsr intbr char float float0 test > >>> > >>> 2.0 3.5 5.3 2.4 4.7 a i. a > >>> > >>> 4.2 4.7 9.2 3.2 6.9 a i. b > >>> > >>> 5.4 8.0 8.9 5.2 11.7 b i. a > >>> > >>> 5.8 7.7 9.3 6.7 12.7 a e. b > >>> > >>> 4.1 4.7 9.2 3.2 6.9 b e. a > >>> > >>> 5.7 8.3 9.7 6.5 12.5 a (+/@:e.) b > >>> > >>> 2.0 4.0 7.8 21.0 12.6 a (e. i. 1:) b > >>> > >>> 1.5 3.3 4.5 1.9 1.9 ~.a > >>> > >>> 1.3 3.4 5.2 2.9 2.9 ~:a > >>> > >>> 2.1 2.1 1.3 1.9 1.9 /:a > >>> > >>> 1.7 2.3 1.4 2.1 2.1 /:~a > >>> > >>> > >>> 'type' set '1e3' > >>> > >>> mmint mmfloat mmcomplex test > >>> > >>> 27.6 23.7 17.9 a +/ . * b > >>> > >>> ---------------------------------------------------------------------- > >>> > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >>> ---------------------------------------------------------------------- > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
