beta binaries should have been built using exactly the same sources, cflags in git. no tweaking on O? flags. Result timing are variable, you may also compare with real time taken to determine if comparison can be taken seriously.
On 12 Apr, 2017 9:49 am, "Xiao-Yong Jin" <[email protected]> wrote: > > > On Apr 11, 2017, at 8:45 PM, bill lam <[email protected]> wrote: > > > > I doubt it, inner product should be 4 to 20 time faster with avx support. > > check your JVERSION and os support for switching ymm registers. > I'm comparing my compiled the binary with beta-3, NOT with j805. > From the report: > > j806/j64avx/darwin/beta-3/commercial/www.jsoftware.com/2017-04-10T18:24:03 > j806/j64avx/darwin/beta/GPL3/jxy/2017-04-11T14:38:44 > > > > > On 12 Apr, 2017 9:39 am, "Xiao-Yong Jin" <[email protected]> wrote: > > > >> > >>> On Apr 11, 2017, at 7:56 PM, bill lam <[email protected]> wrote: > >>> > >>> without -DC_AVX=1 , it will build a non-avx version, and your benchmark > >>> result was expected.Ignore those 0.4 since the actual execution time > were > >>> very small and unstable for comparison. > >> I didn't write that in my message, but I did have those. > >> Considering those mmint/float/complex reached comparable performance, I > >> believe > >> my binary should have the proper avx code compiled. > >> > >>> > >>> If your cpu supports avx, try again with cflags -DC_AVX=1 -mavx which > are > >>> used in the j64avx target. > >>> > >>> > >>> On 12 Apr, 2017 8:33 am, "Xiao-Yong Jin" <[email protected]> > wrote: > >>> > >>> How do you guys compile the jengine? > >>> I tried to compile the source code from github, and got the following > >>> benchmark. > >>> I used gcc6.3 from macports. I also tried the clang from xcode, but > >> didn't > >>> see much difference. > >>> I tried a few optimization levels, O1, O2, O3, Ofast, and turning off > >> some > >>> aggressive optimizations to pass the tsu.ijs.* > >>> The following is from -march=native -Ofast -fno-finite-math-only > >>> -fno-tree-loop-vectorize -fwrapv -fno-strict-aliasing > >>> It seems that the intsr and intbr are much worse in the binary I > >> compiled. > >>> What could be the possible reason for my result here? > >>> > >>> > >>> 2017 4 11 19 2 > >>> Intel(R) Core(TM) i5-2557M CPU @ 1.70GHz > >>> > >>> j806/j64avx/darwin/beta-3/commercial/www.jsoftware.com/ > >> 2017-04-10T18:24:03 > >>> j806/j64avx/darwin/beta/GPL3/jxy/2017-04-11T14:38:44 > >>> intsr (small range) special code avoids hash - intbr (big range) > >>> float0 tests use !.0 where appropriate > >>> N in tables below indicate avx JE runs N times faster than 805 > >>> > >>> 'type' set 1e7 1e3 > >>> intsr intbr char float float0 test > >>> 1.0 1.0 1.2 1.0 1.1 a i. a > >>> 1.2 1.0 1.3 1.0 1.0 a i. b > >>> 0.7 1.0 1.3 1.0 1.0 b i. a > >>> 0.4 0.9 1.2 1.0 1.0 a e. b > >>> 1.1 0.9 1.4 1.1 1.0 b e. a > >>> 0.4 0.9 1.4 1.0 1.1 a (+/@:e.) b > >>> 0.4 1.0 1.3 0.8 1.0 a (e. i. 1:) b > >>> 0.9 0.9 1.2 1.0 1.0 ~.a > >>> 0.7 0.9 1.3 1.0 1.0 ~:a > >>> 1.0 1.0 1.3 1.0 1.0 /:a > >>> 1.0 1.0 1.0 1.0 0.9 /:~a > >>> > >>> 'type' set 1e5 1e3 > >>> intsr intbr char float float0 test > >>> 0.9 1.0 1.0 0.9 1.0 a i. a > >>> 1.2 0.9 1.2 1.0 1.0 a i. b > >>> 0.5 0.9 1.2 1.0 1.1 b i. a > >>> 0.4 0.8 1.2 1.0 1.0 a e. b > >>> 1.0 0.9 1.2 1.0 1.0 b e. a > >>> 0.4 0.9 1.2 1.0 1.1 a (+/@:e.) b > >>> 0.8 1.0 1.3 1.0 1.0 a (e. i. 1:) b > >>> 0.9 0.9 1.0 1.0 1.0 ~.a > >>> 0.8 0.9 1.1 0.9 0.9 ~:a > >>> 1.0 1.0 1.1 1.0 1.0 /:a > >>> 1.0 1.0 1.0 1.0 1.0 /:~a > >>> > >>> 'type' set '1e3' > >>> mmint mmfloat mmcomplex test > >>> 1.0 0.9 1.0 a +/ . * b > >>> > >>>> On Apr 11, 2017, at 5:48 PM, Eric Iverson <[email protected]> > >>> wrote: > >>>> > >>>> It is no possible to upgrade through pacman. The onlyl way is to > install > >>>> the zip release. > >>>> > >>>> The previous release included avx and non-avx binaries. The new > version > >>>> includes only the avx version. This reflects what will probably be in > >> the > >>>> final release. > >>>> > >>>> On Tue, Apr 11, 2017 at 6:16 PM, 'Pascal Jasmin' via Programming < > >>>> [email protected]> wrote: > >>>> > >>>>> The original beta included a load script, and verb execution that > IIRC > >>>>> switched j.dll to "avx.dll" > >>>>> > >>>>> > >>>>> Those instructions could not be found in the web instructions. > >>>>> > >>>>> Also, is it possible to upgrade beta through pacman? > >>>>> > >>>>> > >>>>> ________________________________ > >>>>> From: Eric Iverson <[email protected]> > >>>>> To: Programming forum <[email protected]> > >>>>> Sent: Tuesday, April 11, 2017 3:18 PM > >>>>> Subject: [Jprogramming] 806 beta-3 available > >>>>> > >>>>> > >>>>> > >>>>> 806 beta-3 available. > >>>>> > >>>>> > >>>>> Comments from original announcement are repeated here for emphasis. > >>>>> > >>>>> > >>>>> 806 will be primarily a performance release. This is the first J > >> release > >>>>> > >>>>> where hardware features are directly used for performance. Previous > >>>>> > >>>>> releases depended on excellent code and smart algorithms. With > Advanced > >>>>> > >>>>> Vector Extensions (AVX) Intel finally (first hardware released in > 2011) > >>> has > >>>>> > >>>>> hardware that seems to have J, at least partly, in mind. > >>>>> > >>>>> > >>>>> A rough benchmark report is at the end of this message. It has been a > >>> long > >>>>> > >>>>> time since we've been able to brag of a factor of 10 speedup in a > >>>>> primitive. > >>>>> > >>>>> > >>>>> Improvements in i. and related areas are important in J, but faster > >>>>> > >>>>> crunching is usually overwhelmed by all the housekeeping in an > >>> application. > >>>>> > >>>>> Some things run 10 times faster, but your application won't. > >>>>> > >>>>> > >>>>> Please get involved in the beta program, it helps make a better > product > >>> for > >>>>> > >>>>> everyone. > >>>>> > >>>>> > >>>>> And give big thanks to Henry Rich for this core JE development! > >>>>> > >>>>> > >>>>> *** > >>>>> > >>>>> Follow web site download links. There have been changes. Please > follow > >>> the > >>>>> > >>>>> directions and report any problems. > >>>>> > >>>>> > >>>>> These releases are only for windows/osx/linux intel/amd 64 and > include > >>> only > >>>>> > >>>>> an avx binary. > >>>>> > >>>>> > >>>>> The J Engine load will fail if the hardware/OS does not support avx. > >>>>> > >>>>> > >>>>> *** benchmark report - ~addons/ide/jhs/misc/avx.ijs > >>>>> > >>>>> > >>>>> 2017 4 11 15 1 > >>>>> > >>>>> Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz > >>>>> > >>>>> j805/j64/linux/release-a/commercial/www.jsoftware.com/ > >> 2017-02-26T16:47:20 > >>>>> > >>>>> j806/j64avx/linux/beta-3/commercial/www.jsoftware.com/ > >> 2017-04-10T17:51:14 > >>>>> > >>>>> intsr (small range) special code avoids hash - intbr (big range) > >>>>> > >>>>> float0 tests use !.0 where appropriate > >>>>> > >>>>> N in tables below indicate avx JE runs N times faster than 805 > >>>>> > >>>>> > >>>>> 'type' set 1e7 1e3 > >>>>> > >>>>> intsr intbr char float float0 test > >>>>> > >>>>> 1.3 2.0 4.1 2.1 3.6 a i. a > >>>>> > >>>>> 12.9 10.7 25.8 2.1 20.5 a i. b > >>>>> > >>>>> 3.4 7.3 8.6 6.7 10.6 b i. a > >>>>> > >>>>> 5.3 8.2 9.1 7.2 13.2 a e. b > >>>>> > >>>>> 6.4 10.6 25.9 2.1 20.4 b e. a > >>>>> > >>>>> 5.3 8.8 9.5 7.0 13.2 a (+/@:e.) b > >>>>> > >>>>> 4.4 6.5 9.6 53.4 13.2 a (e. i. 1:) b > >>>>> > >>>>> 1.7 1.9 3.8 1.7 1.7 ~.a > >>>>> > >>>>> 1.6 2.1 4.1 2.2 2.0 ~:a > >>>>> > >>>>> 1.1 1.3 1.3 1.2 1.2 /:a > >>>>> > >>>>> 1.3 2.1 2.2 1.9 1.9 /:~a > >>>>> > >>>>> > >>>>> 'type' set 1e5 1e3 > >>>>> > >>>>> intsr intbr char float float0 test > >>>>> > >>>>> 2.0 3.5 5.3 2.4 4.7 a i. a > >>>>> > >>>>> 4.2 4.7 9.2 3.2 6.9 a i. b > >>>>> > >>>>> 5.4 8.0 8.9 5.2 11.7 b i. a > >>>>> > >>>>> 5.8 7.7 9.3 6.7 12.7 a e. b > >>>>> > >>>>> 4.1 4.7 9.2 3.2 6.9 b e. a > >>>>> > >>>>> 5.7 8.3 9.7 6.5 12.5 a (+/@:e.) b > >>>>> > >>>>> 2.0 4.0 7.8 21.0 12.6 a (e. i. 1:) b > >>>>> > >>>>> 1.5 3.3 4.5 1.9 1.9 ~.a > >>>>> > >>>>> 1.3 3.4 5.2 2.9 2.9 ~:a > >>>>> > >>>>> 2.1 2.1 1.3 1.9 1.9 /:a > >>>>> > >>>>> 1.7 2.3 1.4 2.1 2.1 /:~a > >>>>> > >>>>> > >>>>> 'type' set '1e3' > >>>>> > >>>>> mmint mmfloat mmcomplex test > >>>>> > >>>>> 27.6 23.7 17.9 a +/ . * b > >>>>> > >>>>> ------------------------------------------------------------ > ---------- > >>>>> > >>>>> For information about J forums see http://www.jsoftware.com/ > forums.htm > >>>>> ------------------------------------------------------------ > ---------- > >>>>> For information about J forums see http://www.jsoftware.com/ > forums.htm > >>>> ------------------------------------------------------------ > ---------- > >>>> For information about J forums see http://www.jsoftware.com/ > forums.htm > >>> > >>> ---------------------------------------------------------------------- > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >>> ---------------------------------------------------------------------- > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >> > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
