> On Apr 11, 2017, at 7:56 PM, bill lam <[email protected]> wrote: > > without -DC_AVX=1 , it will build a non-avx version, and your benchmark > result was expected.Ignore those 0.4 since the actual execution time were > very small and unstable for comparison. I didn't write that in my message, but I did have those. Considering those mmint/float/complex reached comparable performance, I believe my binary should have the proper avx code compiled.
> > If your cpu supports avx, try again with cflags -DC_AVX=1 -mavx which are > used in the j64avx target. > > > On 12 Apr, 2017 8:33 am, "Xiao-Yong Jin" <[email protected]> wrote: > > How do you guys compile the jengine? > I tried to compile the source code from github, and got the following > benchmark. > I used gcc6.3 from macports. I also tried the clang from xcode, but didn't > see much difference. > I tried a few optimization levels, O1, O2, O3, Ofast, and turning off some > aggressive optimizations to pass the tsu.ijs.* > The following is from -march=native -Ofast -fno-finite-math-only > -fno-tree-loop-vectorize -fwrapv -fno-strict-aliasing > It seems that the intsr and intbr are much worse in the binary I compiled. > What could be the possible reason for my result here? > > > 2017 4 11 19 2 > Intel(R) Core(TM) i5-2557M CPU @ 1.70GHz > > j806/j64avx/darwin/beta-3/commercial/www.jsoftware.com/2017-04-10T18:24:03 > j806/j64avx/darwin/beta/GPL3/jxy/2017-04-11T14:38:44 > intsr (small range) special code avoids hash - intbr (big range) > float0 tests use !.0 where appropriate > N in tables below indicate avx JE runs N times faster than 805 > > 'type' set 1e7 1e3 > intsr intbr char float float0 test > 1.0 1.0 1.2 1.0 1.1 a i. a > 1.2 1.0 1.3 1.0 1.0 a i. b > 0.7 1.0 1.3 1.0 1.0 b i. a > 0.4 0.9 1.2 1.0 1.0 a e. b > 1.1 0.9 1.4 1.1 1.0 b e. a > 0.4 0.9 1.4 1.0 1.1 a (+/@:e.) b > 0.4 1.0 1.3 0.8 1.0 a (e. i. 1:) b > 0.9 0.9 1.2 1.0 1.0 ~.a > 0.7 0.9 1.3 1.0 1.0 ~:a > 1.0 1.0 1.3 1.0 1.0 /:a > 1.0 1.0 1.0 1.0 0.9 /:~a > > 'type' set 1e5 1e3 > intsr intbr char float float0 test > 0.9 1.0 1.0 0.9 1.0 a i. a > 1.2 0.9 1.2 1.0 1.0 a i. b > 0.5 0.9 1.2 1.0 1.1 b i. a > 0.4 0.8 1.2 1.0 1.0 a e. b > 1.0 0.9 1.2 1.0 1.0 b e. a > 0.4 0.9 1.2 1.0 1.1 a (+/@:e.) b > 0.8 1.0 1.3 1.0 1.0 a (e. i. 1:) b > 0.9 0.9 1.0 1.0 1.0 ~.a > 0.8 0.9 1.1 0.9 0.9 ~:a > 1.0 1.0 1.1 1.0 1.0 /:a > 1.0 1.0 1.0 1.0 1.0 /:~a > > 'type' set '1e3' > mmint mmfloat mmcomplex test > 1.0 0.9 1.0 a +/ . * b > >> On Apr 11, 2017, at 5:48 PM, Eric Iverson <[email protected]> > wrote: >> >> It is no possible to upgrade through pacman. The onlyl way is to install >> the zip release. >> >> The previous release included avx and non-avx binaries. The new version >> includes only the avx version. This reflects what will probably be in the >> final release. >> >> On Tue, Apr 11, 2017 at 6:16 PM, 'Pascal Jasmin' via Programming < >> [email protected]> wrote: >> >>> The original beta included a load script, and verb execution that IIRC >>> switched j.dll to "avx.dll" >>> >>> >>> Those instructions could not be found in the web instructions. >>> >>> Also, is it possible to upgrade beta through pacman? >>> >>> >>> ________________________________ >>> From: Eric Iverson <[email protected]> >>> To: Programming forum <[email protected]> >>> Sent: Tuesday, April 11, 2017 3:18 PM >>> Subject: [Jprogramming] 806 beta-3 available >>> >>> >>> >>> 806 beta-3 available. >>> >>> >>> Comments from original announcement are repeated here for emphasis. >>> >>> >>> 806 will be primarily a performance release. This is the first J release >>> >>> where hardware features are directly used for performance. Previous >>> >>> releases depended on excellent code and smart algorithms. With Advanced >>> >>> Vector Extensions (AVX) Intel finally (first hardware released in 2011) > has >>> >>> hardware that seems to have J, at least partly, in mind. >>> >>> >>> A rough benchmark report is at the end of this message. It has been a > long >>> >>> time since we've been able to brag of a factor of 10 speedup in a >>> primitive. >>> >>> >>> Improvements in i. and related areas are important in J, but faster >>> >>> crunching is usually overwhelmed by all the housekeeping in an > application. >>> >>> Some things run 10 times faster, but your application won't. >>> >>> >>> Please get involved in the beta program, it helps make a better product > for >>> >>> everyone. >>> >>> >>> And give big thanks to Henry Rich for this core JE development! >>> >>> >>> *** >>> >>> Follow web site download links. There have been changes. Please follow > the >>> >>> directions and report any problems. >>> >>> >>> These releases are only for windows/osx/linux intel/amd 64 and include > only >>> >>> an avx binary. >>> >>> >>> The J Engine load will fail if the hardware/OS does not support avx. >>> >>> >>> *** benchmark report - ~addons/ide/jhs/misc/avx.ijs >>> >>> >>> 2017 4 11 15 1 >>> >>> Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz >>> >>> j805/j64/linux/release-a/commercial/www.jsoftware.com/2017-02-26T16:47:20 >>> >>> j806/j64avx/linux/beta-3/commercial/www.jsoftware.com/2017-04-10T17:51:14 >>> >>> intsr (small range) special code avoids hash - intbr (big range) >>> >>> float0 tests use !.0 where appropriate >>> >>> N in tables below indicate avx JE runs N times faster than 805 >>> >>> >>> 'type' set 1e7 1e3 >>> >>> intsr intbr char float float0 test >>> >>> 1.3 2.0 4.1 2.1 3.6 a i. a >>> >>> 12.9 10.7 25.8 2.1 20.5 a i. b >>> >>> 3.4 7.3 8.6 6.7 10.6 b i. a >>> >>> 5.3 8.2 9.1 7.2 13.2 a e. b >>> >>> 6.4 10.6 25.9 2.1 20.4 b e. a >>> >>> 5.3 8.8 9.5 7.0 13.2 a (+/@:e.) b >>> >>> 4.4 6.5 9.6 53.4 13.2 a (e. i. 1:) b >>> >>> 1.7 1.9 3.8 1.7 1.7 ~.a >>> >>> 1.6 2.1 4.1 2.2 2.0 ~:a >>> >>> 1.1 1.3 1.3 1.2 1.2 /:a >>> >>> 1.3 2.1 2.2 1.9 1.9 /:~a >>> >>> >>> 'type' set 1e5 1e3 >>> >>> intsr intbr char float float0 test >>> >>> 2.0 3.5 5.3 2.4 4.7 a i. a >>> >>> 4.2 4.7 9.2 3.2 6.9 a i. b >>> >>> 5.4 8.0 8.9 5.2 11.7 b i. a >>> >>> 5.8 7.7 9.3 6.7 12.7 a e. b >>> >>> 4.1 4.7 9.2 3.2 6.9 b e. a >>> >>> 5.7 8.3 9.7 6.5 12.5 a (+/@:e.) b >>> >>> 2.0 4.0 7.8 21.0 12.6 a (e. i. 1:) b >>> >>> 1.5 3.3 4.5 1.9 1.9 ~.a >>> >>> 1.3 3.4 5.2 2.9 2.9 ~:a >>> >>> 2.1 2.1 1.3 1.9 1.9 /:a >>> >>> 1.7 2.3 1.4 2.1 2.1 /:~a >>> >>> >>> 'type' set '1e3' >>> >>> mmint mmfloat mmcomplex test >>> >>> 27.6 23.7 17.9 a +/ . * b >>> >>> ---------------------------------------------------------------------- >>> >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
