Thanks. It's interesting to see how far the avx has come along. And the |: takes more than 20% of the time? I guess this is something that could be improved.
> On Apr 21, 2017, at 7:42 AM, bill lam <[email protected]> wrote: > > Opp, output from dgemm should be transposed to row major. > > dgemm=: 'liblapack.so.3 dgemm_ > n *c *c *i *i *i *d *d *i *d *i *d *d *i'&cd > > mm=: 4 : 0 > k=. ,{.$x > c=. (k,k)$1.5-1.5 > dgemm (,'T');(,'T');k;k;k;(,2.5-1.5);x;k;y;k;(,1.5-1.5);c;k > |:c > ) > > 'A B'=:0?@$~2,,~4096 > echo timespacex'c1=: A+/ .*B' > echo timespacex'c2=: A mm B' > echo c1-:c2 > > NB. avx > load'dgemm.ijs' > 19.4683 2.68438e8 > 1.11488 5.36873e8 > 1 > > NB. j602 > 167.99789 2.684384e8 > 1.224063 5.369056e8 > 1 > > j806 version is already quite good. > > Пт, 21 апр 2017, bill lam написал(а): >> I tested with J calling lapack for matrix multiplication with the >> following script, >> >> NB. extern dgemm_(char * transa, char * transb, int * m, int * n, int * k, >> NB. double * alpha, double * A, int * lda, >> NB. double * B, int * ldb, double * beta, >> NB. double * C, int * ldc); >> >> dgemm=: 'liblapack.so.3 dgemm_ > n *c *c *i *i *i *d *d *i *d *i *d *d *i'&cd >> >> mm=: 4 : 0 >> k=. ,{.$x >> c=. (k,k)$1.5-1.5 >> dgemm (,'T');(,'T');k;k;k;(,2.5-1.5);x;k;y;k;(,1.5-1.5);c;k >> c >> ) >> >> 'A B'=:0?@$~2,,~4096 >> echo timespacex'A+/ .*B' >> echo timespacex'A mm B' >> >> result was, >> 19.3608 2.68437e8 >> 0.886447 2.68442e8 >> >> Note it need to use an optimized version of blas, not the >> reference blas. >> >> Apparently the blas used in julia is sub-optimal. >> >> Вт, 18 апр 2017, bill lam написал(а): >>> I think julia just calls blas. >>> >>> Пн, 17 апр 2017, Xiao-Yong Jin написал(а): >>>> >>>>> On Apr 17, 2017, at 9:26 PM, Henry Rich <[email protected]> wrote: >>>>> >>>>> If you have an implementation of +/ . * on double-precision floats that's >>>>> faster than J 8.06, I would be obliged if you'd send me a copy of the >>>>> source code. >>>> >>>> I'm sure your code is much faster than naive c loops. But some how the >>>> matrix-matrix multiplication is much slower (10x) than that in julia >>>> (tested with a 3-year old version). >>>> >>>> % julia >>>> _ >>>> _ _ _(_)_ | A fresh approach to technical computing >>>> (_) | (_) (_) | Documentation: http://docs.julialang.org >>>> _ _ _| |_ __ _ | Type "help()" to list help topics >>>> | | | | | | |/ _` | | >>>> | | |_| | | | (_| | | Version 0.2.1 (2014-02-11 06:30 UTC) >>>> _/ |\__'_|_|_|\__'_| | >>>> |__/ | x86_64-linux-gnu >>>> >>>> julia> A=rand(4096,4096); B=rand(4096,4096); >>>> >>>> julia> @time A*B; >>>> elapsed time: 2.260157127 seconds (149184640 bytes allocated) >>>> >>>> julia> >>>> % jconsole >>>> JVERSION >>>> Engine: j806/j64avx/linux >>>> Beta-3: commercial/2017-04-10T17:51:14 >>>> Library: 8.06.02 >>>> Platform: Linux 64 >>>> Installer: J806 install >>>> InstallPath: /nfs2/xjin/pkgs/j64-806 >>>> Contact: www.jsoftware.com >>>> 'A B'=:0?@$~2,,~4096 >>>> timespacex'A+/ .*B' >>>> 23.8976 2.68437e8 >>>> timespacex'A+/ .*B' >>>> >>>> ---------------------------------------------------------------------- >>>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >>> -- >>> regards, >>> ==================================================== >>> GPG key 1024D/4434BAB3 2008-08-24 >>> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 >>> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 >> >> -- >> regards, >> ==================================================== >> GPG key 1024D/4434BAB3 2008-08-24 >> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 >> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > > -- > regards, > ==================================================== > GPG key 1024D/4434BAB3 2008-08-24 > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
