Opp, output from dgemm should be transposed to row major.
dgemm=: 'liblapack.so.3 dgemm_ > n *c *c *i *i *i *d *d *i *d *i *d *d *i'&cd
mm=: 4 : 0
k=. ,{.$x
c=. (k,k)$1.5-1.5
dgemm (,'T');(,'T');k;k;k;(,2.5-1.5);x;k;y;k;(,1.5-1.5);c;k
|:c
)
'A B'=:0?@$~2,,~4096
echo timespacex'c1=: A+/ .*B'
echo timespacex'c2=: A mm B'
echo c1-:c2
NB. avx
load'dgemm.ijs'
19.4683 2.68438e8
1.11488 5.36873e8
1
NB. j602
167.99789 2.684384e8
1.224063 5.369056e8
1
j806 version is already quite good.
Пт, 21 апр 2017, bill lam написал(а):
> I tested with J calling lapack for matrix multiplication with the
> following script,
>
> NB. extern dgemm_(char * transa, char * transb, int * m, int * n, int * k,
> NB. double * alpha, double * A, int * lda,
> NB. double * B, int * ldb, double * beta,
> NB. double * C, int * ldc);
>
> dgemm=: 'liblapack.so.3 dgemm_ > n *c *c *i *i *i *d *d *i *d *i *d *d *i'&cd
>
> mm=: 4 : 0
> k=. ,{.$x
> c=. (k,k)$1.5-1.5
> dgemm (,'T');(,'T');k;k;k;(,2.5-1.5);x;k;y;k;(,1.5-1.5);c;k
> c
> )
>
> 'A B'=:0?@$~2,,~4096
> echo timespacex'A+/ .*B'
> echo timespacex'A mm B'
>
> result was,
> 19.3608 2.68437e8
> 0.886447 2.68442e8
>
> Note it need to use an optimized version of blas, not the
> reference blas.
>
> Apparently the blas used in julia is sub-optimal.
>
> Вт, 18 апр 2017, bill lam написал(а):
> > I think julia just calls blas.
> >
> > Пн, 17 апр 2017, Xiao-Yong Jin написал(а):
> > >
> > > > On Apr 17, 2017, at 9:26 PM, Henry Rich <[email protected]> wrote:
> > > >
> > > > If you have an implementation of +/ . * on double-precision floats
> > > > that's faster than J 8.06, I would be obliged if you'd send me a copy
> > > > of the source code.
> > >
> > > I'm sure your code is much faster than naive c loops. But some how the
> > > matrix-matrix multiplication is much slower (10x) than that in julia
> > > (tested with a 3-year old version).
> > >
> > > % julia
> > > _
> > > _ _ _(_)_ | A fresh approach to technical computing
> > > (_) | (_) (_) | Documentation: http://docs.julialang.org
> > > _ _ _| |_ __ _ | Type "help()" to list help topics
> > > | | | | | | |/ _` | |
> > > | | |_| | | | (_| | | Version 0.2.1 (2014-02-11 06:30 UTC)
> > > _/ |\__'_|_|_|\__'_| |
> > > |__/ | x86_64-linux-gnu
> > >
> > > julia> A=rand(4096,4096); B=rand(4096,4096);
> > >
> > > julia> @time A*B;
> > > elapsed time: 2.260157127 seconds (149184640 bytes allocated)
> > >
> > > julia>
> > > % jconsole
> > > JVERSION
> > > Engine: j806/j64avx/linux
> > > Beta-3: commercial/2017-04-10T17:51:14
> > > Library: 8.06.02
> > > Platform: Linux 64
> > > Installer: J806 install
> > > InstallPath: /nfs2/xjin/pkgs/j64-806
> > > Contact: www.jsoftware.com
> > > 'A B'=:0?@$~2,,~4096
> > > timespacex'A+/ .*B'
> > > 23.8976 2.68437e8
> > > timespacex'A+/ .*B'
> > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> > --
> > regards,
> > ====================================================
> > GPG key 1024D/4434BAB3 2008-08-24
> > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
>
> --
> regards,
> ====================================================
> GPG key 1024D/4434BAB3 2008-08-24
> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm