dyadic +-*%^ +. *. first? afaiu, likely oversimplifying, these map closely to 
assembler primitives.

Also, I get a 2x improvement on extended ints, for i. e. e.~


b=: (<.-:#a)+ x: c ?. c=. 1000 [ a=: C ?. C =. 1000000


p. and +//.@:(*/) would be promising.  Though our extended integer format 
allows 18 vector elements (per 256 bit register), and its my impression that 
avx allows for custom "integer packing", it may be worth considering a 
"extended binary representation format".

I did some work (can share) exploring how representing extended integers as 
polynomials (vectors of 32/64 bit numbers) produced faster operations (mult 
especially) than the current internal "base 10000" operations.  The latter is 
faster to print, but 64bit systems could also move internally to base 1e9, and 
quickly convert to base 1e4 for 3!:1 getting a 4x~ improvement on multiply 
before potential additional 8x from avx.



----- Original Message -----
From: Henry Rich <henryhr...@gmail.com>
To: programm...@jsoftware.com
Sent: Sunday, March 12, 2017 9:32 PM
Subject: Re: [Jprogramming] first 806 beta vailable

Your timing includes the inverse, which is comparable to the cost of a 
multiply I think, and is not sped up.

There are many primitives that could benefit from vector instructions.  
The question is which ones to do first.

Henry Rich

On 3/12/2017 8:34 PM, 'Pascal Jasmin' via Programming wrote:
> larger matrix is 2x faster with avx...
>
>   timespacex '( %. + / . * ] ) aa'[ aa =.? 1200 1200 $ 40000
> 3.67906 1.21639e8
>
>
>
> ----- Original Message -----
> From: 'Pascal Jasmin' via Programming <programm...@jsoftware.com>
> To: "programm...@jsoftware.com" <programm...@jsoftware.com>
> Sent: Sunday, March 12, 2017 8:16 PM
> Subject: Re: [Jprogramming] first 806 beta vailable
>
> 10 timespacex '( %. + / . * ] ) aa'[ aa =.? 200 200 $ 40000
> 0.0308774 3.80518e6
>
>
> slightly slower on avx806 vs 805.
>
>
> 0.0286737 3.80518e6
>
> Exact memory match suggests maybe the processor doesn't support a specific 
> avx feature, and the code aborts to "downhandle" the operation?
>
>
>
> ----- Original Message -----
> From: Henry Rich <henryhr...@gmail.com>
> To: programm...@jsoftware.com
> Sent: Sunday, March 12, 2017 7:26 PM
> Subject: Re: [Jprogramming] first 806 beta vailable
>
> Very surprising that there is no improvement in matrix multiply, when
> the processor uses AVX instructions.  This processor has a large L2
> cache.  How large were the matrices?
>
> Henry Rich
>
> On 3/12/2017 7:20 PM, 'Pascal Jasmin' via Programming wrote:
>> on older AM A8-5500 most of the benchmark improvements (compared to 805) are 
>> higher (minimally) than Eric reported.  floating point a bit less, but 
>> float!.0 a bit more.  No improvement in matrix multiply.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: 'Mike Day' via Programming <programm...@jsoftware.com>
>> To: programm...@jsoftware.com
>> Sent: Sunday, March 12, 2017 1:44 PM
>> Subject: Re: [Jprogramming] first 806 beta vailable
>>
>> I didn't know I had avx available on this machine, an AMD A10-7300,
>> running Windows 10.  Anyway,  J806 JVERSION says I do!
>>
>> I'd have a go at running the benchmarks,  but wonder if there's a script
>> available
>> to save doing them "by hand"....
>>
>> JVERSION
>>
>> Engine: j806/j64avx/windows
>>
>> Beta-1: commercial/2017-03-09T09:10:13
>>
>> Library: 8.06.01
>>
>> Qt IDE: 1.5.3/5.6.2
>>
>> Platform: Win 64
>>
>> Installer: J806 install
>>
>> InstallPath: c:/d/j806
>>
>> Contact: www.jsoftware.com
>>
>> Mike
>>
>> On 11/03/2017 22:44, Eric Iverson wrote:
>>> The first 806 beta is available.
>>>
>>> 806 will be primarily a performance release. This is the first J release
>>> where hardware features are directly used for performance. Previous
>>> releases depended on excellent code and smart algorithms. With Advanced
>>> Vector Extensions (AVX) Intel finally (first hardware released in 2011) has
>>> hardware that seems to have J, at least partially, in mind.
>>>
>>> A rough benchmark report is at the end of this message. Some of the results
>>> are already impressive and there may be more to come.
>>>
>>> Improvments in i. and related areas are important in J, but faster
>>> crunching is usually overwhelmed by all the housekeeping in an application.
>>> Some things run 10 times faster, but your application won't.
>>>
>>> It would be a shame to have non-trivial vector capabilities in the hardware
>>> and for J to not take advantage. AVX2 machines have just hit the shelves
>>> there are more goodies there.
>>>
>>> It has been a long time since we've been able to brag of a factor of 10
>>> speedup in a primitive.
>>>
>>> Please get involved in the beta program, it helps make a better product for
>>> everyone.
>>>
>>> And give big thanks to Henry Rich for this core JE development!
>>>
>>> ***
>>> Follow web site download links to Installation/Beta. Do appropriate
>>> download from j806/install folder and then follow the Archive install
>>> instructions. These are 805 release instructions, so be careful to use 806
>>> as appropriate.
>>>
>>> The install contains a default non-avx JE binary as well as an avx JE
>>> binary. The launch icons will run the non-avx binary. Make sure the install
>>> is stable and when you are ready, switch to the avx binary with the
>>> following steps:
>>>
>>>       load'~addons/ide/jhs/installer.ijs'
>>>       avx'' NB. follow the instructions
>>>
>>> If your hardware/OS supports avx, then the next time you start 806 it will
>>> use the avx binary. Verify this by checking 9!:14'' (you will see avx in
>>> the string).
>>>
>>> *** preliminary benchmark report
>>>
>>> 2017 3 11 16 10
>>> j806/j64avx/linux/beta-1/commercial/www.jsoftware.com/2017-03-09T10:14:43
>>> i7-7700Q
>>>
>>> N in tables below indicate new avx JE runs N times faster than 805
>>> b=: (<.-:#a)+ c ?. c [ a=: C ?. C                 NB. intsr
>>> b=: (c?.#a){a [ a=: C ?@$ <:2^63                  NB. intbr
>>> b=: (c?.#a){a [ a=: >,.~":each <"0 [C ?@$ <:2^63  NB. char
>>> b=: 0.1+(c?.#a){a [ a=: 0.1+C ?@$ <:2^63          NB. float
>>> intsr (small range) special code avoids hash - intbr (big range)
>>> float0 tests use !.0 where appropriate
>>>
>>> 'C c'=: 10000000 1000
>>> intsr intbr char  float float0 test
>>>      1.3   2.0   4.1   1.0   3.5  a i. a
>>>     12.8  10.4  25.5   2.1  20.2  a i. b
>>>      3.4   7.3   8.6   5.1  10.7  b i. a
>>>      5.2   8.0   9.0   5.3  12.9  a e. b
>>>      6.4  10.4  25.3   2.1  20.2  b e. a
>>>      5.3   8.8   9.5   5.1  13.1  a (+/@:e.) b
>>>      4.4   6.4   9.4  38.2  12.9  a (e. i. 1:) b
>>>      1.7   1.9   3.8   1.0   1.0  ~.a
>>>      1.6   2.1   4.1   1.0   1.0  ~:a
>>>      1.1   0.9   1.4   1.1   0.0  /:a
>>>      1.2   0.6   1.3   1.0   0.0  /:~a
>>>
>>> 'C c'=: 100000 1000
>>> intsr intbr char float float0 test
>>>     1.5   3.5   5.3  1.1   4.6   a i. a
>>>     3.4   4.7   9.3  3.1   7.1   a i. b
>>>     3.9   8.1   8.7  5.0  12.1   b i. a
>>>     4.4   7.5   9.1  5.3  12.6   a e. b
>>>     2.6   4.8   9.2  3.1   6.8   b e. a
>>>     4.4   8.4   9.5  5.2  12.9   a (+/@:e.) b
>>>     1.5   4.3   7.8 20.7  12.7   a (e. i. 1:) b
>>>     1.0   3.3   4.7  1.1   1.1   ~.a
>>>     1.3   3.5   5.2  1.2   1.1   ~:a
>>>     1.7   1.3   1.3  1.2   0.0   /:a
>>>     1.6   1.4   1.3  1.2   0.0   /:~a
>>>
>>> matrix multiply
>>>     4.5   a +/ . * b
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>> ---
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus

>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to