You are right that the implementation focuses on improving
floating-point performance. Sometimes integer arrays are converted to
floating-point just to get better speed.
Henry Rich
On 9/8/2019 10:09 PM, Don Kelly wrote:
I have used the same approach as Joey for some time now and note that
there have been quite amazing jumps (avx helps) I usually start with
an integer array such as 1+100000?10000 and save it -then repeat with
%. on the floating point result with negligible differences in
timing. Storage time is insignificant and seem to be faster for the
floating point array. Errors in multiplication of the matrix gives a
unit matrix with off diagonal errors of the order of 10^13 or less in
magnitude.
Don
901-beta i
AMD a10-7800 windows 10
On 2019-09-08 1:56 p.m., Henry Rich wrote:
What I meant you couldn't have known is how much work would go into
cache-friendly matrix multiply in J.
Henry Rich
On 9/8/2019 4:37 PM, Roger Hui wrote:
A J model of the QR decomposition can be found in
http://code.jsoftware.com/wiki/Essays/QR_Decomposition
He builded better than he knew: it relies on +/ .* for most of the
numeric work.
(Ahem) actually I knew. It was part of my M.Sc. thesis and is another
lemma which shows that %.x has the same order of complexity as +/ .*
, at
the time O(n*2^.7) (better than the conventional O(n*3)), now down to
O(n*2.3728639) according to Wikipedia. The algorithm is also good
answer
to claims that QR and other matrix decompositions are inherently loopy.
See Hui & Iverson, _A Note on Programming Style_, Vector, volume 12,
number
13, 1996-01.
On Sun, Sep 8, 2019 at 1:14 PM Henry Rich <henryhr...@gmail.com> wrote:
For years Joey Tuttle has used the performance of (%. y) as a simple
measure of the performance of J releases. It has improved pretty
steadily.
This surprised me, because I know where the gaussian-elimination code
is, and that code doesn't do much except C loops that do add and
subtract. I didn't see how improvements to the JE would have any
effect
on (%. y). Not wanting to mess up a good story, I kept my doubts to
myself.
Finally I understand. The code I had been looking at is used only for
rational matrices. The code for general (%. y) is a lovely little
recursive QR decomposition that does lots of memory allocation and
calls
to internal J arithmetic functions. In fact, it's a pretty good
exercise of the computational side of the system. It's quite short
but
uses general utilities so that it works on all fixed-precision numeric
types. Bravo to Roger. He builded better than he knew: it relies
on +/
. * for most of the numeric work, and that code has now been
polished so
well that it is the most efficient code in the system.
Long live Joey's benchmark!
Henry Rich
---
This email has been checked for viruses by AVG.
https://www.avg.com
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm