An algorithm with slightly worse O() can perform better if it can take
advantage of hardware features such as L1/L2 cache, cpu specific new
instructions. That's what BLAS trying to do.

On Mon, Sep 9, 2019, 4:44 AM Roger Hui <[email protected]> wrote:

> Arggh!  Mixed up APL and J.  Conventionally, matrix mult is O(n^3), at the
> time of my M.Sc. O(n^2^.7), now O(n^2.3728639) according to Wikipedia.
>
> A further personal note:  The recursive QR and a couple of similar
> decompositions are how I got to Erdős 2.
>
>
> On Sun, Sep 8, 2019 at 1:37 PM Roger Hui <[email protected]>
> wrote:
>
> > A J model of the QR decomposition can be found in
> > http://code.jsoftware.com/wiki/Essays/QR_Decomposition
> >
> > > He builded better than he knew: it relies on +/ .* for most of the
> > numeric work.
> >
> > (Ahem) actually I knew.  It was part of my M.Sc. thesis and is another
> > lemma which shows that %.x has the same order of complexity as +/ .* , at
> > the time O(n*2^.7) (better than the conventional O(n*3)), now down to
> > O(n*2.3728639) according to Wikipedia.  The algorithm is also good answer
> > to claims that QR and other matrix decompositions are inherently loopy.
> > See Hui & Iverson, _A Note on Programming Style_, Vector, volume 12,
> number
> > 13, 1996-01.
> >
> >
> >
> >
> >
> > On Sun, Sep 8, 2019 at 1:14 PM Henry Rich <[email protected]> wrote:
> >
> >> For years Joey Tuttle has used the performance of (%. y) as a simple
> >> measure of the performance of J releases.  It has improved pretty
> >> steadily.
> >>
> >> This surprised me, because I know where the gaussian-elimination code
> >> is, and that code doesn't do much except C loops that do add and
> >> subtract.  I didn't see how improvements to the JE would have any effect
> >> on (%. y).  Not wanting to mess up a good story, I kept my doubts to
> >> myself.
> >>
> >> Finally I understand.  The code I had been looking at is used only for
> >> rational matrices.  The code for general (%. y) is a lovely little
> >> recursive QR decomposition that does lots of memory allocation and calls
> >> to internal J arithmetic functions.  In fact, it's a pretty good
> >> exercise of the computational side of the system.  It's quite short but
> >> uses general utilities so that it works on all fixed-precision numeric
> >> types.  Bravo to Roger.  He builded better than he knew: it relies on +/
> >> . * for most of the numeric work, and that code has now been polished so
> >> well that it is the most efficient code in the system.
> >>
> >> Long live Joey's benchmark!
> >>
> >> Henry Rich
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >> https://www.avg.com
> >>
> >> ----------------------------------------------------------------------
> >> For information about J forums see http://www.jsoftware.com/forums.htm
> >>
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to