1. My initial question raised a simple and very common computational structure which seemed to me to offer a reasonably simply attainable improvement. Since the calculations expand O(*:n) halving it helps at least a little. It has been considered before. I just wanted it considered again in case its relative position in the wish list might have improved.

2. The problem was not matrix multiplication. If the function f in f"1 1/~ is itself quite slow, a factor of 2 can be very helpful

3. Henry's point that you then have to use the matrix is germane. In some cases that is very complex. In my problem, it is very simple and fast.

4. I am not in a position comment on the strategic issues which Henry and Bill raised, but handling multiple cores and taking advantage of changes in the instruction set on the most recent chips sound as if they will give much value to the J community. I have no idea of how big a task that is.

5. Roger has often made the point that the core language should be kept as simple and powerful as possible and where the user has something beyond that there have to be very good reasons for incorporating it. I do not think a factor of 2 on a very significant proportion of matrix operations is something which should be passed up. If it is to be considered further whether it uses the inner product or table form might make a big difference to the range of applications. Roger has already woven a lot of his magic on both.


----- Original Message ----- From: "bill lam" <[EMAIL PROTECTED]>
To: "Programming forum" <[email protected]>
Sent: Saturday, September 02, 2006 4:45 PM
Subject: Re: [Jprogramming] Symmetric inner and outer products


Henry Rich wrote:
For arithmetic operations in general, it would also be
a much better use of coding time to take advantage of
the SSE3 instructions.  With careful coding there could
be a severalfold increase in speed for +/ .* and a
noticeable improvement for other floating-point arithmetics.

I agree with Henry. From source code of J7, it appeared J did not make use of floating point stack for computation. Not sure this had been changed in later release. Given that SIMD (MMX SSE SSE2 3DNow) is almost available in every Intel/AMD CPU. Not making use of them in computational intensive program like J
seems under-utilise the CPU's raw power.

--
regards,
bill
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to