Re: [Jprogramming] Performance of exact keyed aggregate

Mike Day Sat, 29 Sep 2007 11:57:00 -0700

Thanks to Roger and Raul for their thoughts.My original message on 28/9 didn't make clear

that the results of one stage of aggregation
supplied input to the next stage,  so the
exact placement of the x: wasn't as crucial
as it appeared from my flawed presentation.


So I'm afraid all the discussion has been on my
misunderstanding of floats and integers and has

not focused on the nub of my enquiry - why doesthe version with indirect summation terminate in

finite time while the direct version doesn't
seem to?

This difficulty remains even when I resort to
Roger's preferred placement of the x: which I
now use in the following.

To illustrate the staged or cyclic process,
if we had:

 vj,:nj NB. stage j   - toy data
1  2  1  2  1  2  1  2  1
20 30 40 30 40 20 40 30 40

these would lead to

 [mj =: vj (+//.) nj NB. aggregate nj grouped by vj
180 110

The next stage j1 = j+1 scatters the mj around and
forms new values vj1  (the v's are always small integers
- integer arrays in fact, but scalars do for the example)
so we might have

 vj1,:nj1 NB. stage j+1 NB. new v's, n's from stage j
 3   4   5   3   4   5   3   4   4   5
180 180 180 110 110 110 180 180 110 110

and

  [mj1 =: vj1 (+//.) nj1
470 580 400

All n0 = 1 so it's somewhat immaterial whether I force
integer
with         vj(+//.) x: nj
or with     x: vj (+//.) nj

In practice, the m's are one or two orders of
magnitude larger than the n's in each stage, so we hit
the integer limits somewhere around stage 10 to 20.

However
a) the process does not produce the "correct" result  at

the final stage if it is too large for datatype integer.It is of course approximately correct.

b) the process does not terminate if I force integer with
the recommended vj(+//.) x: nj
c) the process does terminate with the "correct" result
if instead I use
(vj (</.) ij)(+/@:{)every <x:nj

Sorry about the red herring.  I think there is an
interesting problem here despite my distracting use of
x:@+/   - maybe not so entirely unimportant despite my
earlier remarks!

I hadn't noticed the appearance of j602 beta until today
- I'll download it and see if it makes any difference.

Thanks again

Mike

Roger Hui wrote:

Since it's not really important in the scheme of things
and since the slow expression is curious (i.e. odd i.e. strange),
I am not inclined to spend the time to investigate.

Do you understand what x:@+/ does? In x+y ifone argument is float the other argument is convertedto float if necessary before the addition. Therefore,

on floating point a,b,c,d
   x:@+/ a,b,c,d
   a x:@+ b x:@+ c x:@+ d
   x: a + fx x: b + fx x: c + d
where fx is x:^:_1, converting extended precision

to float. That is, the x: gets you nothing. That yougot the right answer using x:@+/ means one ofthe following:

a. coincidence; or
b. extended precision is not needed, whence
+/ works just as well; or
c. n is created to be extended precision, whence
+/ works just as well.



----- Original Message -----
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Date: Friday, September 28, 2007 11:46
Subject: Re: [Jprogramming] Performance of exact keyed aggregate
To: [email protected], [email protected]

Thanks ...

Well, I did get the right result using
   (v (</.) i)(x:@+/@:{)every <n

while i had to killv (x:@+//.) nso I'd still like thoughts on why thelatter indexed version performsmuch better than the former. Raulseems to agree that it should work...I'm not on a J machine right now, socan't test the variant phrases.

Thanks, Mike

Original Message:
-----------------
From: Roger Hui [EMAIL PROTECTED]
Date: Fri, 28 Sep 2007 10:56:37 -0700
To: [email protected]
Subject: Re: [Jprogramming] Performance of exact keyed aggregate


If you need exact integer results the expression should be

v +//. x: n (or just v +//. n with n created to beextended precision).v x:@+//. n is a curious phrase and does not achieve theintentexpressed in your msg.




----- Original Message -----
From: Mike Day <[EMAIL PROTECTED]>
Date: Friday, September 28, 2007 3:07
Subject: [Jprogramming] Performance of exact keyed aggregate
To: Programming forum <[email protected]>

Given values v, counts n:
   v,:n
1 2 1 2 1 2 1

2  1

20 30 40 30 40 20 40 30 40

We can sum the counts keyed by value
   v (+//.) n
180 110
   since
   v (</.) n
+--------------+-----------+
|20 40 40 40 40|30 30 20 30|
+--------------+-----------+

However my actual values for count get very large,
ie up to 2^50 or more, and I needed exact integer results
so used
   v (x:@+//.) n
180 110

but this got very slow.

However once I used the indices i of the counts
   i,v,:n

0 1 2 1 2 0 21 21 2 1 2 1 2 12 1

20 30 40 30 40 20 40 30 40

the more indirect derivation
   (v (</.) i)
+---------+-------+
|0 2 2 2 2|1 1 0 1|
+---------+-------+
   (v (</.) i)(x:@+/@:{)every <n
180 110

is much quicker.

Any ideas why the circumlocution is faster than
the more obvious approach?  (I don't know how
much because the slow version never finished.)

Not really important in the scheme of things -

just a day lost in a Mathschallenge solution!I hope that's not a give-away.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Performance of exact keyed aggregate

Reply via email to