Thanks to Roger and Raul for their thoughts.
My original message on 28/9 didn't make clear
that the results of one stage of aggregation
supplied input to the next stage, so the
exact placement of the x: wasn't as crucial
as it appeared from my flawed presentation.
So I'm afraid all the discussion has been on my
misunderstanding of floats and integers and has
not focused on the nub of my enquiry - why does
the version with indirect summation terminate in
finite time while the direct version doesn't
seem to?
This difficulty remains even when I resort to
Roger's preferred placement of the x: which I
now use in the following.
To illustrate the staged or cyclic process,
if we had:
vj,:nj NB. stage j - toy data
1 2 1 2 1 2 1 2 1
20 30 40 30 40 20 40 30 40
these would lead to
[mj =: vj (+//.) nj NB. aggregate nj grouped by vj
180 110
The next stage j1 = j+1 scatters the mj around and
forms new values vj1 (the v's are always small integers
- integer arrays in fact, but scalars do for the example)
so we might have
vj1,:nj1 NB. stage j+1 NB. new v's, n's from stage j
3 4 5 3 4 5 3 4 4 5
180 180 180 110 110 110 180 180 110 110
and
[mj1 =: vj1 (+//.) nj1
470 580 400
All n0 = 1 so it's somewhat immaterial whether I force
integer
with vj(+//.) x: nj
or with x: vj (+//.) nj
In practice, the m's are one or two orders of
magnitude larger than the n's in each stage, so we hit
the integer limits somewhere around stage 10 to 20.
However
a) the process does not produce the "correct" result at
the final stage if it is too large for datatype integer.
It is of course approximately correct.
b) the process does not terminate if I force integer with
the recommended vj(+//.) x: nj
c) the process does terminate with the "correct" result
if instead I use
(vj (</.) ij)(+/@:{)every <x:nj
Sorry about the red herring. I think there is an
interesting problem here despite my distracting use of
x:@+/ - maybe not so entirely unimportant despite my
earlier remarks!
I hadn't noticed the appearance of j602 beta until today
- I'll download it and see if it makes any difference.
Thanks again
Mike
Roger Hui wrote:
Since it's not really important in the scheme of things
and since the slow expression is curious (i.e. odd i.e. strange),
I am not inclined to spend the time to investigate.
Do you understand what x:@+/ does? In x+y if
one argument is float the other argument is converted
to float if necessary before the addition. Therefore,
on floating point a,b,c,d
x:@+/ a,b,c,d
a x:@+ b x:@+ c x:@+ d
x: a + fx x: b + fx x: c + d
where fx is x:^:_1, converting extended precision
to float. That is, the x: gets you nothing. That you
got the right answer using x:@+/ means one of
the following:
a. coincidence; or
b. extended precision is not needed, whence
+/ works just as well; or
c. n is created to be extended precision, whence
+/ works just as well.
----- Original Message -----
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Date: Friday, September 28, 2007 11:46
Subject: Re: [Jprogramming] Performance of exact keyed aggregate
To: [email protected], [email protected]
Thanks ...
Well, I did get the right result using
(v (</.) i)(x:@+/@:{)every <n
while i had to kill
v (x:@+//.) n
so I'd still like thoughts on why the
latter indexed version performs
much better than the former. Raul
seems to agree that it should work...
I'm not on a J machine right now, so
can't test the variant phrases.
Thanks, Mike
Original Message:
-----------------
From: Roger Hui [EMAIL PROTECTED]
Date: Fri, 28 Sep 2007 10:56:37 -0700
To: [email protected]
Subject: Re: [Jprogramming] Performance of exact keyed aggregate
If you need exact integer results the expression should be
v +//. x: n (or just v +//. n with n created to be
extended precision).
v x:@+//. n is a curious phrase and does not achieve the
intent
expressed in your msg.
----- Original Message -----
From: Mike Day <[EMAIL PROTECTED]>
Date: Friday, September 28, 2007 3:07
Subject: [Jprogramming] Performance of exact keyed aggregate
To: Programming forum <[email protected]>
Given values v, counts n:
v,:n
1 2 1 2 1 2 1
2 1
20 30 40 30 40 20 40 30 40
We can sum the counts keyed by value
v (+//.) n
180 110
since
v (</.) n
+--------------+-----------+
|20 40 40 40 40|30 30 20 30|
+--------------+-----------+
However my actual values for count get very large,
ie up to 2^50 or more, and I needed exact integer results
so used
v (x:@+//.) n
180 110
but this got very slow.
However once I used the indices i of the counts
i,v,:n
0 1 2 1 2 0 2
1 2
1 2 1 2 1 2 1
2 1
20 30 40 30 40 20 40 30 40
the more indirect derivation
(v (</.) i)
+---------+-------+
|0 2 2 2 2|1 1 0 1|
+---------+-------+
(v (</.) i)(x:@+/@:{)every <n
180 110
is much quicker.
Any ideas why the circumlocution is faster than
the more obvious approach? (I don't know how
much because the slow version never finished.)
Not really important in the scheme of things -
just a day lost in a Mathschallenge solution!
I hope that's not a give-away.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm