Here's one way to combine frequency tables such as those I've derived
below:
Append one freq table to another, sort by their tags/indices, accumulate
the count terms by common tags, and pair the new counts with those tags.
It might save some space to do the sort last.
CombineFreq =: (~.@:{. ,: +//./)@:(/:~@:,&.:|:)
f1
5 6 7 8 9 10 11 12 13 14 15
16 17 18
1954 5194 11757 21627 38189 60395 85311 113399 136308 148676 142623
119709 78395 35706
f2
3 4 5 6 7 8 9 10 11 12 13 14
15 16 17 18
144 601 1872 5266 11630 21925 38098 60066 84947 113079 136071 148731
143091 120464 78640 35375
f1 CombineFreq f2
3 4 5 6 7 8 9 10 11 12 13 14
15 16 17 18
144 601 3826 10460 23387 43552 76287 120461 170258 226478 272379 297407
285714 240173 157035 71081
Mike
On 21/02/2017 14:24, 'Mike Day' via Programming wrote:
FWIW, here are some results running one million tests of bestof3N
compared with a
very slightly more J way of doing it.
(Your correspondence with Joe & Pascal while I was drafting this might
render the
following redundant.)
Freq =: ~.,:#/.~ NB. 2-row frequency table
NB. slightly harder to make sure all slots (ie 3-18 here) are
filled
Sort =: /:~
timer'Freq Sort 5 best3ofN"0] 1000000#6'NB. ELAPSED time for
1000000 applications of best3ofN
+-----+----------------------------------------------------------------------------------------------+
|3.251| 3 4 5 6 7 8 9 10 11 12 13
14 15 16 17 18|
| |134 624 1874 5270 11435 21829 37994 60349 86068 113847 135482
148051 142940 120337 78427 35339|
+-----+----------------------------------------------------------------------------------------------+
xUniform0toy =: ?@$ NB. sample {.x sets of {:x
values in range 0 to y - 1
NB. slightly misnamed!
SumBest3 =: +/@:(3{.\:~)"1 NB. as you were doing it, with
"1 to apply by rows
FreqBest3ofXshapeY =: Freq @ Sort @: SumBest3 @ xUniform0toy NB.
put them together
timer'3 0 + 1000000 5 FreqBest3ofXshapeY 6' NB. ELAPSED time for 1
application with 1000000 rows
NB. adding 3 0 at the end rather than 1 at the beginning
+-----+----------------------------------------------------------------------------------------------+
|2.194| 3 4 5 6 7 8 9 10 11 12 13
14 15 16 17 18|
| |130 620 1890 5152 11610 21919 38019 60349 85426 113379 135519
148380 143211 119976 78658 35762|
+-----+----------------------------------------------------------------------------------------------+
ts 'Freq Sort 5 best3ofN"0] 1000000#6' NB. cpu time & space used
3.21335 2.72781e8
ts '3 0 + 1000000 5 FreqBest3ofXshapeY 6' NB. somewhat faster, in
about 20% more space
1.64252 3.315e8
If you want to sample significantly more sets or sets much longer than
5 elements, it might be best to accumulate frequency
tables of results for some sutiable block size, such as 10^5 or 6.
It's not too hard too combine frequency tables, somewhat
more tricky to combine standard deviations, variances, etc.
Any use?
Mike
On 21/02/2017 13:41, Paul Moore wrote:
On 21 February 2017 at 13:23, Joe Bogner <joebog...@gmail.com> wrote:
The key thing here is that I'm going to be collecting millions of
results, and I'll run out of memory if I just retain them all and
count at the end. But I'm not sure how to modify structures in J.
Are you sure you will run out of memory? I can hold a billion ints
on my
machine and use 7.8gb of memory.
I did yesterday. I can't recall the details - I may have been running
a billion simulations. I may well also have been using a
memory-inefficient approach, though (if I held 5 numbers for each run,
that would be 40GB of RAM I guess).
My purpose behind this exercise is that I'm trying to demonstrate to a
friend that you don't need to write custom C programs (which is what
he did) to do this type of work, and that off the shelf (albeit
specialised) languages can cope fine (where "fine" means "comparable
to custom C"). So far, I'm finding that's not true - my experiments
with various languages[1] have either been too slow or too memory
hungry to make my case. When a tiny performance issue or over-use of
memory in the implementation of the single experiment gets multiplied
by factors on the order of millions or billions, that doesn't scale
well enough.
It's an interesting exercise that's got me learning a lot about
various languages' strengths and weaknesses, though :-)
Paul
[1] Most of which have been at the "interested beginner" level, as here.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm