FWIW, here are some results running one million tests of bestof3N
compared with a
very slightly more J way of doing it.
(Your correspondence with Joe & Pascal while I was drafting this might
render the
following redundant.)
Freq =: ~.,:#/.~ NB. 2-row frequency table
NB. slightly harder to make sure all slots (ie 3-18 here) are filled
Sort =: /:~
timer'Freq Sort 5 best3ofN"0] 1000000#6'NB. ELAPSED time for 1000000
applications of best3ofN
+-----+----------------------------------------------------------------------------------------------+
|3.251| 3 4 5 6 7 8 9 10 11 12 13
14 15 16 17 18|
| |134 624 1874 5270 11435 21829 37994 60349 86068 113847 135482
148051 142940 120337 78427 35339|
+-----+----------------------------------------------------------------------------------------------+
xUniform0toy =: ?@$ NB. sample {.x sets of {:x
values in range 0 to y - 1
NB. slightly misnamed!
SumBest3 =: +/@:(3{.\:~)"1 NB. as you were doing it, with
"1 to apply by rows
FreqBest3ofXshapeY =: Freq @ Sort @: SumBest3 @ xUniform0toy NB.
put them together
timer'3 0 + 1000000 5 FreqBest3ofXshapeY 6' NB. ELAPSED time for 1
application with 1000000 rows
NB. adding 3 0 at the end rather than 1 at the beginning
+-----+----------------------------------------------------------------------------------------------+
|2.194| 3 4 5 6 7 8 9 10 11 12 13
14 15 16 17 18|
| |130 620 1890 5152 11610 21919 38019 60349 85426 113379 135519
148380 143211 119976 78658 35762|
+-----+----------------------------------------------------------------------------------------------+
ts 'Freq Sort 5 best3ofN"0] 1000000#6' NB. cpu time & space used
3.21335 2.72781e8
ts '3 0 + 1000000 5 FreqBest3ofXshapeY 6' NB. somewhat faster, in
about 20% more space
1.64252 3.315e8
If you want to sample significantly more sets or sets much longer than 5
elements, it might be best to accumulate frequency
tables of results for some sutiable block size, such as 10^5 or 6. It's
not too hard too combine frequency tables, somewhat
more tricky to combine standard deviations, variances, etc.
Any use?
Mike
On 21/02/2017 13:41, Paul Moore wrote:
On 21 February 2017 at 13:23, Joe Bogner <[email protected]> wrote:
The key thing here is that I'm going to be collecting millions of
results, and I'll run out of memory if I just retain them all and
count at the end. But I'm not sure how to modify structures in J.
Are you sure you will run out of memory? I can hold a billion ints on my
machine and use 7.8gb of memory.
I did yesterday. I can't recall the details - I may have been running
a billion simulations. I may well also have been using a
memory-inefficient approach, though (if I held 5 numbers for each run,
that would be 40GB of RAM I guess).
My purpose behind this exercise is that I'm trying to demonstrate to a
friend that you don't need to write custom C programs (which is what
he did) to do this type of work, and that off the shelf (albeit
specialised) languages can cope fine (where "fine" means "comparable
to custom C"). So far, I'm finding that's not true - my experiments
with various languages[1] have either been too slow or too memory
hungry to make my case. When a tiny performance issue or over-use of
memory in the implementation of the single experiment gets multiplied
by factors on the order of millions or billions, that doesn't scale
well enough.
It's an interesting exercise that's got me learning a lot about
various languages' strengths and weaknesses, though :-)
Paul
[1] Most of which have been at the "interested beginner" level, as here.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm