Re: [Jprogramming] Accumulator for generated results

'Mike Day' via Programming Tue, 21 Feb 2017 06:52:38 -0800

Here's one way to combine frequency tables such as those I've derivedbelow:


Append one freq table to another,  sort by their tags/indices, accumulate
the count terms by common tags,  and pair the new counts with those tags.


It might save some space to do the sort last.

   CombineFreq =: (~.@:{. ,: +//./)@:(/:~@:,&.:|:)

   f1

5 6 7 8 9 10 11 12 13 14 1516 17 181954 5194 11757 21627 38189 60395 85311 113399 136308 148676 142623119709 78395 35706

f2

3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18144 601 1872 5266 11630 21925 38098 60066 84947 113079 136071 148731143091 120464 78640 35375

   f1 CombineFreq f2

3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18144 601 3826 10460 23387 43552 76287 120461 170258 226478 272379 297407285714 240173 157035 71081


Mike





On 21/02/2017 14:24, 'Mike Day' via Programming wrote:

FWIW, here are some results running one million tests of bestof3Ncompared with a
very slightly more J way of doing it.
(Your correspondence with Joe & Pascal while I was drafting this mightrender the
following redundant.)

   Freq         =: ~.,:#/.~         NB. 2-row frequency table
NB. slightly harder to make sure all slots (ie 3-18 here) arefilled
   Sort         =: /:~
timer'Freq Sort 5 best3ofN"0] 1000000#6'NB. ELAPSED time for1000000 applications of best3ofN+-----+----------------------------------------------------------------------------------------------+|3.251| 3 4 5 6 7 8 9 10 11 12 1314 15 16 17 18|| |134 624 1874 5270 11435 21829 37994 60349 86068 113847 135482148051 142940 120337 78427 35339|+-----+----------------------------------------------------------------------------------------------+
xUniform0toy =: ?@$ NB. sample {.x sets of {:xvalues in range 0 to y - 1
                                       NB. slightly misnamed!
SumBest3 =: +/@:(3{.\:~)"1 NB. as you were doing it, with"1 to apply by rows
FreqBest3ofXshapeY =: Freq @ Sort @: SumBest3 @ xUniform0toy NB.put them together
timer'3 0 + 1000000 5 FreqBest3ofXshapeY 6' NB. ELAPSED time for 1application with 1000000 rows
NB. adding 3 0 at the end rather than 1 at the beginning
+-----+----------------------------------------------------------------------------------------------+|2.194| 3 4 5 6 7 8 9 10 11 12 1314 15 16 17 18|| |130 620 1890 5152 11610 21919 38019 60349 85426 113379 135519148380 143211 119976 78658 35762|+-----+----------------------------------------------------------------------------------------------+
   ts 'Freq Sort 5 best3ofN"0] 1000000#6'    NB. cpu time & space used
3.21335 2.72781e8
ts '3 0 + 1000000 5 FreqBest3ofXshapeY 6' NB. somewhat faster, inabout 20% more space
1.64252 3.315e8
If you want to sample significantly more sets or sets much longer than5 elements, it might be best to accumulate frequencytables of results for some sutiable block size, such as 10^5 or 6.It's not too hard too combine frequency tables, somewhat
more tricky to combine standard deviations, variances, etc.

Any use?

Mike


On 21/02/2017 13:41, Paul Moore wrote:
On 21 February 2017 at 13:23, Joe Bogner <joebog...@gmail.com> wrote:
The key thing here is that I'm going to be collecting millions of
results, and I'll run out of memory if I just retain them all and
count at the end. But I'm not sure how to modify structures in J.
Are you sure you will run out of memory? I can hold a billion intson my
machine and use 7.8gb of memory.
I did yesterday. I can't recall the details - I may have been running
a billion simulations. I may well also have been using a
memory-inefficient approach, though (if I held 5 numbers for each run,
that would be 40GB of RAM I guess).

My purpose behind this exercise is that I'm trying to demonstrate to a
friend that you don't need to write custom C programs (which is what
he did) to do this type of work, and that off the shelf (albeit
specialised) languages can cope fine (where "fine" means "comparable
to custom C"). So far, I'm finding that's not true - my experiments
with various languages[1] have either been too slow or too memory
hungry to make my case. When a tiny performance issue or over-use of
memory in the implementation of the single experiment gets multiplied
by factors on the order of millions or billions, that doesn't scale
well enough.

It's an interesting exercise that's got me learning a lot about
various languages' strengths and weaknesses, though :-)

Paul

[1] Most of which have been at the "interested beginner" level, as here.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Accumulator for generated results

Reply via email to