Re: [Jprogramming] Moving median - any good methods?

'Jon Hough' via Programming Tue, 14 Mar 2017 16:53:12 -0700

I haven't attempted that Project Euler Question. But I did implement a priority 
queue a while back.
I just put it here: 
https://gist.github.com/jonghough/e3566dcfea95e4dbc5b23f9295388df6


It can handle arbitrary comparators, i.e. any verb returning 0 or 1. However, I 
doubt it is of much use, because it isn't particularly fast.
But may be interesting as a reference. 

Jon

--------------------------------------------
On Wed, 3/15/17, 'Mike Day' via Programming <[email protected]> wrote:

 Subject: [Jprogramming]  Moving median - any good methods?
 To: [email protected]
 Date: Wednesday, March 15, 2017, 4:50 AM
 
 This has arisen in
 solving Euler Project problem 593.
 https://projecteuler.net/problem=593
 It is basically quite easy, especially for the
 examples with given 
 answers,  ie
 for F(100,10) and even for F(100000, 10000) but
 as usual with these 
 problems,
 there's a sting in the tail, in the
 computing requirement for F(10^7, 10^5)
 
 The data consist of what one can regard as a
 random set of integers, 
 which I called
 S2keep,  drawn from a finite set with ~ 20000
 elements.
 
 F(n,k) is the sum
 of medians on each window of k consecutive elements of
 sample of n elements of S2keep.
 
 So F(10,100) is easily
 constructed,  hard-wired here:
  
    10j1":-:+/10 (+/@:(4
 5{/:~))\100{.S2keep
 
 463628.5
 
 
 This is of course inefficient for larger n and
 k.
 
 
 However,  the window size for the stated
 problem is ~5 times larger than 
 the
 universe
 
 of elements,  and
 we know that each successive window differs from the 
 previous one
 
 in at most two elements,  so it's somewhat
 more efficient,  though more 
 tedious, 
 to
 
 maintain a count array
 (cts) of size ~20000. For any window,  this array 
 consists of
 
 effectively random zero or positive entries,
 summing to k.  The indices 
 of these
 entries
 
 are the indices of
 the elements of S2keep in a sorted list of nub 
 elements of S2keep.  So
 
 locating the indices of the (k/2)th members of
 the cumulative sum of cts 
 will reveal
 the
 
 median elements of the
 window.
 
 
 But this is still quite slow.  The answer
 which I submitted had taken 
 about 14
 minutes
 
 with this method.
 (in J805 pre avx!)
 
 
 I thought it should be possible to do better,
 and googled in stack 
 overflow and
 elsewhere.
 
 The preferred
 method appears to require "indexable skip lists",
 said to 
 be better than binary
 
 trees.  I had a look -
 they're not too hard to construct,  but it's 
 difficult to insert and delete
 
 elements in a J-like way.
 
 
 I gave up on
 this tack,  and instead found some improvement in run-time
 
 with using
 
 heaps.  So, instead of maintaining a count
 array with one slot per nub 
 element,  the
 heaps
 
 count several
 contiguous nub elements.  The default of this approach is
 
 to have ~ %: 20000
 
 heaps, ie ~ 140.
 
 
 Best time so far is still
 quite long - about 8 minutes.  (And ~5 1/2 in 
 J806Beta using avx)
 
 
 Question is - has anyone
 achieved efficient solutions in J?    Other 
 languages seem to have
 
 efficient implementations of heaps, binary
 trees,  skip lists and so 
 on,  if only
 one can
 
 understand how to
 use them!
 
 
 Not much use if it's just for solving Euler
 Project problems,  but 
 people do need
 moving
 
 statistics on
 occasion.
 
 
 This is long enough without quoting my source
 code.  Available on request!
 
 
 Thanks,
 
 
 Mike
 
 
 
 
 ---
 This email has been checked
 for viruses by Avast antivirus software.
 https://www.avast.com/antivirus
 ----------------------------------------------------------------------
 For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Moving median - any good methods?

Reply via email to