Good idea and very clear summary.

One correction: in the discard step the paper prescribes a linear search
through the queue starting from the end, not a binary search. And the
modification I propose is to stick with this linear search for the first
log2(w) samples, then switch to a binary search. I believe this makes the
overall performance worse, but that it will still be O(n), and now the
worst case for an individual sample is O(log(w)) rather than O(w), a big
improvement.

As to how the O(n) bound works, I'd suggest just reading that part of the
paper until it clicks. Informally, the gist is this: Robert rightly worries
about the case where we have to search through this whole list of w
elements to figure out how many to discard. This will indeed take some time
for large w. However, the list only grows in size when we're lucky --
specifically when we encounter a sample which is less than everything in
the list. In this case we just add it to the end and move on, and the
discard step is nearly free. So, in order for the list to have gotten this
long in the first place we must have gotten lucky w times in the past. Even
though we have to pay the piper now, the time is amortized over all of
these samples. The average time is thus bounded by a constant, and really
quite a small constant.

(And sorry, Robert, I don't have any code to share).

-Ethan




On Wed, Jul 20, 2016 at 12:39 PM, Evan Balster <e...@imitone.com> wrote:

> OK, here's what I gather...  I'm also going to summarize for the benefit
> of people who haven't read the other thread.
>
> There exists an algorithm for computing the minimum and/or maximum sample
> value in a rolling range.  It runs in amortized linear time in proportion
> to the number of samples processed---that is, we can process N samples in
> O(N).  Individual samples have a worst-case time of O(w) or O(log2(w)) if
> we use a binary search.
>
> Daniel Lemire, Streaming Maximum-Minimum Filter Using No More than Three
>> Comparisons per Element, Nordic Journal of Computing, Volume 13, Number 4,
>> pages 328-339, 2006
>
>
>> http://arxiv.org/abs/cs/0610046
>
>
> From what I've read here, the maximum algorithm works thus:
>
>    - Maintain a delay line containing the last w elements.
>
>    - Maintain a "monotonic wedge" queue describing all potential
>    maximums...
>       - The queue contains each sample from the last *w* samples which is
>       larger than all the samples after it.
>       - By definition, the front element in the queue is the maximum in
>       the range.
>       - The queue's maximum size is *w*; this happens when the samples
>       are monotonically descending.
>
>       - For each new input sample:
>    - If the front element in the queue is now *w* or more samples old,
>       discard it.
>       - *Discard* all samples in the queue which are not larger than the
>       latest sample.
>       - Add the latest sample to the back of the queue.
>
>       - To implement the *Discard* step:
>       - Perform a binary search on the queue (which is already sorted) to
>       find the smallest sample larger than
>       <http://www.cplusplus.com/reference/algorithm/upper_bound/> the new
>       input sample.
>       - Discard all samples following the found sample.
>
>       - To query the maximum:
>       - Return the front-most element of the queue.
>
> We can derive the min or min-and-max algorithm trivially from these
> steps.  The queue can be efficiently implemented with a ring-buffer of size
> w, which could optionally store delay-line indices instead of samples.  The
> discard step would be as simple as revising the end pointer.
>
> Because the only sample which can be both a minimum and  maximum candidate
> is the latest sample, we might be able to implement the algorithm with
> queue storage of size w.
>
> As Robert observes, this could get very spiky for w larger than processing
> block size.  I suspect that it would be possible to do the same kind of
> magic used in realtime hash-table migration and make that discard step run
> in constant time per sample---but that's a little more tango than I feel
> like dancing today!
>
> – Evan Balster
> creator of imitone <http://imitone.com>
>
> On Wed, Jul 20, 2016 at 11:04 AM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>>
>>
>> ---------------------------- Original Message ----------------------------
>> Subject: Re: [music-dsp] efficient running max algorithm
>> From: "Ethan Fenn" <et...@polyspectral.com>
>> Date: Wed, July 20, 2016 10:27 am
>> To: music-dsp@music.columbia.edu
>> --------------------------------------------------------------------------
>>
>> > Of course, for processing n samples, something that is O(n) is going to
>> > eventually beat something that's O(n*log(w)), for big enough w.
>>
>> i am skeptical of the claim.  at least for worst case.  if w = 2^30, i
>> would think the computational cost O(.) will have an increasing function of
>> w in the expression.
>>
>> > FWIW if it's important to have O(log(w)) worst case per sample, I think
>> you
>> > can adapt the method of the paper to achieve this while keeping the O(1)
>> > average.
>> >
>>
>> for real-time processing (that's how i think of a "running" max) the
>> worst case timing is what one has to worry about.  at least to prevent an
>> unexpected hiccup.
>>
>> now, if the worst case *average* can be contained, then processing
>> samples in blocks can take advantage of that averaging.  so if we process
>> samples in, say, 32 or 64 sample blocks, and the *worst case* timing of
>> this can be guaranteed to beat the O(log2(w)), then the alg has value.  but
>> if it cannot be guaranteed to do that, it does not for a real-time context.
>>
>> > Instead of a linear search back through the maxima, do a linear search
>> for
>> > log2(w) samples and, failing that, switch to binary search. This is
>> > O(log(w)) worst case, but I think it would keep the property of O(1)
>> > comparisons per sample. I believe the upper bound would now be 4
>> > comparisons per sample rather than 3. I'd have to think more about it
>> to be
>> > sure.
>>
>>
>> because i am having trouble understanding *exactly* the alg, i would
>> still appreciate a snippet of code.
>>
>> i will bet, that given a snippet of code, i can think of a signal that
>> will push the computational cost into the worst case which will be worse
>> than O(log2(w)).
>>
>> --
>>
>> r b-j                      r...@audioimagination.com
>>
>> "Imagination is more important than knowledge."
>>
>> _______________________________________________
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to