Re: Skiplist improvements

Yann Ylavic Thu, 10 Jul 2014 17:08:25 -0700

On Thu, Jul 10, 2014 at 9:55 PM, Jim Jagielski <[email protected]> wrote:
>
> On Jul 10, 2014, at 1:53 PM, Yann Ylavic <[email protected]> wrote:
>
>>
>> The issue is that when a doublon is insert_compare()d, it will not
>> necessarily (and even unlikely) be at the same height as the existing
>> doublon(s).
>> Hence a "skip path" may be created between any previous value and the
>> new doublon just inserted, and that path may be taken by
>> apr_skiplist_find() to skip the oldest doublon(s).
>>
>> I think this is not a desirable behaviour (the internals are
>> probabilistic but the returned values through the API shouldn't be :p
>> ).
>>
>
> What does the canonical description of skiplists say?
> Is that desirable or expected behavior? What do other
> skiplist implementations do?


I could not find such description about multiple (equals) values in
skiplists, only implementations.

The current APR's implementation is clearly not the only one to have
no garantee about order, but for what I can see (not a very extensive
search though), those don't provide an iterator (node) and/or the
next/previous() functions and/or the find*() functions that
advantageously take an update-able iterator as argument (even if it's
an ouput argument only now, I find the input/ouput choice
interesting).

Some provide an ordering garantee though, eg QMultiMap which (seems
to) use a skip-list structure (and provides upper/lower_bound()
functions), or std::multimap which is (usually) a rb-tree (a
comparable structure).

I saw implementations with elements of type list to handle doublons (I
first thought about having a specific chaining when modifying the
code, but worried about non-pointer iterators requirement), but found
none with the proposed implementation (didn't loot at QMultiMap's
one).

I propose to garantee the ordering presumably at low coding and
performance (if any) cost, so that one can choose between apr_table or
apr_skiplist to do similar things (iterate over all the values of a
key is one such thing, in a deterministic order is an other, both not
feasible with the current skiplist implementation).

The proposal does not change the structure (the new node's link
member, sibling = next_is_doublon ? next : NULL, is itself a doublon),
takes few cycles (only the empty loop while sibling++; to skip the
doublons when needed and to avoid more costly compare() function), and
has no memory overhead.

It also has the advantage to allow a simple replace() implementation
(most/all other implementations have add() and set() semantics, not
insert() as defined in APR), by removing existing element(s) in order
(and in the same loop) before inserting the new one.

Finally, I find the current API quite limited to httpd's event's use,
ie. insert() then pop() oldest values, hence this proposal.

PS: while having a look at other implementations, I found the idea to
configure the min/max height quite interesting, no limit is applied
currently in the APR's implementation.

Re: Skiplist improvements

Reply via email to