On 11/30/15 3:20 PM, H. S. Teoh via Digitalmars-d wrote:
On Mon, Nov 30, 2015 at 03:13:11PM -0500, Andrei Alexandrescu via Digitalmars-d 
wrote:
Okasaki's book is a continued inspiration of data structures and
algorithms.

I was thinking along the following lines: typical collections are
searchable in linear time. Then we have highly structured collections
that feature logarithmic search time. But there seems to be nothing in
between. So I was thinking, what would be a data structure that allows
O(sqrt n) search times?

After a little tinkering, here's what I came up with.

Interesting indeed.

It leaves me wondering, though, what's the point of having an O(sqrt n)
collection? What are the advantages?  Why would I use this structure
instead of, say, a traditional array heap with O(log n) search time?

(Heaps offer only linear search time. You may take advantage of the heap structure to skip portions of the array, but on average and in the worst case the search is still O(n). So I assume you meant "sorted array or one of the classic search trees".)

The motivation starts with a desire to use arrays as the fundamental layout. There have been many trends in computing as of late, among which: cache hierarchies are here to stay and contiguous layout is preferable.

The short of it is, arrays are king. No two ways about it - following pointers is a losing strategy when there's an alternative. A variety of new data structures (Clojure's arrays, heaps with tombstones) avoid classic pointer-based data structures in favor of adding structure on top of arrays.

So now if we consider thinking, "how do we organize an array for good search performance" we have a spectrum. Generally we also care about insertion and removal.

At one end of the spectrum there's doing nothing at all - that means O(1) build (nothing to do), O(n) search, O(1) insert (just append it), O(n) removal. Not very nice.

At the other end, the absolute best structuring on top of an array for search is sorting. With sorting you get great O(log n) search performance. But the others are not nice - O(n log n) build, O(n) add, O(n) remove.

So now consider my square heaps. We have O(n) build time (just a bunch of heapifications) and O(sqrt n) search. Then (again I haven't worked out the math yet) let's assume insertion and removal are both O(sqrt n). Then you get something less structured than full sorting, but also just good enough to offer the same complexity for each of search, insert, and delete. That would be pretty neat.


Andrei

Reply via email to