Implementation

Jason Causey via Digitalmars-d Thu, 10 Dec 2015 15:15:13 -0800

Hello all! I happened by this thread (from Hacker News) and theidea of this data structure got stuck in my head. I did somescribbling on paper and decided that it could be interesting tocode...

On Thursday, 3 December 2015 at 22:44:26 UTC, Andrei Alexandrescuwrote:

At this point I need to either get to the bottom of the math orput together an experimental rig that counts comparisons andswaps.

I've built a test implementation (in C++ ... I hope that isn'ttoo distasteful on a D forum, but it's what I'm most comfortablewith) here: https://github.com/jcausey-astate/HeapArray

I chose to use a Min-Max heap[1] for the heaps themselves (thisbuys me O(1) min and max access). This solves the problem ofmaking insert and delete follow the same pattern (to insert,ripple the max value in a full partition to the next one; in adelete, fill in the "hole" with min value from previouspartition).

I wasn't able to come up with anything better than pre-sortingthe whole thing, then running the Floyd "make heap" on eachpartition. So, the whole thing costs O(n*lg(n) + n)) to make anew structure on top of an existing array. This is still fasterthan doing a top-down (add every element) make-heap though. Iagree with Timon's analysis[2] on that.

I also agree with Andrei that I have a "gut feeling" that therecould be a faster setup algorithm, since we really just need toshuffle values into the right partitions, not actually fully sortthem... But that would require knowing exactly what the partition"pivot" values were in advance, which would require searching,and 'round we go.

My code is still rough, only works for ints, and needs somedocumentation and cleanup, but it does show that our hunches weremore or less correct: This thing is much faster than linear forsearching, but not as fast as logarithmic. It also performs OKwhen adding new values and performing lots of searches (whencompared with a plain old array or vector where you have tolinear search every time).



My Github README has some charts, but I'll link a couple here:
Search times (vector VS this structure VS multiset)
https://plot.ly/~jcausey-astate/7/search-timing-vector-vs-heaparray-vs-multiset/

Time to add N values starting with an empty structure (vector VSthis structure VS multiset)

https://plot.ly/~jcausey-astate/18/fill-container-dynamically-heaparray-vs-vector-and-multiset/

Time to initially build the structure given an already-populatedarray (vector VS this structure VS multiset):

https://plot.ly/~jcausey-astate/35/build-data-structure-on-existing-array-all-at-once/

[1]:http://www.akira.ruc.dk/~keld/teaching/algoritmedesign_f03/Artikler/02../Atkinson86.pdf


[2]: http://forum.dlang.org/post/[email protected]

Implementation

Reply via email to