Re: Adapting TimSort into AsterixDB/Hyracks

Mike Carey Sat, 28 Oct 2017 12:06:06 -0700

So the word on the web seems maybe to be that Quicksort is generallysuperior and cache-friendly (cache oblivious). Wondering if we shouldjust get our Quicksort code under control?


On 10/28/17 11:36 AM, Chen Luo wrote:

Not exactly sure about this right now... But since TimSort is essentially a
combination of insertion sort and merge sort, it's cache-friendliness won't
be worse than our merge sort.

This TimSort could be served as a short-term plugin-and-play improvement of
our sorting algorithm. It is still stable, the same as our current merge
sort, but faster, especially on partially ordered dataset.

Best regards,
Chen Luo

On Sat, Oct 28, 2017 at 10:58 AM, Mike Carey <[email protected]> wrote:

How is it on cache-friendliness?



On 10/27/17 11:38 PM, abdullah alamoudi wrote:

While I have no answer to the question of legality, this sounds great.

~Abdullah.

On Oct 27, 2017, at 9:20 PM, Chen Luo <[email protected]> wrote:

Hi devs,

I have adapted the TimSort algorithm used in JDK (java.util.TimSort) into
Hyracks, which gives 10-20% performance improvements on random data. It
will be more useful if the input data is partially sorted, e.g., primary
keys fetched from secondary index scan, which I haven't got time to
experiment with.

*Before going any further, is it legal to adapt some algorithm
implementation from JDK into our codebase? *I saw the JDK implementation
itself is adopted from
http://svn.python.org/projects/python/trunk/Objects/listsort.txt as
well.

Best regards,
Chen Luo

Re: Adapting TimSort into AsterixDB/Hyracks

Reply via email to