Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

Neil Girdhar Mon, 09 Mar 2015 13:45:12 -0700

It may be that the comparison that you do is between two elements that are
almost always in the same cache line whereas the binary search might often
incur a cache miss.


On Mon, Mar 9, 2015 at 2:49 PM, nha pham <phq...@gmail.com> wrote:

> I do not know exactly, one thing I can imagine is: it turns the worst case
> of binary insertion sort to best case.
> With sorted array in range of 32 or 64 items, built from zero element. The
> new element you put into the sorted list has a high chance of being the
> smallest or the the highest of the sorted list (or nearly highest or nearly
> smallest)
>
> If that case happen, the old binary insertion sort will have the
> investigate all the list, while with my idea, it just have to compare more
> 1-2 times.
> I will try to run more test an more thinking to make sure though.
>
> On Mon, Mar 9, 2015 at 11:48 AM, nha pham <phq...@gmail.com> wrote:
>
>> I do not know exactly, one thing I can imagine is: it turns the worst
>> case of binary insertion sort to best case.
>> With sorted array in range of 32 or 64 items, built from zero element.
>> The new element you put into the sorted list has a high chance of being the
>> smallest or the the highest of the sorted list (or nearly highest or nearly
>> smallest)
>>
>> If that case happen, the old binary insertion sort will have the
>> investigate all the list, while with my idea, it just have to compare more
>> 1-2 times.
>> I will try to run more test an more thinking to make sure though.
>>
>>
>>
>> On Mon, Mar 9, 2015 at 10:39 AM, Isaac Schwabacher <ischwabac...@wisc.edu
>> > wrote:
>>
>>> On 15-03-08, nha pham
>>>  wrote:
>>> >
>>> > We can optimize the TimSort algorithm by optimizing its binary
>>> insertion sort.
>>> >
>>> > The current version of binary insertion sort use this idea:
>>> >
>>> > Use binary search to find a final position in sorted list for a new
>>> element X. Then insert X to that location.
>>> >
>>> > I suggest another idea:
>>> >
>>> > Use binary search to find a final postion in sorted list for a new
>>> element X. Before insert X to that location, compare X with its next
>>> element.
>>> >
>>> > For the next element, we already know if it is lower or bigger than X,
>>> so we can reduce the search area to the left side or on the right side of X
>>> in the sorted list.
>>>
>>> I don't understand how this is an improvement, since with binary search
>>> the idea is that each comparison cuts the remaining list to search in half;
>>> i.e., each comparison yields one bit of information. Here, you're spending
>>> a comparison to cut the list to search at the element you just inserted,
>>> which is probably not right in the middle. If you miss the middle, you're
>>> getting on average less than a full bit of information from your
>>> comparison, so you're not reducing the remaining search space by as much as
>>> you would be if you just compared to the element in the middle of the list.
>>>
>>> > I have applied my idea on java.util. ComparableTimSort.sort() and
>>> testing. The execute time is reduced by 2%-6% with array of random integer.
>>>
>>> For all that, though, experiment trumps theory...
>>>
>>> > Here is detail about algorithm and testing:
>>> https://github.com/nhapq/Optimize_binary_insertion_sort
>>> >
>>> > Sincerely.
>>> >
>>> > phqnha
>>>
>>
>>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/mistersheik%40gmail.com
>
>

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

Reply via email to