[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 26.11.2021 10:56, Ruben Vorderman wrote: > > $ python -m timeit -c "from bytes_sort import bytes_sort" "bytes_sort(b'')" > 50 loops, best of 5: 495 nsec per loop Shouldn't this read: $ python -m timeit -s "from bytes_sort import bytes_sort"

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-28 Thread Tim Peters
Tim Peters added the comment: I agree with closing this - I don't know of real use cases either. Serhiy, essentially all LSD radix sorts are stable, and rely on that for their own correctness. MSD radix sorts vary. -- ___ Python tracker

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-27 Thread Gregory P. Smith
Gregory P. Smith added the comment: General consensus: There isn't a common need for this. -- nosy: +gregory.p.smith resolution: -> rejected stage: -> resolved status: open -> closed ___ Python tracker

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-27 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I concur with Raymond. It is difficult to find any use case for sorting bytes objects (I cannot find any). As for using radix sort in list.sort() in special case of small integer keys, it is difficult to implement, because we should preserve the initial

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-27 Thread Raymond Hettinger
Raymond Hettinger added the comment: I’m -1 on this. Given that use cases are rare, there is no need to burden the code base with an optimization of something we can already do in other ways. Also, I don’t like that the APIs for list.sort(), bytes.sort(), and bytearray.sort() wouldn’t match.

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Brett Cannon
Change by Brett Cannon : -- nosy: -brett.cannon ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread STINNER Victor
STINNER Victor added the comment: There are other byte-like objects like memoryview or array.array (for some array types). I would suggest writing a function rather than a method. -- ___ Python tracker

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Eric V. Smith
Eric V. Smith added the comment: Given that the normal sort() machinery wouldn't use this code, I don't think there's any advantage to adding .sort() methods to bytes and bytesarray. The downside to adding these methods is the increased complexity in the stdlib. I think the better approach

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Mark Dickinson
Mark Dickinson added the comment: > If there are enough use cases for it. Well, that's the question. :-) *I* can't think of any good use cases, but maybe others can. But if we can't come up with some use cases, then this feels like a solution looking for a problem, and that makes it hard to

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Ruben Vorderman
Ruben Vorderman added the comment: I used it for the median calculation of FASTQ quality scores (https://en.wikipedia.org/wiki/FASTQ_format). But in the end I used the frequency table to calculate the median more quickly. So as you say, the frequency table turned out to be more useful.

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Mark Dickinson
Mark Dickinson added the comment: (Changing the issue type: as I understand it, this is a proposal for a new feature, namely new methods bytes.sort and bytearray.sort, rather than a performance improvement to an existing feature.) -- type: performance -> enhancement

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Mark Dickinson
Mark Dickinson added the comment: Can you give example use-cases for sorting a bytes or bytearray object? I see value in the intermediate object - the frequency table, but the reconstructed sorted bytes object just seems like an inefficient representation of the frequency table, and I'm not

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Ruben Vorderman
Ruben Vorderman added the comment: Sorry for the spam. I see I made a typo in the timeit script. Next time I will be more dilligent when making these kinds of reports and triple checking it before hand, and sending it once. I used -c instead of -s and now all the setup time is also

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Ruben Vorderman
Ruben Vorderman added the comment: I changed the cython script a bit to use a more naive implementation without memset. Now it is always significantly faster than bytes(sorted(my_bytes)). $ python -m timeit -c "from bytes_sort import bytes_sort" "bytes_sort(b'')" 50 loops, best of 5: 495

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Alex Waygood
Change by Alex Waygood : -- nosy: +Mark.Shannon, brett.cannon, rhettinger, serhiy.storchaka, tim.peters, vstinner, yselivanov ___ Python tracker ___

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Ruben Vorderman
Ruben Vorderman added the comment: Also I didn't know if this should be in Component C-API or Interpreter Core. But I guess this will be implemented as C-API calls PyBytes_Sort and PyByteArray_SortInplace so I figured C-API is the correct component here. --

[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-26 Thread Ruben Vorderman
New submission from Ruben Vorderman : Python now uses the excellent timsort for most (all?) of its sorting. But this is not the fastest sort available for one particular use case. If the number of possible values in the array is limited, it is possible to perform a counting sort: