On Wed, Aug 19, 2015 at 1:10 PM, Feng Yu <rainwood...@gmail.com> wrote:
> Dear list, > > This is forwarded from issue 6217 > https://github.com/numpy/numpy/issues/6217 > > "What is the way to implement DESC ordering in the sorting routines of > numpy?" > > (I am borrowing DESC/ASC from the SQL notation) > > For a stable DESC ordering sort, one can not revert the sorted array via > argsort()[::-1] . > > I propose the following API change to argsorts/sort. (haven't thought > about lexsort yet) I will use argsort as an example. > > Currently, argsort supports sorting by keys ('order') and by 'axis'. > These two somewhat orthonal interfaces need to be treated differently. > > 1. by axis. > > Since there is just one sorting key, a single 'reversed' keyword > argument is sufficient: > > a.argsort(axis=0, kind='merge', reversed=True) > > Jaime suggested this can be implemented efficiently as a > post-processing step. > (https://github.com/numpy/numpy/issues/6217#issuecomment-132604920) Is > there a reference to the algorithm? > My thinking was that, for native types, you can stably reverse a sorted permutation in-place by first reversing item-by-item, then reversing every chunk of repeated entries. Sort of the way you would reverse the words in a sentence in-place: first reverse every letter, then reverse everything bounded by spaces: TURN ME AROUND -> DNUORA EM NRUT -> AROUND EM NRUT -> AROUND ME NRUT -> AROUND ME TURN We could even add a type-specific function to do this for each of the native types in the npy_sort library. As I mentioned in Yu's very nice PR <https://github.com/numpy/numpy/pull/6222>, probably it is best to leave the signature of the function alone, and have something like order='desc' be the trigger for the proposed reversed=True. Jaime > > Because all of the sorting algorithms for 'atomic' dtypes are using > _LT functions, a post processing step seems to be the only viable > solution, if possible. > > 2. by fields, ('order' kwarg) > > A single 'reversed' keyword argument will not work, because some keys > are ASC but others are DESC, for example, sorting my LastName ASC, > then Salary DESC. > > a.argsort(kind='merge', order=[('LastName', ('FirstName', 'asc'), > ('Salary', 'desc'))]) > > The parsing rule of order is: > > - if an item is tuple, the first item is the fieldname, the second > item is DESC/ASC > - if an item is scalar, the fieldname is the item, the ordering is ASC. > > This part of the code already goes to VOID_compare, which walks a > temporary copy of a.dtype to call the comparison functions. > > If I understood the purpose of c_metadata (numpy 1.7+) correctly, the > ASC/DESC flags, offsets, and comparison functions can all be > pre-compiled and passed into VOID_compare via c_metadata of the > temporary type-descriptor. > > By just looking this will actually make VOID_compare faster by > avoiding calling several Python C-API functions. negating the return > value of cmp seems to be a negligable overhead in such a complex > function. > 3. If both reversed and order is given, the ASC/DESC fields in 'order' > are effectively reversed. > > Any comments? > > Best, > > Yu > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion