Re: [Numpy-discussion] Proposal to remove the Bento build.

2015-08-19 Thread David Cournapeau
On Wed, Aug 19, 2015 at 1:22 AM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Aug 18, 2015 at 4:15 PM, David Cournapeau courn...@gmail.com
 wrote:
  If everybody wants to remove bento, we should remove it.

 FWIW, I don't really have an opinion either way on bento versus
 distutils, I just feel that we shouldn't maintain two build systems
 unless we're actively planning to get rid of one of them, and for
 several years now we haven't really been learning anything by keeping
 the bento build working, nor has there been any movement towards
 switching to bento as the one-and-only build system, or even a clear
 consensus that this would be a good thing. (Obviously distutils and
 numpy.distutils are junk, so that's a point in bento's favor, but it
 isn't *totally* cut and dried -- we know numpy.distutils works and we
 have to maintain it regardless for backcompat, while bento doesn't
 seem to have any activity upstream or any other users...).

 So I'd be totally in favor of adding bento back later if/when such a
 plan materializes; I just don't think it makes sense to keep
 continuously investing effort into it just in case such a plan
 materializes later.

  Regarding single file builds, why would it help for static builds ? I
  understand it would make things slightly easier to have one .o per
  extension, but it does not change the fundamental process as the exported
  symbols are the same in the end ?

 IIUC they aren't: with the multi-file build we control exported
 symbols using __attribute__((visibility(hidden)) or equivalent,
 which hides symbols from the shared object export table, but not from
 other translation units that are statically linked. So if you want to
 statically link cpython and numpy, you need some other way to let
 numpy .o files see each others's symbols without exposing them to
 cpython's .o files,


It is less a problem than in shared linking because you can detect the
conflicts at linking time (instead of loading time).


and the single-file build provides one mechanism
 to do that: make the numpy symbols 'static' and then combine them all
 into a single translation unit.

 I would love to be wrong about this though. The single file build is
 pretty klugey :-).


I know, it took me a while to split the files to go out of single file
build in the first place :)

David



 -n

 --
 Nathaniel J. Smith -- http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fwd: Reverse(DESC)-ordered sorting

2015-08-19 Thread Feng Yu
Dear list,

This is forwarded from issue 6217 https://github.com/numpy/numpy/issues/6217

What is the way to implement DESC ordering in the sorting routines of numpy?

(I am borrowing DESC/ASC from the SQL notation)

For a stable DESC ordering sort, one can not  revert the sorted array via
argsort()[::-1] .

I propose the following API change to argsorts/sort. (haven't thought
about lexsort yet) I will use argsort as an example.

Currently, argsort supports sorting by keys ('order') and by 'axis'.
These two somewhat orthonal interfaces need to be treated differently.

1. by axis.

Since there is just one sorting key, a single 'reversed' keyword
argument is sufficient:

a.argsort(axis=0, kind='merge', reversed=True)

Jaime suggested this can be implemented efficiently as a
post-processing step.
(https://github.com/numpy/numpy/issues/6217#issuecomment-132604920) Is
there a reference to the algorithm?

Because all of the sorting algorithms for 'atomic' dtypes are using
_LT functions, a post processing step seems to be the only viable
solution, if possible.


2. by fields, ('order' kwarg)

A single 'reversed' keyword argument will not work, because some keys
are ASC but others are DESC, for example, sorting my LastName ASC,
then Salary DESC.

a.argsort(kind='merge', order=[('LastName', ('FirstName', 'asc'),
('Salary', 'desc'))])

The parsing rule of order is:

- if an item is tuple, the first item is the fieldname, the second
item is DESC/ASC
- if an item is scalar, the fieldname is the item, the ordering is ASC.

This part of the code already goes to VOID_compare, which walks a
temporary copy of a.dtype to call the comparison functions.

If I understood the purpose of c_metadata (numpy 1.7+) correctly, the
ASC/DESC flags, offsets, and comparison functions can all be
pre-compiled and passed into VOID_compare via c_metadata of the
temporary type-descriptor.

By just looking this will actually make VOID_compare faster by
avoiding calling several Python C-API functions. negating the return
value of cmp seems to be a negligable overhead in such a complex
function.

3. If both reversed and order is given, the ASC/DESC fields in 'order'
are effectively reversed.

Any comments?

Best,

Yu
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Reverse(DESC)-ordered sorting

2015-08-19 Thread Jaime Fernández del Río
On Wed, Aug 19, 2015 at 1:10 PM, Feng Yu rainwood...@gmail.com wrote:

 Dear list,

 This is forwarded from issue 6217
 https://github.com/numpy/numpy/issues/6217

 What is the way to implement DESC ordering in the sorting routines of
 numpy?

 (I am borrowing DESC/ASC from the SQL notation)

 For a stable DESC ordering sort, one can not  revert the sorted array via
 argsort()[::-1] .

 I propose the following API change to argsorts/sort. (haven't thought
 about lexsort yet) I will use argsort as an example.

 Currently, argsort supports sorting by keys ('order') and by 'axis'.
 These two somewhat orthonal interfaces need to be treated differently.

 1. by axis.

 Since there is just one sorting key, a single 'reversed' keyword
 argument is sufficient:

 a.argsort(axis=0, kind='merge', reversed=True)

 Jaime suggested this can be implemented efficiently as a
 post-processing step.
 (https://github.com/numpy/numpy/issues/6217#issuecomment-132604920) Is
 there a reference to the algorithm?


My thinking was that, for native types, you can stably reverse a sorted
permutation in-place by first reversing item-by-item, then reversing every
chunk of repeated entries. Sort of the way you would reverse the words in a
sentence in-place: first reverse every letter, then reverse everything
bounded by spaces:

TURN ME AROUND - DNUORA EM NRUT - AROUND EM NRUT - AROUND ME NRUT -
AROUND ME TURN

We could even add a type-specific function to do this for each of the
native types in the npy_sort library.

As I mentioned in Yu's very nice PR
https://github.com/numpy/numpy/pull/6222, probably it is best to leave
the signature of the function alone, and have something like order='desc'
be the trigger for the proposed reversed=True.

Jaime



 Because all of the sorting algorithms for 'atomic' dtypes are using
 _LT functions, a post processing step seems to be the only viable
 solution, if possible.



 2. by fields, ('order' kwarg)

 A single 'reversed' keyword argument will not work, because some keys
 are ASC but others are DESC, for example, sorting my LastName ASC,
 then Salary DESC.

 a.argsort(kind='merge', order=[('LastName', ('FirstName', 'asc'),
 ('Salary', 'desc'))])

 The parsing rule of order is:

 - if an item is tuple, the first item is the fieldname, the second
 item is DESC/ASC
 - if an item is scalar, the fieldname is the item, the ordering is ASC.

 This part of the code already goes to VOID_compare, which walks a
 temporary copy of a.dtype to call the comparison functions.

 If I understood the purpose of c_metadata (numpy 1.7+) correctly, the
 ASC/DESC flags, offsets, and comparison functions can all be
 pre-compiled and passed into VOID_compare via c_metadata of the
 temporary type-descriptor.

 By just looking this will actually make VOID_compare faster by
 avoiding calling several Python C-API functions. negating the return
 value of cmp seems to be a negligable overhead in such a complex
 function.


 3. If both reversed and order is given, the ASC/DESC fields in 'order'
 are effectively reversed.

 Any comments?

 Best,

 Yu
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
(\__/)
( O.o)
(  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion