[Numpy-discussion] 1.10.x is branched
Hi All, Numpy 1.10.x is branched. There is still some cleanup to do before the alpha release, but that should be coming in a couple of days. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
Kang, Thank you for explaining your motivation. It's clear from your last note, as you said, that your desire for column-first indexing has nothing to do with in-memory data layout. That being the case, I strongly urge you to just use bare numpy and do not use the fortran_zeros function I recommended before. Changing the in-memory layout via the order keyword in numpy.zeros will not change the way indexing works at all. You gain absolutely nothing by changing the in-memory order unless you are writing some C or Fortran code which will interact with the data in memory. To see what I mean, consider the following examples: x = np.array([1, 2, 3], [4, 5, 6]]) x.shape (2, 3) and x = np.array([1, 2, 3], [4, 5, 6]], order='F') x.shape (2, 3) You see that changing the in-memory order has nothing whatsoever to do with the array's shape or how you access it. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. Could you give a very simple example of what you mean? I can't think of how this could ever happen and your fear here makes me think there's a fundamental misunderstanding about how array operations in numpy and other programming languages work. As an example, iteration in numpy goes through the first index: x = np.array([[1, 2, 3], [4, 5, 6]]) for foo in x: ... Inside the for loop, foo takes on the values [1, 2, 3] on the first iteration and [4, 5, 6] on the second. If you want to iterate through the columns just do this instead x = np.array([[1, 2, 3], [4, 5, 6]]) for foo in x.T: ... If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to produce an array with shape (3, 2) then you should own up to the fact that the array constructor expects it the other way around and do this x = np.array([[1, 2, 3], [4, 5, 6]]).T instead. This is infinity times better than trying to write a shim function or patch numpy because with .T you're using (fast) built-in functionality which other people your code will understand. The real message here is that whether the first index runs over rows or columns is actually meaningless. The only places the row versus column issue has any meaning is when doing input/output (in which case you should use the transpose if you actually need it), or when doing iteration. One thing that would make sense if you're reading from a binary file format which uses column-major format would be to write your own reader function: def read_fortran_style_binary_file(file): return np.fromfile(file).T Note that if you do this then you already have a column major array in numpy and you don't have to worry about any other transposes (except, again, when doing more I/O or passing to something like a plotting function). On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang kwan...@wisc.edu wrote: Thank you all for replying and providing useful insights and suggestions. The reasons I really want to use column-major are: - I am image-oriented user (not matrix-oriented, as explained in http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues ) - I am so used to read/write I(x, y, z) in textbook and code, and it is very likely that if the environment (row-major environment) forces me to write I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it is difficult to debug, because everything compile and build fine. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. - It actually has not too much to do with the actual data layout in memory. In imaging processing, especially medical imaging where I am working in, if you have a 3D image, everyone will agree that in memory, the X index is the fasted changing index, and the Z dimension (we often call it the slice dimension) has the largest stride in memory. So, if data layout is like this in memory, and image-oriented users are so used to read/write I(x,y,z), the only storage order that makes sense is column-major - I also write code in MATLAB and C/C++. In MATLAB, matrix is column-major array. In C/C++, we often use ITK, which is also column-major ( http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really prefer always read/write column-major code to minimize coding bugs related to storage order. - I also prefer index to be 0-based; however, there is nothing I can do about it for MATLAB (which is 1-based). I can see that my original thought about modifying NumPy source and re-compile is probably a bad idea. The suggestions about using fortran_zeros = partial(np.zeros(order='F')) is probably the best way so far, in my opinion, and I am going to give it a try. Again, thank you all for replying. Kang On 08/02/15, *Nathaniel
Re: [Numpy-discussion] Change default order to Fortran order
Hi Kang, Feel free to come chat about your application on the scikit-image list [1]! I'll note that we've been through the array order discussion many times there and even have a doc page about it [2]. The short version is that you'll save yourself a lot of pain by starting to think of your images as (plane, row, column) instead of (x, y, z). The syntax actually becomes friendlier too. For example, to do something to each slice of data, you do: for plane in image: plane += foo instead of for z in image.shape[2]: image[:, :, z] += foo for example. Juan. [1] scikit-im...@googlegroups.com [2] http://scikit-image.org/docs/dev/user_guide/numpy_images.html#coordinate-conventions PS: As to the renamed Fortran-ordered numpy, may I suggest funpy. The F is for Fortran and the fun is for all the fun you'll have maintaining it. =P On Mon, 3 Aug 2015 at 6:28 am Daniel Sank sank.dan...@gmail.com wrote: Kang, Thank you for explaining your motivation. It's clear from your last note, as you said, that your desire for column-first indexing has nothing to do with in-memory data layout. That being the case, I strongly urge you to just use bare numpy and do not use the fortran_zeros function I recommended before. Changing the in-memory layout via the order keyword in numpy.zeros will not change the way indexing works at all. You gain absolutely nothing by changing the in-memory order unless you are writing some C or Fortran code which will interact with the data in memory. To see what I mean, consider the following examples: x = np.array([1, 2, 3], [4, 5, 6]]) x.shape (2, 3) and x = np.array([1, 2, 3], [4, 5, 6]], order='F') x.shape (2, 3) You see that changing the in-memory order has nothing whatsoever to do with the array's shape or how you access it. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. Could you give a very simple example of what you mean? I can't think of how this could ever happen and your fear here makes me think there's a fundamental misunderstanding about how array operations in numpy and other programming languages work. As an example, iteration in numpy goes through the first index: x = np.array([[1, 2, 3], [4, 5, 6]]) for foo in x: ... Inside the for loop, foo takes on the values [1, 2, 3] on the first iteration and [4, 5, 6] on the second. If you want to iterate through the columns just do this instead x = np.array([[1, 2, 3], [4, 5, 6]]) for foo in x.T: ... If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to produce an array with shape (3, 2) then you should own up to the fact that the array constructor expects it the other way around and do this x = np.array([[1, 2, 3], [4, 5, 6]]).T instead. This is infinity times better than trying to write a shim function or patch numpy because with .T you're using (fast) built-in functionality which other people your code will understand. The real message here is that whether the first index runs over rows or columns is actually meaningless. The only places the row versus column issue has any meaning is when doing input/output (in which case you should use the transpose if you actually need it), or when doing iteration. One thing that would make sense if you're reading from a binary file format which uses column-major format would be to write your own reader function: def read_fortran_style_binary_file(file): return np.fromfile(file).T Note that if you do this then you already have a column major array in numpy and you don't have to worry about any other transposes (except, again, when doing more I/O or passing to something like a plotting function). On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang kwan...@wisc.edu wrote: Thank you all for replying and providing useful insights and suggestions. The reasons I really want to use column-major are: - I am image-oriented user (not matrix-oriented, as explained in http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues ) - I am so used to read/write I(x, y, z) in textbook and code, and it is very likely that if the environment (row-major environment) forces me to write I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it is difficult to debug, because everything compile and build fine. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. - It actually has not too much to do with the actual data layout in memory. In imaging processing, especially medical imaging where I am working in, if you have a 3D image, everyone will agree that in memory, the X index is the fasted changing index, and the Z dimension (we often call it the slice
Re: [Numpy-discussion] mailmap update
Hi Chuck, Tom Poole t.b.poole at gmail.com http://gmail.com/ tpoole t.b.poole at gmail.com http://gmail.com/ Tom On 2 Aug 2015, at 06:08, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, I'm trying to update the .mailmap file on github and could use some help. The current version seems common to both numpy and scipy, hence the crosspost. Here is what I've got so far. Alex Griffing argriffi at ncsu.edu http://ncsu.edu/ alex argriffi at ncsu.edu http://ncsu.edu/ Alex Griffing argriffi at ncsu.edu http://ncsu.edu/ argriffing argriffi at ncsu.edu http://ncsu.edu/ Alex Griffing argriffi at ncsu.edu http://ncsu.edu/ argriffing argriffing at users.noreply.github.com http://users.noreply.github.com/ Behzad Nouri behzadnouri at gmail.com http://gmail.com/ behzad nouri behzadnouri at gmail.com http://gmail.com/ Carl Kleffner cmkleffner at gmail.com http://gmail.com/ carlkl cmkleffner at gmail.com http://gmail.com/ Christoph Gohlke cgohlke at uci.edu http://uci.edu/ Christolph Gohlke cgohlke at uci.edu http://uci.edu/ Christoph Gohlke cgohlke at uci.edu http://uci.edu/ cgholke ? at ? Christoph Gohlke cgohlke at uci.edu http://uci.edu/ cgohlke cgohlke at uci.edu http://uci.edu/ Han Genuit hangenuit at gmail.com http://gmail.com/ Han hangenuit at gmail.com http://gmail.com/ Jaime Fernandez jaime.frio at gmail.com http://gmail.com/ Jaime jaime.frio at gmail.com http://gmail.com/ Jaime Fernandez jaime.frio at gmail.com http://gmail.com/ jaimefrio jaime.frio at gmail.com http://gmail.com/ Mark Wiebe mwwiebe at gmail.com http://gmail.com/ Mark mwwiebe at gmail.com http://gmail.com/ Mark Wiebe mwwiebe at gmail.com http://gmail.com/ Mark Wiebe mwiebe at enthought.com http://enthought.com/ Mark Wiebe mwwiebe at gmail.com http://gmail.com/ Mark Wiebe mwiebe at georg.(none) Nathaniel J. Smith njs at pobox.com http://pobox.com/ njsmith njs at pobox.com http://pobox.com/ Ondřej Čertík ondrej.certik at gmail.com http://gmail.com/ Ondrej Certik ondrej.certik at gmail.com http://gmail.com/ Ralf Gommers ralf.gommers at googlemail.com http://googlemail.com/ rgommers ralf.gommers at googlemail.com http://googlemail.com/ Saullo Giovani saullogiovani at gmail.com http://gmail.com/ saullogiovani saullogiovani at gmail.com http://gmail.com/ Sebastian Berg sebastian at sipsolutions.net http://sipsolutions.net/ seberg sebastian at sipsolutions.net http://sipsolutions.net/ Anon a...@gmail.com mailto:a...@gmail.com abdulmuneer abdulmuneer at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com amir ladsgroup at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com cel cel.gentoo at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com chebee7i chebee7i at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com empeeu empeeu at yahoo.com http://yahoo.com/ Anon a...@gmail.com mailto:a...@gmail.com endolith endolith at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com hannaro hroehling at gmx.net http://gmx.net/ Anon a...@gmail.com mailto:a...@gmail.com hpaulj hpj3 at myuw.net http://myuw.net/ Anon a...@gmail.com mailto:a...@gmail.com immerrr immerrr at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com jmrosen155 Jordan at Jordans-MacBook-Pro.local Anon a...@gmail.com mailto:a...@gmail.com jnothman jnothman at student.usyd.edu.au http://student.usyd.edu.au/ Anon a...@gmail.com mailto:a...@gmail.com kanhua kanhwa at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com mamikony ernest.mamikonyan at sig.com http://sig.com/ Anon a...@gmail.com mailto:a...@gmail.com mbyt random.seed at web.de http://web.de/ Anon a...@gmail.com mailto:a...@gmail.com mlai mlai at begws92.beg.utexas.edu http://begws92.beg.utexas.edu/ Anon a...@gmail.com mailto:a...@gmail.com ryanblak rbtnet at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com styr styr.py at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com tdihp tdihp at hotmail.com http://hotmail.com/ Anon a...@gmail.com mailto:a...@gmail.com tpoole t.b.poole at gmail.com http://gmail.com/ Anon a...@gmail.com mailto:a...@gmail.com wim glenn wim.glenn at melbourneit.com.au http://melbourneit.com.au/ The Anon author is just a standing in for unknown author. I can make a guess at some of those, but would prefer it if the people in question could supply their proper name and address. TIA, Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: Message signed with OpenPGP using GPGMail ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Change default order to Fortran order
Hi, I am an imaging researcher, and a new Python user. My first Python project is to somehow modify NumPy source code such that everything is Fortran column-major by default. I read about the information in the link below, but for us, the fact is that we absolutely want to use Fortran column major, and we want to make it default. Explicitly writing order = 'F' all over the place is not acceptable to us. http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues I tried searching in this email list, as well as google search in general. However, I have not found anything useful. This must be a common request/need, I believe. Can anyone provide any insight/help? Thank you very much, Kang -- Kang Wang, Ph.D. Highland Ave., Room 1113 Madison, WI 53705-2275 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
On 02/08/15 15:55, Kang Wang wrote: Can anyone provide any insight/help? There is no default order. There was before, but now all operators control the order of their return arrays from the order of their input array. The only thing that makes C order default is the keyword argument to np.empty, np.ones and np.zeros. Just monkey patch those functions and it should be fine. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?
On 31/07/15 09:38, Julian Taylor wrote: A long is only machine word wide on posix, in windows its not. Actually it is the opposite. A pointer is 64 bit on AMD64, but the native integer and pointer offset is only 32 bit. But it does not matter because it is int that should be machine word sized, not long, which it is on both platforms. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
Well, numpy has a tendency to prefer C order. There is nothing you can do about that really. But you just cannot be sure what you get in some cases. Often you need something specific for interfaceing other code. But in that case quite often you also do not need to fear the copy. - Sebastian On Sun Aug 2 16:27:08 2015 GMT+0200, Sturla Molden wrote: On 02/08/15 15:55, Kang Wang wrote: Can anyone provide any insight/help? There is no default order. There was before, but now all operators control the order of their return arrays from the order of their input array. The only thing that makes C order default is the keyword argument to np.empty, np.ones and np.zeros. Just monkey patch those functions and it should be fine. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
On 02/08/15 22:28, Bryan Van de Ven wrote: And to eliminate the order kwarg, use functools.partial to patch the zeros function (or any others, as needed): This will probably break code that depends on NumPy, like SciPy and scikit-image. But if NumPy is all that matters, sure go ahead and monkey patch. Otherwise keep the patched functions in another namespace. :-) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
Could you please explain why you need 'F' ordering? It's pretty unlikely that you actually care about the internal memory layout, and you'll get better advice if you explain why you think you do care. My first Python project is to somehow modify NumPy source code such that everything is Fortran column-major by default. This is the road to pain. You'll have to maintain your own fork and will probably inject bugs when trying to rewrite. Nobody will want to help fix them because everyone else just uses numpy as is. And to eliminate the order kwarg, use functools.partial to patch the zeros function (or any others, as needed): Instead of monkey patching, why not just define your own shims: fortran_zeros = partial(np.zeros(order='F')) Seems like this would lead to a lot less confusion (although until you tell us why you care about the in-memory layout I don't know the point of doing this at all). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
On Aug 2, 2015 7:30 AM, Sturla Molden sturla.mol...@gmail.com wrote: On 02/08/15 15:55, Kang Wang wrote: Can anyone provide any insight/help? There is no default order. There was before, but now all operators control the order of their return arrays from the order of their input array. This is... overoptimistic. I would not rely on this in code that I wrote. It's true that many numpy operations do preserve the input order. But there are also many operations that don't. And which are which often changes between releases. (Not on purpose usually, but it's an easy bug to introduce. And sometimes it is done intentionally, e.g. to make functions faster. It sucks to have to make a function slower for everyone because someone somewhere is depending on memory layout default details.) And there are operations where it's not even clear what preserving order means (indexing a C array with a Fortran array, add(C, fortran), ...), and even lots of operations that intrinsically break contiguity/ordering (transpose, diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings one way or another in any non-trivial program. Instead, it's better to explicitly specify order= just at the places where you care. That way your code is *locally* correct (you can check it will work by just reading the one function). The alternative is to try and enforce a *global* property on your program (everyone everywhere is very careful to only use contiguity-preserving operations, where everyone includes third party libraries like numpy and others). In software design, local invariants invariants are always better than global invariants -- the most well known example is local variables versus global variables, but the principle is much broader. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
On 02/08/15 22:14, Kang Wang wrote: Thank you all for replying! I did a quick test, using python 2.6.6, and the original numpy package on my Linux computer without any change. == x = np.zeros((2,3),dtype=np.int32,order='F') print x.strides = print x.strides y = x + 1 print y.strides = print y.strides == Output: x.strides = (4, 8) y.strides = (12, 4) Update NumPy. This is the behavior I talked about that has changed. Now NumPy does this: In [21]: x = np.zeros((2,3),dtype=np.int32,order='F') In [22]: y = x + 1 In [24]: x.strides Out[24]: (4, 8) In [25]: y.strides Out[25]: (4, 8) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
Thank you all for replying! I did a quick test, using python 2.6.6, and the original numpy package on my Linux computer without any change.== x = np.zeros((2,3),dtype=np.int32,order='F') print x.strides = print x.strides y = x + 1 print y.strides = print y.strides == Output: x.strides = (4, 8) y.strides = (12, 4) So, basically, x is Fortran-style column-major (because I explicitly write order='F'), but y is C-style row-major. This is going to be very annoying. What I really want is: - I do not have to write order='F' explicitly when declaring x - both x and y are Fortran-style column-major Which file should I modify to achieve this goal? Right now, I am just trying to get some basic stuff working with all arrays default to Fortran-style, and I can worry about interfacing with other code/libraries later. Thanks, Kang On 08/02/15, Sebastian Berg sebast...@sipsolutions.net wrote: Well, numpy has a tendency to prefer C order. There is nothing you can do about that really. But you just cannot be sure what you get in some cases. Often you need something specific for interfaceing other code. But in that case quite often you also do not need to fear the copy. - Sebastian On Sun Aug 2 16:27:08 2015 GMT+0200, Sturla Molden wrote: On 02/08/15 15:55, Kang Wang wrote: Can anyone provide any insight/help? There is no default order. There was before, but now all operators control the order of their return arrays from the order of their input array. The only thing that makes C order default is the keyword argument to np.empty, np.ones and np.zeros. Just monkey patch those functions and it should be fine. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Kang Wang, Ph.D. Highland Ave., Room 1113 Madison, WI 53705-2275 TEL 608-263-0066 http://www.medphysics.wisc.edu/~kang/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
And to eliminate the order kwarg, use functools.partial to patch the zeros function (or any others, as needed): In [26]: import numpy as np In [27]: from functools import partial In [28]: np.zeros = partial(np.zeros, order=F) In [29]: x = np.zeros((2,3), dtype=np.int32) In [30]: y = x + 1 In [31]: x.strides Out[31]: (4, 8) In [32]: y.strides Out[32]: (4, 8) In [33]: np.__version__ Out[33]: '1.9.2' Bryan On Aug 2, 2015, at 3:22 PM, Sturla Molden sturla.mol...@gmail.com wrote: On 02/08/15 22:14, Kang Wang wrote: Thank you all for replying! I did a quick test, using python 2.6.6, and the original numpy package on my Linux computer without any change. == x = np.zeros((2,3),dtype=np.int32,order='F') print x.strides = print x.strides y = x + 1 print y.strides = print y.strides == Output: x.strides = (4, 8) y.strides = (12, 4) Update NumPy. This is the behavior I talked about that has changed. Now NumPy does this: In [21]: x = np.zeros((2,3),dtype=np.int32,order='F') In [22]: y = x + 1 In [24]: x.strides Out[24]: (4, 8) In [25]: y.strides Out[25]: (4, 8) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
On Aug 2, 2015 6:59 AM, Kang Wang kwan...@wisc.edu wrote: Hi, I am an imaging researcher, and a new Python user. My first Python project is to somehow modify NumPy source code such that everything is Fortran column-major by default. I read about the information in the link below, but for us, the fact is that we absolutely want to use Fortran column major, and we want to make it default. Explicitly writing order = 'F' all over the place is not acceptable to us. http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues I tried searching in this email list, as well as google search in general. However, I have not found anything useful. This must be a common request/need, I believe. It isn't, I'm afraid. Basically what you're signing up for is to maintain your own copy of numpy all by yourself. You're totally within your rights to do this, but it isn't something I would really recommend as a first python project (to put it mildly). And unfortunately, there are plenty of libraries out there that use numpy and assume they will get C order by default, so your version of numpy will create lots of obscure errors, segfaults, etc. as you start using it with other packages. Obviously this will be a problem for you -- basically you may find yourself having to maintain your own copy of lots of libraries. Less obviously, this would also create a big problem for us, because your users will start filling bug reports on numpy, or on these random third party packages, and it will be massively confusing and a big waste of time because the problem will be with your package, not with any of our code. So if you do do this, please either (a) change the name of your package somehow ('import numpyfortran' or similar) so that everyone using it is clear that it's a non-standard product, or else (b) make sure that you only use it within your own team, don't allow anyone else to use it, and make a rule that no one is allowed to file bug reports, or ask or answer questions on mailing lists or stackoverflow, unless they have first double checked *every* time that what they're saying is also valid when using regular numpy. Again, I strongly recommend you not do this. There are literally millions of users who are using numpy as it currently is, and able to get stuff done. I don't know your specific situation, but maybe if you describe a bit more what it is you're doing and why you think you need all-Fortran-all-the-time, then people will be able to suggest strategies to work around things on your end, or find smaller tweaks to numpy that could go into the standard version. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
On Aug 2, 2015 1:17 PM, Kang Wang kwan...@wisc.edu wrote: Thank you all for replying! I did a quick test, using python 2.6.6, There's pretty much no good reason these days to be using python 2.6 (which was released in *2008*). I assume you're using it because you're using redhat or some redhat derivative, and that's what they ship by default? Even redhat engineers officially recommend that users *not* use the default python -- it's basically only intended for use by their own built-in system management scripts. If you're just getting started with python, then at this point I'd recommend starting with python 3.4. Some easy ways to get this installed: - Anaconda: the most popular scientific python distribution -- you pretty much just download one file and get a full, up to date setup of python and all the main scientific packages, in your home directory. Supported on all popular platforms. Trivial to use and requires no special permissions. http://continuum.io/downloads#py34 - One of Anaconda's competitors: http://www.scipy.org/install.html - Software collections: redhat's official way to do things like this: https://www.softwarecollections.org/en/ -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change default order to Fortran order
Just chiming in with my 2 cents, in direct response to your points... - Image oriented processing is most typically done with row-major storage layout. From hardware to general software implementations. - Well really think of it as [slice,] row, column (logical)... you don't actually have to be concerned about the layout unless you want higher performance - in which case for a better access pattern you process a fundamental image-line at a time. I also find it helps me avoid bugs with xyz semantics by working with rows and columns only and remembering x=col, y = row. - I'm most familiar with having slice first like the above. - ITK is stored as row-major actually, but it's index type has dimensions specified as column,row, slice . Matlab does alot of things column order and thus acts different from implementations which can result in different outputs, but matlab seems perfectly happy living on an island where it's the only implementation providing a specific answer given a specific input. - Numpy is 0 based...? Good luck keeping it all sane though, -Jason On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang kwan...@wisc.edu wrote: Thank you all for replying and providing useful insights and suggestions. The reasons I really want to use column-major are: - I am image-oriented user (not matrix-oriented, as explained in http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues ) - I am so used to read/write I(x, y, z) in textbook and code, and it is very likely that if the environment (row-major environment) forces me to write I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it is difficult to debug, because everything compile and build fine. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. - It actually has not too much to do with the actual data layout in memory. In imaging processing, especially medical imaging where I am working in, if you have a 3D image, everyone will agree that in memory, the X index is the fasted changing index, and the Z dimension (we often call it the slice dimension) has the largest stride in memory. So, if data layout is like this in memory, and image-oriented users are so used to read/write I(x,y,z), the only storage order that makes sense is column-major - I also write code in MATLAB and C/C++. In MATLAB, matrix is column-major array. In C/C++, we often use ITK, which is also column-major ( http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really prefer always read/write column-major code to minimize coding bugs related to storage order. - I also prefer index to be 0-based; however, there is nothing I can do about it for MATLAB (which is 1-based). I can see that my original thought about modifying NumPy source and re-compile is probably a bad idea. The suggestions about using fortran_zeros = partial(np.zeros(order='F')) is probably the best way so far, in my opinion, and I am going to give it a try. Again, thank you all for replying. Kang On 08/02/15, *Nathaniel Smith * n...@pobox.com wrote: On Aug 2, 2015 7:30 AM, Sturla Molden sturla.mol...@gmail.com wrote: On 02/08/15 15:55, Kang Wang wrote: Can anyone provide any insight/help? There is no default order. There was before, but now all operators control the order of their return arrays from the order of their input array. This is... overoptimistic. I would not rely on this in code that I wrote. It's true that many numpy operations do preserve the input order. But there are also many operations that don't. And which are which often changes between releases. (Not on purpose usually, but it's an easy bug to introduce. And sometimes it is done intentionally, e.g. to make functions faster. It sucks to have to make a function slower for everyone because someone somewhere is depending on memory layout default details.) And there are operations where it's not even clear what preserving order means (indexing a C array with a Fortran array, add(C, fortran), ...), and even lots of operations that intrinsically break contiguity/ordering (transpose, diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings one way or another in any non-trivial program. Instead, it's better to explicitly specify order= just at the places where you care. That way your code is *locally* correct (you can check it will work by just reading the one function). The alternative is to try and enforce a *global* property on your program (everyone everywhere is very careful to only use contiguity-preserving operations, where everyone includes third party libraries like numpy and others). In software design, local invariants invariants are always better than
Re: [Numpy-discussion] Change default order to Fortran order
Thank you all for replying and providing useful insights and suggestions. The reasons I really want to use column-major are: I am image-oriented user (not matrix-oriented, as explained in http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues) I am so used to read/write I(x, y, z) in textbook and code, and it is very likely that if the environment (row-major environment) forces me to write I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it is difficult to debug, because everything compile and build fine. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. It actually has not too much to do with the actual data layout in memory. In imaging processing, especially medical imaging where I am working in, if you have a 3D image, everyone will agree that in memory, the X index is the fasted changing index, and the Z dimension (we often call it the slice dimension) has the largest stride in memory. So, if data layout is like this in memory, and image-oriented users are so used to read/write I(x,y,z), the only storage order that makes sense is column-major I also write code in MATLAB and C/C++. In MATLAB, matrix is column-major array. In C/C++, we often use ITK, which is also column-major (http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really prefer always read/write column-major code to minimize coding bugs related to storage order. I also prefer index to be 0-based; however, there is nothing I can do about it for MATLAB (which is 1-based). I can see that my original thought about modifying NumPy source and re-compile is probably a bad idea. The suggestions about using fortran_zeros = partial(np.zeros(order='F')) is probably the best way so far, in my opinion, and I am going to give it a try. Again, thank you all for replying. Kang On 08/02/15, Nathaniel Smith n...@pobox.com wrote: On Aug 2, 2015 7:30 AM, Sturla Molden sturla.mol...@gmail.com wrote: On 02/08/15 15:55, Kang Wang wrote: Can anyone provide any insight/help? There is no default order. There was before, but now all operators control the order of their return arrays from the order of their input array. This is... overoptimistic. I would not rely on this in code that I wrote. It's true that many numpy operations do preserve the input order. But there are also many operations that don't. And which are which often changes between releases. (Not on purpose usually, but it's an easy bug to introduce. And sometimes it is done intentionally, e.g. to make functions faster. It sucks to have to make a function slower for everyone because someone somewhere is depending on memory layout default details.) And there are operations where it's not even clear what preserving order means (indexing a C array with a Fortran array, add(C, fortran), ...), and even lots of operations that intrinsically break contiguity/ordering (transpose, diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings one way or another in any non-trivial program. Instead, it's better to explicitly specify order= just at the places where you care. That way your code is *locally* correct (you can check it will work by just reading the one function). The alternative is to try and enforce a *global* property on your program (everyone everywhere is very careful to only use contiguity-preserving operations, where everyone includes third party libraries like numpy and others). In software design, local invariants invariants are always better than global invariants -- the most well known example is local variables versus global variables, but the principle is much broader. -n -- Kang Wang, Ph.D. Highland Ave., Room 1113 Madison, WI 53705-2275 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion