[Numpy-discussion] 1.10.x is branched

2015-08-02 Thread Charles R Harris
Hi All,

Numpy 1.10.x is branched. There is still some cleanup to do before the
alpha release, but that should be coming in a couple of days.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Daniel Sank
Kang,

Thank you for explaining your motivation. It's clear from your last note,
as you said, that your desire for column-first indexing has nothing to do
with in-memory data layout. That being the case, I strongly urge you to
just use bare numpy and do not use the fortran_zeros function I
recommended before. Changing the in-memory layout via the order keyword
in numpy.zeros will not change the way indexing works at all. You gain
absolutely nothing by changing the in-memory order unless you are writing
some C or Fortran code which will interact with the data in memory.

To see what I mean, consider the following examples:

x = np.array([1, 2, 3], [4, 5, 6]])
x.shape
 (2, 3)

and

x = np.array([1, 2, 3], [4, 5, 6]], order='F')
x.shape
 (2, 3)

You see that changing the in-memory order has nothing whatsoever to do with
the array's shape or how you access it.

 You will see run time error. Depending on environment, you may get useful
error message
 (i.e. index out of range), but sometimes you just get bad image results.

Could you give a very simple example of what you mean? I can't think of how
this could ever happen and your fear here makes me think there's a
fundamental misunderstanding about how array operations in numpy and other
programming languages work. As an example, iteration in numpy goes through
the first index:

x = np.array([[1, 2, 3], [4, 5, 6]])
for foo in x:
...

Inside the for loop, foo takes on the values [1, 2, 3] on the first
iteration and [4, 5, 6] on the second. If you want to iterate through the
columns just do this instead

x = np.array([[1, 2, 3], [4, 5, 6]])
for foo in x.T:
...

If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to
produce an array with shape (3, 2) then you should own up to the fact that
the array constructor expects it the other way around and do this

x = np.array([[1, 2, 3], [4, 5, 6]]).T

instead. This is infinity times better than trying to write a shim function
or patch numpy because with .T you're using (fast) built-in functionality
which other people your code will understand.

The real message here is that whether the first index runs over rows or
columns is actually meaningless. The only places the row versus column
issue has any meaning is when doing input/output (in which case you should
use the transpose if you actually need it), or when doing iteration. One
thing that would make sense if you're reading from a binary file format
which uses column-major format would be to write your own reader function:

def read_fortran_style_binary_file(file):
return np.fromfile(file).T

Note that if you do this then you already have a column major array in
numpy and you don't have to worry about any other transposes (except,
again, when doing more I/O or passing to something like a plotting
function).




On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang kwan...@wisc.edu wrote:

 Thank you all for replying and providing useful insights and suggestions.

 The reasons I really want to use column-major are:

- I am image-oriented user (not matrix-oriented, as explained in

 http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues
)
- I am so used to read/write I(x, y, z) in textbook and code, and it
is very likely that if the environment (row-major environment) forces me to
write I(z, y, x),  I will write a bug if I am not 100% focused. When this
happens, it is difficult to debug, because everything compile and build
fine. You will see run time error. Depending on environment, you may get
useful error message (i.e. index out of range), but sometimes you just get
bad image results.
- It actually has not too much to do with the actual data layout in
memory. In imaging processing, especially medical imaging where I am
working in, if you have a 3D image, everyone will agree that in memory, the
X index is the fasted changing index, and the Z dimension (we often call it
the slice dimension) has the largest stride in memory. So, if
data layout is like this in memory, and image-oriented users are so used to
read/write I(x,y,z), the only storage order that makes sense is
column-major
- I also write code in MATLAB and C/C++. In MATLAB, matrix is
column-major array. In C/C++, we often use ITK, which is also column-major 
 (
http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really
prefer always read/write column-major code to minimize coding bugs related
to storage order.
- I also prefer index to be 0-based; however, there is nothing I can
do about it for MATLAB (which is 1-based).

 I can see that my original thought about modifying NumPy source and
 re-compile is probably a bad idea. The suggestions about using
 fortran_zeros = partial(np.zeros(order='F')) is probably the best way so
 far, in my opinion, and I am going to give it a try.

 Again, thank you all for replying.

 Kang

 On 08/02/15, *Nathaniel 

Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Juan Nunez-Iglesias
Hi Kang,

Feel free to come chat about your application on the scikit-image list [1]!
I'll note that we've been through the array order discussion many times
there and even have a doc page about it [2].

The short version is that you'll save yourself a lot of pain by starting to
think of your images as (plane, row, column) instead of (x, y, z). The
syntax actually becomes friendlier too. For example, to do something to
each slice of data, you do:

for plane in image:
plane += foo

instead of

for z in image.shape[2]:
image[:, :, z] += foo

for example.

Juan.

[1] scikit-im...@googlegroups.com
[2]
http://scikit-image.org/docs/dev/user_guide/numpy_images.html#coordinate-conventions

PS: As to the renamed Fortran-ordered numpy, may I suggest funpy. The F
is for Fortran and the fun is for all the fun you'll have maintaining it. =P

On Mon, 3 Aug 2015 at 6:28 am Daniel Sank sank.dan...@gmail.com wrote:

 Kang,

 Thank you for explaining your motivation. It's clear from your last note,
 as you said, that your desire for column-first indexing has nothing to do
 with in-memory data layout. That being the case, I strongly urge you to
 just use bare numpy and do not use the fortran_zeros function I
 recommended before. Changing the in-memory layout via the order keyword
 in numpy.zeros will not change the way indexing works at all. You gain
 absolutely nothing by changing the in-memory order unless you are writing
 some C or Fortran code which will interact with the data in memory.

 To see what I mean, consider the following examples:

 x = np.array([1, 2, 3], [4, 5, 6]])
 x.shape
  (2, 3)

 and

 x = np.array([1, 2, 3], [4, 5, 6]], order='F')
 x.shape
  (2, 3)

 You see that changing the in-memory order has nothing whatsoever to do
 with the array's shape or how you access it.

  You will see run time error. Depending on environment, you may get
 useful error message
  (i.e. index out of range), but sometimes you just get bad image results.

 Could you give a very simple example of what you mean? I can't think of
 how this could ever happen and your fear here makes me think there's a
 fundamental misunderstanding about how array operations in numpy and other
 programming languages work. As an example, iteration in numpy goes through
 the first index:

 x = np.array([[1, 2, 3], [4, 5, 6]])
 for foo in x:
 ...

 Inside the for loop, foo takes on the values [1, 2, 3] on the first
 iteration and [4, 5, 6] on the second. If you want to iterate through the
 columns just do this instead

 x = np.array([[1, 2, 3], [4, 5, 6]])
 for foo in x.T:
 ...

 If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to
 produce an array with shape (3, 2) then you should own up to the fact that
 the array constructor expects it the other way around and do this

 x = np.array([[1, 2, 3], [4, 5, 6]]).T

 instead. This is infinity times better than trying to write a shim
 function or patch numpy because with .T you're using (fast) built-in
 functionality which other people your code will understand.

 The real message here is that whether the first index runs over rows or
 columns is actually meaningless. The only places the row versus column
 issue has any meaning is when doing input/output (in which case you should
 use the transpose if you actually need it), or when doing iteration. One
 thing that would make sense if you're reading from a binary file format
 which uses column-major format would be to write your own reader function:

 def read_fortran_style_binary_file(file):
 return np.fromfile(file).T

 Note that if you do this then you already have a column major array in
 numpy and you don't have to worry about any other transposes (except,
 again, when doing more I/O or passing to something like a plotting
 function).




 On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang kwan...@wisc.edu wrote:

 Thank you all for replying and providing useful insights and suggestions.

 The reasons I really want to use column-major are:

- I am image-oriented user (not matrix-oriented, as explained in

 http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues
)
- I am so used to read/write I(x, y, z) in textbook and code, and
it is very likely that if the environment (row-major environment) forces 
 me
to write I(z, y, x),  I will write a bug if I am not 100% focused. When
this happens, it is difficult to debug, because everything compile and
build fine. You will see run time error. Depending on environment, you may
get useful error message (i.e. index out of range), but sometimes you just
get bad image results.
- It actually has not too much to do with the actual data layout in
memory. In imaging processing, especially medical imaging where I am
working in, if you have a 3D image, everyone will agree that in memory, 
 the
X index is the fasted changing index, and the Z dimension (we often call 
 it
the slice 

Re: [Numpy-discussion] mailmap update

2015-08-02 Thread Tom Poole
Hi Chuck,

Tom Poole t.b.poole at gmail.com http://gmail.com/ tpoole t.b.poole at 
gmail.com http://gmail.com/

Tom

 On 2 Aug 2015, at 06:08, Charles R Harris charlesr.har...@gmail.com wrote:
 
 Hi All,
 
 I'm trying to update the .mailmap file on github and could use some help. The 
 current version seems common to both numpy and scipy, hence the crosspost.  
 Here is what I've got so far.
 
 Alex Griffing argriffi at ncsu.edu http://ncsu.edu/ alex argriffi at 
 ncsu.edu http://ncsu.edu/
 Alex Griffing argriffi at ncsu.edu http://ncsu.edu/ argriffing 
 argriffi at ncsu.edu http://ncsu.edu/
 Alex Griffing argriffi at ncsu.edu http://ncsu.edu/ argriffing 
 argriffing at users.noreply.github.com http://users.noreply.github.com/
 Behzad Nouri behzadnouri at gmail.com http://gmail.com/ behzad nouri 
 behzadnouri at gmail.com http://gmail.com/
 Carl Kleffner cmkleffner at gmail.com http://gmail.com/ carlkl 
 cmkleffner at gmail.com http://gmail.com/
 Christoph Gohlke cgohlke at uci.edu http://uci.edu/ Christolph Gohlke 
 cgohlke at uci.edu http://uci.edu/
 Christoph Gohlke cgohlke at uci.edu http://uci.edu/ cgholke ? at ?
 Christoph Gohlke cgohlke at uci.edu http://uci.edu/ cgohlke cgohlke 
 at uci.edu http://uci.edu/
 Han Genuit hangenuit at gmail.com http://gmail.com/ Han hangenuit at 
 gmail.com http://gmail.com/
 Jaime Fernandez jaime.frio at gmail.com http://gmail.com/ Jaime 
 jaime.frio at gmail.com http://gmail.com/
 Jaime Fernandez jaime.frio at gmail.com http://gmail.com/ jaimefrio 
 jaime.frio at gmail.com http://gmail.com/
 Mark Wiebe mwwiebe at gmail.com http://gmail.com/ Mark mwwiebe at 
 gmail.com http://gmail.com/
 Mark Wiebe mwwiebe at gmail.com http://gmail.com/ Mark Wiebe mwiebe 
 at enthought.com http://enthought.com/
 Mark Wiebe mwwiebe at gmail.com http://gmail.com/ Mark Wiebe mwiebe 
 at georg.(none)
 Nathaniel J. Smith njs at pobox.com http://pobox.com/ njsmith njs at 
 pobox.com http://pobox.com/
 Ondřej Čertík ondrej.certik at gmail.com http://gmail.com/ Ondrej 
 Certik ondrej.certik at gmail.com http://gmail.com/
 Ralf Gommers ralf.gommers at googlemail.com http://googlemail.com/ 
 rgommers ralf.gommers at googlemail.com http://googlemail.com/
 Saullo Giovani saullogiovani at gmail.com http://gmail.com/ 
 saullogiovani saullogiovani at gmail.com http://gmail.com/
 Sebastian Berg sebastian at sipsolutions.net http://sipsolutions.net/ 
 seberg sebastian at sipsolutions.net http://sipsolutions.net/
 
 Anon a...@gmail.com mailto:a...@gmail.com abdulmuneer abdulmuneer at 
 gmail.com http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com amir ladsgroup at gmail.com 
 http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com cel cel.gentoo at gmail.com 
 http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com chebee7i chebee7i at 
 gmail.com http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com empeeu empeeu at yahoo.com 
 http://yahoo.com/
 Anon a...@gmail.com mailto:a...@gmail.com endolith endolith at 
 gmail.com http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com hannaro hroehling at gmx.net 
 http://gmx.net/
 Anon a...@gmail.com mailto:a...@gmail.com hpaulj hpj3 at myuw.net 
 http://myuw.net/
 Anon a...@gmail.com mailto:a...@gmail.com immerrr immerrr at gmail.com 
 http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com jmrosen155 Jordan at 
 Jordans-MacBook-Pro.local
 Anon a...@gmail.com mailto:a...@gmail.com jnothman jnothman at 
 student.usyd.edu.au http://student.usyd.edu.au/
 Anon a...@gmail.com mailto:a...@gmail.com kanhua kanhwa at gmail.com 
 http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com mamikony ernest.mamikonyan 
 at sig.com http://sig.com/
 Anon a...@gmail.com mailto:a...@gmail.com mbyt random.seed at web.de 
 http://web.de/
 Anon a...@gmail.com mailto:a...@gmail.com mlai mlai at 
 begws92.beg.utexas.edu http://begws92.beg.utexas.edu/
 Anon a...@gmail.com mailto:a...@gmail.com ryanblak rbtnet at gmail.com 
 http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com styr styr.py at gmail.com 
 http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com tdihp tdihp at hotmail.com 
 http://hotmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com tpoole t.b.poole at 
 gmail.com http://gmail.com/
 Anon a...@gmail.com mailto:a...@gmail.com wim glenn wim.glenn at 
 melbourneit.com.au http://melbourneit.com.au/
 
 The Anon author is just a standing in for unknown author. I can make a guess 
 at some of those, but would prefer it if the people in question could supply 
 their proper name and address.
 
 TIA,
 
 Chuck
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Kang Wang
Hi,

I am an imaging researcher, and a new Python user. My first Python project is 
to somehow modify NumPy source code such that everything is Fortran 
column-major by default.


I read about the information in the link below, but for us, the fact is that we 
absolutely want to use Fortran column major, and we want to make it default. 
Explicitly writing  order = 'F'  all over the place is not acceptable to us.
http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues


I tried searching in this email list, as well as google search in general. 
However, I have not found anything useful. This must be a common request/need, 
I believe.


Can anyone provide any insight/help?


Thank you very much,


Kang

--
Kang Wang, Ph.D.
 Highland Ave., Room 1113
Madison, WI 53705-2275

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Sturla Molden
On 02/08/15 15:55, Kang Wang wrote:

 Can anyone provide any insight/help?

There is no default order. There was before, but now all operators 
control the order of their return arrays from the order of their input 
array. The only thing that makes C order default is the keyword 
argument to np.empty, np.ones and np.zeros. Just monkey patch those 
functions and it should be fine.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-08-02 Thread Sturla Molden
On 31/07/15 09:38, Julian Taylor wrote:

 A long is only machine word wide on posix, in windows its not.

Actually it is the opposite. A pointer is 64 bit on AMD64, but the 
native integer and pointer offset is only 32 bit. But it does not matter 
because it is int that should be machine word sized, not long, which it 
is on both platforms.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Sebastian Berg
Well, numpy has a tendency to prefer C order. There is nothing you can do about 
that really. But you just cannot be sure what you get in some cases. 
Often you need something specific for interfaceing other code. But in that case 
quite often you also do not need to fear the copy.

- Sebastian 


On Sun Aug 2 16:27:08 2015 GMT+0200, Sturla Molden wrote:
 On 02/08/15 15:55, Kang Wang wrote:
 
  Can anyone provide any insight/help?
 
 There is no default order. There was before, but now all operators 
 control the order of their return arrays from the order of their input 
 array. The only thing that makes C order default is the keyword 
 argument to np.empty, np.ones and np.zeros. Just monkey patch those 
 functions and it should be fine.
 
 Sturla
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Sturla Molden
On 02/08/15 22:28, Bryan Van de Ven wrote:
 And to eliminate the order kwarg, use functools.partial to patch the zeros 
 function (or any others, as needed):

This will probably break code that depends on NumPy, like SciPy and 
scikit-image. But if NumPy is all that matters, sure go ahead and monkey 
patch. Otherwise keep the patched functions in another namespace.

:-)

Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Daniel Sank
Could you please explain why you need 'F' ordering? It's pretty unlikely
that you actually care about the internal memory layout, and you'll get
better advice if you explain why you think you do care.

 My first Python project is to somehow modify NumPy source
 code such that everything is Fortran column-major by default.

This is the road to pain. You'll have to maintain your own fork and will
probably inject bugs when trying to rewrite. Nobody will want to help fix
them because everyone else just uses numpy as is.

 And to eliminate the order kwarg, use functools.partial to patch the
 zeros function (or any others, as needed):

Instead of monkey patching, why not just define your own shims:

fortran_zeros = partial(np.zeros(order='F'))

Seems like this would lead to a lot less confusion (although until you tell
us why you care about the in-memory layout I don't know the point of doing
this at all).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Nathaniel Smith
On Aug 2, 2015 7:30 AM, Sturla Molden sturla.mol...@gmail.com wrote:

 On 02/08/15 15:55, Kang Wang wrote:

  Can anyone provide any insight/help?

 There is no default order. There was before, but now all operators
 control the order of their return arrays from the order of their input
 array.

This is... overoptimistic. I would not rely on this in code that I wrote.

It's true that many numpy operations do preserve the input order. But there
are also many operations that don't. And which are which often changes
between releases. (Not on purpose usually, but it's an easy bug to
introduce. And sometimes it is done intentionally, e.g. to make functions
faster. It sucks to have to make a function slower for everyone because
someone somewhere is depending on memory layout default details.) And there
are operations where it's not even clear what preserving order means
(indexing a C array with a Fortran array, add(C, fortran), ...), and even
lots of operations that intrinsically break contiguity/ordering (transpose,
diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings
one way or another in any non-trivial program.

Instead, it's better to explicitly specify order= just at the places where
you care. That way your code is *locally* correct (you can check it will
work by just reading the one function). The alternative is to try and
enforce a *global* property on your program (everyone everywhere is very
careful to only use contiguity-preserving operations, where everyone
includes third party libraries like numpy and others). In software design,
local invariants invariants are always better than global invariants -- the
most well known example is local variables versus global variables, but the
principle is much broader.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Sturla Molden
On 02/08/15 22:14, Kang Wang wrote:
 Thank you all for replying!

 I did a quick test, using python 2.6.6, and the original numpy package
 on my Linux computer without any change.
 ==
 x = np.zeros((2,3),dtype=np.int32,order='F')
 print x.strides =
 print x.strides

 y = x + 1
 print y.strides =
 print y.strides
 ==

 Output:
 
 x.strides =
 (4, 8)
 y.strides =
 (12, 4)
 

Update NumPy. This is the behavior I talked about that has changed.

Now NumPy does this:


In [21]: x = np.zeros((2,3),dtype=np.int32,order='F')

In [22]: y = x + 1

In [24]: x.strides
Out[24]: (4, 8)

In [25]: y.strides
Out[25]: (4, 8)



Sturla





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Kang Wang
Thank you all for replying!


I did a quick test, using python 2.6.6, and the original numpy package on my 
Linux computer without any change.==
x = np.zeros((2,3),dtype=np.int32,order='F')
print x.strides =
print x.strides


y = x + 1
print y.strides =
print y.strides

==

Output:

x.strides =
(4, 8)
y.strides =
(12, 4)



So, basically, x is Fortran-style column-major (because I explicitly write 
order='F'), but y is C-style row-major. This is going to be very annoying. 
What I really want is:
- I do not have to write order='F' explicitly when declaring x
- both x and y are Fortran-style column-major


Which file should I modify to achieve this goal?


Right now, I am just trying to get some basic stuff working with all arrays 
default to Fortran-style, and I can worry about interfacing with other 
code/libraries later.


Thanks,


Kang


On 08/02/15, Sebastian Berg  sebast...@sipsolutions.net wrote:
 Well, numpy has a tendency to prefer C order. There is nothing you can do 
 about that really. But you just cannot be sure what you get in some cases. 
 Often you need something specific for interfaceing other code. But in that 
 case quite often you also do not need to fear the copy.
 
 - Sebastian 
 
 
 On Sun Aug 2 16:27:08 2015 GMT+0200, Sturla Molden wrote:
  On 02/08/15 15:55, Kang Wang wrote:
  
   Can anyone provide any insight/help?
  
  There is no default order. There was before, but now all operators 
  control the order of their return arrays from the order of their input 
  array. The only thing that makes C order default is the keyword 
  argument to np.empty, np.ones and np.zeros. Just monkey patch those 
  functions and it should be fine.
  
  Sturla
  
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
  
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
--
Kang Wang, Ph.D.
 Highland Ave., Room 1113
Madison, WI 53705-2275
TEL 608-263-0066
http://www.medphysics.wisc.edu/~kang/ 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Bryan Van de Ven
And to eliminate the order kwarg, use functools.partial to patch the zeros 
function (or any others, as needed):

In [26]: import numpy as np

In [27]: from functools import partial

In [28]: np.zeros = partial(np.zeros, order=F)

In [29]: x = np.zeros((2,3), dtype=np.int32)

In [30]: y = x + 1

In [31]: x.strides
Out[31]: (4, 8)

In [32]: y.strides
Out[32]: (4, 8)

In [33]: np.__version__ 
Out[33]: '1.9.2' 

Bryan 

 On Aug 2, 2015, at 3:22 PM, Sturla Molden sturla.mol...@gmail.com wrote:
 
 On 02/08/15 22:14, Kang Wang wrote:
 Thank you all for replying!
 
 I did a quick test, using python 2.6.6, and the original numpy package
 on my Linux computer without any change.
 ==
 x = np.zeros((2,3),dtype=np.int32,order='F')
 print x.strides =
 print x.strides
 
 y = x + 1
 print y.strides =
 print y.strides
 ==
 
 Output:
 
 x.strides =
 (4, 8)
 y.strides =
 (12, 4)
 
 
 Update NumPy. This is the behavior I talked about that has changed.
 
 Now NumPy does this:
 
 
 In [21]: x = np.zeros((2,3),dtype=np.int32,order='F')
 
 In [22]: y = x + 1
 
 In [24]: x.strides
 Out[24]: (4, 8)
 
 In [25]: y.strides
 Out[25]: (4, 8)
 
 
 
 Sturla
 
 
 
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Nathaniel Smith
On Aug 2, 2015 6:59 AM, Kang Wang kwan...@wisc.edu wrote:

 Hi,

 I am an imaging researcher, and a new Python user. My first Python
project is to somehow modify NumPy source code such that everything is
Fortran column-major by default.

 I read about the information in the link below, but for us, the fact is
that we absolutely want to  use Fortran column major, and we want to make
it default. Explicitly writing  order = 'F'  all over the place is not
acceptable to us.

http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues

 I tried searching in this email list, as well as google search in
general. However, I have not found anything useful. This must be a common
request/need, I believe.

It isn't, I'm afraid. Basically what you're signing up for is to maintain
your own copy of numpy all by yourself. You're totally within your rights
to do this, but it isn't something I would really recommend as a first
python project (to put it mildly).

And unfortunately, there are plenty of libraries out there that use numpy
and assume they will get C order by default, so your version of numpy will
create lots of obscure errors, segfaults, etc. as you start using it with
other packages. Obviously this will be a problem for you -- basically you
may find yourself having to maintain your own copy of lots of libraries.
Less obviously, this would also create a big problem for us, because your
users will start filling bug reports on numpy, or on these random third
party packages, and it will be massively confusing and a big waste of time
because the problem will be with your package, not with any of our code. So
if you do do this, please either (a) change the name of your package
somehow ('import numpyfortran' or similar) so that everyone using it is
clear that it's a non-standard product, or else (b) make sure that you only
use it within your own team, don't allow anyone else to use it, and make a
rule that no one is allowed to file bug reports, or ask or answer questions
on mailing lists or stackoverflow, unless they have first double checked
*every* time that what they're saying is also valid when using regular
numpy.

Again, I strongly recommend you not do this.

There are literally millions of users who are using numpy as it currently
is, and able to get stuff done. I don't know your specific situation, but
maybe if you describe a bit more what it is you're doing and why you think
you need all-Fortran-all-the-time, then people will be able to suggest
strategies to work around things on your end, or find smaller tweaks to
numpy that could go into the standard version.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Nathaniel Smith
On Aug 2, 2015 1:17 PM, Kang Wang kwan...@wisc.edu wrote:

 Thank you all for replying!

 I did a quick test, using python 2.6.6,

There's pretty much no good reason these days to be using python 2.6 (which
was released in *2008*). I assume you're using it because you're using
redhat or some redhat derivative, and that's what they ship by default?
Even redhat engineers officially recommend that users *not* use the default
python -- it's basically only intended for use by their own built-in system
management scripts.

If you're just getting started with python, then at this point I'd
recommend starting with python 3.4.

Some easy ways to get this installed:
- Anaconda: the most popular scientific python distribution -- you pretty
much just download one file and get a full, up to date setup of python and
all the main scientific packages, in your home directory. Supported on all
popular platforms. Trivial to use and requires no special permissions.
http://continuum.io/downloads#py34
- One of Anaconda's competitors: http://www.scipy.org/install.html
- Software collections: redhat's official way to do things like this:
https://www.softwarecollections.org/en/

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Jason Newton
Just chiming in with my 2 cents, in direct response to your points...

   - Image oriented processing is most typically done with row-major
   storage layout.  From hardware to general software implementations.
   - Well really think of it as [slice,] row, column (logical)... you don't
   actually have to be concerned about the layout unless you want higher
   performance - in which case for a better access pattern you process a
   fundamental image-line at a time.  I also find it helps me avoid bugs with
   xyz semantics by working with rows and columns only and remembering x=col,
   y = row.
   - I'm most familiar with having slice first like the above.
   - ITK is stored as row-major actually, but it's index type has
   dimensions specified as column,row, slice .  Matlab does alot of things
   column order and thus acts different from implementations which can result
   in different outputs, but matlab seems perfectly happy living on an island
   where it's the only implementation providing a specific answer given a
   specific input.
   - Numpy is 0 based...?

Good luck keeping it all sane though,

-Jason

On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang kwan...@wisc.edu wrote:

 Thank you all for replying and providing useful insights and suggestions.

 The reasons I really want to use column-major are:

- I am image-oriented user (not matrix-oriented, as explained in

 http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues
)
- I am so used to read/write I(x, y, z) in textbook and code, and it
is very likely that if the environment (row-major environment) forces me to
write I(z, y, x),  I will write a bug if I am not 100% focused. When this
happens, it is difficult to debug, because everything compile and build
fine. You will see run time error. Depending on environment, you may get
useful error message (i.e. index out of range), but sometimes you just get
bad image results.
- It actually has not too much to do with the actual data layout in
memory. In imaging processing, especially medical imaging where I am
working in, if you have a 3D image, everyone will agree that in memory, the
X index is the fasted changing index, and the Z dimension (we often call it
the slice dimension) has the largest stride in memory. So, if
data layout is like this in memory, and image-oriented users are so used to
read/write I(x,y,z), the only storage order that makes sense is
column-major
- I also write code in MATLAB and C/C++. In MATLAB, matrix is
column-major array. In C/C++, we often use ITK, which is also column-major 
 (
http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really
prefer always read/write column-major code to minimize coding bugs related
to storage order.
- I also prefer index to be 0-based; however, there is nothing I can
do about it for MATLAB (which is 1-based).

 I can see that my original thought about modifying NumPy source and
 re-compile is probably a bad idea. The suggestions about using
 fortran_zeros = partial(np.zeros(order='F')) is probably the best way so
 far, in my opinion, and I am going to give it a try.

 Again, thank you all for replying.

 Kang

 On 08/02/15, *Nathaniel Smith * n...@pobox.com wrote:

 On Aug 2, 2015 7:30 AM, Sturla Molden sturla.mol...@gmail.com wrote:
 
  On 02/08/15 15:55, Kang Wang wrote:
 
   Can anyone provide any insight/help?
 
  There is no default order. There was before, but now all operators
  control the order of their return arrays from the order of their input
  array.

 This is... overoptimistic. I would not rely on this in code that I wrote.

 It's true that many numpy operations do preserve the input order. But
 there are also many operations that don't. And which are which often
 changes between releases. (Not on purpose usually, but it's an easy bug to
 introduce. And sometimes it is done intentionally, e.g. to make functions
 faster. It sucks to have to make a function slower for everyone because
 someone somewhere is depending on memory layout default details.) And there
 are operations where it's not even clear what preserving order means
 (indexing a C array with a Fortran array, add(C, fortran), ...), and even
 lots of operations that intrinsically break contiguity/ordering (transpose,
 diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings
 one way or another in any non-trivial program.

 Instead, it's better to explicitly specify order= just at the places where
 you care. That way your code is *locally* correct (you can check it will
 work by just reading the one function). The alternative is to try and
 enforce a *global* property on your program (everyone everywhere is very
 careful to only use contiguity-preserving operations, where everyone
 includes third party libraries like numpy and others). In software design,
 local invariants invariants are always better than 

Re: [Numpy-discussion] Change default order to Fortran order

2015-08-02 Thread Kang Wang
Thank you all for replying and providing useful insights and suggestions.

The reasons I really want to use column-major are:

I am image-oriented user (not matrix-oriented, as explained in 
http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues)
I am so used to read/write I(x, y, z) in textbook and code, and it is 
very likely that if the environment (row-major environment) forces me to write 
I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it 
is difficult to debug, because everything compile and build fine. You will see 
run time error. Depending on environment, you may get useful error message 
(i.e. index out of range), but sometimes you just get bad image results.
It actually has not too much to do with the actual data layout in 
memory. In imaging processing, especially medical imaging where I am working 
in, if you have a 3D image, everyone will agree that in memory, the X index is 
the fasted changing index, and the Z dimension (we often call it the slice 
dimension) has the largest stride in memory. So, if data layout is like this in 
memory, and image-oriented users are so used to read/write I(x,y,z), the only 
storage order that makes sense is column-major
I also write code in MATLAB and C/C++. In MATLAB, matrix is 
column-major array. In C/C++, we often use ITK, which is also column-major 
(http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really prefer 
always read/write column-major code to minimize coding bugs related to storage 
order.
I also prefer index to be 0-based; however, there is nothing I can do 
about it for MATLAB (which is 1-based).

I can see that my original thought about modifying NumPy source and 
re-compile is probably a bad idea. The suggestions about using fortran_zeros 
= partial(np.zeros(order='F')) is probably the best way so far, in my opinion, 
and I am going to give it a try.


Again, thank you all for replying.


Kang

On 08/02/15, Nathaniel Smith  n...@pobox.com wrote:
 
 On Aug 2, 2015 7:30 AM, Sturla Molden sturla.mol...@gmail.com wrote:
 
 
 
  On 02/08/15 15:55, Kang Wang wrote:
 
 
 
   Can anyone provide any insight/help?
 
 
 
  There is no default order. There was before, but now all operators
 
  control the order of their return arrays from the order of their input
 
  array.
  
 This is... overoptimistic. I would not rely on this in code that I wrote.
  
 It's true that many numpy operations do preserve the input order. But there 
 are also many operations that don't. And which are which often changes 
 between releases. (Not on purpose usually, but it's an easy bug to introduce. 
 And sometimes it is done intentionally, e.g. to make functions faster. It 
 sucks to have to make a function slower for everyone because someone 
 somewhere is depending on memory layout default details.) And there are 
 operations where it's not even clear what preserving order means (indexing a 
 C array with a Fortran array, add(C, fortran), ...), and even lots of 
 operations that intrinsically break contiguity/ordering (transpose, diagonal, 
 slicing, swapaxes, ...), so you will end up with mixed orderings one way or 
 another in any non-trivial program.
  
 Instead, it's better to explicitly specify order= just at the places where 
 you care. That way your code is *locally* correct (you can check it will work 
 by just reading the one function). The alternative is to try and enforce a 
 *global* property on your program (everyone everywhere is very careful to 
 only use contiguity-preserving operations, where everyone includes third 
 party libraries like numpy and others). In software design, local invariants 
 invariants are always better than global invariants -- the most well known 
 example is local variables versus global variables, but the principle is much 
 broader.
  
 -n
  
 
--
Kang Wang, Ph.D.
 Highland Ave., Room 1113
Madison, WI 53705-2275

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion