Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Wed, Jun 9, 2010 at 4:16 PM, Francesc Alted fal...@pytables.org wrote: A Tuesday 08 June 2010 23:34:09 Anne Archibald escrigué: But the issue isn't one of efficiency, it's merely an arbitrarily chosen convention. (Does anyone know the history of the choices for FORTRAN and C, esp. why KR chose the opposite of what was already in common usage in FORTRAN? Just curious?) This is speculation, not knowledge, but it's worth pointing out that there are actually two ways to represent a multidimensional array in C: as a block of memory with appropriate type definitions, or as an array of pointers to subarrays. This latter approach is generally not used for numerical work, but is potentially useful for other applications. More relevantly, it already has a natural syntax; a[2][3][5] naturally follows the chain of pointers and gives you what you want; it also forces your last index to change most rapidly as you walk through memory. So it would be very odd if multidimensional arrays defined without pointers but using the same syntax were indexed the other way around. (Let's ignore abominations like 5[3[2[a]]].) Hey, maybe it is only speculation, but this is the most convincing argument for breaking Fortran convention that I've ever heard I think that arrays are just syntax on pointer is indeed the key reason for how C works here. Since a[b] really means a + b (which is why 5[a] and a[5] are the same), I don't see how to do it differently. (although I'm not sure if C was really breaking Fortran convention, as both languages should have born more or less in time, although I'd say that Fortran is a bit older). Fortran is the oldest language I am aware of - certainly the oldest still widely in use. it is even older than Lisp, the first version is from 1956-57, and was proposed by Backus to IBM in 53 according to wikipedia. It was created at a time where many people thought the very idea of a compiler did not make any sense and was impossible. So yes, Fortran is *much* older than C. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
I think that arrays are just syntax on pointer is indeed the key reason for how C works here. Since a[b] really means a + b (which is why 5[a] and a[5] are the same), I don't see how to do it differently. Holy crap! You can do that in C?! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Thu, Jun 10, 2010 at 12:09 AM, Benjamin Root ben.r...@ou.edu wrote: I think that arrays are just syntax on pointer is indeed the key reason for how C works here. Since a[b] really means a + b (which is why 5[a] and a[5] are the same), I don't see how to do it differently. Holy crap! You can do that in C?! Yes: #include stdio.h int main() { float a[2] = {1.0, 2.0}; printf(%f %f %f\n, a[1], *(a+1), 1[a]); } ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Wed, Jun 9, 2010 at 9:00 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 10, 2010 at 12:09 AM, Benjamin Root ben.r...@ou.edu wrote: I think that arrays are just syntax on pointer is indeed the key reason for how C works here. Since a[b] really means a + b (which is why 5[a] and a[5] are the same), I don't see how to do it differently. Holy crap! You can do that in C?! Yes: #include stdio.h int main() { float a[2] = {1.0, 2.0}; printf(%f %f %f\n, a[1], *(a+1), 1[a]); } This is all _very_ educational (and I mean that sincerely), but can we please get back to the topic at hand ( :-) ). A specific proposal is on the table: we remove discussion of the whole C/Fortran ordering issue from basics.indexing.rst and promote it to a more advanced document TBD. DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Mon, Jun 7, 2010 at 4:52 AM, Pavel Bazant maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Pavel Sounds correct (your criticism, that is) but I'm no expert, so I'm going to wait another 12 hours or so - to give others a chance to chime in - before correcting it. DG -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Mon, Jun 7, 2010 at 5:52 AM, Pavel Bazant maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Tue, Jun 8, 2010 at 9:27 AM, Pavel Bazant maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Good point, I can see that that could be a source of potential confusion. Perhaps something along the lines that 1) the memory is in one contiguous slab, and 2) it is accessed in order by changing the rightmost indices fastest, would be better. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Still have the feeling the doc is very misleading at this important issue. Pavel The distinction between your two perspectives is that one is using for-loop traversal of indices, the other is using pointer-increment traversal of memory; from each of your perspectives, your conclusions are correct, but my inclination is that the pointer-increment traversal of memory perspective is closer to the spirit of the docstring, no? DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On 06/08/2010 05:50 AM, Charles R Harris wrote: On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmith d.l.goldsm...@gmail.com mailto:d.l.goldsm...@gmail.com wrote: On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant maxpla...@seznam.cz mailto:maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Still have the feeling the doc is very misleading at this important issue. Pavel The distinction between your two perspectives is that one is using for-loop traversal of indices, the other is using pointer-increment traversal of memory; from each of your perspectives, your conclusions are correct, but my inclination is that the pointer-increment traversal of memory perspective is closer to the spirit of the docstring, no? I think the confusion is in most rapidly changing memory location, which is kind of ambiguous because a change in the indices is always a change in memory location if one hasn't used index tricks and such. So from a time perspective it means nothing, while from a memory perspective the largest address changes come from the leftmost indices. Exactly. Rate of change with respect to what, or as you do what? I suggest something like the following wording, if you don't mind the verbosity as a means of conjuring up an image (although putting in diagrams would make it even clearer--undoubtedly there are already good illustrations somewhere on the web): Note to those used to Matlab, IDL, or Fortran memory order as it relates to indexing. Numpy uses C-order indexing by default, although a numpy array can be designated as using Fortran order. [With C-order, sequential memory locations are accessed by incrementing the last index.] For a two-dimensional array, think if it as a table. With C-order indexing the table is stored as a series of rows, so that one is reading from left to right, incrementing the column (last) index, and jumping ahead in memory to the next row by incrementing the row (first) index. With Fortran order, the table is stored as a series of columns, so one reads memory sequentially from top to bottom, incrementing the first index, and jumps ahead in memory to the next column by incrementing the last index. One more difference to be aware of: numpy, like python and C, uses zero-based indexing; Matlab, [IDL???], and Fortran start from one. - If you want to keep it short, the key wording is in the sentence in brackets, and you can chop out the table illustration. Eric Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On 06/08/2010 08:16 AM, Eric Firing wrote: On 06/08/2010 05:50 AM, Charles R Harris wrote: On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmithd.l.goldsm...@gmail.com mailto:d.l.goldsm...@gmail.com wrote: On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazantmaxpla...@seznam.cz mailto:maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Still have the feeling the doc is very misleading at this important issue. Pavel The distinction between your two perspectives is that one is using for-loop traversal of indices, the other is using pointer-increment traversal of memory; from each of your perspectives, your conclusions are correct, but my inclination is that the pointer-increment traversal of memory perspective is closer to the spirit of the docstring, no? I think the confusion is in most rapidly changing memory location, which is kind of ambiguous because a change in the indices is always a change in memory location if one hasn't used index tricks and such. So from a time perspective it means nothing, while from a memory perspective the largest address changes come from the leftmost indices. Exactly. Rate of change with respect to what, or as you do what? I suggest something like the following wording, if you don't mind the verbosity as a means of conjuring up an image (although putting in diagrams would make it even clearer--undoubtedly there are already good illustrations somewhere on the web): Note to those used to Matlab, IDL, or Fortran memory order as it relates to indexing. Numpy uses C-order indexing by default, although a numpy array can be designated as using Fortran order. [With C-order, sequential memory locations are accessed by incrementing the last Maybe change sequential to contiguous. index.] For a two-dimensional array, think if it as a table. With C-order indexing the table is stored as a series of rows, so that one is reading from left to right, incrementing the column (last) index, and jumping ahead in memory to the next row by incrementing the row (first) index. With Fortran order, the table is stored as a series of columns, so one reads memory sequentially from top to bottom, incrementing the first index, and jumps ahead in memory to the next column by incrementing the last index. One more difference to be aware of: numpy, like python and C, uses zero-based indexing; Matlab, [IDL???], and Fortran start from one. - If you want to keep it short, the key wording is in the sentence in brackets, and you can chop out the table illustration. Eric Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On 8 June 2010 14:16, Eric Firing efir...@hawaii.edu wrote: On 06/08/2010 05:50 AM, Charles R Harris wrote: On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmith d.l.goldsm...@gmail.com mailto:d.l.goldsm...@gmail.com wrote: On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant maxpla...@seznam.cz mailto:maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Still have the feeling the doc is very misleading at this important issue. Pavel The distinction between your two perspectives is that one is using for-loop traversal of indices, the other is using pointer-increment traversal of memory; from each of your perspectives, your conclusions are correct, but my inclination is that the pointer-increment traversal of memory perspective is closer to the spirit of the docstring, no? I think the confusion is in most rapidly changing memory location, which is kind of ambiguous because a change in the indices is always a change in memory location if one hasn't used index tricks and such. So from a time perspective it means nothing, while from a memory perspective the largest address changes come from the leftmost indices. Exactly. Rate of change with respect to what, or as you do what? I suggest something like the following wording, if you don't mind the verbosity as a means of conjuring up an image (although putting in diagrams would make it even clearer--undoubtedly there are already good illustrations somewhere on the web): Note to those used to Matlab, IDL, or Fortran memory order as it relates to indexing. Numpy uses C-order indexing by default, although a numpy array can be designated as using Fortran order. [With C-order, sequential memory locations are accessed by incrementing the last index.] For a two-dimensional array, think if it as a table. With C-order indexing the table is stored as a series of rows, so that one is reading from left to right, incrementing the column (last) index, and jumping ahead in memory to the next row by incrementing the row (first) index. With Fortran order, the table is stored as a series of columns, so one reads memory sequentially from top to bottom, incrementing the first index, and jumps ahead in memory to the next column by incrementing the last index. One more difference to be aware of: numpy, like python and C, uses zero-based indexing; Matlab, [IDL???], and Fortran start from one. - If you want to keep it short, the key wording is in the sentence in brackets, and you can chop out the table illustration. I'd just like to point out a few warnings to keep in mind while rewriting this section: Numpy arrays can have any configuration of memory strides, including some that are zero; C and Fortran contiguous arrays are simply those that have special arrangements of the strides. The actual stride values is normally almost irrelevant to python code. There is a second meaning of C and Fortran order: when you are reshaping an array, you can specify one order or the
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On Tue, Jun 8, 2010 at 12:05 PM, Anne Archibald aarch...@physics.mcgill.cawrote: On 8 June 2010 14:16, Eric Firing efir...@hawaii.edu wrote: On 06/08/2010 05:50 AM, Charles R Harris wrote: On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmith d.l.goldsm...@gmail.com mailto:d.l.goldsm...@gmail.com wrote: On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant maxpla...@seznam.cz mailto:maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Still have the feeling the doc is very misleading at this important issue. Pavel The distinction between your two perspectives is that one is using for-loop traversal of indices, the other is using pointer-increment traversal of memory; from each of your perspectives, your conclusions are correct, but my inclination is that the pointer-increment traversal of memory perspective is closer to the spirit of the docstring, no? I think the confusion is in most rapidly changing memory location, which is kind of ambiguous because a change in the indices is always a change in memory location if one hasn't used index tricks and such. So from a time perspective it means nothing, while from a memory perspective the largest address changes come from the leftmost indices. Exactly. Rate of change with respect to what, or as you do what? I suggest something like the following wording, if you don't mind the verbosity as a means of conjuring up an image (although putting in diagrams would make it even clearer--undoubtedly there are already good illustrations somewhere on the web): Note to those used to Matlab, IDL, or Fortran memory order as it relates to indexing. Numpy uses C-order indexing by default, although a numpy array can be designated as using Fortran order. [With C-order, sequential memory locations are accessed by incrementing the last index.] For a two-dimensional array, think if it as a table. With C-order indexing the table is stored as a series of rows, so that one is reading from left to right, incrementing the column (last) index, and jumping ahead in memory to the next row by incrementing the row (first) index. With Fortran order, the table is stored as a series of columns, so one reads memory sequentially from top to bottom, incrementing the first index, and jumps ahead in memory to the next column by incrementing the last index. One more difference to be aware of: numpy, like python and C, uses zero-based indexing; Matlab, [IDL???], and Fortran start from one. - If you want to keep it short, the key wording is in the sentence in brackets, and you can chop out the table illustration. I'd just like to point out a few warnings to keep in mind while rewriting this section: Numpy arrays can have any configuration of memory strides, including some that are zero; C and Fortran contiguous arrays are simply those that have special arrangements of
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
On 8 June 2010 17:17, David Goldsmith d.l.goldsm...@gmail.com wrote: On Tue, Jun 8, 2010 at 1:56 PM, Benjamin Root ben.r...@ou.edu wrote: On Tue, Jun 8, 2010 at 1:36 PM, Eric Firing efir...@hawaii.edu wrote: On 06/08/2010 08:16 AM, Eric Firing wrote: On 06/08/2010 05:50 AM, Charles R Harris wrote: On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmithd.l.goldsm...@gmail.com mailto:d.l.goldsm...@gmail.com wrote: On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazantmaxpla...@seznam.cz mailto:maxpla...@seznam.cz wrote: Correct me if I am wrong, but the paragraph Note to those used to IDL or Fortran memory order as it relates to indexing. Numpy uses C-order indexing. That means that the last index usually (see xxx for exceptions) represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion. in http://docs.scipy.org/doc/numpy/user/basics.indexing.html is quite misleading, as C-order means that the last index changes rapidly, not the memory location. Any index can change rapidly, depending on whether is in an inner loop or not. The important distinction between C and Fortran order is how indices translate to memory locations. The documentation seems correct to me, although it might make more sense to say the last index addresses a contiguous range of memory. Of course, with modern processors, actual physical memory can be mapped all over the place. Chuck To me, saying that the last index represents the most rapidly changing memory location means that if I change the last index, the memory location changes a lot, which is not true for C-order. So for C-order, supposed one scans the memory linearly (the desired scenario), it is the last *index* that changes most rapidly. The inverted picture looks like this: For C-order, changing the first index leads to the most rapid jump in *memory*. Still have the feeling the doc is very misleading at this important issue. Pavel The distinction between your two perspectives is that one is using for-loop traversal of indices, the other is using pointer-increment traversal of memory; from each of your perspectives, your conclusions are correct, but my inclination is that the pointer-increment traversal of memory perspective is closer to the spirit of the docstring, no? I think the confusion is in most rapidly changing memory location, which is kind of ambiguous because a change in the indices is always a change in memory location if one hasn't used index tricks and such. So from a time perspective it means nothing, while from a memory perspective the largest address changes come from the leftmost indices. Exactly. Rate of change with respect to what, or as you do what? I suggest something like the following wording, if you don't mind the verbosity as a means of conjuring up an image (although putting in diagrams would make it even clearer--undoubtedly there are already good illustrations somewhere on the web): Note to those used to Matlab, IDL, or Fortran memory order as it relates to indexing. Numpy uses C-order indexing by default, although a numpy array can be designated as using Fortran order. [With C-order, sequential memory locations are accessed by incrementing the last Maybe change sequential to contiguous. I was thinking maybe subsequent might be a better word. IMV, contiguous has more of a physical connotation. (That just isn't valid in Numpy, correct?) So I'd prefer subsequent as an alternative to sequential. In the end, we need to communicate this clearly. No matter which language, I have always found it difficult to get new programmers to understand the importance of knowing the difference between row-major and column-major. A thick paragraph isn't going to help to get the idea across to a person who doesn't even know that a problem exists. Maybe a car analogy would be good here... Maybe if one imagine city streets (where many of the streets are one-way), and need to drop off mail at each address. Would it be more efficient to go up and back a street or to drop off mail at the
Re: [Numpy-discussion] C vs. Fortran order -- misleading documentation?
2010/6/8 Anne Archibald aarch...@physics.mcgill.ca: Numpy arrays can have any configuration of memory strides, including some that are zero; C and Fortran contiguous arrays are simply those that have special arrangements of the strides. The actual stride values is normally almost irrelevant to python code. First, I don't see the point why this text made it's way to this doc page at all - it's all abstract Python numpy indexing on that page as far as I can see - I don't know why a beginner should worry about strides and how the linear memory is actually organised - from my point of view I never did that. To resolve the problem, and to avoid the confusion about the fast and slow, why not using directly the concept of strides as Anne pointed out. Simply saying that: When an array is indiced with indices (i1, i2, i3, ... in) and the array has strides (s1, s2, s3, ..., sn), the memory location addressed is: i1 * s1 + i2 * s2 + ... + in * sn relative to the base point (and up to dtype). I hope I'm not wrong here. For C order, s1 = s2 = ... = sn = 1 , for fortran order the other way round. Friedrich In fact it holds: s(k-1) = sk * Nk for C order, s(k+1) = sk * Nk for fortran order, where Nk is the length of dimension k. If I'm not mistaken, it's late at night. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion