On Sun, Dec 25, 2016 at 12:53 PM, Mikhail V <mikhail...@gmail.com> wrote:

> ​On Sat, Dec 24, 2016 at 5:12 PM, Mikhail V <mikhail...@gmail.com> wrote:
>>
>>> Probably there is more criterias here that I am not aware of
>>> and objective arguments to prefer "FORTRAN" order, apart
>>> from having more traditional [x,y] notation?
>>>
>> ​The argument I think comes from building/slicing matrices out of
>> (column) vectors. You see this a lot in numerical work. If the row is of
>> pointers, you can build sparse systems that reference underlying vector
>> without doing any copying (you can do this with row data instead, but then
>> you need row vectors, and that would be morally wrong). This is important
>> since building sparse systems can be very slow if you're not careful.
>>
>> I still avoid FORTRAN order because it's not mathy. E.g., the matrix
>> element "a_{0,2}" should be accessed as "a[0][2]". For an objective
>> argument, I'll note that graphics hardware--in particular VGA/VBE hardware,
>> which influenced latter standards, e.g. HDMI--is row-major, top-to-bottom
>> raster order. This has been hugely influential, and is more-or-less
>> expected today by graphics programmers. It explains everything from most
>> windowing systems today having GUI controls at the top and left, to why GL
>> takes padded scanlines as texture input.
>>
>> One way or another, at this point, changing the order in PyGame is
>> probably a bad idea (backwards compatibility and suchlike). At the very
>> least, it would needs to be deferred to a major update with breaking API
>> changes.
>>
>
> So you kind of agree, that surfarray/pixelcopy should better deal with C
> order?
>
​Definitely.
​


> I am curious, if it is worth proposing adding methods which do so.
> I agree, one should not touch the existing API.
>
> Now I have tested the performance one more time, namely
> comparing 3 variants to copy data from array to surface:
> 1.     buf = Dest.get_buffer()
>         buf.write(Src.tostring(), 0)
> 2.     pygame.pixelcopy.array_to_surface(Dest, Src)
> 3.     pygame.pixelcopy.array_to_surface(Dest, Src.T)
>
> And it turned out that I was wrong about transpose being expensive.
> Actually transpose itself does not add significant overhead. First time
> I was testing it, I did something wrong.
>
> For method 2. if I define order="FORTRAN" for original array,
> there is no difference in comparison to 3. But if I leave default (C)
> order then the performance degrades with bigger arrays
> (ca. 20% slower by 800x600 8bit array).
> So it is indeed important thing.
>
​Makes sense. For bigger arrays, caching becomes more important in the
copying, and implicit transposes of the order mean you thrash on reading.​


> Most interesting that 1. method with buffer write seems to be always
> faster
> than others, by ca. 5%. Not a big win, but still interesting...
> And if I try it with FORTRAN order, it becomes 2 times slower!
>
​I'm not sure I fully parse what you're doing here. As long as it's safe,
copying buffers should be slightly faster since it's 1D--maybe the buffer
API is smart enough to step in larger chunks that might potentially
straddle a scanline, and you also have one fewer loop variable. When you
try it with FORTRAN order, to produce a buffer of the same format would
require an allocation and then a copy, so that's probably why it's slower.

The NumPy internals
<https://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues>
has salient things to say on this issue.

So I would still look forward to having methods dealing with C order,
> just to avoid writing extra transposing and full compliance
> with default numpy notation.
>
> Any comments or opinions about it?
> It would be good to know first, which of those things
> people use more often and make some use case examples.
>
​Personally, I would like C order just because it's "expected" in
graphics*. Under this assumption, I wrote all my code e.g. looping over "y"
first, using the buffer API for GL interop, etc. This is optimal in the C
order every graphics programmer would expect, but in FORTRAN order, it's
*exactly* wrong. I never profiled both options because it's a nearly
fundamental assumption.

I mean, it's not terribly important. Python is not a fast language. One
writes stuff in Python because your program running 5x/50x slower is a
non-issue and you want the expressivity. But free perf is free, so it's a
bit annoying.

*In the interest of fairness, it should be noted that there is an offshoot
of image processing (a subset of graphics) that might disagree. They're
very FORTRAN-y, using langs with 1-based indexing and both array orders.
They also tend to be non-CS/non-math types who work in industry, generating
appalling code.

Mikhail
>
​Ian​

Reply via email to