Re: [Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

2021-03-12 Thread Sebastian Berg
On Sat, 2021-03-13 at 00:33 +0100, PIERRE AUGIER wrote:
> Hi,
> 
> I tried to compile Numpy with `pip install numpy==1.20.1 --no-binary
> numpy --force-reinstall` and I can reproduce the regression.
> 
> Good news, I was able to reproduce the difference with only Numpy
> 1.20.1. 
> 
> Arrays prepared with (`df` is a Pandas dataframe)
> 
> arr = df.values.copy()
> 
> or 
> 
> arr = np.ascontiguousarray(df.values)
> 
> lead to "slow" execution while arrays prepared with
> 
> arr = np.copy(df.values)
> 
> lead to faster execution.
> 
> arr.copy() or np.copy(arr) do not give the same result, with arr
> obtained from a Pandas dataframe with arr = df.values. It's strange
> because type(df.values) gives  so I would
> expect arr.copy() and np.copy(arr) to give exactly the same result.

The only thing that can change would be the arrays flags and
`arr.strides`, but they should not have cahnged.  And there is no
change in NumPy that I can even remotely think of.  Array data is just
allocated with `malloc`.

That is: as I understand it, you are *not* timing `np.copy` or
`np.ascontiguouscopy` itself, but just operating on the array returned.
NumPy only ever uses `malloc` for allocating array content. 

> 
> Note that I think I'm doing quite serious and reproducible
> benchmarks. I also checked that this regression is reproducible on
> another computer.

I absolutely trust the benchmark results. I was hoping you might also
be running a profiler (as in analyze the running program) to find out
where the difference originate on the C side.  That would allow to say
with certainty either what changed or that there was no actual related
code change.

E.g. I have seen huge speed differences in the same `memcpy` or similar
calls, due to whatever reasons (maybe due to compiler changes, or due
to address space changes... or maybe the former causing the latter, I
don't know.).

Cheers,

Sebastian


> 
> Cheers,
> 
> Pierre
> 
> - Mail original -
> > De: "Sebastian Berg" 
> > À: "numpy-discussion" 
> > Envoyé: Vendredi 12 Mars 2021 22:50:24
> > Objet: Re: [Numpy-discussion] Looking for a difference between
> > Numpy 0.19.5 and 0.20 explaining a perf regression with
> > Pythran
> 
> > On Fri, 2021-03-12 at 21:36 +0100, PIERRE AUGIER wrote:
> > > Hi,
> > > 
> > > I'm looking for a difference between Numpy 0.19.5 and 0.20 which
> > > could explain a performance regression (~15 %) with Pythran.
> > > 
> > > I observe this regression with the script
> > > https://github.com/paugier/nbabel/blob/master/py/bench.py
> > > 
> > > Pythran reimplements Numpy so it is not about Numpy code for
> > > computation. However, Pythran of course uses the native array
> > > contained in a Numpy array. I'm quite sure that something has
> > > changed
> > > between Numpy 0.19.5 and 0.20 (or between the corresponding
> > > wheels?)
> > > since I don't get the same performance with Numpy 0.20. I checked
> > > that the values in the arrays are the same and that the flags
> > > characterizing the arrays are also the same.
> > > 
> > > Good news, I'm now able to obtain the performance difference just
> > > with Numpy 0.19.5. In this code, I load the data with Pandas and
> > > need
> > > to prepare contiguous Numpy arrays to give them to Pythran. With
> > > Numpy 0.19.5, if I use np.copy I get better performance that with
> > > np.ascontiguousarray. With Numpy 0.20, both functions create
> > > array
> > > giving the same performance with Pythran (again, less good that
> > > with
> > > Numpy 0.19.5).
> > > 
> > > Note that this code is very efficient (more that 100 times faster
> > > than using Numpy), so I guess that things like alignment or
> > > memory
> > > location can lead to such difference.
> > > 
> > > More details in this issue
> > > https://github.com/serge-sans-paille/pythran/issues/1735
> > > 
> > > Any help to understand what has changed would be greatly
> > > appreciated!
> > > 
> > 
> > If you want to really dig into this, it would be good to do
> > profiling
> > to find out at where the differences are.
> > 
> > Without that, I don't have much appetite to investigate personally.
> > The
> > reason is that fluctuations of ~30% (or even much more) when
> > running
> > the NumPy benchmarks are very common.
> > 
> > I am not aware of an immediate change in NumPy, especially since
> > you
> > are talking pythran, and only the memory space or the interface
> > code
> > should matter.
> > As to the interface code... I would expect it to be quite a bit
> > faster,
> > not slower.
> > There was no change around data allocation, so at best what you are
> > seeing is a different pattern in how the "small array cache" ends
> > up
> > being used.
> > 
> > 
> > Unfortunately, getting stable benchmarks that reflect code changes
> > exactly is tough...  Here is a nice blog post from Victor Stinner
> > where
> > he had to go as far as using "profile guided compilation" to avoid
> > fluctuations:
> > 
> >  
> > 

Re: [Numpy-discussion] Programmatically contracting multiple tensors

2021-03-12 Thread Andras Deak
On Sat, Mar 13, 2021 at 1:32 AM Eric Wieser  wrote:
>
> Einsum has a secret integer argument format that appears in the Examples 
> section of the `np.einsum` docs, but appears not to be mentioned at all in 
> the parameter listing.

It's mentioned (albeit somewhat cryptically) sooner in the Notes:

"einsum also provides an alternative way to provide the subscripts and
operands as einsum(op0, sublist0, op1, sublist1, ..., [sublistout]).
If the output shape is not provided in this format einsum will be
calculated in implicit mode, otherwise it will be performed
explicitly. The examples below have corresponding einsum calls with
the two parameter methods.

New in version 1.10.0."

Not that this helps much, because I definitely wouldn't understand
this API without the examples.
But I'm not sure _where_ this could be highlighted among the
parameters; after all this is all covered by the *operands parameter.

András



> Eric
>
> On Sat, 13 Mar 2021 at 00:25, Michael Lamparski  
> wrote:
>>
>> Greetings,
>>
>> I have something in my code where I can receive an array M of unknown 
>> dimensionality and a list of "labels" for each axis.  E.g. perhaps I might 
>> get an array of shape (2, 47, 3, 47, 3) with labels ['spin', 'atom', 
>> 'coord', 'atom', 'coord'].
>>
>> For every axis that is labeled "coord", I want to multiply in some rotation 
>> matrix R.  So, for the above example, this could be done with the following 
>> handwritten line:
>>
>> return np.einsum('Cc,Ee,abcde->abCdE', R, R, M)
>>
>> But since I want to do this programmatically, I find myself in the awkward 
>> situation of having to construct this string (and e.g. having to arbitrarily 
>> limit the number of axes to 26 or something like that).  Is there a more 
>> idiomatic way to do this that would let me supply integer labels for 
>> summation indices?  Or should I just bite the bullet and start generating 
>> strings?
>>
>> ---
>> Michael
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Programmatically contracting multiple tensors

2021-03-12 Thread Eric Wieser
Einsum has a secret integer argument format that appears in the Examples
section of the `np.einsum` docs, but appears not to be mentioned at all in
the parameter listing.

Eric

On Sat, 13 Mar 2021 at 00:25, Michael Lamparski 
wrote:

> Greetings,
>
> I have something in my code where I can receive an array M of unknown
> dimensionality and a list of "labels" for each axis.  E.g. perhaps I might
> get an array of shape (2, 47, 3, 47, 3) with labels ['spin', 'atom',
> 'coord', 'atom', 'coord'].
>
> For every axis that is labeled "coord", I want to multiply in some
> rotation matrix R.  So, for the above example, this could be done with the
> following handwritten line:
>
> return np.einsum('Cc,Ee,abcde->abCdE', R, R, M)
>
> But since I want to do this programmatically, I find myself in the awkward
> situation of having to construct this string (and e.g. having to
> arbitrarily limit the number of axes to 26 or something like that).  Is
> there a more idiomatic way to do this that would let me supply integer
> labels for summation indices?  Or should I just bite the bullet and start
> generating strings?
>
> ---
> Michael
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Programmatically contracting multiple tensors

2021-03-12 Thread Michael Lamparski
Greetings,

I have something in my code where I can receive an array M of unknown
dimensionality and a list of "labels" for each axis.  E.g. perhaps I might
get an array of shape (2, 47, 3, 47, 3) with labels ['spin', 'atom',
'coord', 'atom', 'coord'].

For every axis that is labeled "coord", I want to multiply in some rotation
matrix R.  So, for the above example, this could be done with the following
handwritten line:

return np.einsum('Cc,Ee,abcde->abCdE', R, R, M)

But since I want to do this programmatically, I find myself in the awkward
situation of having to construct this string (and e.g. having to
arbitrarily limit the number of axes to 26 or something like that).  Is
there a more idiomatic way to do this that would let me supply integer
labels for summation indices?  Or should I just bite the bullet and start
generating strings?

---
Michael
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

2021-03-12 Thread Eric Firing

On 2021/03/12 1:33 PM, PIERRE AUGIER wrote:

arr.copy() or np.copy(arr) do not give the same result, with arr obtained from a 
Pandas dataframe with arr = df.values. It's strange because type(df.values) gives 
 so I would expect arr.copy() and np.copy(arr) to give 
exactly the same result.


According to the docstrings for numpy.copy and arr.copy, the function 
and the method have different defaults for the memory layout.  np.copy() 
tries to maintain the order of the original while arr.copy() defaults to 
C order.


Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

2021-03-12 Thread PIERRE AUGIER
Hi,

I tried to compile Numpy with `pip install numpy==1.20.1 --no-binary numpy 
--force-reinstall` and I can reproduce the regression.

Good news, I was able to reproduce the difference with only Numpy 1.20.1. 

Arrays prepared with (`df` is a Pandas dataframe)

arr = df.values.copy()

or 

arr = np.ascontiguousarray(df.values)

lead to "slow" execution while arrays prepared with

arr = np.copy(df.values)

lead to faster execution.

arr.copy() or np.copy(arr) do not give the same result, with arr obtained from 
a Pandas dataframe with arr = df.values. It's strange because type(df.values) 
gives  so I would expect arr.copy() and np.copy(arr) to 
give exactly the same result.

Note that I think I'm doing quite serious and reproducible benchmarks. I also 
checked that this regression is reproducible on another computer.

Cheers,

Pierre

- Mail original -
> De: "Sebastian Berg" 
> À: "numpy-discussion" 
> Envoyé: Vendredi 12 Mars 2021 22:50:24
> Objet: Re: [Numpy-discussion] Looking for a difference between Numpy 0.19.5 
> and 0.20 explaining a perf regression with
> Pythran

> On Fri, 2021-03-12 at 21:36 +0100, PIERRE AUGIER wrote:
>> Hi,
>> 
>> I'm looking for a difference between Numpy 0.19.5 and 0.20 which
>> could explain a performance regression (~15 %) with Pythran.
>> 
>> I observe this regression with the script
>> https://github.com/paugier/nbabel/blob/master/py/bench.py
>> 
>> Pythran reimplements Numpy so it is not about Numpy code for
>> computation. However, Pythran of course uses the native array
>> contained in a Numpy array. I'm quite sure that something has changed
>> between Numpy 0.19.5 and 0.20 (or between the corresponding wheels?)
>> since I don't get the same performance with Numpy 0.20. I checked
>> that the values in the arrays are the same and that the flags
>> characterizing the arrays are also the same.
>> 
>> Good news, I'm now able to obtain the performance difference just
>> with Numpy 0.19.5. In this code, I load the data with Pandas and need
>> to prepare contiguous Numpy arrays to give them to Pythran. With
>> Numpy 0.19.5, if I use np.copy I get better performance that with
>> np.ascontiguousarray. With Numpy 0.20, both functions create array
>> giving the same performance with Pythran (again, less good that with
>> Numpy 0.19.5).
>> 
>> Note that this code is very efficient (more that 100 times faster
>> than using Numpy), so I guess that things like alignment or memory
>> location can lead to such difference.
>> 
>> More details in this issue
>> https://github.com/serge-sans-paille/pythran/issues/1735
>> 
>> Any help to understand what has changed would be greatly appreciated!
>> 
> 
> If you want to really dig into this, it would be good to do profiling
> to find out at where the differences are.
> 
> Without that, I don't have much appetite to investigate personally. The
> reason is that fluctuations of ~30% (or even much more) when running
> the NumPy benchmarks are very common.
> 
> I am not aware of an immediate change in NumPy, especially since you
> are talking pythran, and only the memory space or the interface code
> should matter.
> As to the interface code... I would expect it to be quite a bit faster,
> not slower.
> There was no change around data allocation, so at best what you are
> seeing is a different pattern in how the "small array cache" ends up
> being used.
> 
> 
> Unfortunately, getting stable benchmarks that reflect code changes
> exactly is tough...  Here is a nice blog post from Victor Stinner where
> he had to go as far as using "profile guided compilation" to avoid
> fluctuations:
> 
> https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html
> 
> I somewhat hope that this is also the reason for the huge fluctuations
> we see in the NumPy benchmarks due to absolutely unrelated code
> changes.
> But I did not have the energy to try it (and a probably fixed bug in
> gcc makes it a bit harder right now).
> 
> Cheers,
> 
> Sebastian
> 
> 
> 
> 
>> Cheers,
>> Pierre
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

2021-03-12 Thread Sebastian Berg
On Fri, 2021-03-12 at 21:36 +0100, PIERRE AUGIER wrote:
> Hi,
> 
> I'm looking for a difference between Numpy 0.19.5 and 0.20 which
> could explain a performance regression (~15 %) with Pythran.
> 
> I observe this regression with the script 
> https://github.com/paugier/nbabel/blob/master/py/bench.py
> 
> Pythran reimplements Numpy so it is not about Numpy code for
> computation. However, Pythran of course uses the native array
> contained in a Numpy array. I'm quite sure that something has changed
> between Numpy 0.19.5 and 0.20 (or between the corresponding wheels?)
> since I don't get the same performance with Numpy 0.20. I checked
> that the values in the arrays are the same and that the flags
> characterizing the arrays are also the same.
> 
> Good news, I'm now able to obtain the performance difference just
> with Numpy 0.19.5. In this code, I load the data with Pandas and need
> to prepare contiguous Numpy arrays to give them to Pythran. With
> Numpy 0.19.5, if I use np.copy I get better performance that with
> np.ascontiguousarray. With Numpy 0.20, both functions create array
> giving the same performance with Pythran (again, less good that with
> Numpy 0.19.5).
> 
> Note that this code is very efficient (more that 100 times faster
> than using Numpy), so I guess that things like alignment or memory
> location can lead to such difference.
> 
> More details in this issue 
> https://github.com/serge-sans-paille/pythran/issues/1735
> 
> Any help to understand what has changed would be greatly appreciated!
> 

If you want to really dig into this, it would be good to do profiling
to find out at where the differences are.

Without that, I don't have much appetite to investigate personally. The
reason is that fluctuations of ~30% (or even much more) when running
the NumPy benchmarks are very common.

I am not aware of an immediate change in NumPy, especially since you
are talking pythran, and only the memory space or the interface code
should matter.
As to the interface code... I would expect it to be quite a bit faster,
not slower.
There was no change around data allocation, so at best what you are
seeing is a different pattern in how the "small array cache" ends up
being used.


Unfortunately, getting stable benchmarks that reflect code changes
exactly is tough...  Here is a nice blog post from Victor Stinner where
he had to go as far as using "profile guided compilation" to avoid
fluctuations:

https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html

I somewhat hope that this is also the reason for the huge fluctuations
we see in the NumPy benchmarks due to absolutely unrelated code
changes.
But I did not have the energy to try it (and a probably fixed bug in
gcc makes it a bit harder right now).

Cheers,

Sebastian




> Cheers,
> Pierre
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Documentation Team meeting - Monday March 15 (Beware of Daylight Saving Time!)

2021-03-12 Thread Melissa Mendonça
Hi all!

Our next Documentation Team meeting will be on *Monday, March 15* at ***4PM
UTC*** (This has probably changed for you if you have recently gone through
a DST change).

All are welcome - you don't need to already be a contributor to join. If
you have questions or are curious about what we're doing, we'll be happy to
meet you!

If you wish to join on Zoom, use this link:

https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09#success

Here's the permanent hackmd document with the meeting notes (still being
updated in the next few days!):

https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg


Hope to see you around!

** You can click this link to get the correct time at your timezone:
https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting=20210315T16=1440=1

- Melissa
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

2021-03-12 Thread PIERRE AUGIER
Hi,

I'm looking for a difference between Numpy 0.19.5 and 0.20 which could explain 
a performance regression (~15 %) with Pythran.

I observe this regression with the script 
https://github.com/paugier/nbabel/blob/master/py/bench.py

Pythran reimplements Numpy so it is not about Numpy code for computation. 
However, Pythran of course uses the native array contained in a Numpy array. 
I'm quite sure that something has changed between Numpy 0.19.5 and 0.20 (or 
between the corresponding wheels?) since I don't get the same performance with 
Numpy 0.20. I checked that the values in the arrays are the same and that the 
flags characterizing the arrays are also the same.

Good news, I'm now able to obtain the performance difference just with Numpy 
0.19.5. In this code, I load the data with Pandas and need to prepare 
contiguous Numpy arrays to give them to Pythran. With Numpy 0.19.5, if I use 
np.copy I get better performance that with np.ascontiguousarray. With Numpy 
0.20, both functions create array giving the same performance with Pythran 
(again, less good that with Numpy 0.19.5).

Note that this code is very efficient (more that 100 times faster than using 
Numpy), so I guess that things like alignment or memory location can lead to 
such difference.

More details in this issue 
https://github.com/serge-sans-paille/pythran/issues/1735

Any help to understand what has changed would be greatly appreciated!

Cheers,
Pierre
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion