Re: [Numpy-discussion] Improving performance of the `numpy.any` function.

2021-04-14 Thread Brock Mendel
FWIW in https://github.com/pandas-dev/pandas/issues/32339 I tried short-circuiting (left == right).all() with a naive cython implementation. In the cases that _dont_ short-circuit, it was 2x slower than np.array_equal. On Wed, Apr 14, 2021 at 6:54 PM dan_patterson wrote: > a =

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Derek Homeier
On 15 Apr 2021, at 12:39 am, Robert Kern wrote: > > On Wed, Apr 14, 2021 at 6:16 PM Andrew Nelson wrote: > On Thu, 15 Apr 2021 at 07:15, Robert Kern wrote: > On Wed, Apr 14, 2021 at 4:37 PM Joachim Wuttke wrote: > Regarding numpy, I'd propose a bolder measure: > To let savetxt(fname, X, ...)

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Robert Kern
On Wed, Apr 14, 2021 at 6:16 PM Andrew Nelson wrote: > On Thu, 15 Apr 2021 at 07:15, Robert Kern wrote: > >> On Wed, Apr 14, 2021 at 4:37 PM Joachim Wuttke >> wrote: >> >>> Regarding numpy, I'd propose a bolder measure: >>> To let savetxt(fname, X, ...) store exactly the same information in

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Robert Kern
On Wed, Apr 14, 2021 at 6:20 PM Derek Homeier < de...@astro.physik.uni-goettingen.de> wrote: > On 14 Apr 2021, at 11:15 pm, Robert Kern wrote: > > > > On Wed, Apr 14, 2021 at 4:37 PM Joachim Wuttke > wrote: > > Regarding numpy, I'd propose a bolder measure: > > To let savetxt(fname, X, ...)

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Derek Homeier
On 14 Apr 2021, at 11:15 pm, Robert Kern wrote: > > On Wed, Apr 14, 2021 at 4:37 PM Joachim Wuttke wrote: > Regarding numpy, I'd propose a bolder measure: > To let savetxt(fname, X, ...) store exactly the same information in > compressed and uncompressed files, always invoke gzip with mtime =

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Andrew Nelson
On Thu, 15 Apr 2021 at 07:15, Robert Kern wrote: > On Wed, Apr 14, 2021 at 4:37 PM Joachim Wuttke > wrote: > >> Regarding numpy, I'd propose a bolder measure: >> To let savetxt(fname, X, ...) store exactly the same information in >> compressed and uncompressed files, always invoke gzip with

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Robert Kern
On Wed, Apr 14, 2021 at 4:37 PM Joachim Wuttke wrote: > Regarding numpy, I'd propose a bolder measure: > To let savetxt(fname, X, ...) store exactly the same information in > compressed and uncompressed files, always invoke gzip with mtime = 0. > I agree. > I would like to follow up with a

Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Andras Deak
On Wed, Apr 14, 2021 at 10:36 PM Joachim Wuttke wrote: > > If argument fname of savetxt(fname, X, ...) ends with ".gz" then > array X is not only converted to text, but also compressed using gzip. > > The format gzip [1] has a timestamp. The Python module gzip.py [2] > sets the timestamp

[Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Joachim Wuttke
If argument fname of savetxt(fname, X, ...) ends with ".gz" then array X is not only converted to text, but also compressed using gzip. The format gzip [1] has a timestamp. The Python module gzip.py [2] sets the timestamp according to an optional constructor argument "mtime". By default, the

[Numpy-discussion] Improving performance of the `numpy.any` function.

2021-04-14 Thread zoj613
Hi All, I was using numpy's `any` function earlier and realized that it might not be as performant as I assumed. See the code below: ``` In [1]: import numpy as np In [2]: a = np.zeros(1_000_000) In [3]: a[100] = 1 In [4]: b = np.zeros(2_000_000) In [5]: b[100] = 1 In [6]: %timeit np.any(a)