[Numpy-discussion] Numpy array slicing
Hello, (also sent to Scipy-User, sorry for duplicates). This is (I think) a rather basic question about numpy slicing. I have the following code: In [29]: a.shape Out[29]: (3, 4, 12288, 2) In [30]: mask.shape Out[30]: (3, 12288) In [31]: mask.dtype Out[31]: dtype('bool') In [32]: sum(mask[0]) Out[32]: 12285 In [33]: a[[0] + [slice(None)] + [mask[0]] + [slice(None)]].shape Out[33]: (12285, 4, 2) My question is: Why is not the final shape (4, 12285, 2) instead of (12285, 4, 2)? Eirik Gjerløw ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy array slicing
Ok, that was an enlightening discussion, I guess I signed up for this list a couple of days too late! Thanks, Eirik On 09. feb. 2012 12:55, Olivier Delalleau wrote: This was actually discussed very recently (for more details: http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060232.html). It's caused by mixing slicing with advanced indexing. The resulting shape is the concatenation of a first part obtained by broadcasting of the non-slice items (in your case, 0 and mask[0] being broadcasted to shape (12285,), the number of non-zero elements in mask[0]), followed by a second part obtained by the slice items (in your case extracting dimensions #1 and #3 of a, i.e. shape (4, 2)). So the final shape is (12285, 4, 2). -=- Olivier Le 9 février 2012 06:32, Eirik Gjerløw eirik.gjer...@astro.uio.no mailto:eirik.gjer...@astro.uio.no a écrit : Hello, (also sent to Scipy-User, sorry for duplicates). This is (I think) a rather basic question about numpy slicing. I have the following code: In [29]: a.shape Out[29]: (3, 4, 12288, 2) In [30]: mask.shape Out[30]: (3, 12288) In [31]: mask.dtype Out[31]: dtype('bool') In [32]: sum(mask[0]) Out[32]: 12285 In [33]: a[[0] + [slice(None)] + [mask[0]] + [slice(None)]].shape Out[33]: (12285, 4, 2) My question is: Why is not the final shape (4, 12285, 2) instead of (12285, 4, 2)? Eirik Gjerløw ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Cython question
Hi All, Does anyone know how to make Cython emit a C macro? I would like to be able to #define NO_DEPRECATED_API and can do so by including a header file or futzing with the generator script, but I was wondering if there was an easy way to do it in Cython. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] just the date part of a datetime64[s]?
Thanks Mark! John On Wed, Feb 8, 2012 at 6:48 PM, Mark Wiebe mwwi...@gmail.com wrote: Converting between date and datetime requires caution, because it depends on your time zone. Because all datetime64's are internally stored in UTC, simply casting as in your example treats it in UTC. The 'astype' function does not raise an error to tell you that this is problematic, because NumPy's default casting for that function has no error policy (yet). Here's the trouble you can get into: x = datetime64('2012-02-02 22:00:00', 's') x.astype('M8[D]') Out[19]: numpy.datetime64('2012-02-03') The trouble happens the other way too, because a date is represented as midnight UTC. This would also raise an exception, but for the fact that astype does no checking: x = datetime64('2012-02-02') x.astype('M8[m]') Out[23]: numpy.datetime64('2012-02-01T16:00-0800') The intention is to have functions which handles this casting explicitly, called datetime_as_date and date_as_datetime. They would take a timezone parameter, so the code explicitly specifies how the conversion takes place. A crude replacement for now is: x = datetime64('2012-02-02 22:00:00', 's') np.datetime64(np.datetime_as_string(x, timezone='local')[:10]) Out[21]: numpy.datetime64('2012-02-02') This is hackish, but it should do what you want. -Mark On Wed, Feb 8, 2012 at 9:10 AM, John Salvatier jsalv...@u.washington.eduwrote: Hello, is there a good way to get just the date part of a datetime64? Frequently datetime datatypes have month(), date(), hour(), etc functions that pull out part of the datetime, but I didn't see those mentioned in the datetime64 docs. Casting to a 'D' dtype didn't work as I would have hoped: In [30]: x= datetime64('2012-02-02 09:00:00', 's') In [31]: x Out[31]: numpy.datetime64('2012-02-02T09:00:00-0800') In [32]: x.astype('datetime64[D]').astype('datetime64[s]') Out[32]: numpy.datetime64('2012-02-01T16:00:00-0800') What's the simplest way to do this? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Moving to gcc 4.* for win32 installers ?
On 07.02.2012 18:38, Sturla Molden wrote: One potential problem I just discovered is dependency on a DLL called libpthreadGC2.dll. This is not correct!!! :-D Two threading APIs can be used for OpenBLAS/GotoBLAS2, Win32 threads or OpenMP. driver/others/blas_server_omp.c driver/others/blas_server_win32.c Simply build without telling OpenBLAS/GotoBLAS2 to use OpenMP (i.e. make without USE_OPENMP=1), and no dependency on libpthreadGC2.dll is ever made. OpenBLAS/GotoBLAS2 is thus a plain BSD licenced BLAS. I tried to compile OpenBLAS on my office computer. It did not know about Sandy Shore architecture so I had to tell it to use NEHALEM instead: $ make TARGET=NEHALEM This worked just fine :) Setup: - TDM-GCC 4.6.1 for x64 (install before MSYS) with gfortran. - MSYS (mingw-get-inst-2018.exe). During the MSYS install, deselect C compiler and select MinGw Developer ToolKit to get Perl. NB! OpenBLAS/GotoBLAS2 will not build without Perl in MSYS, you will get an error that says couldn't commit memory for cygwin heap. Never mind that OpenBLAS/GotoBLAS2 says you need Cygwin and Visual Studio, those are not needed. The DLL that is produced (OpenBLAS.dll) is linked against msvcrt.dll, not msvcr90.dll. Thus, don't use it with Python27, or at least don't share any CRT resources with it. The static library (libopenblas_nehalemp-r0.1alpha2.4.lib) is not linked with msvcrt.dll as far as I can tell, or any other library such as libgfortran. (This is the one we need for NumPy anyway, I think, David C. hates DLLs.) We will probably have to build one for all the different AMD and Intel architectures. If it is of interest for building NumPy, it seems the OpenBLAS DLL is linked with this sequence: -lgfortran -lmingw32 -lmoldname -lmingwex -lmsvcrt -lquadmath -lm I tried to build plain GotoBLAS2 as well... Using $ make BINARY=64 resulted this error: http://lists.freebsd.org/pipermail/freebsd-ports-bugs/2011-October/220422.html That is because GotoBLAS2 thinks Shandy Shore is Prescott, and then does something stupid... Thus: Building OpenBLAS with MinGW workes just fine (TDM-GCC with gfortran and MSYS DTK) and requires no configuration. Just type make and specity the CPU architecture, see the text file TargetList.txt. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.arange() error?
On Thu, Feb 9, 2012 at 12:20 PM, Drew Frank drewfr...@gmail.com wrote: Eric Firing efiring at hawaii.edu writes: On 02/08/2012 09:31 PM, teomat wrote: Hi, Am I wrong or the numpy.arange() function is not correct 100%? Try to do this: In [7]: len(np.arange(3.1, 4.9, 0.1)) Out[7]: 18 In [8]: len(np.arange(8.1, 9.9, 0.1)) Out[8]: 19 I would expect the same result for each command. Not after more experience with the wonders of floating point! Nice-looking decimal numbers often have long, drawn-out, inexact floating point (base 2) representations. That leads to exactly this sort of problem. numpy.linspace is provided to help get around some of these surprises; or you can use an integer sequence and then scale and shift it. Eric All the best I also found this surprising -- not because I lack experience with floating point, but because I do have experience with MATLAB. In MATLAB, the corresponding operation 3.1:0.1:4.9 has length 19 because of an explicit tolerance parameter used in the implmentation ( http://www.mathworks.com/support/solutions/en/data/1-4FLI96/index.html?solution=1-4FLI96 ). Of course, NumPy is not MATLAB :). That said, I prefer the MATLAB behavior in this case -- even though it has a bit of a magic feel to it, I find it hard to imagine code that operates correctly given the Python semantics and incorrectly under MATLAB's. Thoughts? Matlab didn't have integers, so they did the best they could ;) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.arange() error?
On 02/09/2012 09:20 AM, Drew Frank wrote: Eric Firingefiringat hawaii.edu writes: On 02/08/2012 09:31 PM, teomat wrote: Hi, Am I wrong or the numpy.arange() function is not correct 100%? Try to do this: In [7]: len(np.arange(3.1, 4.9, 0.1)) Out[7]: 18 In [8]: len(np.arange(8.1, 9.9, 0.1)) Out[8]: 19 I would expect the same result for each command. Not after more experience with the wonders of floating point! Nice-looking decimal numbers often have long, drawn-out, inexact floating point (base 2) representations. That leads to exactly this sort of problem. numpy.linspace is provided to help get around some of these surprises; or you can use an integer sequence and then scale and shift it. Eric All the best I also found this surprising -- not because I lack experience with floating point, but because I do have experience with MATLAB. In MATLAB, the corresponding operation 3.1:0.1:4.9 has length 19 because of an explicit tolerance parameter used in the implmentation (http://www.mathworks.com/support/solutions/en/data/1-4FLI96/index.html?solution=1-4FLI96). Of course, NumPy is not MATLAB :). That said, I prefer the MATLAB behavior in this case -- even though it has a bit of a magic feel to it, I find it hard to imagine code that operates correctly given the Python semantics and incorrectly under MATLAB's. Thoughts? You raise a good point. Neither arange nor linspace provides a close equivalent to the nice behavior of the Matlab colon, even though that is often what one really wants. Adding this, either via an arange kwarg, a linspace kwarg, or a new function, seems like a good idea. Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.arange() error?
Hi, On Thu, Feb 9, 2012 at 9:47 PM, Eric Firing efir...@hawaii.edu wrote: On 02/09/2012 09:20 AM, Drew Frank wrote: Eric Firingefiringat hawaii.edu writes: On 02/08/2012 09:31 PM, teomat wrote: Hi, Am I wrong or the numpy.arange() function is not correct 100%? Try to do this: In [7]: len(np.arange(3.1, 4.9, 0.1)) Out[7]: 18 In [8]: len(np.arange(8.1, 9.9, 0.1)) Out[8]: 19 I would expect the same result for each command. Not after more experience with the wonders of floating point! Nice-looking decimal numbers often have long, drawn-out, inexact floating point (base 2) representations. That leads to exactly this sort of problem. numpy.linspace is provided to help get around some of these surprises; or you can use an integer sequence and then scale and shift it. Eric All the best I also found this surprising -- not because I lack experience with floating point, but because I do have experience with MATLAB. In MATLAB, the corresponding operation 3.1:0.1:4.9 has length 19 because of an explicit tolerance parameter used in the implmentation ( http://www.mathworks.com/support/solutions/en/data/1-4FLI96/index.html?solution=1-4FLI96 ). Of course, NumPy is not MATLAB :). That said, I prefer the MATLAB behavior in this case -- even though it has a bit of a magic feel to it, I find it hard to imagine code that operates correctly given the Python semantics and incorrectly under MATLAB's. Thoughts? You raise a good point. Neither arange nor linspace provides a close equivalent to the nice behavior of the Matlab colon, even though that is often what one really wants. Adding this, either via an arange kwarg, a linspace kwarg, or a new function, seems like a good idea. Maybe this issue is raised also earlier, but wouldn't it be more consistent to let arange operate only with integers (like Python's range) and let linspace handle the floats as well? My 2 cents, eat Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.arange() error?
On Thursday, February 9, 2012, Sturla Molden stu...@molden.no wrote: Den 9. feb. 2012 kl. 22:44 skrev eat e.antero.ta...@gmail.com: Maybe this issue is raised also earlier, but wouldn't it be more consistent to let arange operate only with integers (like Python's range) and let linspace handle the floats as well? Perhaps. Another possibility would be to let arange take decimal arguments, possibly entered as text strings. Sturla Personally, I treat arange() to mean, give me a sequence of values from x to y, exclusive, with a specific step size. Nowhere in that statement does it guarantee a particular number of elements. Whereas linspace() means, give me a sequence of evenly spaced numbers from x to y, optionally inclusive, such that there are exactly N elements. They complement each other well. There are times when I intentionally will specify a range where the step size will not nicely fit. i.e.- np.arange(1, 7, 3.5). I wouldn't want this to change. My vote is that if users want matlab-colon-like behavior, we could make a new function - maybe erange() for exact range? Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.arange() error?
On Thu, Feb 9, 2012 at 3:40 PM, Benjamin Root ben.r...@ou.edu wrote: On Thursday, February 9, 2012, Sturla Molden stu...@molden.no wrote: Den 9. feb. 2012 kl. 22:44 skrev eat e.antero.ta...@gmail.com: Maybe this issue is raised also earlier, but wouldn't it be more consistent to let arange operate only with integers (like Python's range) and let linspace handle the floats as well? Perhaps. Another possibility would be to let arange take decimal arguments, possibly entered as text strings. Sturla Personally, I treat arange() to mean, give me a sequence of values from x to y, exclusive, with a specific step size. Nowhere in that statement does it guarantee a particular number of elements. Whereas linspace() means, give me a sequence of evenly spaced numbers from x to y, optionally inclusive, such that there are exactly N elements. They complement each other well. I agree -- both functions are useful and I think about them the same way. The unfortunate part is that tiny precision errors in y can make arange appear to be sometimes-exclusive rather than always exclusive. I've always imagined there to be a sort of duality between the two functions, where arange(low, high, step) == linspace(low, high-step, round((high-low)/step)) in cases where (high - low)/step is integral, but it turns out this is not the case. There are times when I intentionally will specify a range where the step size will not nicely fit. i.e.- np.arange(1, 7, 3.5). I wouldn't want this to change. Nor would I. What I meant to express earlier is that I like how Matlab addresses this particular class of floating point precision errors, not that I think arange output should somehow include both endpoints. My vote is that if users want matlab-colon-like behavior, we could make a new function - maybe erange() for exact range? Ben Root That could work; it would completely replace arange for me in every circumstance I can think of, but I understand we can't just go changing the behavior of core functions. Drew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] cumsum much slower than simple loop?
Why is numpy.cumsum (along axis=0) so much slower than a simple loop? The same goes for numpy.add.accumulate # cumsumtest.py import numpy as np def loopcumsum(a): csum = np.empty_like(a) s = 0.0 for i in range(len(a)): csum[i] = s = s + a[i] return csum npcumsum = lambda a: np.cumsum(a, axis=0) addaccum = lambda a: np.add.accumulate(a) shape = (100, 8, 512) a = np.arange(np.prod(shape), dtype='f').reshape(shape) # check that we get the same results print (npcumsum(a)==loopcumsum(a)).all() print (addaccum(a)==loopcumsum(a)).all() ipython session: In [1]: from cumsumtest import * True True In [2]: timeit npcumsum(a) 100 loops, best of 3: 14.7 ms per loop In [3]: timeit addaccum(a) 100 loops, best of 3: 15.4 ms per loop In [4]: timeit loopcumsum(a) 100 loops, best of 3: 2.16 ms per loop Dave Cook ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] cumsum much slower than simple loop?
On Thu, Feb 9, 2012 at 11:39 PM, Dave Cook dav...@gmail.com wrote: Why is numpy.cumsum (along axis=0) so much slower than a simple loop? The same goes for numpy.add.accumulate # cumsumtest.py import numpy as np def loopcumsum(a): csum = np.empty_like(a) s = 0.0 for i in range(len(a)): csum[i] = s = s + a[i] return csum npcumsum = lambda a: np.cumsum(a, axis=0) addaccum = lambda a: np.add.accumulate(a) shape = (100, 8, 512) a = np.arange(np.prod(shape), dtype='f').reshape(shape) # check that we get the same results print (npcumsum(a)==loopcumsum(a)).all() print (addaccum(a)==loopcumsum(a)).all() ipython session: In [1]: from cumsumtest import * True True In [2]: timeit npcumsum(a) 100 loops, best of 3: 14.7 ms per loop In [3]: timeit addaccum(a) 100 loops, best of 3: 15.4 ms per loop In [4]: timeit loopcumsum(a) 100 loops, best of 3: 2.16 ms per loop strange (if I didn't make a mistake) In [12]: timeit a.cumsum(0) 100 loops, best of 3: 7.17 ms per loop In [13]: timeit a.T.cumsum(-1).T 1000 loops, best of 3: 1.78 ms per loop In [14]: (a.T.cumsum(-1).T == a.cumsum(0)).all() Out[14]: True Josef Dave Cook ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] cumsum much slower than simple loop?
On Thu, Feb 9, 2012 at 9:21 PM, josef.p...@gmail.com wrote: strange (if I didn't make a mistake) In [12]: timeit a.cumsum(0) 100 loops, best of 3: 7.17 ms per loop In [13]: timeit a.T.cumsum(-1).T 1000 loops, best of 3: 1.78 ms per loop In [14]: (a.T.cumsum(-1).T == a.cumsum(0)).all() Out[14]: True Interesting. I should have mentioned that I'm using numpy 1.5.1 on 64-bit Ubuntu 10.10. This transpose/compute/transpose trick did not work for me. In [27]: timeit a.T.cumsum(-1).T 10 loops, best of 3: 18.3 ms per loop Dave Cook ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] cumsum much slower than simple loop?
numpy 1.6.1, OSX, Core 2 Duo: In [7]: timeit a.cumsum(0) 100 loops, best of 3: 6.67 ms per loop In [8]: timeit a.T.cumsum(-1).T 100 loops, best of 3: 6.75 ms per loop -E On Thu, Feb 9, 2012 at 9:51 PM, Dave Cook dav...@gmail.com wrote: On Thu, Feb 9, 2012 at 9:41 PM, Dave Cook dav...@gmail.com wrote: Interesting. I should have mentioned that I'm using numpy 1.5.1 on 64-bit Ubuntu 10.10. This transpose/compute/transpose trick did not work for me. Nor does it work under numpy 1.6.1 built with MKL under Windows 7 on a core i7. Dave Cook ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion