Re: [Numpy-discussion] how to work with numpy.int8 in c
Francesc Alted wrote: A Wednesday 03 March 2010 02:58:31 David Cournapeau escrigué: PyObject *ret; PyArray_Descr *typecode; typecode = PyArray_DescrFromType(PyArray_UINT8); ret = PyArray_Scalar(NULL, typecode, NULL); Py_DECREF(typecode); Sorry, this is wrong, this does not work on my machine, but I am not sure to understand why. Well, at least the next works in Cython: cdef npy_int8 next cdef dtype int8 cdef object current [...] int8 = PyArray_DescrFromType(NPY_INT8) current = PyArray_Scalar(next, int8, None) Yes, what does not work is the Py_DECREF on typecode. Maybe I misunderstand the comment on PyArray_Scalar (typecode is not used but cannot be NULL). cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] multiprocessing shared arrays and numpy
Hi people, I was wondering about the status of using the standard library multiprocessing module with numpy. I found a cookbook example last updated one year ago which states that: This page was obsolete as multiprocessing's internals have changed. More information will come shortly; a link to this page will then be added back to the Cookbook. http://www.scipy.org/Cookbook/multiprocessing I also found the code that used to be on this page in the cookbook but it does not work any more. So my question is: Is it possible to use numpy arrays as shared arrays in an application using multiprocessing and how do you do it? Best regards, Jesper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
Greetings I sent this last night but the buildout.txt file was too large so my email was held pending moderation. When I got up this morning I realized that there was a good chance that the moderation queue is overwhelmed and not looked at much so I decided to post links to the out files and resubmit. I apologize if any of you get this twice... Okay, I'm about out of ideas. Hopefully someone on here has an idea as to what might be going on. 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from paver bdist_wininst is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using python25 setup.py build -c mingw32 bdist_wininst buildout.txt. (Note, I've also done this for building with Python 2.6.) This works and I'm able to build the windows installer for both Python 2.5 and 2.6. (The output of the build from Python 2.5 is here http://patricktmarsh.com/numpy/20100302.buildout.txt. However, when I try to run the test suite (using python25 -c 'import numpy; print numpy.__version__; numpy.test();' testout.txt), Python 2.5 runs, with failures errors, whereas Python 2.6 freezes and eventually (Python itself) quits. Since I'm unable to generate the test suite log for Numpy on Python 2.6, I'm working on the (incorrect?) assumption that the freezing when using Python 2.6 corresponds with the failures/errors when using Python 2.5. The Python 2.5 numpy test suite log is located here: http://patricktmarsh.com/numpy/20100302.testout.txt. Most of the errors with my locally build numpy version comes from the test suite being unable to call matrix. One, of many, examples is attached below. I don't know where else to look anymore. I get the same results when building from trunk and building from the 1.4.x tag (which I downloaded this morning). I've gotten these same results when building on a different laptop (which was a 32bit version of Windows 7 Professional). I should disclose that I'm currently running a 64bit version of Windows 7 Professional. I'm running EPDv5.1.1 32bit for Python 2.5.4 (shipped numpy test suite works fine). I'm running EPDv6.1.1 32bit for Python 2.6.4 (shipped numpy test suite works fine). I've achieved the same errors when using a generic 32bit Python install downloaded from the official Python website. I'm hoping that I've looked at this for so long that I'm missing something obvious. Any thoughts/suggestions at this point would be appreciated. == ERROR: Test whether matrix.sum(axis=1) preserves orientation. -- Traceback (most recent call last): File C:\Python25\lib\site-packages\numpy\core\tests\test_defmatrix.py, line 56, in test_sum M = matrix([[1,2,0,0], NameError: global name 'matrix' is not defined == Patrick -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
On Wed, Mar 3, 2010 at 11:34 PM, Patrick Marsh patrickmars...@gmail.com wrote: Okay, I'm about out of ideas. Hopefully someone on here has an idea as to what might be going on. 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from paver bdist_wininst is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. That's a bug in the pavement script - on windows 7, some env variables are necessary to run python correctly, which were not necessary for windows 7. I will fix this. 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using python25 setup.py build -c mingw32 bdist_wininst buildout.txt. (Note, I've also done this for building with Python 2.6.) You should make sure that you are testing the numpy you think you are testing, and always, always remove the previous installed version. The matrix error is most likely due to some stalled files from a previous install David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
On Wed, Mar 3, 2010 at 10:34 PM, Patrick Marsh patrickmars...@gmail.comwrote: 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from paver bdist_wininst is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using python25 setup.py build -c mingw32 bdist_wininst buildout.txt. (Note, I've also done this for building with Python 2.6.) This works and I'm able to build the windows installer for both Python 2.5 and 2.6. (The output of the build from Python 2.5 is here http://patricktmarsh.com/numpy/20100302.buildout.txt. However, when I try to run the test suite (using python25 -c 'import numpy; print numpy.__version__; numpy.test();' testout.txt), Python 2.5 runs, with failures errors, whereas Python 2.6 freezes and eventually (Python itself) quits. Since I'm unable to generate the test suite log for Numpy on Python 2.6, I'm working on the (incorrect?) assumption that the freezing when using Python 2.6 corresponds with the failures/errors when using Python 2.5. The Python 2.5 numpy test suite log is located here: http://patricktmarsh.com/numpy/20100302.testout.txt. It fails on importing tempfile, that may be unrelated to numpy. What happens if you run python in a shell and do import tempfile? Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to efficiently build an array of x, y, z points
On 03/02/2010 09:47 PM, David Goldsmith wrote: On Tue, Mar 2, 2010 at 6:59 PM, Brennan Williams brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com wrote: David Goldsmith wrote: On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com wrote: I'm reading a file which contains a grid definition. Each cell in the grid, apart from having an i,j,k index also has 8 x,y,z coordinates. I'm reading each set of coordinates into a numpy array. I then want to add/append those coordinates to what will be my large points array. Due to the orientation/order of the 8 corners of each hexahedral cell I may have to reorder them before adding them to my large points array (not sure about that yet). Should I create a numpy array with nothing in it and then .append to it? But this is probably expensive isn't it as it creates a new copy of the array each time? Or should I create a zero or empty array of sufficient size and then put each set of 8 coordinates into the correct position in that big array? I don't know exactly how big the array will be (some cells are inactive and therefore don't have a geometry defined) but I do know what its maximum size is (ni*nj*nk,3). Someone will correct me if I'm wrong, but this problem - the best way to build a large array whose size is not known beforehand - came up in one of the tutorials at SciPyCon '09 and IIRC the answer was, perhaps surprisingly, build the thing as a Python list (which is optimized for this kind of indeterminate sequence building) and convert to a numpy array when you're done. Isn't that what was recommended, folks? Build a list of floating point values, then convert to an array and shape accordingly? Or build a list of small arrays and then somehow convert that into a big numpy array? My guess is that either way will be better than iteratively appending to an existing array. Hi, Christopher Barker provided some code last last year on appending ndarrays eg: http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html A lot depends on your final usage of the array otherwise there are no suitable suggestions. That is do you need just to index the array using i, j, k indices (this gives you either an i by j by k array that contains the x, y, z coordinates) or do you also need to index the x, y, z coordinates as well (giving you an i by j by k by x by y by z array). If it is just plain storage then perhaps just a Python list, dict or sqlite object may be sufficient. There are also time and memory constraints as you can spend large effort just to get the input into a suitable format and memory usage. If you use a secondary storage like a Python list then you need memory to storage the list, the ndarray and all intermediate components and overheads. If you use scipy then you should look at using sparse arrays where space is only added as you need it. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
On Wed, Mar 3, 2010 at 8:48 AM, Ralf Gommers ralf.gomm...@googlemail.comwrote: On Wed, Mar 3, 2010 at 10:34 PM, Patrick Marsh patrickmars...@gmail.comwrote: 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from paver bdist_wininst is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using python25 setup.py build -c mingw32 bdist_wininst buildout.txt. (Note, I've also done this for building with Python 2.6.) This works and I'm able to build the windows installer for both Python 2.5 and 2.6. (The output of the build from Python 2.5 is here http://patricktmarsh.com/numpy/20100302.buildout.txt. However, when I try to run the test suite (using python25 -c 'import numpy; print numpy.__version__; numpy.test();' testout.txt), Python 2.5 runs, with failures errors, whereas Python 2.6 freezes and eventually (Python itself) quits. Since I'm unable to generate the test suite log for Numpy on Python 2.6, I'm working on the (incorrect?) assumption that the freezing when using Python 2.6 corresponds with the failures/errors when using Python 2.5. The Python 2.5 numpy test suite log is located here: http://patricktmarsh.com/numpy/20100302.testout.txt. It fails on importing tempfile, that may be unrelated to numpy. What happens if you run python in a shell and do import tempfile? Tempfile imports with no errors with both Python 2.5.4 and Python 2.6.4. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Mar 3, 2010 at 11:34 PM, Patrick Marsh patrickmars...@gmail.com wrote: Okay, I'm about out of ideas. Hopefully someone on here has an idea as to what might be going on. 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from paver bdist_wininst is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. That's a bug in the pavement script - on windows 7, some env variables are necessary to run python correctly, which were not necessary for windows 7. I will fix this. Thanks! 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using python25 setup.py build -c mingw32 bdist_wininst buildout.txt. (Note, I've also done this for building with Python 2.6.) You should make sure that you are testing the numpy you think you are testing, and always, always remove the previous installed version. The matrix error is most likely due to some stalled files from a previous install David Okay, I had been removing the build and dist directories but didn't realize I needed to remove the numpy directory in the site-packages directory. Deleting this last directory fixed the matrix issues and I'm now left with the two failures. The latter failure doesn't seem to really be an issue to me and the first one is the same error that Ralf posted earlier - so for Python 2.5, I've got it working. However, Python 2.6.4 still freezes on the test suite. I'll have to look more into this today, but for reference, has anyone successfully built Numpy from the 1.4.x branch, on Windows 7, using Python 2.6.4? I'm going to attempt to get my hands on a Windows XP box today and try to build it there, but I don't know when/if I'll be able to get the XP box. Thanks for the help == FAIL: test_special_values (test_umath_complex.TestClog) -- Traceback (most recent call last): File C:\Python25\Lib\site-packages\numpy\core\tests\test_umath_complex.py, line 179, in test_special_values assert_almost_equal(np.log(x), y) File C:\Python25\Lib\site-packages\numpy\testing\utils.py, line 437, in assert_almost_equal DESIRED: %s\n % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ NaN+2.35619449j] DESIRED: (1.#INF+2.35619449019j) == FAIL: test_doctests (test_polynomial.TestDocs) -- Traceback (most recent call last): File C:\Python25\Lib\site-packages\numpy\lib\tests\test_polynomial.py, line 90, in test_doctests return rundocs() File C:\Python25\Lib\site-packages\numpy\testing\utils.py, line 953, in rundocs raise AssertionError(Some doctests failed:\n%s % \n.join(msg)) AssertionError: Some doctests failed: ** File C:\Python25\lib\site-packages\numpy\lib\tests\test_polynomial.py, line 20, in test_polynomial Failed example: print poly1d([100e-90, 1.234567e-9j+3, -1234.999e8]) Expected: 2 1e-88 x + (3 + 1.235e-09j) x - 1.235e+11 Got: 2 1e-088 x + (3 + 1.235e-009j) x - 1.235e+011 -- Ran 2334 tests in 10.175s FAILED (KNOWNFAIL=7, SKIP=1, failures=2) Patrick -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] multiprocessing shared arrays and numpy
A Wednesday 03 March 2010 15:31:29 Jesper Larsen escrigué: Hi people, I was wondering about the status of using the standard library multiprocessing module with numpy. I found a cookbook example last updated one year ago which states that: This page was obsolete as multiprocessing's internals have changed. More information will come shortly; a link to this page will then be added back to the Cookbook. http://www.scipy.org/Cookbook/multiprocessing I also found the code that used to be on this page in the cookbook but it does not work any more. So my question is: Is it possible to use numpy arrays as shared arrays in an application using multiprocessing and how do you do it? Yes, it is pretty easy if your problem can be vectorised. Just split your arrays in chunks and assign the computation of each chunk to a different process. I'm attaching a code that does this for computing a polynomial on a certain range. Here it is the output (for a dual-core processor): Serial computation... 1000 0 Time elapsed in serial computation: 3.438 333 0 334 1 333 2 Time elapsed in parallel computation: 2.271 with 3 threads Speed-up: 1.51x -- Francesc Alted import numpy from multiprocessing import Pool from time import time import subprocess, os, sys N = 1000*1000*10 expr = a + b*x + c*x**2 + d*x**3 + e*x**4 a, b, c, d, e = -1.1, 2.2, -3.3, 4.4, -5.5 xp = numpy.linspace(-1, 1, N) parallel = True NT = 3 global counter counter = 0 def cb(r): global counter print r, counter counter +=1 def compute(nt, i): x = xp[i*N/nt:(i+1)*N/nt] eval(expr) return len(x) if __name__ == '__main__': print Serial computation... t0 = time() result = compute(1,0) print result, 0 ts = round(time() - t0, 3) print Time elapsed in serial computation:, ts if not parallel: sys.exit() t0 = time() po = Pool(processes=NT) for i in xrange(NT): po.apply_async(compute, (NT,i), callback=cb) po.close() po.join() tp = round(time() - t0, 3) print Time elapsed in parallel computation:, tp, with %s threads % NT print Speed-up: %sx % round(ts/tp, 2) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
Numpy 1.4.0 svn rev 8270 builds (with setup.py) and tests OK on Windows 7 using Python 2.6.4. The only test failure is test_special_values (test_umath_complex.TestClog). - Christoph On 3/3/2010 7:22 AM, Patrick Marsh wrote: On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau courn...@gmail.com mailto:courn...@gmail.com wrote: On Wed, Mar 3, 2010 at 11:34 PM, Patrick Marsh patrickmars...@gmail.com mailto:patrickmars...@gmail.com wrote: Okay, I'm about out of ideas. Hopefully someone on here has an idea as to what might be going on. 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from paver bdist_wininst is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. That's a bug in the pavement script - on windows 7, some env variables are necessary to run python correctly, which were not necessary for windows 7. I will fix this. Thanks! 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using python25 setup.py build -c mingw32 bdist_wininst buildout.txt. (Note, I've also done this for building with Python 2.6.) You should make sure that you are testing the numpy you think you are testing, and always, always remove the previous installed version. The matrix error is most likely due to some stalled files from a previous install David Okay, I had been removing the build and dist directories but didn't realize I needed to remove the numpy directory in the site-packages directory. Deleting this last directory fixed the matrix issues and I'm now left with the two failures. The latter failure doesn't seem to really be an issue to me and the first one is the same error that Ralf posted earlier - so for Python 2.5, I've got it working. However, Python 2.6.4 still freezes on the test suite. I'll have to look more into this today, but for reference, has anyone successfully built Numpy from the 1.4.x branch, on Windows 7, using Python 2.6.4? I'm going to attempt to get my hands on a Windows XP box today and try to build it there, but I don't know when/if I'll be able to get the XP box. Thanks for the help == FAIL: test_special_values (test_umath_complex.TestClog) -- Traceback (most recent call last): File C:\Python25\Lib\site-packages\numpy\core\tests\test_umath_complex.py, line 179, in test_special_values assert_almost_equal(np.log(x), y) File C:\Python25\Lib\site-packages\numpy\testing\utils.py, line 437, in assert_almost_equal DESIRED: %s\n % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ NaN+2.35619449j] DESIRED: (1.#INF+2.35619449019j) == FAIL: test_doctests (test_polynomial.TestDocs) -- Traceback (most recent call last): File C:\Python25\Lib\site-packages\numpy\lib\tests\test_polynomial.py, line 90, in test_doctests return rundocs() File C:\Python25\Lib\site-packages\numpy\testing\utils.py, line 953, in rundocs raise AssertionError(Some doctests failed:\n%s % \n.join(msg)) AssertionError: Some doctests failed: ** File C:\Python25\lib\site-packages\numpy\lib\tests\test_polynomial.py, line 20, in test_polynomial Failed example: print poly1d([100e-90, 1.234567e-9j+3, -1234.999e8]) Expected: 2 1e-88 x + (3 + 1.235e-09j) x - 1.235e+11 Got: 2 1e-088 x + (3 + 1.235e-009j) x - 1.235e+011 -- Ran 2334 tests in 10.175s FAILED (KNOWNFAIL=7, SKIP=1, failures=2) Patrick -- ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to work with numpy.int8 in c
Thanks all for your help, I think I'm on my way again. The catch in the first place was not being confident that a PyArray_Scalar was the thing I needed. I grep'd the code for uint8, int8 and so on and could not find their definitions. On first reading I overlooked the PyArray_Scalar link in this section: http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#scalararraytypes But now that it has been pointed out, I think the reference docs are good enough for me to do what I wanted. So the docs are pretty good. I guess the only thing that I managed to miss was the initial connection that a numpy.int8(6) is a PyArray_Scalar instance. Seems obvious enough in hindsight... On Wed, Mar 3, 2010 at 4:40 AM, David Cournapeau da...@silveregg.co.jp wrote: Francesc Alted wrote: A Wednesday 03 March 2010 02:58:31 David Cournapeau escrigué: PyObject *ret; PyArray_Descr *typecode; typecode = PyArray_DescrFromType(PyArray_UINT8); ret = PyArray_Scalar(NULL, typecode, NULL); Py_DECREF(typecode); Sorry, this is wrong, this does not work on my machine, but I am not sure to understand why. Well, at least the next works in Cython: cdef npy_int8 next cdef dtype int8 cdef object current [...] int8 = PyArray_DescrFromType(NPY_INT8) current = PyArray_Scalar(next, int8, None) Yes, what does not work is the Py_DECREF on typecode. Maybe I misunderstand the comment on PyArray_Scalar (typecode is not used but cannot be NULL). cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- http://www-etud.iro.umontreal.ca/~bergstrj ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to efficiently build an array of x, y, z points
Bruce Southey wrote: On 03/02/2010 09:47 PM, David Goldsmith wrote: On Tue, Mar 2, 2010 at 6:59 PM, Brennan Williams brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com wrote: David Goldsmith wrote: On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com mailto:brennan.willi...@visualreservoir.com wrote: I'm reading a file which contains a grid definition. Each cell in the grid, apart from having an i,j,k index also has 8 x,y,z coordinates. I'm reading each set of coordinates into a numpy array. I then want to add/append those coordinates to what will be my large points array. Due to the orientation/order of the 8 corners of each hexahedral cell I may have to reorder them before adding them to my large points array (not sure about that yet). Should I create a numpy array with nothing in it and then .append to it? But this is probably expensive isn't it as it creates a new copy of the array each time? Or should I create a zero or empty array of sufficient size and then put each set of 8 coordinates into the correct position in that big array? I don't know exactly how big the array will be (some cells are inactive and therefore don't have a geometry defined) but I do know what its maximum size is (ni*nj*nk,3). Someone will correct me if I'm wrong, but this problem - the best way to build a large array whose size is not known beforehand - came up in one of the tutorials at SciPyCon '09 and IIRC the answer was, perhaps surprisingly, build the thing as a Python list (which is optimized for this kind of indeterminate sequence building) and convert to a numpy array when you're done. Isn't that what was recommended, folks? Build a list of floating point values, then convert to an array and shape accordingly? Or build a list of small arrays and then somehow convert that into a big numpy array? My guess is that either way will be better than iteratively appending to an existing array. Hi, Christopher Barker provided some code last last year on appending ndarrays eg: http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html A lot depends on your final usage of the array otherwise there are no suitable suggestions. That is do you need just to index the array using i, j, k indices (this gives you either an i by j by k array that contains the x, y, z coordinates) or do you also need to index the x, y, z coordinates as well (giving you an i by j by k by x by y by z array). If it is just plain storage then perhaps just a Python list, dict or sqlite object may be sufficient. Ultimately I'm trying to build a tvtk unstructured grid to view in a Traits/tvtk/Mayavi app. The grid is ni*nj*nk cells with 8 xyz's per cell (hexahedral cell with 6 faces). However some cells are inactive and therefore don't have geometry. Cells also have connectivity to other cells, usually to adjacent cells (e.g. cell i,j,k connected to cell i-1,j,k) but not always. I'll post more comments/questions as I go. Brennan There are also time and memory constraints as you can spend large effort just to get the input into a suitable format and memory usage. If you use a secondary storage like a Python list then you need memory to storage the list, the ndarray and all intermediate components and overheads. If you use scipy then you should look at using sparse arrays where space is only added as you need it. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
Hello, am using the scikit.timeseries to convert a hourly timeseries to a lower frequency unsing the appropriate function [1]. When I compare the result to the values calculated with a Pivot table in Excel there is a difference in the values which reaches quite high values in the total sum of all monthly values. I found out that the differnec arises from different decimal settings: In Python the numbers show: 12. whereas in Excel I see: 12.8 The difference due to the different decimals is small for single values and accumulates to a 2-digit number for the total of all values. * Why do these differences arise? * What can I do to achive comparable values? Thanks in advance for any hint, Marco [1] http://pytseries.sourceforge.net/generated/scikits.timeseries.convert.html P.S.: Sorry if this is a numpy question but as I was using the scikit I though this is the right forum. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
On Wed, Mar 3, 2010 at 14:09, Marco Tuckner marcotuck...@public-files.de wrote: Hello, am using the scikit.timeseries to convert a hourly timeseries to a lower frequency unsing the appropriate function [1]. When I compare the result to the values calculated with a Pivot table in Excel there is a difference in the values which reaches quite high values in the total sum of all monthly values. I found out that the differnec arises from different decimal settings: In Python the numbers show: 12. whereas in Excel I see: 12.8 The difference due to the different decimals is small for single values and accumulates to a 2-digit number for the total of all values. * Why do these differences arise? * What can I do to achive comparable values? We default to printing only eight decimal digits for floating point values for convenience. There are more under the covers. Use numpy.set_printoptions(precision=16) to see all of them. If you are still seeing actual calculation differences, we will need to see a complete, self-contained example that demonstrates the difference. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
ke, 2010-03-03 kello 21:09 +0100, Marco Tuckner kirjoitti: am using the scikit.timeseries to convert a hourly timeseries to a lower frequency unsing the appropriate function [1]. When I compare the result to the values calculated with a Pivot table in Excel there is a difference in the values which reaches quite high values in the total sum of all monthly values. I found out that the differnec arises from different decimal settings: In Python the numbers show: 12. whereas in Excel I see: 12.8 Typically, the internal precision used in Python and Numpy is significantly more than what is printed. Most likely, your problem has a different cause. Are you sure Excel is using a high enough accuracy? If you want more help, it would be useful to post a self-contained code that demonstrates the error. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
Robert Kern wrote: On Wed, Mar 3, 2010 at 14:09, Marco Tuckner In Python the numbers show: 12. whereas in Excel I see: 12.8 If you are still seeing actual calculation differences, we will need to see a complete, self-contained example that demonstrates the difference. To add a bit more detail -- unless you are explicitly specifying single precision floats (dtype=float32), then both numpy and excel are using doubles -- so that's not the source of the differences. Even if you are using single precision in numpy, It's pretty rare for that to make a significant difference. Something else is going on. I suspect a different algorithm, you can tell timeseries.convert how you want it to interpolate -- who knows what excel is doing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to efficiently build an array of x, y, z points
Bruce Southey wrote: Christopher Barker provided some code last last year on appending ndarrays eg: http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html yup, Id love someone else to pick that up and test/improve it. Anyway, that code only handles 1-d arrays, though that can be structured arrays. Id like to extend it to handlw n-d arrays, though you could only grow them in the first dimension, which may work for your case. As for performance: My numpy code is a bit slower than using python lists, if you add elements one at a time, and the elements are a standard python data type. It should use less memory though, if that matters. If you add the data in big enough chunks, my method gets better performance. Ultimately I'm trying to build a tvtk unstructured grid to view in a Traits/tvtk/Mayavi app. I'd love to see that working, once you've got it! The grid is ni*nj*nk cells with 8 xyz's per cell (hexahedral cell with 6 faces). However some cells are inactive and therefore don't have geometry. Cells also have connectivity to other cells, usually to adjacent cells (e.g. cell i,j,k connected to cell i-1,j,k) but not always. I'm confused now -- what does the array need to look like in the end? Maybe: ni*nj*nk X 8 X 3 ? How is inactive indicated? Is the connectivity somehow in the same array, or is that stored separately? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dtype for a single char
Hello today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: array([('a',1),('b',2),('c',3)], dtype=[(letter, a), (number, i)]) array([('', 1), ('', 2), ('', 3)], dtype=[('letter', '|S0'), ('number', 'i4')]) the fix seems to be using 'a1' instead array([('a',1),('b',2),('c',3)], dtype=[(letter, a1), (number, i)]) array([('a', 1), ('b', 2), ('c', 3)], dtype=[('letter', '|S1'), ('number', 'i4')]) this seems odd to me, as other types eg 'i' and 'f' can be used on their own. is there a reason for this? thanks Sam ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype for a single char
On Wed, Mar 3, 2010 at 15:14, sam tygier samtyg...@yahoo.co.uk wrote: Hello today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: array([('a',1),('b',2),('c',3)], dtype=[(letter, a), (number, i)]) array([('', 1), ('', 2), ('', 3)], dtype=[('letter', '|S0'), ('number', 'i4')]) the fix seems to be using 'a1' instead array([('a',1),('b',2),('c',3)], dtype=[(letter, a1), (number, i)]) array([('a', 1), ('b', 2), ('c', 3)], dtype=[('letter', '|S1'), ('number', 'i4')]) this seems odd to me, as other types eg 'i' and 'f' can be used on their own. is there a reason for this? Other types have a sensible default determined by the platform. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
Thanks to all who answered. This is really helpful! If you are still seeing actual calculation differences, we will need to see a complete, self-contained example that demonstrates the difference. To add a bit more detail -- unless you are explicitly specifying single precision floats (dtype=float32), then both numpy and excel are using doubles -- so that's not the source of the differences. Even if you are using single precision in numpy, It's pretty rare for that to make a significant difference. Something else is going on. I suspect a different algorithm, you can tell timeseries.convert how you want it to interpolate -- who knows what excel is doing. I checked the values row by row comparing Excel against the Python results. The the values of both programs match perfectly at the data points where no periodic sequence occurs: so those values where the aggregated value results in a straight value (e.g. 12.04) the results were the same. At values points where the result was a periodic sequence (e.g. 12.22 ...) the described difference could be observed. I will try to create a self contained example tomorrow. Thanks a lot and kind regards, Marco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Recommendation on reference software [Re: setting decimal accuracy ... ]
I suspect a different algorithm, you can tell timeseries.convert how you want it to interpolate -- who knows what excel is doing. does this mean that you are questioning Excel or more neutrally Spreadsheet programs? What software would you recommend as reference to test Python packages against? R-Project? Best regards, Marco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
On Wed, Mar 3, 2010 at 16:23, Marco Tuckner marcotuck...@public-files.de wrote: Thanks to all who answered. This is really helpful! If you are still seeing actual calculation differences, we will need to see a complete, self-contained example that demonstrates the difference. To add a bit more detail -- unless you are explicitly specifying single precision floats (dtype=float32), then both numpy and excel are using doubles -- so that's not the source of the differences. Even if you are using single precision in numpy, It's pretty rare for that to make a significant difference. Something else is going on. I suspect a different algorithm, you can tell timeseries.convert how you want it to interpolate -- who knows what excel is doing. I checked the values row by row comparing Excel against the Python results. The the values of both programs match perfectly at the data points where no periodic sequence occurs: so those values where the aggregated value results in a straight value (e.g. 12.04) the results were the same. At values points where the result was a periodic sequence (e.g. 12.22 ...) the described difference could be observed. I think you are just seeing the effect of the different printing that I described. These are not differences in the actual values. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Recommendation on reference software [Re: setting decimal accuracy ... ]
On Wed, Mar 3, 2010 at 16:26, Marco Tuckner marcotuck...@public-files.de wrote: I suspect a different algorithm, you can tell timeseries.convert how you want it to interpolate -- who knows what excel is doing. does this mean that you are questioning Excel or more neutrally Spreadsheet programs? We are not questioning its accuracy, not yet at least. You just haven't told us exactly what calculation that you are trying to make it do. What software would you recommend as reference to test Python packages against? R-Project? Sometimes. It depends on exactly what you are trying to test. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building Numpy Windows Superpack
Patrick Marsh wrote: On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau courn...@gmail.com mailto:courn...@gmail.com wrote: That's a bug in the pavement script - on windows 7, some env variables are necessary to run python correctly, which were not necessary for windows 7. I will fix this. This is fixed in both trunk and 1.4.x now. I have not tested it, though. Okay, I had been removing the build and dist directories but didn't realize I needed to remove the numpy directory in the site-packages directory. Deleting this last directory fixed the matrix issues and I'm now left with the two failures. The latter failure doesn't seem to really be an issue to me and the first one is the same error that Ralf posted earlier - so for Python 2.5, I've got it working. However, Python 2.6.4 still freezes on the test suite. I'll have to look more into this today, but for reference, has anyone successfully built Numpy from the 1.4.x branch, on Windows 7, using Python 2.6.4? This is almost always a problem with the C runtime. Those are a big PITA to debug/understand/fix. You built this with Mingw, right ? The first thing to check is whether you have several C runtimes loaded: you can check this with the problem depends.exe: http://www.dependencywalker.com I will try to look at this myself - I have only attempted Visual Studio builds on Windows 7 so far, cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype for a single char
On 3-Mar-10, at 4:56 PM, Robert Kern wrote: Other types have a sensible default determined by the platform. Yes, and the 'S0' type isn't terribly sensible, if only because of this issue: http://projects.scipy.org/numpy/ticket/1239 David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] multiprocessing shared arrays and numpy
There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in the cookbook page. I am into the same issue and going to test it today. Nadav On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: Hi people, I was wondering about the status of using the standard library multiprocessing module with numpy. I found a cookbook example last updated one year ago which states that: This page was obsolete as multiprocessing's internals have changed. More information will come shortly; a link to this page will then be added back to the Cookbook. http://www.scipy.org/Cookbook/multiprocessing I also found the code that used to be on this page in the cookbook but it does not work any more. So my question is: Is it possible to use numpy arrays as shared arrays in an application using multiprocessing and how do you do it? Best regards, Jesper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion