Re: [Numpy-discussion] Questions about masked arrays
On Wed, Oct 7, 2009 at 12:47 AM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 7, 2009, at 1:12 AM, Gökhan Sever wrote: One more from me: I[1]: a = np.arange(5) I[2]: mask = 999 I[6]: a[3] = 999 I[7]: am = ma.masked_equal(a, mask) I[8]: am O[8]: masked_array(data = [0 1 2 -- 4], mask = [False False False True False], fill_value = 99) Where does this fill_value come from? To me it is little confusing having a value and fill_value in masked array method arguments. Because the two are unrelated. The `fill_value` is the value used to fill the masked elements (that is, the missing entries). When you create a masked array, you get a `fill_value`, whose actual value is defined by default from the dtype of the array: for int, it's 99, for float, 1e+20, you get the idea. The value you used for masking is different, it's just whatver value you consider invalid. Now, if I follow you, you would expect the value in `masked_equal(array, value)` to be the `fill_value` of the output. That's an idea, would you mind fiilling a ticket/enhancement and assign it to me? So that I don't forget. One more example. (I still think the behaviour of fill_value is inconsistent) See below: I[6]: f = np.arange(5, dtype=float) I[7]: mask = . I[8]: f[3] = mask I[9]: fm = ma.masked_equal(f, mask) I[10]: fm O[10]: masked_array(data = [0.0 1.0 2.0 -- 4.0], mask = [False False False True False], fill_value = 1e+20) I[22]: fm2 = ma.masked_values(f, mask) I[23]: fm2 O[23]: masked_array(data = [0.0 1.0 2.0 -- 4.0], mask = [False False False True False], fill_value = .) ma.masked_equal(x, value, copy=True) ma.masked_values(x, value, rtol=1.0001e-05, atol=1e-08, copy=True, shrink=True) Similar function definitions, but different fill_values... Ok, it is almost 2 AM here my understanding might be crawling on the ground. Probably I will re-read your comments and file an issue on the trac. Probably you can pin-point the error by testing a 1.3.0 version numpy. Not too many arc function with masked array users around I guess :) Will try, but if it ain't broken, don't fix it... Also if it is working don't update (This applies to Fedora updates :) especially if you have an Nvidia display card) assert(np.arccos(ma.masked), ma.masked) would be the simplest. (and in fact, it'd be assert(np.arccos(ma.masked) is ma.masked) in this case). Good to know this. The more I spend time with numpy the more I understand the importance of testing the code automatically. This said, I still find the test-driven-development approach somewhat bizarre. Start only by writing test code and keep implementing your code until all the tests are satisfied. Very interesting...These software engineers... Bah, it's not a rule cast in iron... You can start writing your code but do write the tests at the same time. It's the best way to make sure you're not breaking something later on. That's what I have been thinking, a more reasonable way. The other is way too a reverse thinking. Thanks for the long hours discussion. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy.linalg.eig memory issue with libatlas?
[I am resending this as the previous attempt seems to have failed] Hello List, I am looking at memory errors when using numpy.linalg.eig(). Short version: I had memory errors in numpy.linalg.eig(), and I have reasons (valgrind) to believe these are due to writing to incorrect memory addresses in the diagonalization routine zgeev, called by numpy.linalg.eig(). I realized that I had recently installed atlas, and now had several lapack-like libraries, so I uninstalled atlas, and the issues seemed to go away. My question is: Could it be that some lapack/blas/atlas package I use is incompatible with the numpy I use, and if so, is there a method to diagnose this in a more reliable way? Longer version: The system used is an updated debian testing (squeeze), on amd64. My program uses numpy, matplotlib, and a module compiled using cython. I started getting errors from my program this week. Pdb and print-statements tell me that the errors arise around the point where I call numpy.linalg.eig(), but not every time. The type of error varies. Most frequently a segmentation fault, but sometimes a matrix dimension mismatch, and sometimes a message related to the python GC. Valgrind tells me that something impossible happened, and that this is probably due to invalid writes earlier during the program execution. There seems to be two invalid writes after each program crash, and the log looks like this (it only contains two invalid writes): [...] ==6508== Invalid write of size 8 ==6508==at 0x92D2597: zunmhr_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x920A42B: zlaqr3_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x9205D11: zlaqr0_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x91B0C4D: zhseqr_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x911CA15: zgeev_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x881B81B: lapack_lite_zgeev (lapack_litemodule.c:590) ==6508==by 0x4911D4: PyEval_EvalFrameEx (ceval.c:3612) ==6508==by 0x491CE1: PyEval_EvalFrameEx (ceval.c:3698) ==6508==by 0x4924CC: PyEval_EvalCodeEx (ceval.c:2875) ==6508==by 0x490F17: PyEval_EvalFrameEx (ceval.c:3708) ==6508==by 0x4924CC: PyEval_EvalCodeEx (ceval.c:2875) ==6508==by 0x4DC991: function_call (funcobject.c:517) ==6508== Address 0x67ab118 is not stack'd, malloc'd or (recently) free'd ==6508== ==6508== Invalid write of size 8 ==6508==at 0x92D25A8: zunmhr_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x920A42B: zlaqr3_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x9205D11: zlaqr0_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x91B0C4D: zhseqr_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x911CA15: zgeev_ (in /usr/lib/atlas/liblapack.so.3gf.0) ==6508==by 0x881B81B: lapack_lite_zgeev (lapack_litemodule.c:590) ==6508==by 0x4911D4: PyEval_EvalFrameEx (ceval.c:3612) ==6508==by 0x491CE1: PyEval_EvalFrameEx (ceval.c:3698) ==6508==by 0x4924CC: PyEval_EvalCodeEx (ceval.c:2875) ==6508==by 0x490F17: PyEval_EvalFrameEx (ceval.c:3708) ==6508==by 0x4924CC: PyEval_EvalCodeEx (ceval.c:2875) ==6508==by 0x4DC991: function_call (funcobject.c:517) ==6508== Address 0x67ab110 is not stack'd, malloc'd or (recently) free'd [...] valgrind: m_mallocfree.c:248 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed. valgrind: Heap block lo/hi size mismatch: lo = 96, hi = 0. This is probably caused by your program erroneously writing past the end of a heap block and corrupting heap metadata. If you fix any invalid writes reported by Memcheck, this assertion failure will probably go away. Please try that before reporting this as a bug. [...] Today I looked in my package installation logs to see what had changed recently, and I noticed that I installed atlas (debian package libatlas3gf-common) recently. I uninstalled that package, and now the same program seems to have no memory errors. The packages I removed from the system today were libarpack2 libfltk1.1 libftgl2 libgraphicsmagick++3 libgraphicsmagick3 libibverbs1 libopenmpi1.3 libqrupdate1 octave3.2-common octave3.2-emacsen libatlas3gf-base octave3.2 My interpretation is that I had several packages available containing the diagonalization functionality, but that they differed subtly in their interfaces. My recent installation of atlas made numpy use (the incompatible) atlas instead of its previous choice, and removal of atlas restored the situation to the state of last week. Now for the questions: Is this a reasonable hypothesis? Is it known? Can it be investigated more precisely by comparing versions somehow? Regards / johan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Oct 7, 2009, at 2:57 AM, Gökhan Sever wrote: One more example. (I still think the behaviour of fill_value is inconsistent) Well, ma.masked_values use `value` to define fill_value, ma.masked_equal does not. So yes, there's an inconsistency here. Once again, please fill an enhancement request ticket. I should be able to deal with this one quite soon. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] SVN + Python 2.5.4 (32b) + MacOS 10.6.1
All, I need to test the numpy SVN on a 10.6.1 mac, but using Python 2.5.4 (32b) instead of the 2.6.1 (64b). The sources get compiled OK (apparently, find the build here: http://pastebin.com/m147a2909 ) but numpy fails to import: File .../.virtualenvs/default25/lib/python2.5/site-packages/numpy/ __init__.py, line 130, in module import add_newdocs File .../.virtualenvs/default25/lib/python2.5/site-packages/numpy/ add_newdocs.py, line 9, in module from lib import add_newdoc File .../.virtualenvs/default25/lib/python2.5/site-packages/numpy/ lib/__init__.py, line 4, in module from type_check import * File .../.virtualenvs/default25/lib/python2.5/site-packages/numpy/ lib/type_check.py, line 8, in module import numpy.core.numeric as _nx File .../.virtualenvs/default25/lib/python2.5/site-packages/numpy/ core/__init__.py, line 8, in module import numerictypes as nt File .../.virtualenvs/default25/lib/python2.5/site-packages/numpy/ core/numerictypes.py, line 737, in module _typestr[key] = empty((1,),key).dtype.str[1:] ValueError: array is too big. Obviously, I'm messing between 32b and 64b, but can't figure where. Any help/hint will be deeply appreciated Cheers P. FYI: Python 2.5.4 (r254:67916, Jul 7 2009, 23:51:24) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin] CFLAGS=-arch i386 -arch x86_64 FFLAGS=-arch i386 -arch x86_64 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 2:06 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I'm thinking the safest thing to do is to move the new type to the end of the list. I'm not sure what all the ramifications are for compatibility to having it stuck in the middle like that, does it change the type numbers for all the types after? I wonder what the type numbers are internally? No doubt putting it at the end makes the logic for casting more difficult, but that is something that needs fixing anyway. Question - if the new type is simply removed from the list does anything break? Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 9:31 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:06 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I'm thinking the safest thing to do is to move the new type to the end of the list. That's what the above branch does. I'm not sure what all the ramifications are for compatibility to having it stuck in the middle like that, does it change the type numbers for all the types after? Yes, there is no space left between the types declarations and the first functions. Currently, I just put things at the end manually, but that's really error prone. I am a bit lazy to fix this for real (I was thinking about using a python dict with hardcoded indexes as an entry instead of the current .txt files, but this requires several changes in the code generator, which is already not the greatest code to begin with). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 6:37 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 9:31 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:06 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I'm thinking the safest thing to do is to move the new type to the end of the list. That's what the above branch does. I'm not sure what all the ramifications are for compatibility to having it stuck in the middle like that, does it change the type numbers for all the types after? Yes, there is no space left between the types declarations and the first functions. Currently, I just put things at the end manually, but that's really error prone. I am a bit lazy to fix this for real (I was thinking about using a python dict with hardcoded indexes as an entry instead of the current .txt files, but this requires several changes in the code generator, which is already not the greatest code to begin with). What I'm concerned about is that, IIRC, types in the c-code can be referenced by their index in a list of types and that internal mechanism might be exposed to the outside somewhere. That is, what has happened to the order of the enumerated types? If that has changed, and if external code references a type by a hard-wired number, then there is a problem that goes beyond the code generator. The safe(r) thing to do in that case is add the new type to the end of the enumerated types and fix the promotion code so it doesn't try to rely on a linear order. I expect Robert can give the fastest answer to that question. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 6:59 AM, Charles R Harris charlesr.har...@gmail.comwrote: On Wed, Oct 7, 2009 at 6:37 AM, David Cournapeau courn...@gmail.comwrote: On Wed, Oct 7, 2009 at 9:31 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:06 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I'm thinking the safest thing to do is to move the new type to the end of the list. That's what the above branch does. I'm not sure what all the ramifications are for compatibility to having it stuck in the middle like that, does it change the type numbers for all the types after? Yes, there is no space left between the types declarations and the first functions. Currently, I just put things at the end manually, but that's really error prone. I am a bit lazy to fix this for real (I was thinking about using a python dict with hardcoded indexes as an entry instead of the current .txt files, but this requires several changes in the code generator, which is already not the greatest code to begin with). What I'm concerned about is that, IIRC, types in the c-code can be referenced by their index in a list of types and that internal mechanism might be exposed to the outside somewhere. That is, what has happened to the order of the enumerated types? If that has changed, and if external code references a type by a hard-wired number, then there is a problem that goes beyond the code generator. The safe(r) thing to do in that case is add the new type to the end of the enumerated types and fix the promotion code so it doesn't try to rely on a linear order. Here, for instance: The various character codes indicating certain types are also part of an enumerated list. References to type characters (should they be needed at all) should always use these enumerations. The form of them is NPY NAMELTR where NAME So those macros will generate a hard-coded number at compile time, and number that might have changed with the addition of the new types. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Building a new copy of NumPy
Hello, I checked-out the latest trunk and make a new installation of NumPy. My question: Is it a known behaviour that this action will result with re-building other packages that are dependent on NumPy. In my case, I had to re-built matplotlib, and now scipy. Here is the error message that I am getting while I try to import a scipy module: I[1]: run lab4.py --- RuntimeError Traceback (most recent call last) RuntimeError: FATAL: module compiled aslittle endian, but detected different endianness at runtime --- ImportError Traceback (most recent call last) /home/gsever/AtSc450/labs/04_thermals/lab4.py in module() 2 3 import numpy as np 4 from scipy import stats 5 6 /home/gsever/Desktop/python-repo/scipy/scipy/stats/__init__.py in module() 5 from info import __doc__ 6 7 from stats import * 8 from distributions import * 9 from rv import * /home/gsever/Desktop/python-repo/scipy/scipy/stats/stats.py in module() 196 # Scipy imports. 197 from numpy import array, asarray, dot, ma, zeros, sum -- 198 import scipy.special as special 199 import scipy.linalg as linalg 200 import numpy as np /home/gsever/Desktop/python-repo/scipy/scipy/special/__init__.py in module() 6 #from special_version import special_version as __version__ 7 8 from basic import * 9 import specfun 10 import orthogonal /home/gsever/Desktop/python-repo/scipy/scipy/special/basic.py in module() 6 7 from numpy import * 8 from _cephes import * 9 import types 10 import specfun ImportError: numpy.core.multiarray failed to import WARNING: Failure executing file: lab4.py -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building a new copy of NumPy
On Wed, Oct 7, 2009 at 09:55, Gökhan Sever gokhanse...@gmail.com wrote: Hello, I checked-out the latest trunk and make a new installation of NumPy. My question: Is it a known behaviour that this action will result with re-building other packages that are dependent on NumPy. In my case, I had to re-built matplotlib, and now scipy. Known issue. See the thread Numpy SVN broken. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] byteswapping a complex scalar
I'm noticing an inconsistency as to how complex numbers are byteswapped as arrays vs. scalars, and wondering if I'm doing something wrong. x = np.array([-1j], 'c8') x.tostring().encode('hex') '80bf' # This is a little-endian representation, in the order (real, imag) # When I swap the whole array, it swaps each of the (real, imag) parts separately y = x.byteswap() y.tostring().encode('hex') 'bf80' # and this round-trips fine z = np.fromstring(y.tostring(), dtype='c8') assert z[0] == -1j # When I swap the scalar, it seems to swap the entire 8 bytes y = x[0].byteswap() y.tostring().encode('hex') 'bf80' # ...and this doesn't round-trip z = np.fromstring(y.tostring(), dtype='c8') assert z[0] == -1j Traceback (most recent call last): File stdin, line 1, in module AssertionError Any thoughts? Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
Added as comment in the same entry: http://projects.scipy.org/numpy/ticket/1253#comment:1 Guessing that this one should be easy to fix :) On Wed, Oct 7, 2009 at 3:05 AM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 7, 2009, at 2:57 AM, Gökhan Sever wrote: One more example. (I still think the behaviour of fill_value is inconsistent) Well, ma.masked_values use `value` to define fill_value, ma.masked_equal does not. So yes, there's an inconsistency here. Once again, please fill an enhancement request ticket. I should be able to deal with this one quite soon. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] tostring() for array rows
josef.p...@gmail.com wrote: I wanted to avoid the python loop and thought creating the view will be faster with large arrays. But for this I need to know the memory length of a row of arbitrary types for the conversion to strings, ndarray.itemsize might do it. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
Gökhan Sever wrote: Sorry too much time spent in ipython -pylab :) Good reflex. Saves you from making extra explanations. But it works with just typing array why should I type np.array (Ohh my namespacess :) Because it shouldn't work that way! I use -pylab, but I've added: o.pylab_import_all = 0 to my ipy_user_conf.py file, so I don't get the namespace pollution. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Wed, Oct 7, 2009 at 12:21 PM, Christopher Barker chris.bar...@noaa.govwrote: Gökhan Sever wrote: Sorry too much time spent in ipython -pylab :) Good reflex. Saves you from making extra explanations. But it works with just typing array why should I type np.array (Ohh my namespacess :) Because it shouldn't work that way! I use -pylab, but I've added: o.pylab_import_all = 0 to my ipy_user_conf.py file, so I don't get the namespace pollution. -Chris Yes, I am aware of this fact. Still either from laziness or practicality I prefer typing plot to plt.plot and arange to np.arange while I have write them so many times in one day. Do you know what shortcut name is used for scipy package itself? -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Wed, Oct 7, 2009 at 12:35, Gökhan Sever gokhanse...@gmail.com wrote: Do you know what shortcut name is used for scipy package itself? I do not recommend using import scipy or import scipy as Import the subpackages directly (e.g. from scipy import linalg). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building a new copy of NumPy
I have seen that message, but I wasn't sure these errors were directly connected since he mentions of getting segfaults whereas in my case only gives import errors. Building a new copy of scipy fixed this error. On Wed, Oct 7, 2009 at 10:10 AM, Robert Kern robert.k...@gmail.com wrote: On Wed, Oct 7, 2009 at 09:55, Gökhan Sever gokhanse...@gmail.com wrote: Hello, I checked-out the latest trunk and make a new installation of NumPy. My question: Is it a known behaviour that this action will result with re-building other packages that are dependent on NumPy. In my case, I had to re-built matplotlib, and now scipy. Known issue. See the thread Numpy SVN broken. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 7:07 AM, Charles R Harris charlesr.har...@gmail.comwrote: On Wed, Oct 7, 2009 at 6:59 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Oct 7, 2009 at 6:37 AM, David Cournapeau courn...@gmail.comwrote: On Wed, Oct 7, 2009 at 9:31 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:06 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I'm thinking the safest thing to do is to move the new type to the end of the list. That's what the above branch does. I'm not sure what all the ramifications are for compatibility to having it stuck in the middle like that, does it change the type numbers for all the types after? Yes, there is no space left between the types declarations and the first functions. Currently, I just put things at the end manually, but that's really error prone. I am a bit lazy to fix this for real (I was thinking about using a python dict with hardcoded indexes as an entry instead of the current .txt files, but this requires several changes in the code generator, which is already not the greatest code to begin with). What I'm concerned about is that, IIRC, types in the c-code can be referenced by their index in a list of types and that internal mechanism might be exposed to the outside somewhere. That is, what has happened to the order of the enumerated types? If that has changed, and if external code references a type by a hard-wired number, then there is a problem that goes beyond the code generator. The safe(r) thing to do in that case is add the new type to the end of the enumerated types and fix the promotion code so it doesn't try to rely on a linear order. Here, for instance: The various character codes indicating certain types are also part of an enumerated list. References to type characters (should they be needed at all) should always use these enumerations. The form of them is NPY NAMELTR where NAME So those macros will generate a hard-coded number at compile time, and number that might have changed with the addition of the new types. Nevermind, it looks like the new type number is at the end as it should be. In [22]: typecodes Out[22]: {'All': '?bhilqpBHILQPfdgFDGSUVOMm', 'AllFloat': 'fdgFDG', 'AllInteger': 'bBhHiIlLqQpP', 'Character': 'c', 'Complex': 'FDG', 'Datetime': 'Mm', 'Float': 'fdg', 'Integer': 'bhilqp', 'UnsignedInteger': 'BHILQP'} Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
Pierre GM wrote: On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: option to merge delimiters - actually in SAS it is default Wow! that sure strikes me as a bad choice. Ahah! I get it. Well, I remember that we discussed something like that a few months ago when I started working on np.genfromtxt, and the default of *not* merging whitespaces was requested. I gonna check whether we can't put this option somewhere now... I'd think you might want to have two options: either whitespace which would be any type or amount of whitespace, or a specific delimeter: say \t or or(two spaces), etc. In that case, it would mean one and only one of these. Of course, this would fail in Bruce's example: A B C D 1 2 3 4 1 4 5 as there is a space for the delimeter, and one for the data! This looks like fixed-format to me. if it were single-space delimited, it would look more like: when the delimiter is whitespace. A B C D E 1 2 3 4 5 1 4 5 which is the same as: A, B, C, D, E 1, 2, 3, 4, 5 1, , , 4, 5 If something like SAS actually does merge decimeters, which I interpret to mean that if there are a few empty fields and you call for tab-delimited , you only get one tab, then information as simply been lost -- there is no way to recover it! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On 10/07/2009 02:14 PM, Christopher Barker wrote: Pierre GM wrote: On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: option to merge delimiters - actually in SAS it is default Wow! that sure strikes me as a bad choice. Ahah! I get it. Well, I remember that we discussed something like that a few months ago when I started working on np.genfromtxt, and the default of *not* merging whitespaces was requested. I gonna check whether we can't put this option somewhere now... I'd think you might want to have two options: either whitespace which would be any type or amount of whitespace, or a specific delimeter: say \t or or(two spaces), etc. In that case, it would mean one and only one of these. Of course, this would fail in Bruce's example: A B C D 1 2 3 4 1 4 5 as there is a space for the delimeter, and one for the data! This looks like fixed-format to me. if it were single-space delimited, it would look more like: when the delimiter is whitespace. A B C D E 1 2 3 4 5 1 4 5 which is the same as: A, B, C, D, E 1, 2, 3, 4, 5 1, , , 4, 5 If something like SAS actually does merge decimeters, which I interpret to mean that if there are a few empty fields and you call for tab-delimited , you only get one tab, then information as simply been lost -- there is no way to recover it! -Chris To use fixed length fields you really need nicely formatted data and I usually do not have that. As a default it does not always work for non-whitespace delimiters such as: A,B,C ,,1 1,2,3 There is an option to override that behavior. But it is very useful when you have extra whitespace especially reading in text strings that have different lengths or different levels of whitespace padding. The following is correct in that Python does merge whitespace delimiters by default. This is also what SAS does by default for any delimiter. But it is incorrect if each whitespace character is a delimiter: s = StringIO(''' 1 10 100\r\n 10 1 1000''') np.genfromtxt(s) array([[1.,10., 100.], [ 10., 1., 1000.]]) np.genfromtxt(s, delimiter=' ') Traceback (most recent call last): File stdin, line 1, inmodule File /usr/lib64/python2.6/site-packages/numpy/lib/io.py, line 1048, in genfromtxt raise IOError('End-of-file reached before encountering data.') IOError: End-of-file reached before encountering data. Anyhow, I do like what genfromtxt is doing so merging multiple delimiters of the same type is not really needed. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On Oct 7, 2009, at 3:54 PM, Bruce Southey wrote: Anyhow, I do like what genfromtxt is doing so merging multiple delimiters of the same type is not really needed. Thinking about it, merging multiple delimiters of the same type can be tricky: how do you distinguish between, say, AAA\t\tCCC where you expect 2 fields and AAA\t\tCCC where you expect 3 fields but the second one is missing ? I think 'genfromtxt' works consistently right now (but of course, as soon as I say that we'll find some counter-examples), so let's not break it. Yet. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Building a new copy of NumPy
You can pull the patches from David's fix_abi branch: http://github.com/cournape/numpy/tree/fix_abi This branch has been hacked to be ABI compatible with previous versions. Cheers Stéfan 2009/10/7 Gökhan Sever gokhanse...@gmail.com: I have seen that message, but I wasn't sure these errors were directly connected since he mentions of getting segfaults whereas in my case only gives import errors. Building a new copy of scipy fixed this error. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Oct 7, 2009, at 3:06 AM, David Cournapeau wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I apologize for the mis communication that has occurred here. I did not understand that there was a desire to keep ABI compatibility with NumPy 1.3 when NumPy 1.4 was released.The datetime merge was made under that presumption. I had assumed that people would be fine with recompilation of extension modules that depend on the NumPy C-API.There are several things that needed to be done to merge in new fundamental data-types. Why don't we call the next release NumPy 2.0 if that helps things? Personally, I'd prefer that over hacks to keep ABI compatibility. It feels like we are working very hard to track ABI issues that can also be handled with dependency checking and good package management. -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Thu, Oct 8, 2009 at 11:39 AM, Travis Oliphant oliph...@enthought.com wrote: I apologize for the mis communication that has occurred here. No problem I did not understand that there was a desire to keep ABI compatibility with NumPy 1.3 when NumPy 1.4 was released. The datetime merge was made under that presumption. I had assumed that people would be fine with recompilation of extension modules that depend on the NumPy C-API. There are several things that needed to be done to merge in new fundamental data-types. Why don't we call the next release NumPy 2.0 if that helps things? Personally, I'd prefer that over hacks to keep ABI compatibility. Keeping ABI compatibility by itself is not an hack - the current workaround is an hack, but that's only because the current way of doing things in code generator is a bit ugly, and I did not want to spend too much time on it. It is purely an implementation issue, the fundamental idea is straightforward. If you want a cleaner solution, I can work on it. I think the hour or so that it would take is worth it compared to breaking many people's code. It feels like we are working very hard to track ABI issues that can also be handled with dependency checking and good package management. I think ABI issues are mostly orthogonal to versioning - generally, versions are related to API changes (API changes is what should drive ABI changes, at least for projects like numpy). I would prefer passing to numpy 2.0 when we really need to break ABI and API - at that point, I think we should also think hard about changing our structures and all to make them more robust to those changes (using pimp-like strategies in particular). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 21:55, David Cournapeau courn...@gmail.com wrote: On Thu, Oct 8, 2009 at 11:51 AM, David Cournapeau courn...@gmail.com wrote: I would prefer passing to numpy 2.0 when we really need to break ABI and API - at that point, I think we should also think hard about changing our structures and all to make them more robust to those changes (using pimp-like strategies in particular). Sorry, I mean pimple, not pimp (makes you wonder what goes in my head): Indeed! (And it's pimpl.) :-) -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On 10/7/2009 10:57 PM, Robert Kern wrote: it's pimpl OK: http://en.wikipedia.org/wiki/Opaque_pointer Thanks, Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] robustness strategies
Alan G Isaac wrote: On 10/7/2009 10:51 PM, David Cournapeau wrote: pimp-like strategies Which means ... ? The idea is to put one pointer in you struct instead of all members - it is a form of encapsulation, and it is enforced at compile time. I think part of the problem with changing API/ABI in numpy is that the headers show way too much information. I would really like to improve this, but this would clearly break the ABI (and API - a lot of macros would have to go). There is a performance cost of one more indirection (if you have a pointer to a struct, you need to dereference both the struct and the D pointer inside), but for most purpose, that's likely to be negligeable, except for a few special cases (like iterators). cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 8:39 PM, Travis Oliphant oliph...@enthought.comwrote: On Oct 7, 2009, at 3:06 AM, David Cournapeau wrote: On Wed, Oct 7, 2009 at 2:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: Looks like a clue ;) Ok, I fixed it here: http://github.com/cournape/numpy/tree/fix_abi But that's an ugly hack. I think we should consider rewriting how we generate the API: instead of automatically growing the API array of fptr, we should explicitly mark which function name has which index, and hardcode it. It would help quite a bit to avoid changing the ABI unvoluntary. I apologize for the mis communication that has occurred here. I did not understand that there was a desire to keep ABI compatibility with NumPy 1.3 when NumPy 1.4 was released.The datetime merge was made under that presumption. I had assumed that people would be fine with recompilation of extension modules that depend on the NumPy C-API.There are several things that needed to be done to merge in new fundamental data-types. Why don't we call the next release NumPy 2.0 if that helps things? Personally, I'd prefer that over hacks to keep ABI compatibility. It feels like we are working very hard to track ABI issues that can also be handled with dependency checking and good package management. I was that the code generator shifted the API order because it was inserting the new types after the old types but before the other API functions. It's a code generator problem and doesn't call for a jump in major version. We hope ;) I think David's hack, which looks to have been committed by Stefan, should fix things up. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion