Re: [Numpy-discussion] Tabular data package
On 10/05/2009 06:20 PM, Robert Kern wrote: On Mon, Oct 5, 2009 at 18:15, Elaine Angelinoelaine.angel...@gmail.com wrote: Well, what other recarray functionality are you using? None, in our code. We also thought that since at least some people like using the attribute reference property, perhaps users of tabarrays might too (though we don't personally in our own work) Recarrays still seemed to be being supported by NumPy, so it seemed to make sense to use them. but the only functional thing in our code are those constructors. Then I would suggest making tabarrays subclass from ndarray. If you like, provide a tabrecarray that subclasses from both recarray and tabarray so that people who like attribute access can .view() to their heart's content. (Also, is first casting to recarrays and then viewing as ndarrays more expensive than if we went through ndarray directly?) But if NumPy decided to include ndarray versions of the from*() constructors in the distribution, would this be achieved by first using the recarray constructor and then viewing as ndarray? Or would something more direct be done? We would fix the functions to not do any unnecessary .view()s. Hi Elaine, I do want to look more at what you have done as some of the features are very interesting. This discussion raises the question of what do you find missing in numpy that you have included in tabular package? In particular is there a particular set of functions that you think could be added to numpy or even create a 'better' recarray class? There are real advantages of having at least core components in numpy. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
2009/10/6 Stéfan van der Walt ste...@sun.ac.za Hi all, The current SVN HEAD of NumPy is broken and should not be used. Extensions compiled against this version may (will) segfault. Travis, if you could have a look at the side-effects caused by r7050, that would be great. I meant to figure out what was wrong, but seeing that this is a 3000 line patch, I'm not confident I can find the problem easily. Regards Stéfan P.S. The new functionality is great, but I don't think we're going to be able to convince David to release without documenting and testing those changes to the C API. ___ Seeing as the next release process is probably going to start next month and we want things to settle out, it might be advisable delay any intrusive patches to the release after and subject them to review and discussion first. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tabular data package
On Mon, Oct 5, 2009 at 5:22 PM, Elaine Angelino elaine.angel...@gmail.com wrote: Hi there, We are writing to announce the release of Tabular, a package of Python modules for working with tabular data. Tabular is a package of Python modules for working with tabular data. Its main object is the tabarray class, a data structure for holding and manipulating tabular data. By putting data into a tabarray object, you’ll get a representation of the data that is more flexible and powerful than a native Python representation. More specifically, tabarray provides: -- ultra-fast filtering, selection, and numerical analysis methods, using convenient Matlab-style matrix operation syntax -- spreadsheet-style operations, including row column operations, 'sort', 'replace', 'aggregate', 'pivot', and 'join' -- flexible load and save methods for a variety of file formats, including delimited text (CSV), binary, and HTML -- helpful inference algorithms for determining formatting parameters and data types of input files -- support for hierarchical groupings of columns, both as data structures and file formats You can download Tabular from PyPI (http://pypi.python.org/pypi/tabular/) or alternatively clone our hg repository from bitbucket (http://bitbucket.org/elaine/tabular/). We also have posted tutorial-style Sphinx documentation (http://www.parsemydata.com/tabular/). The tabarray object is based on the record array object from the Numerical Python package (NumPy), and Tabular is built to interface well with NumPy in general. Our intended audience is two-fold: (1) Python users who, though they may not be familiar with NumPy, are in need of a way to work with tabular data, and (2) NumPy users who would like to do spreadsheet-style operations on top of their more numerical work. We hope that some of you find Tabular useful! Best, Elaine and Dan I briefly looked at the sphinx docs and the code. Tabular looks pretty useful and the code can be partially read as recipes for working with recarrays or structured arrays. Thanks for the choice of license (it makes looking at the code legal). I didn't see any explicit nan handling. Are missing values allowed e.g. in the constructor? I looked a bit closer at function like tabular.fast.recarrayisin since I always have problems with these row operations. Are these function supposed to work with arbitrary structured arrays? The tests are only for a 1d integer arrays. With floats the default string representation doesn't sort correctly. Or am I misreading the function? arr = np.array([6,1,2,1e-13,0.5*1e-14,1,2e25,3,0,7]).view([('',float)]*2) arr array([(6.0, 1.0), (2.0, 1e-013), (5e-015, 1.0), (2.0002e+025, 3.0), (0.0, 7.0)], dtype=[('f0', 'f8'), ('f1', 'f8')]) np.sort([str(l) for l in arr]) array(['(0.0, 7.0)', '(2.0, 1e-013)', '(2.0002e+025, 3.0)', '(5e-015, 1.0)', '(6.0, 1.0)'], dtype='|S30') Being able to do a searchsorted on rows of an array would be a useful feature in numpy. Is there a sortable 1d representation of the rows of a 2d float or mixed type array? Thanks, Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
2009/10/6 Charles R Harris charlesr.har...@gmail.com: 2009/10/6 Stéfan van der Walt ste...@sun.ac.za Hi all, The current SVN HEAD of NumPy is broken and should not be used. Extensions compiled against this version may (will) segfault. Can you be more specific? I haven't had any problems running current svn with scipy. Both David and I had segfaults when running scipy compiled off the latest numpy. An example from Kiva: Program received signal SIGSEGV, Segmentation fault. PyArray_INCREF (mp=0x42) at build/scons/numpy/core/src/multiarray/refcount.c:103 103 if (!PyDataType_REFCHK(mp-descr)) { (gdb) bt #0 PyArray_INCREF (mp=0x42) at build/scons/numpy/core/src/multiarray/refcount.c:103 #1 0x00985f67 in agg::pixel_map_as_unowned_array (pix_map=...) at build/src.linux-i686-2.6/enthought/kiva/agg/src/x11/plat_support_wrap.cpp:2909 #2 0x0098795f in _wrap_pixel_map_as_unowned_array (args=0xb7ed032c) at build/src.linux-i686-2.6/enthought/kiva/agg/src/x11/plat_support_wrap.cpp:3341 Via bisection, the source of the problem has been localised to the merge of the datetime branch. Cheers Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Tue, Oct 6, 2009 at 10:50 AM, David Cournapeau courn...@gmail.comwrote: On Wed, Oct 7, 2009 at 1:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: 2009/10/6 Stéfan van der Walt ste...@sun.ac.za Hi all, The current SVN HEAD of NumPy is broken and should not be used. Extensions compiled against this version may (will) segfault. Can you be more specific? I haven't had any problems running current svn with scipy. The version itself is fine, but the ABI has been changed in an incompatible way: if you have an extension built against say numpy 1.2.1, and then use a numpy built from sources after the datetime merge, it will segfault right away. It does so for scipy and several custom extensions. The abi breakage was found to be the datetime merge. Ah... That's a fine kettle of fish. Any idea what ABI calls are causing the problem? Maybe the dtype change wasn't made in a compatible way. IIRC, something was added to the dtype? Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Wed, Oct 7, 2009 at 2:04 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 6, 2009 at 10:50 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 1:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: 2009/10/6 Stéfan van der Walt ste...@sun.ac.za Hi all, The current SVN HEAD of NumPy is broken and should not be used. Extensions compiled against this version may (will) segfault. Can you be more specific? I haven't had any problems running current svn with scipy. The version itself is fine, but the ABI has been changed in an incompatible way: if you have an extension built against say numpy 1.2.1, and then use a numpy built from sources after the datetime merge, it will segfault right away. It does so for scipy and several custom extensions. The abi breakage was found to be the datetime merge. Ah... That's a fine kettle of fish. Any idea what ABI calls are causing the problem? Maybe the dtype change wasn't made in a compatible way. IIRC, something was added to the dtype? Yes, but that should not cause trouble. Adding members to structure should be fine. I quickly look at the diff, and some changes in the code generators look suspicious, e.g.: types = ['Generic','Number','Integer','SignedInteger','UnsignedInteger', - 'Inexact', + 'Inexact', 'TimeInteger', 'Floating', 'ComplexFloating', 'Flexible', 'Character', 'Byte','Short','Int', 'Long', 'LongLong', 'UByte', 'UShort', 'UInt', 'ULong', 'ULongLong', 'Float', 'Double', 'LongDouble', 'CFloat', 'CDouble', 'CLongDouble', 'Object', 'String', 'Unicode', - 'Void'] + 'Void', 'Datetime', 'Timedelta'] As the list is used to initialize some values from the API function pointer array, inserts should be avoided. You can see the consequence on the generated files, e.g. part of __multiarray_api.h diff between datetimemerge and just before: #define PyFloatingArrType_Type (*(PyTypeObject *)PyArray_API[16]) #define PyComplexFloatingArrType_Type (*(PyTypeObject *)PyArray_API[17]) #define PyFlexibleArrType_Type (*(PyTypeObject *)PyArray_API[18]) #define PyCharacterArrType_Type (*(PyTypeObject *)PyArray_API[19]) #define PyByteArrType_Type (*(PyTypeObject *)PyArray_API[20]) #define PyShortArrType_Type (*(PyTypeObject *)PyArray_API[21]) #define PyIntArrType_Type (*(PyTypeObject *)PyArray_API[22]) #define PyLongArrType_Type (*(PyTypeObject *)PyArray_API[23]) #define PyLongLongArrType_Type (*(PyTypeObject *)PyArray_API[24]) #define PyUByteArrType_Type (*(PyTypeObject *)PyArray_API[25]) #define PyUShortArrType_Type (*(PyTypeObject *)PyArray_API[26]) #define PyUIntArrType_Type (*(PyTypeObject *)PyArray_API[27]) #define PyULongArrType_Type (*(PyTypeObject *)PyArray_API[28]) #define PyULongLongArrType_Type (*(PyTypeObject *)PyArray_API[29]) #define PyFloatArrType_Type (*(PyTypeObject *)PyArray_API[30]) #define PyDoubleArrType_Type (*(PyTypeObject *)PyArray_API[31]) #define PyLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[32]) #define PyCFloatArrType_Type (*(PyTypeObject *)PyArray_API[33]) #define PyCDoubleArrType_Type (*(PyTypeObject *)PyArray_API[34]) #define PyCLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[35]) #define PyObjectArrType_Type (*(PyTypeObject *)PyArray_API[36]) #define PyStringArrType_Type (*(PyTypeObject *)PyArray_API[37]) #define PyUnicodeArrType_Type (*(PyTypeObject *)PyArray_API[38]) #define PyVoidArrType_Type (*(PyTypeObject *)PyArray_API[39]) --- #define PyTimeIntegerArrType_Type (*(PyTypeObject *)PyArray_API[16]) #define PyFloatingArrType_Type (*(PyTypeObject *)PyArray_API[17]) #define PyComplexFloatingArrType_Type (*(PyTypeObject *)PyArray_API[18]) #define PyFlexibleArrType_Type (*(PyTypeObject *)PyArray_API[19]) #define PyCharacterArrType_Type (*(PyTypeObject *)PyArray_API[20]) #define PyByteArrType_Type (*(PyTypeObject *)PyArray_API[21]) #define PyShortArrType_Type (*(PyTypeObject *)PyArray_API[22]) #define PyIntArrType_Type (*(PyTypeObject *)PyArray_API[23]) #define PyLongArrType_Type (*(PyTypeObject *)PyArray_API[24]) #define PyLongLongArrType_Type (*(PyTypeObject *)PyArray_API[25]) #define PyUByteArrType_Type (*(PyTypeObject *)PyArray_API[26]) #define PyUShortArrType_Type (*(PyTypeObject *)PyArray_API[27]) #define PyUIntArrType_Type (*(PyTypeObject *)PyArray_API[28]) #define PyULongArrType_Type (*(PyTypeObject *)PyArray_API[29]) #define PyULongLongArrType_Type (*(PyTypeObject *)PyArray_API[30]) #define PyFloatArrType_Type (*(PyTypeObject *)PyArray_API[31]) #define PyDoubleArrType_Type (*(PyTypeObject *)PyArray_API[32]) #define PyLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[33]) #define PyCFloatArrType_Type (*(PyTypeObject *)PyArray_API[34]) #define PyCDoubleArrType_Type (*(PyTypeObject *)PyArray_API[35]) #define PyCLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[36]) #define PyObjectArrType_Type (*(PyTypeObject *)PyArray_API[37])
Re: [Numpy-discussion] NumPy SVN broken
On 6-Oct-09, at 12:50 PM, David Cournapeau wrote: The version itself is fine, but the ABI has been changed in an incompatible way: if you have an extension built against say numpy 1.2.1, and then use a numpy built from sources after the datetime merge, it will segfault right away. It does so for scipy and several custom extensions. The abi breakage was found to be the datetime merge. I experienced something similar recently with both ETS and pytables. Good to know finally what was going on. :) David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy SVN broken
On Tue, Oct 6, 2009 at 11:14 AM, David Cournapeau courn...@gmail.comwrote: On Wed, Oct 7, 2009 at 2:04 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 6, 2009 at 10:50 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Oct 7, 2009 at 1:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: 2009/10/6 Stéfan van der Walt ste...@sun.ac.za Hi all, The current SVN HEAD of NumPy is broken and should not be used. Extensions compiled against this version may (will) segfault. Can you be more specific? I haven't had any problems running current svn with scipy. The version itself is fine, but the ABI has been changed in an incompatible way: if you have an extension built against say numpy 1.2.1, and then use a numpy built from sources after the datetime merge, it will segfault right away. It does so for scipy and several custom extensions. The abi breakage was found to be the datetime merge. Ah... That's a fine kettle of fish. Any idea what ABI calls are causing the problem? Maybe the dtype change wasn't made in a compatible way. IIRC, something was added to the dtype? Yes, but that should not cause trouble. Adding members to structure should be fine. I quickly look at the diff, and some changes in the code generators look suspicious, e.g.: types = ['Generic','Number','Integer','SignedInteger','UnsignedInteger', - 'Inexact', + 'Inexact', 'TimeInteger', 'Floating', 'ComplexFloating', 'Flexible', 'Character', 'Byte','Short','Int', 'Long', 'LongLong', 'UByte', 'UShort', 'UInt', 'ULong', 'ULongLong', 'Float', 'Double', 'LongDouble', 'CFloat', 'CDouble', 'CLongDouble', 'Object', 'String', 'Unicode', - 'Void'] + 'Void', 'Datetime', 'Timedelta'] As the list is used to initialize some values from the API function pointer array, inserts should be avoided. You can see the consequence on the generated files, e.g. part of __multiarray_api.h diff between datetimemerge and just before: Looks like a clue ;) snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tabular data package
I didn't see any explicit nan handling. Are missing values allowed e.g. in the constructor? No, this is a valid point. We don't handle this as explicitly as we should. Are you mostly talking about nan handling in loading from delimited text files? (Or are you talking about something more general, like integration of masked arrays?) In loading from delimited text files, you can use the linefixer and valuefixer arguments, which are for more general purposes, and which will get the job done, but slowly. We should do something more specialized for missing values that would be faster. Are these function supposed to work with arbitrary structured arrays? Well, they're only really tested for working with strings, floats, and ints (tho only the int tests are included in the test module, we should expand that). I imagine it's possible they'd work with more sophisticated things but I'm not sure. arr = np.array([6,1,2,1e-13,0.5*1e-14,1,2e25,3,0,7]).view([('',float)]*2) arr array([(6.0, 1.0), (2.0, 1e-013), (5e-015, 1.0), (2.0002e+025, 3.0), (0.0, 7.0)], dtype=[('f0', 'f8'), ('f1', 'f8')]) np.sort([str(l) for l in arr]) array(['(0.0, 7.0)', '(2.0, 1e-013)', '(2.0002e+025, 3.0)', '(5e-015, 1.0)', '(6.0, 1.0)'], dtype='|S30') Well on this example (as in tests that we did), fast.recarrayisin performed as spec'd. ... But definitely write back again if you think it's failing somewhere. In general, extending a number of the thigns in Tabular (e.g. the loadSV and saveSV) to arbitrary structured dtypes as opposed to more basic types would be great. Dan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On 10/05/2009 02:13 PM, Pierre GM wrote: All, Could you try r7449 ? I introduced some mechanisms to keep track of invalid lines (where the number of columns don't match what's expected). By default, a warning is emitted and these lines are skipped, but an optional argument gives the possibility to raise an exception instead. Now, I need more tests about wrong converters. I'm trying to optimize the upgrade mechanism (there are too many intertwined loops for my taste now), I'll keep you posted. Meanwhile, if you could come with more cases of failure, please send them my way. Cheers P. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Hi, Excellent as the changes appear to address incorrect number of delimiters. I think that the default invalid_raise should be True. One 'feature' is that there is no way to indicate multiple delimiters when the delimiter is whitespace. A B C D 1 2 3 4 1 4 5 Which I consider a user beware issue when using whitespace as the delimiter especially in Python. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On Oct 6, 2009, at 2:42 PM, Bruce Southey wrote: Hi, Excellent as the changes appear to address incorrect number of delimiters. They should also give some extra info if there's a problem w/ the converters. I think that the default invalid_raise should be True. Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? One 'feature' is that there is no way to indicate multiple delimiters when the delimiter is whitespace. A B C D 1 2 3 4 1 4 5 Have you tried using a sequence of integers for the delimiter ? Would you mind sending me some test ? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] tostring() for array rows
josef.p...@gmail.com wrote: If I have a structured or a regular array, is the use of strides in the following always correct for the length of the row memory? I would like to do tostring() but on each row, by creating a string view of the memory in a 1d array. Maybe I'm missing what you want, but why not just: In [15]: tmp Out[15]: array([[ 1.07810097, -1.74157351, 0.29740878], [-0.16786436, 0.45752272, -0.8038045 ], [-0.17195028, -1.16753882, 0.04329128], [ 0.45460137, -0.44584955, -0.77140505]]) In [16]: rows = [] In [17]: for r in range(tmp.shape[0]): rows.append(tmp[r,:].tostring()) : In [19]: rows Out[19]: ['?\xf1?\xe6\xce\x1f9\xce\xbf\xfb\xdd|.\xc85Z?\xd3\x08\xbe\xd6\xb7\xb6\xe8', '\xbf\xc5|\x94Sx\x92\x18?\xddH\r\\T\xfbT\xbf\xe9\xb8\xc45\xff\x92\xdf', '\xbf\xc6\x02w\x82\x18i\xaf\xbf\xf2\xae=/\xfe\xff\x0b?\xa6*FD\xae\xd1F', '?\xdd\x180Z\xcet\xa5\xbf\xdc\x88\xcc\x8a\x8c\x8b\xe7\xbf\xe8\xafY\xa2\xf8\xac '] in general, you can let numpy worry about the strides, etc. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Questions about masked arrays
Hello, I have a sample masked array data as shown below. 1-) When I list the whole array I see the fill value correctly. However below that line, when I do access the 5th element, fill_value flies upto 1e+20. What might be wrong here? I[5]: c.data['Air_Temp'] O[5]: masked_array(data = [13.1509 13.1309 13.1278 13.1542 -- 13.1539 13.1387 -- -- -- 13.1107 13.1351 13.2073 13.2562 13.3533 13.3889 13.4067 13.2938 13.1962 13.1248 13.0411 12.9534 12.8354 12.7392 12.6725], mask = [False False False False True False False True True True False False False False False False False False False False False False False False False], fill_value = 99.) I[6]: c.data['Air_Temp'][4] O[6]: masked_array(data = --, mask = True, fill_value = 1e+20) 2-) What is wrong with the arccos calculation? Should not that result the same as with cos(d) result? I[9]: d = c.data['Air_Temp'][4] I[11]: cos(d) O[11]: masked_array(data = --, mask = True, fill_value = 1e+20) I[12]: arccos(d) O[12]: masked_array(data = 1.57079632679, mask = False, fill_value = 1e+20) Any ideas? -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
Pierre GM wrote: I think that the default invalid_raise should be True. Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? yup -- make it +2 -- ignoring erreos and losing data by default is a bad idea! One 'feature' is that there is no way to indicate multiple delimiters when the delimiter is whitespace. A B C D 1 2 3 4 1 4 5 I'd say someone has made a very poor choice of file formats! Unless this s a fixed width file, in which case it should be processes as such, rather than as a delimited one. I suppose it wouldn't hurt to add that feature to genfromtxt.. or is it there already. Perhaps that's what this means: Have you tried using a sequence of integers for the delimiter ? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Oct 6, 2009, at 6:57 PM, Gökhan Sever wrote: Seeing a different filling value is causing confusion. Both for myself, and when I try to demonstrate the usage of masked array to other people. Fair enough. I must admit that `fill_value` is a vestige from the previous implementation (talking pre 1.2 here), that is no longer really needed (cf below for more details). Also say, if I want to replace that one element back to its original state will it use fill_value as 1e+20 or 99.? What do you mean by 'replace back to its original state' ? Using `filled`, you mean ? 2-) What is wrong with the arccos calculation? Should not that result the same as with cos(d) result? I first tested on 1.3.0, and later on my laptop using 1.4dev version which is about an old month built. Once again the results for each arc... function Er, I assume it's np.arccos ? Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma). Could it be that something went wrng with some ufuncs ? I didn't touch ma since 09/08 (thanks, svn history), so I don't think it comes from here... Would you mind trying a more recent svn version ? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Tue, Oct 6, 2009 at 7:38 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 6, 2009, at 6:57 PM, Gökhan Sever wrote: Seeing a different filling value is causing confusion. Both for myself, and when I try to demonstrate the usage of masked array to other people. Fair enough. I must admit that `fill_value` is a vestige from the previous implementation (talking pre 1.2 here), that is no longer really needed (cf below for more details). Also say, if I want to replace that one element back to its original state will it use fill_value as 1e+20 or 99.? What do you mean by 'replace back to its original state' ? Using `filled`, you mean ? Yes, in more properly stated fashion filled :) I[14]: c.data['Air_Temp'][4] O[14]: masked_array(data = --, mask = True, fill_value = 1e+20) I[15]: c.data['Air_Temp'][4].filled() O[15]: array(1e+20) Little buggy, isn't it? It properly fill the whole array: I[13]: c.data['Air_Temp'].filled() O[13]: array([ 1.31509000e+01, 1.31309000e+01, 1.31278000e+01, 1.31542000e+01, 1.e+06, 1.31539000e+01, 1.31387000e+01, 1.e+06, 1.e+06, 1.e+06, 1.31107000e+01, 1.31351000e+01, 1.32073000e+01, 1.32562000e+01, 1.33533000e+01, 1.33889000e+01, 1.34067000e+01, 1.32938000e+01, 1.31962000e+01, 1.31248000e+01, 1.30411000e+01, 1.29534000e+01, 1.28354000e+01, 1.27392000e+01, 1.26725000e+01]) 2-) What is wrong with the arccos calculation? Should not that result the same as with cos(d) result? I first tested on 1.3.0, and later on my laptop using 1.4dev version which is about an old month built. Once again the results for each arc... function Er, I assume it's np.arccos ? Sorry too much time spent in ipython -pylab :) I[18]: arccos? Type: ufunc Base Class: type 'numpy.ufunc' String Form: ufunc 'arccos' Namespace:Interactive File: /home/gsever/Desktop/python-repo/numpy/numpy/__init__.py Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma). Could it be that something went wrng with some ufuncs ? This I don't know :( I didn't touch ma since 09/08 (thanks, svn history), so I don't think it comes from here... Yes, SVN is a very useful invention indeed. I[6]: numpy.__version__ O[6]: '1.4.0.dev' For some reason it doesn't list check-out revision. Doing an ls -l reveals that those are checked-out and installed after August 13 which was a preparation for the SciPy 09 :) Would you mind trying a more recent svn version ? This is the last resort. I will eventually try this if I don't any other options left. I confirmed the same arccos weirdness in Sage Notebook (www.sagenb.org) where Numpy 1.3.0 is installed there. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Oct 6, 2009, at 9:54 PM, Gökhan Sever wrote: Also say, if I want to replace that one element back to its original state will it use fill_value as 1e+20 or 99.? What do you mean by 'replace back to its original state' ? Using `filled`, you mean ? Yes, in more properly stated fashion filled :) I[14]: c.data['Air_Temp'][4] O[14]: masked_array(data = --, mask = True, fill_value = 1e+20) I[15]: c.data['Air_Temp'][4].filled() O[15]: array(1e+20) Little buggy, isn't it? It properly fill the whole array: I[13]: c.data['Air_Temp'].filled() O[13]: array([ 1.31509000e+01, 1.31309000e+01, 1.31278000e+01, 1.31542000e+01, 1.e+06, 1.31539000e+01, 1.31387000e+01, 1.e+06, 1.e+06, 1.e+06, 1.31107000e+01, 1.31351000e+01, 1.32073000e+01, 1.32562000e+01, 1.33533000e+01, 1.33889000e+01, 1.34067000e+01, 1.32938000e+01, 1.31962000e+01, 1.31248000e+01, 1.30411000e+01, 1.29534000e+01, 1.28354000e+01, 1.27392000e+01, 1.26725000e+01]) Once again, when you access your 5th element, you get the special `masked` constant. If you fill this constant, you'll get something which is probably not what you want. And I would need a *REALLY* compelling reason to change this behavior, as it's gonna break a lot of things (the masked constant has been around for a while) 2-) What is wrong with the arccos calculation? Should not that Er, I assume it's np.arccos ? Sorry too much time spent in ipython -pylab :) Well, i use ipython -pylab regularly as well, but still have the reflex of using np. ;) I[18]: arccos? Type: ufunc Base Class: type 'numpy.ufunc' String Form: ufunc 'arccos' Namespace:Interactive File: /home/gsever/Desktop/python-repo/numpy/numpy/ __init__.py Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma). Could it be that something went wrng with some ufuncs ? This I don't know :( I didn't touch ma since 09/08 (thanks, svn history), so I don't think it comes from here... Yes, SVN is a very useful invention indeed. I[6]: numpy.__version__ O[6]: '1.4.0.dev' For some reason it doesn't list check-out revision. I know, and it's bugging me as well. if you have a build directory somewhere, check numpy/core/__svn_version__.py This is the last resort. I will eventually try this if I don't any other options left. I gonna have difficulties fixing something that I don't see broken... Now, there might be something wrong in my installation. I gonna try to install 1.3.0 somwehere. say, what Python are you using ? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: No, just seeing what sort of problems I can create. This case is partly based on if someone is using tab-delimited then they need to set the delimiter='\t' otherwise it gives an error. Also I often parse text files so, yes, you have to be careful of the delimiters. It is also arises because certain programs like spreadsheets there is the option to merge delimiters - actually in SAS it is default (you need to specify the DSD option). Ahah! I get it. Well, I remmbr that we discussed something like that a few months ago when I started working on np.genfromtxt, and the default of *not* merging whitespaces was requested. I gonna check whether we can't put this option somewhere now... Anyhow, I am really impressed on how this function works. Thx. I hope things haven't been slowed down too much. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On Tue, Oct 6, 2009 at 10:08 PM, Bruce Southey bsout...@gmail.com wrote: On Tue, Oct 6, 2009 at 4:04 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 6, 2009, at 4:43 PM, Christopher Barker wrote: Pierre GM wrote: I think that the default invalid_raise should be True. Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? yup -- make it +2 -- ignoring erreos and losing data by default is a bad idea! OK then, that's enough for me: I'll put invalid_raise as True by default. Note that a warning was emitted no matter what. One 'feature' is that there is no way to indicate multiple delimiters when the delimiter is whitespace. A B C D 1 2 3 4 1 4 5 I'd say someone has made a very poor choice of file formats! No, just seeing what sort of problems I can create. This case is partly based on if someone is using tab-delimited then they need to set the delimiter='\t' otherwise it gives an error. Also I often parse text files so, yes, you have to be careful of the delimiters. It is also arises because certain programs like spreadsheets there is the option to merge delimiters - actually in SAS it is default (you need to specify the DSD option). Unless this s a fixed width file, in which case it should be processes as such, rather than as a delimited one. I suppose it wouldn't hurt to add that feature to genfromtxt.. or is it there already. Perhaps that's what this means: Have you tried using a sequence of integers for the delimiter ? Yes, if you give a sequence of integers as delimiter, it is interpreted as the length of each field. At least, should be. More to learn and test. There's an example on using the fixed-width delimiter here: http://docs.scipy.org/numpy/docs/numpy.lib.io.genfromtxt/ As far as I know, it works fine. Anyhow, I am really impressed on how this function works. Agreed. Genfromtxt and the derived are very useful. Skipper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Tue, Oct 6, 2009 at 9:22 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 6, 2009, at 9:54 PM, Gökhan Sever wrote: Also say, if I want to replace that one element back to its original state will it use fill_value as 1e+20 or 99.? What do you mean by 'replace back to its original state' ? Using `filled`, you mean ? Yes, in more properly stated fashion filled :) I[14]: c.data['Air_Temp'][4] O[14]: masked_array(data = --, mask = True, fill_value = 1e+20) I[15]: c.data['Air_Temp'][4].filled() O[15]: array(1e+20) Little buggy, isn't it? It properly fill the whole array: I[13]: c.data['Air_Temp'].filled() O[13]: array([ 1.31509000e+01, 1.31309000e+01, 1.31278000e+01, 1.31542000e+01, 1.e+06, 1.31539000e+01, 1.31387000e+01, 1.e+06, 1.e+06, 1.e+06, 1.31107000e+01, 1.31351000e+01, 1.32073000e+01, 1.32562000e+01, 1.33533000e+01, 1.33889000e+01, 1.34067000e+01, 1.32938000e+01, 1.31962000e+01, 1.31248000e+01, 1.30411000e+01, 1.29534000e+01, 1.28354000e+01, 1.27392000e+01, 1.26725000e+01]) Once again, when you access your 5th element, you get the special `masked` constant. If you fill this constant, you'll get something which is probably not what you want. And I would need a *REALLY* compelling reason to change this behavior, as it's gonna break a lot of things (the masked constant has been around for a while) I see your points. I don't want to give you extra work, don't worry :) It just seem a bit bizarre: I[27]: c.data['Air_Temp'].fill_value O[27]: 99.005 I[28]: c.data['Air_Temp'][4].fill_value O[28]: 1e+20 As you see, it just returns two different fill_values. I know eventually you will be the one handling this :) it might be good to add this issue to the tracker. 2-) What is wrong with the arccos calculation? Should not that Er, I assume it's np.arccos ? Sorry too much time spent in ipython -pylab :) Well, i use ipython -pylab regularly as well, but still have the reflex of using np. ;) Good reflex. Saves you from making extra explanations. But it works with just typing array why should I type np.array (Ohh my namespacess :) It is just an IPython magic. I[18]: arccos? Type: ufunc Base Class: type 'numpy.ufunc' String Form: ufunc 'arccos' Namespace:Interactive File: /home/gsever/Desktop/python-repo/numpy/numpy/ __init__.py Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma). Could it be that something went wrng with some ufuncs ? This I don't know :( I didn't touch ma since 09/08 (thanks, svn history), so I don't think it comes from here... Yes, SVN is a very useful invention indeed. I[6]: numpy.__version__ O[6]: '1.4.0.dev' For some reason it doesn't list check-out revision. I know, and it's bugging me as well. if you have a build directory somewhere, check numpy/core/__svn_version__.py There is build directory but no files that contains svn :( This is the last resort. I will eventually try this if I don't any other options left. I gonna have difficulties fixing something that I don't see broken... Now, there might be something wrong in my installation. I gonna try to install 1.3.0 somwehere. say, what Python are you using ? OK, I use meld to diff my copy of ma/core.py with the latest trunk version. There are lots of differences :) So there is a possibility that I might have built my local numpy before 09/08. I should renew my copy. Do you know the link of svn browser for the numpy? I don't know how you are making separate installations without overriding other package? I either use Sage (if I have extra time) or SPD. They are both shipped with numpy 1.3.0. Let see how it will result with a new build... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On Tue, Oct 6, 2009 at 10:27 PM, Pierre GM pgmdevl...@gmail.com wrote: snip Anyhow, I am really impressed on how this function works. Thx. I hope things haven't been slowed down too much. In keeping with the making some work for you theme, I filed an enhancement ticket for one change that we discussed and another IMO useful addition. http://projects.scipy.org/numpy/ticket/1238 I think it would be nice if we could do data = np.genfromtxt(SomeFile, dtype=float, names = ['var1', 'var2', 'var3' ...]) So that float is paired with each variable name. Also, the one that came up earlier of data = np.genfromtxt(SomeFile, dtype=(int, int, float), names = ['var1','var2','var3'] I'm not completely convinced on this one though, since dtype = i8,i8,f8 works. I don't want know how much confusion it would add to have the dtype argument accept a non-valid dtype construction. Skipper PS. Is it bad form for me to go ahead and assign these kinds of tickets to you if you're going to be working on them, or do you get pinged when any ticket is filed? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Oct 6, 2009, at 10:58 PM, Gökhan Sever wrote: I see your points. I don't want to give you extra work, don't worry :) It just seem a bit bizarre: I[27]: c.data['Air_Temp'].fill_value O[27]: 99.005 I[28]: c.data['Air_Temp'][4].fill_value O[28]: 1e+20 As you see, it just returns two different fill_values. I know, but I hope you see the difference : in the first line, you access the `fill_value` of the array. In the second, you access the `fill_value` of the `masked` constant. Each time you access a masked element of an array with __getitem__, you get the masked constant. We could force the constant to inherit the fill_value of the array that calls __getitem__, but it'd be propagated. I know eventually you will be the one handling this :) it might be good to add this issue to the tracker. Go for it, but don't expect anything before the release of 1.4.0 (in the next few months) This is the last resort. I will eventually try this if I don't any other options left. I gonna have difficulties fixing something that I don't see broken... Now, there might be something wrong in my installation. I gonna try to install 1.3.0 somwehere. say, what Python are you using ? OK, I use meld to diff my copy of ma/core.py with the latest trunk version. There are lots of differences :) So there is a possibility that I might have built my local numpy before 09/08. I should renew my copy. Do you know the link of svn browser for the numpy? I don't know how you are making separate installations without overriding other package? I either use Sage (if I have extra time) or SPD. They are both shipped with numpy 1.3.0. Make yourself a favor and install virtualenv and virtualenvwrapper. That way, several versions of the same package can coexist without interference. Oh, and install pip till you're at it: http://pypi.python.org/pypi/virtualenv http://www.doughellmann.com/projects/virtualenvwrapper/ http://pypi.python.org/pypi/pip ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt - the return
On Oct 6, 2009, at 11:01 PM, Skipper Seabold wrote: In keeping with the making some work for you theme, I filed an enhancement ticket for one change that we discussed and another IMO useful addition. http://projects.scipy.org/numpy/ticket/1238 I think it would be nice if we could do data = np.genfromtxt(SomeFile, dtype=float, names = ['var1', 'var2', 'var3' ...]) So that float is paired with each variable name. Also, the one that came up earlier of data = np.genfromtxt(SomeFile, dtype=(int, int, float), names = ['var1','var2','var3'] I'm not completely convinced on this one though, since dtype = i8,i8,f8 works. I don't want know how much confusion it would add to have the dtype argument accept a non-valid dtype construction. Actually, it's rather straightforward. I already have something that supports dtype=(int,int,float) (far easier to handle than i4,i4,f8), I need to tweak a couple of things when the names don't match before posting. Pairing the names with the dtype is pretty neat, that would be quite easy to implement PS. Is it bad form for me to go ahead and assign these kinds of tickets to you if you're going to be working on them, or do you get pinged when any ticket is filed? Go for it. I'm only notified when a ticket is assigned to me directly. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Tue, Oct 6, 2009 at 10:15 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 6, 2009, at 10:58 PM, Gökhan Sever wrote: I see your points. I don't want to give you extra work, don't worry :) It just seem a bit bizarre: I[27]: c.data['Air_Temp'].fill_value O[27]: 99.005 I[28]: c.data['Air_Temp'][4].fill_value O[28]: 1e+20 As you see, it just returns two different fill_values. I know, but I hope you see the difference : in the first line, you access the `fill_value` of the array. In the second, you access the `fill_value` of the `masked` constant. Each time you access a masked element of an array with __getitem__, you get the masked constant. We could force the constant to inherit the fill_value of the array that calls __getitem__, but it'd be propagated. Got these points. Thanks It took a while I had to re-built matplotlib to use ipython -pylab :) I built the numpy again source from the trunk and arccos (as well as other arc functions) problem has disappeared. It all started with trying to calculate great circle navigation equations using masked arrays, and seeing this range_calc function returning some weird results where it was not supposed to do. Further tracing down the error to arccos. def range_calc(lat_r, lat_t, long_r, long_t): range = degrees(arccos(sin(radians(lat_r)) * sin(radians(lat_t)) + cos(radians(lat_r)) * cos(radians(lat_t)) * cos(radians(long_t - long_r * F azimuth = degrees(arccos((sin(radians(lat_t)) - cos(radians(range / F)) * sin(radians(lat_r))) / (sin(radians(range / F)) * cos(radians(lat_r) if long_t - long_r 0: azimuth = 360 - azimuth return range, azimuth Happy now ;) I know eventually you will be the one handling this :) it might be good to add this issue to the tracker. Go for it, but don't expect anything before the release of 1.4.0 (in the next few months) I will do this shortly. This is the last resort. I will eventually try this if I don't any other options left. I gonna have difficulties fixing something that I don't see broken... Now, there might be something wrong in my installation. I gonna try to install 1.3.0 somwehere. say, what Python are you using ? OK, I use meld to diff my copy of ma/core.py with the latest trunk version. There are lots of differences :) So there is a possibility that I might have built my local numpy before 09/08. I should renew my copy. Do you know the link of svn browser for the numpy? I don't know how you are making separate installations without overriding other package? I either use Sage (if I have extra time) or SPD. They are both shipped with numpy 1.3.0. Make yourself a favor and install virtualenv and virtualenvwrapper. That way, several versions of the same package can coexist without interference. Oh, and install pip till you're at it: http://pypi.python.org/pypi/virtualenv http://www.doughellmann.com/projects/virtualenvwrapper/ http://pypi.python.org/pypi/pip pip this is the first time I am hearing. Will give these tools a try probably this weekend. Thanks again for your clarifications. Now, I have to update my advisor's numpy to make his code running correctly. In the first place his code was running properly by using manually created masks for numpy arrays. Using the masked arrays we broke it. Now we know what causing the error. It feels good :) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
Created the ticket http://projects.scipy.org/numpy/ticket/1253 Could you tell me briefly what was the source of leak in arccos case? And how do you write a test code for these cases? On Tue, Oct 6, 2009 at 10:15 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 6, 2009, at 10:58 PM, Gökhan Sever wrote: I see your points. I don't want to give you extra work, don't worry :) It just seem a bit bizarre: I[27]: c.data['Air_Temp'].fill_value O[27]: 99.005 I[28]: c.data['Air_Temp'][4].fill_value O[28]: 1e+20 As you see, it just returns two different fill_values. I know, but I hope you see the difference : in the first line, you access the `fill_value` of the array. In the second, you access the `fill_value` of the `masked` constant. Each time you access a masked element of an array with __getitem__, you get the masked constant. We could force the constant to inherit the fill_value of the array that calls __getitem__, but it'd be propagated. I know eventually you will be the one handling this :) it might be good to add this issue to the tracker. Go for it, but don't expect anything before the release of 1.4.0 (in the next few months) This is the last resort. I will eventually try this if I don't any other options left. I gonna have difficulties fixing something that I don't see broken... Now, there might be something wrong in my installation. I gonna try to install 1.3.0 somwehere. say, what Python are you using ? OK, I use meld to diff my copy of ma/core.py with the latest trunk version. There are lots of differences :) So there is a possibility that I might have built my local numpy before 09/08. I should renew my copy. Do you know the link of svn browser for the numpy? I don't know how you are making separate installations without overriding other package? I either use Sage (if I have extra time) or SPD. They are both shipped with numpy 1.3.0. Make yourself a favor and install virtualenv and virtualenvwrapper. That way, several versions of the same package can coexist without interference. Oh, and install pip till you're at it: http://pypi.python.org/pypi/virtualenv http://www.doughellmann.com/projects/virtualenvwrapper/ http://pypi.python.org/pypi/pip ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Oct 7, 2009, at 12:10 AM, Gökhan Sever wrote: Created the ticket http://projects.scipy.org/numpy/ticket/1253 Want even more confusion ? x = ma.array([1,2,3],mask=[0,1,0], dtype=int) x[0].dtype dtype('int64') x[1].dtype dtype('float64') x[2].dtype dtype('int64') Yet another illustration of the masked constant... The more I think about it, the more I think we should have a specific object (MaskedConstant) that would do nothing but tell us that it is masked. Could you tell me briefly what was the source of leak in arccos case? No idea, as I still haven't figured why you were having the problem in the first place And how do you write a test code for these cases? assert(np.arccos(ma.masked), ma.masked) would be the simplest. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Tue, Oct 6, 2009 at 11:33 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 7, 2009, at 12:10 AM, Gökhan Sever wrote: Created the ticket http://projects.scipy.org/numpy/ticket/1253 Want even more confusion ? x = ma.array([1,2,3],mask=[0,1,0], dtype=int) x[0].dtype dtype('int64') x[1].dtype dtype('float64') x[2].dtype dtype('int64') Yet another illustration of the masked constant... The more I think about it, the more I think we should have a specific object (MaskedConstant) that would do nothing but tell us that it is masked. Confusing indeed. One more from me: I[1]: a = np.arange(5) I[2]: mask = 999 I[6]: a[3] = 999 I[7]: am = ma.masked_equal(a, mask) I[8]: am O[8]: masked_array(data = [0 1 2 -- 4], mask = [False False False True False], fill_value = 99) Where does this fill_value come from? To me it is little confusing having a value and fill_value in masked array method arguments. Could you tell me briefly what was the source of leak in arccos case? No idea, as I still haven't figured why you were having the problem in the first place Probably you can pin-point the error by testing a 1.3.0 version numpy. Not too many arc function with masked array users around I guess :) And how do you write a test code for these cases? assert(np.arccos(ma.masked), ma.masked) would be the simplest. Good to know this. The more I spend time with numpy the more I understand the importance of testing the code automatically. This said, I still find the test-driven-development approach somewhat bizarre. Start only by writing test code and keep implementing your code until all the tests are satisfied. Very interesting...These software engineers... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about masked arrays
On Oct 7, 2009, at 1:12 AM, Gökhan Sever wrote: One more from me: I[1]: a = np.arange(5) I[2]: mask = 999 I[6]: a[3] = 999 I[7]: am = ma.masked_equal(a, mask) I[8]: am O[8]: masked_array(data = [0 1 2 -- 4], mask = [False False False True False], fill_value = 99) Where does this fill_value come from? To me it is little confusing having a value and fill_value in masked array method arguments. Because the two are unrelated. The `fill_value` is the value used to fill the masked elements (that is, the missing entries). When you create a masked array, you get a `fill_value`, whose actual value is defined by default from the dtype of the array: for int, it's 99, for float, 1e+20, you get the idea. The value you used for masking is different, it's just whatver value you consider invalid. Now, if I follow you, you would expect the value in `masked_equal(array, value)` to be the `fill_value` of the output. That's an idea, would you mind fiilling a ticket/enhancement and assign it to me? So that I don't forget. Probably you can pin-point the error by testing a 1.3.0 version numpy. Not too many arc function with masked array users around I guess :) Will try, but if it ain't broken, don't fix it... assert(np.arccos(ma.masked), ma.masked) would be the simplest. (and in fact, it'd be assert(np.arccos(ma.masked) is ma.masked) in this case). Good to know this. The more I spend time with numpy the more I understand the importance of testing the code automatically. This said, I still find the test-driven-development approach somewhat bizarre. Start only by writing test code and keep implementing your code until all the tests are satisfied. Very interesting...These software engineers... Bah, it's not a rule cast in iron... You can start writing your code but do write the tests at the same time. It's the best way to make sure you're not breaking something later on. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion