Re: [Numpy-discussion] Tabular data package

2009-10-06 Thread Bruce Southey
On 10/05/2009 06:20 PM, Robert Kern wrote:
 On Mon, Oct 5, 2009 at 18:15, Elaine Angelinoelaine.angel...@gmail.com  
 wrote:


 Well, what other recarray functionality are you using?

 None, in our code.   We also thought that since at least some people like
 using the attribute reference property, perhaps users of tabarrays might too
 (though we don't personally in our own work)   Recarrays still seemed to be
 being supported by NumPy, so it seemed to make sense to use them.   but the
 only functional thing in our code are those constructors.
  
 Then I would suggest making tabarrays subclass from ndarray. If you
 like, provide a tabrecarray that subclasses from both recarray and
 tabarray so that people who like attribute access can .view() to their
 heart's content.


 (Also, is first casting to recarrays and then viewing as ndarrays more
 expensive than if we went through ndarray directly?)
  

 But if NumPy decided to include ndarray versions of the from*() constructors
 in the distribution, would this be achieved by first using the recarray
 constructor and then viewing as ndarray?  Or would something more direct
 be done?
  
 We would fix the functions to not do any unnecessary .view()s.


Hi Elaine,
I do want to look more at what you have done as some of the features are 
very interesting.

This discussion raises the question of what do you find missing in numpy 
that you have included in tabular package?
In particular is there a particular set of functions that you think 
could be added to numpy or even create a 'better' recarray class?
There are real advantages of having at least core components in numpy.

Bruce

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy SVN broken

2009-10-06 Thread Charles R Harris
2009/10/6 Stéfan van der Walt ste...@sun.ac.za

 Hi all,

 The current SVN HEAD of NumPy is broken and should not be used.
 Extensions compiled against this version may (will) segfault.

 Travis, if you could have a look at the side-effects caused by r7050,
 that would be great.  I meant to figure out what was wrong, but seeing
 that this is a 3000 line patch, I'm not confident I can find the
 problem easily.

 Regards
 Stéfan

 P.S. The new functionality is great, but I don't think we're going to
 be able to convince David to release without documenting and testing
 those changes to the C API.
 ___


Seeing as the next release process is probably going to start next month and
we want things to settle out, it might be advisable delay any intrusive
patches to the release after and subject them to review and discussion
first.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Tabular data package

2009-10-06 Thread josef . pktd
On Mon, Oct 5, 2009 at 5:22 PM, Elaine Angelino
elaine.angel...@gmail.com wrote:
 Hi there,

 We are writing to announce the release of Tabular, a package of Python
 modules for working with tabular data.

 Tabular is a package of Python modules for working with tabular data. Its
 main object is the tabarray class, a data structure for holding and
 manipulating tabular data. By putting data into a tabarray object, you’ll
 get a representation of the data that is more flexible and powerful than a
 native Python representation. More specifically, tabarray provides:

 -- ultra-fast filtering, selection, and numerical analysis methods, using
 convenient Matlab-style matrix operation syntax
 -- spreadsheet-style operations, including row  column operations, 'sort',
 'replace', 'aggregate', 'pivot', and 'join'
 -- flexible load and save methods for a variety of file formats, including
 delimited text (CSV), binary, and HTML
 -- helpful inference algorithms for determining formatting parameters and
 data types of input files
 -- support for hierarchical groupings of columns, both as data structures
 and file formats

 You can download Tabular from PyPI (http://pypi.python.org/pypi/tabular/) or
 alternatively clone our hg repository from bitbucket
 (http://bitbucket.org/elaine/tabular/).  We also have posted tutorial-style
 Sphinx documentation (http://www.parsemydata.com/tabular/).

 The tabarray object is based on the record array object from the Numerical
 Python package (NumPy), and Tabular is built to interface well with NumPy in
 general.  Our intended audience is two-fold: (1) Python users who, though
 they may not be familiar with NumPy, are in need of a way to work with
 tabular data, and (2) NumPy users who would like to do spreadsheet-style
 operations on top of their more numerical work.

 We hope that some of you find Tabular useful!

 Best,

 Elaine and Dan

I briefly looked at the sphinx docs and the code. Tabular looks pretty
useful and
the code can be partially read as recipes for working with recarrays
or structured
arrays. Thanks for the choice of license (it makes looking at the code legal).

I didn't see any explicit nan handling. Are missing values allowed
e.g. in the constructor?

I looked a bit closer at function like tabular.fast.recarrayisin since
I always have problems
with these row operations.
Are these function supposed to work with arbitrary structured arrays?
The tests are only
for a 1d integer arrays.
With floats the default string representation doesn't sort correctly.
Or am I misreading the function?

 arr = np.array([6,1,2,1e-13,0.5*1e-14,1,2e25,3,0,7]).view([('',float)]*2)
 arr
array([(6.0, 1.0), (2.0, 1e-013), (5e-015, 1.0),
   (2.0002e+025, 3.0), (0.0, 7.0)],
  dtype=[('f0', 'f8'), ('f1', 'f8')])
 np.sort([str(l) for l in arr])
array(['(0.0, 7.0)', '(2.0, 1e-013)', '(2.0002e+025, 3.0)',
   '(5e-015, 1.0)', '(6.0, 1.0)'],
  dtype='|S30')

Being able to do a searchsorted on rows of an array would be a useful feature
in numpy. Is there a sortable 1d representation of the rows of a 2d float or
mixed type array?

Thanks,

Josef


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy SVN broken

2009-10-06 Thread Stéfan van der Walt
2009/10/6 Charles R Harris charlesr.har...@gmail.com:
 2009/10/6 Stéfan van der Walt ste...@sun.ac.za

 Hi all,

 The current SVN HEAD of NumPy is broken and should not be used.
 Extensions compiled against this version may (will) segfault.


 Can you be more specific? I haven't had any problems running current svn
 with scipy.

Both David and I had segfaults when running scipy compiled off
the latest numpy.  An example from Kiva:

Program received signal SIGSEGV, Segmentation fault.
PyArray_INCREF (mp=0x42)
at build/scons/numpy/core/src/multiarray/refcount.c:103
103 if (!PyDataType_REFCHK(mp-descr)) {
(gdb) bt
#0  PyArray_INCREF (mp=0x42)
at build/scons/numpy/core/src/multiarray/refcount.c:103
#1  0x00985f67 in agg::pixel_map_as_unowned_array (pix_map=...)
at 
build/src.linux-i686-2.6/enthought/kiva/agg/src/x11/plat_support_wrap.cpp:2909
#2  0x0098795f in _wrap_pixel_map_as_unowned_array (args=0xb7ed032c)
at 
build/src.linux-i686-2.6/enthought/kiva/agg/src/x11/plat_support_wrap.cpp:3341

Via bisection, the source of the problem has been localised to the
merge of the datetime branch.

Cheers
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy SVN broken

2009-10-06 Thread Charles R Harris
On Tue, Oct 6, 2009 at 10:50 AM, David Cournapeau courn...@gmail.comwrote:

 On Wed, Oct 7, 2009 at 1:36 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  2009/10/6 Stéfan van der Walt ste...@sun.ac.za
 
  Hi all,
 
  The current SVN HEAD of NumPy is broken and should not be used.
  Extensions compiled against this version may (will) segfault.
 
 
  Can you be more specific? I haven't had any problems running current svn
  with scipy.

 The version itself is fine, but the ABI has been changed in an
 incompatible way: if you have an extension built against say numpy
 1.2.1, and then use a numpy built from sources after the datetime
 merge, it will segfault right away. It does so for scipy and several
 custom extensions. The abi breakage was found to be the datetime
 merge.


Ah... That's a fine kettle of fish. Any idea what ABI calls are causing the
problem? Maybe the dtype change wasn't made in a compatible way. IIRC,
something was added to the dtype?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy SVN broken

2009-10-06 Thread David Cournapeau
On Wed, Oct 7, 2009 at 2:04 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Tue, Oct 6, 2009 at 10:50 AM, David Cournapeau courn...@gmail.com
 wrote:

 On Wed, Oct 7, 2009 at 1:36 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  2009/10/6 Stéfan van der Walt ste...@sun.ac.za
 
  Hi all,
 
  The current SVN HEAD of NumPy is broken and should not be used.
  Extensions compiled against this version may (will) segfault.
 
 
  Can you be more specific? I haven't had any problems running current svn
  with scipy.

 The version itself is fine, but the ABI has been changed in an
 incompatible way: if you have an extension built against say numpy
 1.2.1, and then use a numpy built from sources after the datetime
 merge, it will segfault right away. It does so for scipy and several
 custom extensions. The abi breakage was found to be the datetime
 merge.


 Ah... That's a fine kettle of fish. Any idea what ABI calls are causing the
 problem? Maybe the dtype change wasn't made in a compatible way. IIRC,
 something was added to the dtype?

Yes, but that should not cause trouble. Adding members to structure
should be fine.

I quickly look at the diff, and some changes in the code generators
look suspicious, e.g.:

 types = ['Generic','Number','Integer','SignedInteger','UnsignedInteger',
- 'Inexact',
+ 'Inexact', 'TimeInteger',
  'Floating', 'ComplexFloating', 'Flexible', 'Character',
  'Byte','Short','Int', 'Long', 'LongLong', 'UByte', 'UShort',
  'UInt', 'ULong', 'ULongLong', 'Float', 'Double', 'LongDouble',
  'CFloat', 'CDouble', 'CLongDouble', 'Object', 'String', 'Unicode',
- 'Void']
+ 'Void', 'Datetime', 'Timedelta']

As the list is used to initialize some values from the API function
pointer array, inserts  should be avoided. You can see the consequence
on the generated files, e.g. part of __multiarray_api.h diff between
datetimemerge and just before:

 #define PyFloatingArrType_Type (*(PyTypeObject *)PyArray_API[16])
 #define PyComplexFloatingArrType_Type (*(PyTypeObject *)PyArray_API[17])
 #define PyFlexibleArrType_Type (*(PyTypeObject *)PyArray_API[18])
 #define PyCharacterArrType_Type (*(PyTypeObject *)PyArray_API[19])
 #define PyByteArrType_Type (*(PyTypeObject *)PyArray_API[20])
 #define PyShortArrType_Type (*(PyTypeObject *)PyArray_API[21])
 #define PyIntArrType_Type (*(PyTypeObject *)PyArray_API[22])
 #define PyLongArrType_Type (*(PyTypeObject *)PyArray_API[23])
 #define PyLongLongArrType_Type (*(PyTypeObject *)PyArray_API[24])
 #define PyUByteArrType_Type (*(PyTypeObject *)PyArray_API[25])
 #define PyUShortArrType_Type (*(PyTypeObject *)PyArray_API[26])
 #define PyUIntArrType_Type (*(PyTypeObject *)PyArray_API[27])
 #define PyULongArrType_Type (*(PyTypeObject *)PyArray_API[28])
 #define PyULongLongArrType_Type (*(PyTypeObject *)PyArray_API[29])
 #define PyFloatArrType_Type (*(PyTypeObject *)PyArray_API[30])
 #define PyDoubleArrType_Type (*(PyTypeObject *)PyArray_API[31])
 #define PyLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[32])
 #define PyCFloatArrType_Type (*(PyTypeObject *)PyArray_API[33])
 #define PyCDoubleArrType_Type (*(PyTypeObject *)PyArray_API[34])
 #define PyCLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[35])
 #define PyObjectArrType_Type (*(PyTypeObject *)PyArray_API[36])
 #define PyStringArrType_Type (*(PyTypeObject *)PyArray_API[37])
 #define PyUnicodeArrType_Type (*(PyTypeObject *)PyArray_API[38])
 #define PyVoidArrType_Type (*(PyTypeObject *)PyArray_API[39])
---
 #define PyTimeIntegerArrType_Type (*(PyTypeObject *)PyArray_API[16])
 #define PyFloatingArrType_Type (*(PyTypeObject *)PyArray_API[17])
 #define PyComplexFloatingArrType_Type (*(PyTypeObject *)PyArray_API[18])
 #define PyFlexibleArrType_Type (*(PyTypeObject *)PyArray_API[19])
 #define PyCharacterArrType_Type (*(PyTypeObject *)PyArray_API[20])
 #define PyByteArrType_Type (*(PyTypeObject *)PyArray_API[21])
 #define PyShortArrType_Type (*(PyTypeObject *)PyArray_API[22])
 #define PyIntArrType_Type (*(PyTypeObject *)PyArray_API[23])
 #define PyLongArrType_Type (*(PyTypeObject *)PyArray_API[24])
 #define PyLongLongArrType_Type (*(PyTypeObject *)PyArray_API[25])
 #define PyUByteArrType_Type (*(PyTypeObject *)PyArray_API[26])
 #define PyUShortArrType_Type (*(PyTypeObject *)PyArray_API[27])
 #define PyUIntArrType_Type (*(PyTypeObject *)PyArray_API[28])
 #define PyULongArrType_Type (*(PyTypeObject *)PyArray_API[29])
 #define PyULongLongArrType_Type (*(PyTypeObject *)PyArray_API[30])
 #define PyFloatArrType_Type (*(PyTypeObject *)PyArray_API[31])
 #define PyDoubleArrType_Type (*(PyTypeObject *)PyArray_API[32])
 #define PyLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[33])
 #define PyCFloatArrType_Type (*(PyTypeObject *)PyArray_API[34])
 #define PyCDoubleArrType_Type (*(PyTypeObject *)PyArray_API[35])
 #define PyCLongDoubleArrType_Type (*(PyTypeObject *)PyArray_API[36])
 #define PyObjectArrType_Type (*(PyTypeObject *)PyArray_API[37])

Re: [Numpy-discussion] NumPy SVN broken

2009-10-06 Thread David Warde-Farley
On 6-Oct-09, at 12:50 PM, David Cournapeau wrote:

 The version itself is fine, but the ABI has been changed in an
 incompatible way: if you have an extension built against say numpy
 1.2.1, and then use a numpy built from sources after the datetime
 merge, it will segfault right away. It does so for scipy and several
 custom extensions. The abi breakage was found to be the datetime
 merge.

I experienced something similar recently with both ETS and pytables.  
Good to know finally what was going on. :)

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy SVN broken

2009-10-06 Thread Charles R Harris
On Tue, Oct 6, 2009 at 11:14 AM, David Cournapeau courn...@gmail.comwrote:

 On Wed, Oct 7, 2009 at 2:04 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Tue, Oct 6, 2009 at 10:50 AM, David Cournapeau courn...@gmail.com
  wrote:
 
  On Wed, Oct 7, 2009 at 1:36 AM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
  
   2009/10/6 Stéfan van der Walt ste...@sun.ac.za
  
   Hi all,
  
   The current SVN HEAD of NumPy is broken and should not be used.
   Extensions compiled against this version may (will) segfault.
  
  
   Can you be more specific? I haven't had any problems running current
 svn
   with scipy.
 
  The version itself is fine, but the ABI has been changed in an
  incompatible way: if you have an extension built against say numpy
  1.2.1, and then use a numpy built from sources after the datetime
  merge, it will segfault right away. It does so for scipy and several
  custom extensions. The abi breakage was found to be the datetime
  merge.
 
 
  Ah... That's a fine kettle of fish. Any idea what ABI calls are causing
 the
  problem? Maybe the dtype change wasn't made in a compatible way. IIRC,
  something was added to the dtype?

 Yes, but that should not cause trouble. Adding members to structure
 should be fine.

 I quickly look at the diff, and some changes in the code generators
 look suspicious, e.g.:

  types = ['Generic','Number','Integer','SignedInteger','UnsignedInteger',
 - 'Inexact',
 + 'Inexact', 'TimeInteger',
  'Floating', 'ComplexFloating', 'Flexible', 'Character',
  'Byte','Short','Int', 'Long', 'LongLong', 'UByte', 'UShort',
  'UInt', 'ULong', 'ULongLong', 'Float', 'Double', 'LongDouble',
  'CFloat', 'CDouble', 'CLongDouble', 'Object', 'String', 'Unicode',
 - 'Void']
 + 'Void', 'Datetime', 'Timedelta']

 As the list is used to initialize some values from the API function
 pointer array, inserts  should be avoided. You can see the consequence
 on the generated files, e.g. part of __multiarray_api.h diff between
 datetimemerge and just before:


Looks like a clue ;)

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Tabular data package

2009-10-06 Thread Dan Yamins

 I didn't see any explicit nan handling. Are missing values allowed
 e.g. in the constructor?


No, this is a valid point.  We don't handle this as explicitly as we
should.   Are you mostly talking about nan handling in loading from
delimited text files?  (Or are you talking about something more general,
like integration of masked arrays?)   In loading from delimited text files,
you can use the linefixer and valuefixer arguments, which are for more
general purposes, and which will get the job done, but slowly.  We should do
something more specialized for missing values that would be faster.



 Are these function supposed to work with arbitrary structured arrays?


Well, they're only really tested for working with strings, floats, and ints
(tho only the int tests are included in the test module, we should expand
that).   I imagine it's possible they'd work with more sophisticated things
but I'm not sure.



  arr =
 np.array([6,1,2,1e-13,0.5*1e-14,1,2e25,3,0,7]).view([('',float)]*2)
  arr
 array([(6.0, 1.0), (2.0, 1e-013), (5e-015, 1.0),
   (2.0002e+025, 3.0), (0.0, 7.0)],
  dtype=[('f0', 'f8'), ('f1', 'f8')])
  np.sort([str(l) for l in arr])
 array(['(0.0, 7.0)', '(2.0, 1e-013)', '(2.0002e+025, 3.0)',
   '(5e-015, 1.0)', '(6.0, 1.0)'],
  dtype='|S30')

 Well on this example (as in tests that we did), fast.recarrayisin performed
as spec'd.   ...  But definitely write back again if you think it's failing
somewhere.

In general, extending a number of the thigns in Tabular (e.g. the loadSV and
saveSV) to arbitrary structured dtypes as opposed to more basic types would
be great.

Dan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Bruce Southey
On 10/05/2009 02:13 PM, Pierre GM wrote:
 All,
 Could you try r7449 ? I introduced some mechanisms to keep track of
 invalid lines (where the number of columns don't match what's
 expected). By default, a warning is emitted and these lines are
 skipped, but an optional argument gives the possibility to raise an
 exception instead.
 Now, I need more tests about wrong converters. I'm trying to optimize
 the upgrade mechanism (there are too many intertwined loops for my
 taste now), I'll keep you posted.
 Meanwhile, if you could come with more cases of failure, please send
 them my way.
 Cheers
 P.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

Hi,
Excellent as the changes appear to address incorrect number of delimiters.

I think that the default invalid_raise should be True.

One 'feature' is that there is no way to indicate multiple delimiters 
when the delimiter is whitespace.
A B C D
1 2 3 4
1 4 5

Which I consider a user beware issue when using whitespace as the 
delimiter especially in Python.


Bruce

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM

On Oct 6, 2009, at 2:42 PM, Bruce Southey wrote:

 Hi,
 Excellent as the changes appear to address incorrect number of  
 delimiters.

They should also give some extra info if there's a problem w/ the  
converters.

 I think that the default invalid_raise should be True.

Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ?



 One 'feature' is that there is no way to indicate multiple delimiters
 when the delimiter is whitespace.
 A B C D
 1 2 3 4
 1 4 5

Have you tried using a sequence of integers for the delimiter ? Would  
you mind sending me some test ?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] tostring() for array rows

2009-10-06 Thread Christopher Barker
josef.p...@gmail.com wrote:
 If I have a structured or a regular array, is the use of strides in
 the following always correct for the length of the row memory?
 
 I would like to do tostring() but on each row, by creating a string
 view of the memory in a 1d array.

Maybe I'm missing what you want, but why not just:

In [15]: tmp
Out[15]:
array([[ 1.07810097, -1.74157351,  0.29740878],
[-0.16786436,  0.45752272, -0.8038045 ],
[-0.17195028, -1.16753882,  0.04329128],
[ 0.45460137, -0.44584955, -0.77140505]])

In [16]: rows = []

In [17]: for r in range(tmp.shape[0]):
  rows.append(tmp[r,:].tostring())
:

In [19]: rows
Out[19]:
['?\xf1?\xe6\xce\x1f9\xce\xbf\xfb\xdd|.\xc85Z?\xd3\x08\xbe\xd6\xb7\xb6\xe8',
  '\xbf\xc5|\x94Sx\x92\x18?\xddH\r\\T\xfbT\xbf\xe9\xb8\xc45\xff\x92\xdf',
  '\xbf\xc6\x02w\x82\x18i\xaf\xbf\xf2\xae=/\xfe\xff\x0b?\xa6*FD\xae\xd1F',
 
'?\xdd\x180Z\xcet\xa5\xbf\xdc\x88\xcc\x8a\x8c\x8b\xe7\xbf\xe8\xafY\xa2\xf8\xac 
']


in general, you can let numpy worry about the strides, etc.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Gökhan Sever
Hello,

I have a sample masked array data as shown below.

1-) When I list the whole array I see the fill value correctly. However
below that line, when I do access the 5th element, fill_value flies upto
1e+20. What might be wrong here?

I[5]: c.data['Air_Temp']
O[5]:
masked_array(data = [13.1509 13.1309 13.1278 13.1542 -- 13.1539 13.1387 --
-- -- 13.1107
 13.1351 13.2073 13.2562 13.3533 13.3889 13.4067 13.2938 13.1962 13.1248
 13.0411 12.9534 12.8354 12.7392 12.6725],
 mask = [False False False False  True False False  True  True
True False False
 False False False False False False False False False False False False
 False],
   fill_value = 99.)


I[6]: c.data['Air_Temp'][4]
O[6]:
masked_array(data = --,
 mask = True,
   fill_value = 1e+20)




2-) What is wrong with the arccos calculation? Should not that result the
same as with cos(d) result?



I[9]: d = c.data['Air_Temp'][4]


I[11]: cos(d)
O[11]:
masked_array(data = --,
 mask = True,
   fill_value = 1e+20)


I[12]: arccos(d)
O[12]:
masked_array(data = 1.57079632679,
 mask = False,
   fill_value = 1e+20)


Any ideas?

-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Christopher Barker
Pierre GM wrote:
 I think that the default invalid_raise should be True.
 
 Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ?

yup -- make it +2 -- ignoring erreos and losing data by default is a 
bad idea!

 One 'feature' is that there is no way to indicate multiple delimiters
 when the delimiter is whitespace.
 A B C D
 1 2 3 4
 1 4 5

I'd say someone has made a very poor choice of file formats!

Unless this s a fixed width file, in which case it should be processes 
as such, rather than as a delimited one. I suppose it wouldn't hurt to 
add that feature to genfromtxt.. or is it there already. Perhaps that's 
what this means:

 Have you tried using a sequence of integers for the delimiter ?

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Pierre GM

On Oct 6, 2009, at 6:57 PM, Gökhan Sever wrote:
 Seeing a different filling value is causing confusion. Both for  
 myself, and when I try to demonstrate the usage of masked array to  
 other people.

Fair enough. I must admit that `fill_value` is a vestige from the  
previous implementation (talking pre 1.2 here), that is no longer  
really needed (cf below for more details).

 Also say, if I want to replace that one element back to its original  
 state will it use fill_value as 1e+20 or 99.?

What do you mean by 'replace back to its original state' ? Using  
`filled`, you mean ?

  2-) What is wrong with the arccos calculation? Should not that
  result the same as with cos(d) result?

 I first tested on 1.3.0, and later on my laptop using 1.4dev version  
 which is about an old month built.

 Once again the results for each arc... function

Er, I assume it's np.arccos ?
Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma).  
Could it be that something went wrng with some ufuncs ? I didn't touch  
ma since 09/08 (thanks, svn history), so I don't think it comes from  
here... Would you mind trying a more recent svn version ?


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Gökhan Sever
On Tue, Oct 6, 2009 at 7:38 PM, Pierre GM pgmdevl...@gmail.com wrote:


 On Oct 6, 2009, at 6:57 PM, Gökhan Sever wrote:
  Seeing a different filling value is causing confusion. Both for
  myself, and when I try to demonstrate the usage of masked array to
  other people.

 Fair enough. I must admit that `fill_value` is a vestige from the
 previous implementation (talking pre 1.2 here), that is no longer
 really needed (cf below for more details).

  Also say, if I want to replace that one element back to its original
  state will it use fill_value as 1e+20 or 99.?

 What do you mean by 'replace back to its original state' ? Using
 `filled`, you mean ?


Yes, in more properly stated fashion filled :)


I[14]: c.data['Air_Temp'][4]
O[14]:
masked_array(data = --,
 mask = True,
   fill_value = 1e+20)


I[15]: c.data['Air_Temp'][4].filled()
O[15]: array(1e+20)

Little buggy, isn't it? It properly fill the whole array:

I[13]: c.data['Air_Temp'].filled()
O[13]:
array([  1.31509000e+01,   1.31309000e+01,   1.31278000e+01,
 1.31542000e+01,   1.e+06,   1.31539000e+01,
 1.31387000e+01,   1.e+06,   1.e+06,
 1.e+06,   1.31107000e+01,   1.31351000e+01,
 1.32073000e+01,   1.32562000e+01,   1.33533000e+01,
 1.33889000e+01,   1.34067000e+01,   1.32938000e+01,
 1.31962000e+01,   1.31248000e+01,   1.30411000e+01,
 1.29534000e+01,   1.28354000e+01,   1.27392000e+01,
 1.26725000e+01])




   2-) What is wrong with the arccos calculation? Should not that
   result the same as with cos(d) result?
 
  I first tested on 1.3.0, and later on my laptop using 1.4dev version
  which is about an old month built.
 
  Once again the results for each arc... function

 Er, I assume it's np.arccos ?


Sorry too much time spent in ipython -pylab :)

I[18]: arccos?
Type: ufunc
Base Class:   type 'numpy.ufunc'
String Form:   ufunc 'arccos'
Namespace:Interactive
File: /home/gsever/Desktop/python-repo/numpy/numpy/__init__.py



 Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma).
 Could it be that something went wrng with some ufuncs ?


This I don't know :(


 I didn't touch
 ma since 09/08 (thanks, svn history), so I don't think it comes from
 here...


Yes, SVN is a very useful invention indeed.

I[6]: numpy.__version__
O[6]: '1.4.0.dev'

For some reason it doesn't list check-out revision.

Doing an ls -l reveals that those are checked-out and installed after August
13 which was a preparation for the SciPy 09 :)


Would you mind trying a more recent svn version ?


This is the last resort. I will eventually try this if I don't any other
options left.

I confirmed the same arccos weirdness in Sage Notebook (www.sagenb.org)
where Numpy 1.3.0 is installed there.




 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Pierre GM

On Oct 6, 2009, at 9:54 PM, Gökhan Sever wrote:

  Also say, if I want to replace that one element back to its original
  state will it use fill_value as 1e+20 or 99.?

 What do you mean by 'replace back to its original state' ? Using
 `filled`, you mean ?

 Yes, in more properly stated fashion filled :)

 I[14]: c.data['Air_Temp'][4]
 O[14]:
 masked_array(data = --,
  mask = True,
fill_value = 1e+20)


 I[15]: c.data['Air_Temp'][4].filled()
 O[15]: array(1e+20)

 Little buggy, isn't it? It properly fill the whole array:

 I[13]: c.data['Air_Temp'].filled()
 O[13]:
 array([  1.31509000e+01,   1.31309000e+01,   1.31278000e+01,
  1.31542000e+01,   1.e+06,   1.31539000e+01,
  1.31387000e+01,   1.e+06,   1.e+06,
  1.e+06,   1.31107000e+01,   1.31351000e+01,
  1.32073000e+01,   1.32562000e+01,   1.33533000e+01,
  1.33889000e+01,   1.34067000e+01,   1.32938000e+01,
  1.31962000e+01,   1.31248000e+01,   1.30411000e+01,
  1.29534000e+01,   1.28354000e+01,   1.27392000e+01,
  1.26725000e+01])

Once again, when you access your 5th element, you get the special  
`masked` constant. If you fill this constant, you'll get something  
which is probably not what you want. And I would need a *REALLY*  
compelling reason to change this behavior, as it's gonna break a lot  
of things (the masked constant has been around for a while)

   2-) What is wrong with the arccos calculation? Should not that

 Er, I assume it's np.arccos ?

 Sorry too much time spent in ipython -pylab :)

Well, i use ipython -pylab regularly as well, but still have the  
reflex of using np. ;)



 I[18]: arccos?
 Type: ufunc
 Base Class:   type 'numpy.ufunc'
 String Form:   ufunc 'arccos'
 Namespace:Interactive
 File: /home/gsever/Desktop/python-repo/numpy/numpy/ 
 __init__.py


 Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma).
 Could it be that something went wrng with some ufuncs ?

 This I don't know :(

 I didn't touch
 ma since 09/08 (thanks, svn history), so I don't think it comes from
 here...

 Yes, SVN is a very useful invention indeed.

 I[6]: numpy.__version__
 O[6]: '1.4.0.dev'

 For some reason it doesn't list check-out revision.

I know, and it's bugging me as well. if you have a build directory  
somewhere, check numpy/core/__svn_version__.py

 This is the last resort. I will eventually try this if I don't any  
 other options left.

I gonna have difficulties fixing something that I don't see broken...  
Now, there might be something wrong in my installation. I gonna try to  
install 1.3.0 somwehere. say, what Python are you using ?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM

On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote:
 No, just seeing what sort of problems I can create. This case is
 partly based on if someone is using tab-delimited then they need to
 set the delimiter='\t' otherwise it gives an error. Also I often parse
 text files so, yes, you have to be careful of the delimiters. It is
 also arises because certain programs like spreadsheets there is the
 option to merge delimiters - actually in SAS it is default (you need
 to specify the DSD option).

Ahah! I get it. Well, I remmbr that we discussed something like that a  
few months ago when I started working on np.genfromtxt, and the  
default of *not* merging whitespaces was requested. I gonna check  
whether we can't put this option somewhere now...

 Anyhow, I am really impressed on how this function works.

Thx. I hope things haven't been slowed down too much.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Skipper Seabold
On Tue, Oct 6, 2009 at 10:08 PM, Bruce Southey bsout...@gmail.com wrote:
 On Tue, Oct 6, 2009 at 4:04 PM, Pierre GM pgmdevl...@gmail.com wrote:

 On Oct 6, 2009, at 4:43 PM, Christopher Barker wrote:

 Pierre GM wrote:
 I think that the default invalid_raise should be True.

 Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ?

 yup -- make it +2 -- ignoring erreos and losing data by default is a
 bad idea!

 OK then, that's enough for me: I'll put invalid_raise as True by
 default. Note that a warning was emitted no matter what.



 One 'feature' is that there is no way to indicate multiple
 delimiters
 when the delimiter is whitespace.
 A B C D
 1 2 3 4
 1     4 5

 I'd say someone has made a very poor choice of file formats!

 No, just seeing what sort of problems I can create. This case is
 partly based on if someone is using tab-delimited then they need to
 set the delimiter='\t' otherwise it gives an error. Also I often parse
 text files so, yes, you have to be careful of the delimiters. It is
 also arises because certain programs like spreadsheets there is the
 option to merge delimiters - actually in SAS it is default (you need
 to specify the DSD option).


 Unless this s a fixed width file, in which case it should be processes
 as such, rather than as a delimited one. I suppose it wouldn't hurt to
 add that feature to genfromtxt.. or is it there already. Perhaps
 that's
 what this means:

 Have you tried using a sequence of integers for the delimiter ?

 Yes, if you give a sequence of integers as delimiter, it is
 interpreted as the length of each field. At least, should be.

 More to learn and test.


There's an example on using the fixed-width delimiter here:
http://docs.scipy.org/numpy/docs/numpy.lib.io.genfromtxt/

As far as I know, it works fine.

 Anyhow, I am really impressed on how this function works.


Agreed.  Genfromtxt and the derived are very useful.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Gökhan Sever
On Tue, Oct 6, 2009 at 9:22 PM, Pierre GM pgmdevl...@gmail.com wrote:


 On Oct 6, 2009, at 9:54 PM, Gökhan Sever wrote:
 
   Also say, if I want to replace that one element back to its original
   state will it use fill_value as 1e+20 or 99.?
 
  What do you mean by 'replace back to its original state' ? Using
  `filled`, you mean ?
 
  Yes, in more properly stated fashion filled :)

  I[14]: c.data['Air_Temp'][4]
  O[14]:
  masked_array(data = --,
   mask = True,
 fill_value = 1e+20)
 
 
  I[15]: c.data['Air_Temp'][4].filled()
  O[15]: array(1e+20)
 
  Little buggy, isn't it? It properly fill the whole array:
 
  I[13]: c.data['Air_Temp'].filled()
  O[13]:
  array([  1.31509000e+01,   1.31309000e+01,   1.31278000e+01,
   1.31542000e+01,   1.e+06,   1.31539000e+01,
   1.31387000e+01,   1.e+06,   1.e+06,
   1.e+06,   1.31107000e+01,   1.31351000e+01,
   1.32073000e+01,   1.32562000e+01,   1.33533000e+01,
   1.33889000e+01,   1.34067000e+01,   1.32938000e+01,
   1.31962000e+01,   1.31248000e+01,   1.30411000e+01,
   1.29534000e+01,   1.28354000e+01,   1.27392000e+01,
   1.26725000e+01])

 Once again, when you access your 5th element, you get the special
 `masked` constant. If you fill this constant, you'll get something
 which is probably not what you want. And I would need a *REALLY*
 compelling reason to change this behavior, as it's gonna break a lot
 of things (the masked constant has been around for a while)


I see your points. I don't want to give you extra work, don't worry :) It
just seem a bit bizarre:

I[27]: c.data['Air_Temp'].fill_value
O[27]: 99.005

I[28]: c.data['Air_Temp'][4].fill_value
O[28]: 1e+20

As you see, it just returns two different fill_values. I know eventually you
will be the one handling this :) it might be good to add this issue to the
tracker.




2-) What is wrong with the arccos calculation? Should not that
 
  Er, I assume it's np.arccos ?
 
  Sorry too much time spent in ipython -pylab :)

 Well, i use ipython -pylab regularly as well, but still have the
 reflex of using np. ;)



Good reflex. Saves you from making extra explanations. But it works with
just typing array why should I type np.array (Ohh my namespacess :)

It is just an IPython magic.



 
  I[18]: arccos?
  Type: ufunc
  Base Class:   type 'numpy.ufunc'
  String Form:   ufunc 'arccos'
  Namespace:Interactive
  File: /home/gsever/Desktop/python-repo/numpy/numpy/
  __init__.py
 
 
  Anyway, I'm puzzled. Works like a charm here (r7438 for numpy.ma).
  Could it be that something went wrng with some ufuncs ?
 
  This I don't know :(
 
  I didn't touch
  ma since 09/08 (thanks, svn history), so I don't think it comes from
  here...
 
  Yes, SVN is a very useful invention indeed.
 
  I[6]: numpy.__version__
  O[6]: '1.4.0.dev'
 
  For some reason it doesn't list check-out revision.

 I know, and it's bugging me as well. if you have a build directory
 somewhere, check numpy/core/__svn_version__.py


There is build directory but no files that contains svn :(


  This is the last resort. I will eventually try this if I don't any
  other options left.

 I gonna have difficulties fixing something that I don't see broken...
 Now, there might be something wrong in my installation. I gonna try to
 install 1.3.0 somwehere. say, what Python are you using ?


OK, I use meld to diff my copy of ma/core.py with the latest trunk version.
There are lots of differences :) So there is a possibility that I might have
built my local numpy before 09/08. I should renew my copy. Do you know the
link of svn browser for the numpy? I don't know how you are making separate
installations without overriding other package? I either use Sage (if I have
extra time) or SPD. They are both shipped with numpy 1.3.0.

Let see how it will result with a new build...



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Skipper Seabold
On Tue, Oct 6, 2009 at 10:27 PM, Pierre GM pgmdevl...@gmail.com wrote:
snip
 Anyhow, I am really impressed on how this function works.

 Thx. I hope things haven't been slowed down too much.

In keeping with the making some work for you theme, I filed an
enhancement ticket for one change that we discussed and another IMO
useful addition.  http://projects.scipy.org/numpy/ticket/1238

I think it would be nice if we could do

data = np.genfromtxt(SomeFile, dtype=float, names = ['var1', 'var2',
'var3' ...])

So that float is paired with each variable name.  Also, the one that
came up earlier of

data = np.genfromtxt(SomeFile, dtype=(int, int, float), names =
['var1','var2','var3']

I'm not completely convinced on this one though, since dtype =
i8,i8,f8 works.  I don't want know how much confusion it would add
to have the dtype argument accept a non-valid dtype construction.

Skipper

PS.  Is it bad form for me to go ahead and assign these kinds of
tickets to you if you're going to be working on them, or do you get
pinged when any ticket is filed?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Pierre GM

On Oct 6, 2009, at 10:58 PM, Gökhan Sever wrote:

 I see your points. I don't want to give you extra work, don't  
 worry :) It just seem a bit bizarre:

 I[27]: c.data['Air_Temp'].fill_value
 O[27]: 99.005

 I[28]: c.data['Air_Temp'][4].fill_value
 O[28]: 1e+20

 As you see, it just returns two different fill_values.

I know, but I hope you see the difference : in the first line, you  
access the `fill_value` of the array. In the second, you access the  
`fill_value` of the `masked` constant. Each time you access a masked  
element of an array with __getitem__, you get the masked constant. We  
could force the constant to inherit the fill_value of the array that  
calls __getitem__, but it'd be propagated.

 I know eventually you will be the one handling this :) it might be  
 good to add this issue to the tracker.

Go for it, but don't expect anything before the release of 1.4.0 (in  
the next few months)


  This is the last resort. I will eventually try this if I don't any
  other options left.

 I gonna have difficulties fixing something that I don't see broken...
 Now, there might be something wrong in my installation. I gonna try to
 install 1.3.0 somwehere. say, what Python are you using ?

 OK, I use meld to diff my copy of ma/core.py with the latest trunk  
 version. There are lots of differences :) So there is a possibility  
 that I might have built my local numpy before 09/08. I should renew  
 my copy. Do you know the link of svn browser for the numpy? I don't  
 know how you are making separate installations without overriding  
 other package? I either use Sage (if I have extra time) or SPD. They  
 are both shipped with numpy 1.3.0.

Make yourself a favor and install virtualenv and virtualenvwrapper.  
That way, several versions of the same package can coexist without  
interference. Oh, and install pip till you're at it:

http://pypi.python.org/pypi/virtualenv
http://www.doughellmann.com/projects/virtualenvwrapper/
http://pypi.python.org/pypi/pip





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM

On Oct 6, 2009, at 11:01 PM, Skipper Seabold wrote:

 In keeping with the making some work for you theme, I filed an
 enhancement ticket for one change that we discussed and another IMO
 useful addition.  http://projects.scipy.org/numpy/ticket/1238

 I think it would be nice if we could do

 data = np.genfromtxt(SomeFile, dtype=float, names = ['var1', 'var2',
 'var3' ...])

 So that float is paired with each variable name.  Also, the one that
 came up earlier of

 data = np.genfromtxt(SomeFile, dtype=(int, int, float), names =
 ['var1','var2','var3']

 I'm not completely convinced on this one though, since dtype =
 i8,i8,f8 works.  I don't want know how much confusion it would add
 to have the dtype argument accept a non-valid dtype construction.

Actually, it's rather straightforward. I already have something that  
supports dtype=(int,int,float) (far easier to handle than i4,i4,f8),  
I need to tweak a couple of things when the names don't match before  
posting. Pairing the names with the dtype is pretty neat, that would  
be quite easy to implement



 PS.  Is it bad form for me to go ahead and assign these kinds of
 tickets to you if you're going to be working on them, or do you get
 pinged when any ticket is filed?

Go for it. I'm only notified when a ticket is assigned to me directly.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Gökhan Sever
On Tue, Oct 6, 2009 at 10:15 PM, Pierre GM pgmdevl...@gmail.com wrote:


 On Oct 6, 2009, at 10:58 PM, Gökhan Sever wrote:
 
  I see your points. I don't want to give you extra work, don't
  worry :) It just seem a bit bizarre:
 
  I[27]: c.data['Air_Temp'].fill_value
  O[27]: 99.005
 
  I[28]: c.data['Air_Temp'][4].fill_value
  O[28]: 1e+20
 
  As you see, it just returns two different fill_values.

 I know, but I hope you see the difference : in the first line, you
 access the `fill_value` of the array. In the second, you access the
 `fill_value` of the `masked` constant. Each time you access a masked
 element of an array with __getitem__, you get the masked constant. We
 could force the constant to inherit the fill_value of the array that
 calls __getitem__, but it'd be propagated.


Got these points. Thanks

It took a while I had to re-built matplotlib to use ipython -pylab :)

I built the numpy again source from the trunk and arccos (as well as other
arc functions) problem has disappeared. It all started with trying to
calculate great circle navigation equations using masked arrays, and seeing
this range_calc function returning some weird results where it was not
supposed to do. Further tracing down the error to arccos.

def range_calc(lat_r, lat_t, long_r, long_t):
range = degrees(arccos(sin(radians(lat_r)) * sin(radians(lat_t)) +
cos(radians(lat_r)) * cos(radians(lat_t)) * cos(radians(long_t - long_r
* F
azimuth = degrees(arccos((sin(radians(lat_t)) - cos(radians(range / F))
* sin(radians(lat_r))) / (sin(radians(range / F)) * cos(radians(lat_r)

if long_t - long_r  0:
azimuth = 360 - azimuth

return range, azimuth

Happy now ;)




  I know eventually you will be the one handling this :) it might be
  good to add this issue to the tracker.

 Go for it, but don't expect anything before the release of 1.4.0 (in
 the next few months)


I will do this shortly.



 
   This is the last resort. I will eventually try this if I don't any
   other options left.
 
  I gonna have difficulties fixing something that I don't see broken...
  Now, there might be something wrong in my installation. I gonna try to
  install 1.3.0 somwehere. say, what Python are you using ?
 
  OK, I use meld to diff my copy of ma/core.py with the latest trunk
  version. There are lots of differences :) So there is a possibility
  that I might have built my local numpy before 09/08. I should renew
  my copy. Do you know the link of svn browser for the numpy? I don't
  know how you are making separate installations without overriding
  other package? I either use Sage (if I have extra time) or SPD. They
  are both shipped with numpy 1.3.0.

 Make yourself a favor and install virtualenv and virtualenvwrapper.
 That way, several versions of the same package can coexist without
 interference. Oh, and install pip till you're at it:

 http://pypi.python.org/pypi/virtualenv
 http://www.doughellmann.com/projects/virtualenvwrapper/
 http://pypi.python.org/pypi/pip



pip this is the first time I am hearing. Will give these tools a try
probably this weekend.

Thanks again for your clarifications.

Now, I have to update my advisor's numpy to make his code running correctly.
In the first place his code was running properly by using manually created
masks for numpy arrays. Using the masked arrays we broke it. Now we know
what causing the error. It feels good :)





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Gökhan Sever
Created the ticket http://projects.scipy.org/numpy/ticket/1253

Could you tell me briefly what was the source of leak in arccos case?

And how do you write a test code for these cases?

On Tue, Oct 6, 2009 at 10:15 PM, Pierre GM pgmdevl...@gmail.com wrote:


 On Oct 6, 2009, at 10:58 PM, Gökhan Sever wrote:
 
  I see your points. I don't want to give you extra work, don't
  worry :) It just seem a bit bizarre:
 
  I[27]: c.data['Air_Temp'].fill_value
  O[27]: 99.005
 
  I[28]: c.data['Air_Temp'][4].fill_value
  O[28]: 1e+20
 
  As you see, it just returns two different fill_values.

 I know, but I hope you see the difference : in the first line, you
 access the `fill_value` of the array. In the second, you access the
 `fill_value` of the `masked` constant. Each time you access a masked
 element of an array with __getitem__, you get the masked constant. We
 could force the constant to inherit the fill_value of the array that
 calls __getitem__, but it'd be propagated.

  I know eventually you will be the one handling this :) it might be
  good to add this issue to the tracker.

 Go for it, but don't expect anything before the release of 1.4.0 (in
 the next few months)

 
   This is the last resort. I will eventually try this if I don't any
   other options left.
 
  I gonna have difficulties fixing something that I don't see broken...
  Now, there might be something wrong in my installation. I gonna try to
  install 1.3.0 somwehere. say, what Python are you using ?
 
  OK, I use meld to diff my copy of ma/core.py with the latest trunk
  version. There are lots of differences :) So there is a possibility
  that I might have built my local numpy before 09/08. I should renew
  my copy. Do you know the link of svn browser for the numpy? I don't
  know how you are making separate installations without overriding
  other package? I either use Sage (if I have extra time) or SPD. They
  are both shipped with numpy 1.3.0.

 Make yourself a favor and install virtualenv and virtualenvwrapper.
 That way, several versions of the same package can coexist without
 interference. Oh, and install pip till you're at it:

 http://pypi.python.org/pypi/virtualenv
 http://www.doughellmann.com/projects/virtualenvwrapper/
 http://pypi.python.org/pypi/pip





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Pierre GM

On Oct 7, 2009, at 12:10 AM, Gökhan Sever wrote:

 Created the ticket http://projects.scipy.org/numpy/ticket/1253

Want even more confusion ?
  x = ma.array([1,2,3],mask=[0,1,0], dtype=int)
  x[0].dtype
dtype('int64')
  x[1].dtype
dtype('float64')
  x[2].dtype
dtype('int64')

Yet another illustration of the masked constant... The more I think  
about it, the more I think we should have a specific object  
(MaskedConstant) that would do nothing but tell us that it is masked.


 Could you tell me briefly what was the source of leak in arccos case?

No idea, as I still haven't figured why you were having the problem in  
the first place

 And how do you write a test code for these cases?

assert(np.arccos(ma.masked), ma.masked) would be the simplest.



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Gökhan Sever
On Tue, Oct 6, 2009 at 11:33 PM, Pierre GM pgmdevl...@gmail.com wrote:


 On Oct 7, 2009, at 12:10 AM, Gökhan Sever wrote:

  Created the ticket http://projects.scipy.org/numpy/ticket/1253

 Want even more confusion ?
   x = ma.array([1,2,3],mask=[0,1,0], dtype=int)
   x[0].dtype
 dtype('int64')
   x[1].dtype
 dtype('float64')
   x[2].dtype
 dtype('int64')

 Yet another illustration of the masked constant... The more I think
 about it, the more I think we should have a specific object
 (MaskedConstant) that would do nothing but tell us that it is masked.


Confusing indeed.

One more from me:

I[1]: a = np.arange(5)

I[2]: mask = 999

I[6]: a[3] = 999

I[7]: am = ma.masked_equal(a, mask)

I[8]: am
O[8]:
masked_array(data = [0 1 2 -- 4],
 mask = [False False False  True False],
   fill_value = 99)

Where does this fill_value come from? To me it is little confusing having a
value and fill_value in masked array method arguments.




  Could you tell me briefly what was the source of leak in arccos case?

 No idea, as I still haven't figured why you were having the problem in
 the first place


Probably you can pin-point the error by testing a 1.3.0 version numpy. Not
too many arc function with masked array users around I guess :)



  And how do you write a test code for these cases?

 assert(np.arccos(ma.masked), ma.masked) would be the simplest.


Good to know this. The more I spend time with numpy the more I understand
the importance of testing the code automatically. This said, I still find
the test-driven-development approach somewhat bizarre. Start only by writing
test code and keep implementing your code until all the tests are satisfied.
Very interesting...These software engineers...





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions about masked arrays

2009-10-06 Thread Pierre GM

On Oct 7, 2009, at 1:12 AM, Gökhan Sever wrote:
 One more from me:
 I[1]: a = np.arange(5)
 I[2]: mask = 999
 I[6]: a[3] = 999
 I[7]: am = ma.masked_equal(a, mask)

 I[8]: am
 O[8]:
 masked_array(data = [0 1 2 -- 4],
  mask = [False False False  True False],
fill_value = 99)

 Where does this fill_value come from? To me it is little confusing  
 having a value and fill_value in masked array method arguments.

Because the two are unrelated. The `fill_value` is the value used to  
fill the masked elements (that is, the missing entries).
When you create a masked array, you get a `fill_value`, whose actual  
value is defined by default from the dtype of the array: for int, it's  
99, for float, 1e+20, you get the idea.
The value you used for masking is different, it's just whatver value  
you consider invalid. Now, if I follow you, you would expect the value  
in `masked_equal(array, value)` to be the `fill_value` of the output.  
That's an idea, would you mind fiilling a ticket/enhancement and  
assign it to me? So that I don't forget.


 Probably you can pin-point the error by testing a 1.3.0 version  
 numpy. Not too many arc function with masked array users around I  
 guess :)

Will try, but if it ain't broken, don't fix it...

 assert(np.arccos(ma.masked), ma.masked) would be the simplest.

(and in fact, it'd be assert(np.arccos(ma.masked) is ma.masked) in  
this case).


 Good to know this. The more I spend time with numpy the more I  
 understand the importance of testing the code automatically. This  
 said, I still find the test-driven-development approach somewhat  
 bizarre. Start only by writing test code and keep implementing your  
 code until all the tests are satisfied. Very interesting...These  
 software engineers...

Bah, it's not a rule cast in iron... You can start writing your code  
but do write the tests at the same time. It's the best way to make  
sure you're not breaking something later on.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion