Hi -- We are subclassing from np.rec.recarray and are confused about how some methods of np.rec.recarray relate to (differ from) analogous methods of its parent, np.ndarray. Below are specific questions about the __eq__, __getitem__ and view methods, we'd appreciate answers to our specific questions and/or more general points that we may be not understanding about subclassing from np.ndarray (and np.rec.recarray).
--- 1) Suppose I have a recarray object, x. How come np.ndarray.__getitem__(x, 'column_name') returns a recarray object rather than a ndarray? e.g., In [230]: x = np.rec.fromrecords([(1,'dd'), (2,'cc')], names=['a','b']) In [231]: np.ndarray.__getitem__(x, 'a') Out[231]: rec.array([1, 2]) In [232]: np.ndarray.__getitem__(x, 'a').dtype Out[232]: dtype('int32') The returned object is a recarray but it does not have a structured dtype. This generally seems to be the case when passing the instance of a subclass of np.ndarray (such as a np.rec.recarray object) to np.ndarray.__getitem__ --- 2)a) When I use the __getitem__ method of recarray to get an individual column, the returned object is an ndarray when the column is a numeric type but it is a recarray when the column is a string type. Why doesn't __getitem__ always return an ndarray for an individual column? e.g., In [175]: x = np.rec.fromrecords([(1,'dd'), (2,'cc')], names=['a','b']) In [176]: x['a'] Out[176]: array([1, 2]) In [177]: x['b'] Out[177]: rec.array(['dd', 'cc'], dtype='|S2') 2)b) Suppose I have a subclass of recarray, NewRecarray, that attaches some new attribute, e.g. 'info'. x = NewRecarray(data, names = ['a','b'], formats = '<i4, |S2') Now say I want to use recarray's __getitem__ method to get an individual column. Then x['a'] is an ndarray x['b'] is a NewRecarray and x['b'].info == x.info Is this the expected / proper behavior? Is there something wrong with the way I've subclassed recarray? --- 3)a) If I have two recarrays with the same len and column headers, the __eq__ method returns the rich comparison. Why is the result a recarray rather than an ndarray? In [162]: x = np.rec.fromrecords([(1,'dd'), (2,'cc')], names=['a','b']) In [163]: y = np.rec.fromrecords([(1,'dd'), (2,'cc')], names=['a','b']) In [164]: x == y Out[164]: rec.array([ True, True], dtype=bool) 3)b) Suppose I have a subclass of recarray, NewRecarray, that attaches some new attribute, e.g. 'info'. x = NewRecarray(data) y = NewRecarray(data) z = x == y Then z is a NewRecarray object and z.info = x.info. Is this the expected / proper behavior? Is there something wrong with the way I've subclassed recarray? [Dan Yamins asked this a couple days ago] --- 4) Suppose I have a subclass of np.ndarray, NewArray, that attaches some new attribute, e.g. 'info'. When I view a NewArray object as a ndarray, the result has no 'info' attribute. Is the memory corresponding to the 'info' attribute garbage collected? What happens to it? x = NewArray(data) x.view(np.ndarray) has no 'info' attribute --- Thanks for any help! (And thanks for reading if you read any or all of this!) Elaine
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion