On Tue, May 24, 2011 at 6:39 PM, Craig Yoshioka <[email protected]> wrote: > Hi all, > I've read some discussions about adding labeled axes, and even ticks, to > numpy arrays (such as in Luis' dataarray). > I have recently found that the ability to label axes would be very helpful > to me, but I'd like to keep the implementation as lightweight as possible. > The reason I would find this useful is because I am writing a ndarray > subclass that loads image/volume file formats into numpy arrays. Some of > these files might have multiple images/volumes, I'll call them channels, and > also may have an additional dimension for vectors associated with each > pixel/voxel, like color. The max dims of the array would then be 5. > Example: data = ndarray([1023,128,128,128,3]) might mean > (channels,z,y,x,rgb) for one array. Now I want to keep as much of the fancy > indexing capabilities of numpy as I can, but I am finding it difficult to > track the removal of axes that can occur from indexing. For example > data[2,2] would return an array of shape (128,128,3), or the third slice > through the third volume in the dataset, but the returned array has lost the > meaning associated with its axes, so saving it back out would require manual > relabeling of the axes. I'd like to be able to track the axes as metadata > and retain all the fancy numpy indexing. > There are two ways I could accomplish this with minimal code on the python > side: > One would be if indexing of the array always returned an array of the same > dimensionality, that is data[2,2] returned an array of shape > (1,1,128,128,3). I could then delete the degenerate axes labels from the > metadata, and return the compressed array, resulting in the same output: > > class Data(np.ndarray): > def __getitem__(self,indices): > data = np.ndarray.__getitem__(self,indices,donotcompress=True) # as an > example > data.axeslabels = [label for label,dim in zip(self.axeslabels,data.shape) if > dim > 1] > return data.compress() > def __getslice__(self,s1,s2,step): > # trivial case > > Another approach would be if there is some function in the numpy internals > that I could use to get the needed information before calling the ndarray's > __getitem__ function: > > class Data(np.ndarray): > def __getitem__(self,indices): > unique = np.uniqueIndicesPerDimension(indices) > data = np.ndarray.__getitem__(self,indices) > data.axeslabels = [label for label,dim in zip(self.axeslabels, unique) if > dim > 1] > return data > > Finally, I could implement my own parser for the passed indices to figure > this out myself. This would be bad since I'd have to recreate a lot of the > same code that must go on inside numpy, and it would be slower, error-prone, > etc. : > > class Data(np.ndarray): > def __getitem__(self,indices): > indices = self.uniqueDimensionIndices(indices) > data = np.ndarray.__getitem__(self,indices) > data.axeslabels = [label for label,dim in zip(self.axeslabels,indices) if > dim > 1] > return data > def uniqueDimensionIndices(self,indices): > if isinstance(indices,int): > indices = (indices,) > if isinstance(indices,tuple): > .... > elif isinstance(indices,list): > ... > > Is there anything in the numpy internals already that would allow me to do > #1 or #2?, I don't think #3 is a very good option. > Thanks! > > > > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
I would recommend joining or at least following the datarray project-- I believe it will do what you want, and we are working actively to build out this functionality. Here are some links: Datarray Github: https://github.com/fperez/datarray docs: http://fperez.github.com/datarray-doc/ Podcast we did about recent meeting about datarray http://inscight.org/2011/05/18/episode_13/ Other projects to consider using: larry and pandas-- these support data alignment which you may not care about. In pandas I'm only concerned with data with ndim <= 3, a bit specific to statistics/econometrics/finance applications. - Wes _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
