On Thu, Jul 8, 2010 at 3:13 AM, Lluís <xscr...@gmx.net> wrote: > Rob Speer writes: > >>>>> arr.country.named('Netherlands').year.named(2010) >>>>> arr.country.named('Spain').year.named(slice(1994, 2010)) >>>>> arr.year.named(2006).country[0:2] > > This looks too verbose to me. > > As axis always have a total order, I'd go for the most compact representation > (assuming 'country' is the first axis, and 'year' the second one): > > arr['Netherlands','2010'] > arr['Spain','1994':'2010'] > arr[0:2,'2006'] > > This is my current implementation, which also allows for slices with mixed > integers and names everywhere. > > I understand this might not be the desired default behaviour, as requires > looking into the types of every item in '__getitem__', and this might be a > performance issue (although my current implementation tries to optimize for > the > case of integer indexes). > > Thus, we can use something in the middle: > > arr[0,1] > arr.names['Netherlands',2010] # I'd rather go for 'names' instead of 'ticks' > arr.country['Spain'].year[1994:2010] > > The default '__getitem__' still has full speed, but accessing the 'named' > attribute allows for accessing on the lines of my previous example, while > still > allowing the access through axis name without requiring an explicit 'slice'. > > Although this is not my preferred syntax, I think it is a good compromise, > and I > could always subclass this to redirect the default '__getitem__' into > 'names.__getitem__'. > > Btw, I store the names to index translations on an ordered dict (indexed by > name), such that I can also provide an 'arr.iteritems' method that returns > tuples with 'name/tick' and the array contents of that index. In the above > syntax, this would probably be 'arr.<axisname>.iteritems'. > > Another feature I like is being able to translate back and forth from > names/ticks to integers, which I do through my 'Dimension.__getitem__' method > (Dimension is the equivalent of datarray's 'Axis'). > > PS: I also have a separation between axis and their naming, meaning that I can > have a single axis with both 'country' and 'year', such that I would index > with > 'Netherlands-2010' (other examples do make more sense), but still be able to > access them separately (this reduces the size of the full ndarray, as there is > no need for so many NaNs to make the ndarray homoheneus on size, and it brings > the ndarray closer to the structuring of data on the mind of the user). > > Read you, > Lluis > > -- > "And it's much the same thing with knowledge, for whenever you learn > something new, the whole world becomes that much richer." > -- The Princess of Pure Reason, as told by Norton Juster in The Phantom > Tollbooth > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
> arr['Netherlands','2010'] Isn't this the __getitem___ action we were trying to avoid? --Josh _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion