On 10/28/10 5:29 PM, Robert Kern wrote: > On Thu, Oct 28, 2010 at 15:17, Ian Stokes-Rees > <ijsto...@hkl.hms.harvard.edu> wrote: >> I have an ndarray with named dimensions. I find myself writing some >> fairly laborious code with lots of square brackets and quotes. It seems >> like it wouldn't be such a big deal to overload __getattribute__ so >> instead of doing: >> >> r = genfromtxt('results.dat',dtype=[('a','int'), ('b', 'f8'), >> ('c','int'), ('d', 'a20')]) >> scatter(r[r['d'] == 'OK']['a'], r[r['d'] == 'OK']['b']) >> >> I could do: >> >> scatter(r[r.d == 'OK'].a, r[r.d == 'OK'].b) >> >> which is really a lot clearer. Is something like this already possible >> somehow? > See recarray which uses __getattribute__.
Thanks -- I'll look into it. >> Is there some reason not to map __getattr__ to __getitem__? > Using __getattribute__ tends to slow down almost all operations on the > array substantially. Perhaps __getattr__ would work better, but all of > the methods and attributes would mask the fields. If you can find a > better solution that doesn't have such an impact on normal > performance, we'd be happy to hear it. But wouldn't the performance hit only come when I use it in this way? __getattr__ is only called if the named attribute is *not* found (I guess it falls off the end of the case statement, or is the result of the attribute hash table "miss"). So the proviso is "this shortcut only works if the field names are distinct from any methods or attributes on the ndarray object (or its sub-classes)". You've gotta admit that the readability of the code goes up *a lot* with the alternative I'm proposing. Ian -- Ian Stokes-Rees, PhD W: http://portal.nebiogrid.org ijsto...@hkl.hms.harvard.edu T: +1.617.432.5608 x75 NEBioGrid, Harvard Medical School C: +1.617.331.5993
<<attachment: ijstokes.vcf>>
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion