On Fri, Jul 9, 2010 at 1:17 PM, Joshua Holbrook <[email protected]> wrote: > On Fri, Jul 9, 2010 at 11:42 AM, Rob Speer <[email protected]> wrote: >> Now, the one part I've implemented that I just made up instead of >> looking to the SciPy consensus (because there was no SciPy consensus) >> was how to refer to multiple labeled axes without repeating ".axis" >> all over the place. My choice, which I call "magical axis attributes", >> is to have arr.somelabel == arr.axis.somelabel whenever it doesn't >> mean something else. This turns the call >> arr.axis.country.named['Netherlands'].axis.year[-1] >> into: >> arr.country.named['Netherlands'].year[-1] >> >> I got a message from Fernando Perez saying that he didn't like the >> magical axis attributes, for the expected reason that it's >> inconsistent. You shouldn't have to refer to your axis differently >> just because you called it something like "mean". Another problem that >> just occurred to me is that >> datarray-using code could break just because DataArray, or even >> ndarray itself, grew a new method. >> >> I like the syntax that magical attributes provide, but I'm willing to >> consider other options. Here's one: >> >> The __getattr__ only does its magic on attribute names that end in >> "_index" or "_named", which should not conflict with other method >> names. "arr.foo_index[3]" is the same as "arr.axis.foo[3]". >> Furthermore, "arr.foo_named['bar']" is the same as >> "arr.axis.foo.named['bar']". Then the above lookup becomes: >> arr.country_named['Netherlands'].year_index[-1] >> >> I don't find this as appealing as magical attributes, but perhaps it's >> more responsible. I'd like to know what other people think, so let me >> summarize and name the existing proposals: >> >> arr.axis.country.named['Netherlands'].axis.year[-1] # the default >> option -- works in any case >> arr[ arr.aix.country.named['Netherlands'].year[-1] ] # the "stuple" option >> arr.country.named['Netherlands'].year[-1] # the >> "magical" option >> arr.country_named['Netherlands'].year_index[-1] # the "semi-magical" >> option >> >> -- Rob >> >> On Fri, Jul 9, 2010 at 1:39 AM, Rob Speer <[email protected]> wrote: >>> http://github.com/rspeer/datarray represents my best guess at the >>> SciPy BOF consensus. I recently switched the method of accessing named >>> ticks from .named() to .named[] based on further discussion here. >>> >>> My implementation is still missing the case with named ticks but >>> positional axes, however. That is, you should be able to use .named >>> directly on the top-level datarray without referring to any axis >>> labels, to say something like arr.named['Netherlands', 2010], but you >>> can't yet. >>> -- Rob >>> >>> On Thu, Jul 8, 2010 at 11:44 PM, Keith Goodman <[email protected]> wrote: >>>> On Thu, Jul 8, 2010 at 1:20 PM, Fernando Perez <[email protected]> >>>> wrote: >>>> >>>>> The consensus at the BoF (not that it means it's set in stone, simply >>>>> that there was good chance for back-and-forth on the topic with many >>>>> voices) was that: >>>>> >>>>> 1. There are valid use cases for 'integer ticks', i.e. integers that >>>>> index arbitrarily into an array instead of in 0..N-1 fashion. >>>>> >>>>> 2. That having plain arr[0] give anything but the first element in arr >>>>> would be way too confusing in practice, and likely to cause too many >>>>> problems. >>>>> >>>>> 3. That the best solution to allow integer ticks while retaining >>>>> 'normal' indexing semantics for integers would be to have >>>>> >>>>> arr[int] -> normal indexing >>>>> arr.somethin[int] -> tick-based indexing, where an int can mean anything. >>>> >>>> Has the Scipy 2010 BOF consensus been implemented in anyone's fork? I >>>> don't understand the indexing so I'd like to try it. >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> [email protected] >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > I personally find the magic attributes most appealing as well. I don't > like the pseudomagic choice. I think what makes the magic attributes > appealing is that it's so much less verbose than the > alternatives--that is, axis.row --> row. While pseudo-magics is > conceptually like magic attributes with decreased chance of conflicts, > in practice it seems to merely turn that dot into an underscore--that > is, axis.row --> axis_row. > > We'd still be able to do axis.row as it is, right? (I've been too busy > being my parents' IT guy to get my hands dirty :( ) Maybe that would > be the way to go--I mean, you have the option of the nice magic > attribute action, but if it bothers you or you want your datarray to > be more robust or whatever, you can use axis.row throughout. Maybe we > could even have an enable/disable flag? I dunno. > > I almost feel like we should come up with some sort of hypothetical > case of a datarray that we want to do specific things with, so we can > talk about how we would do those things with a concrete example. It > should probably be at least 3d. Maybe I'll mock one up over my lunch > break. > > Oh, and in case anyone missed this email: > > On Thu, Jul 8, 2010 at 12:55 PM, Keith Goodman <[email protected]> wrote: >> What do you think of adding a ticks parameter to DataArray? Would that >> make sense? >> >> Current behavior: >> >>>> x = DataArray([[1, 2], [3, 4]], (('row', ['A','B']), ('col', ['C', 'D']))) >>>> x.axes >> (Axis(label='row', index=0, ticks=['A', 'B']), >> Axis(label='col', index=1, ticks=['C', 'D'])) >> >> Proposed ticks as separate input parameter: >> >>>> x = DataArray([[1, 2], [3, 4]], labels=('row', 'col'), ticks=[['A', 'B'], >>>> ['C', 'D']]) >> >> I think this would make it easier for new users to construct a >> DataArray with ticks just from looking at the function signature. It >> would match the function signature of Axis. My use case is to use >> ticks only and not names axes (at first), so: >> >>>> x = DataArray([[1, 2], [3, 4]], labels=None, ticks=[['A', 'B'], ['C', >>>> 'D']]) >> >> instead of the current: >> >>>> x = DataArray([[1, 2], [3, 4]], ((None, ['A','B']), (None, ['C', 'D']))) >> >> It might also cause less typos (parentheses matching) at the command line. >> >> I've only made a few DataArrays so I don't understanding the >> ramifications of what I am suggesting. >> _______________________________________________ >> NumPy-Discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > I was going to reply to it after I considered its contents but kinda > forgot until now. > > Anyways: while I like the idea of having ticks that correspond to > their axis being next to each other as the current behavior goes, I > find this alternative syntax easier to read, probably due to less > parentheses. > > At any rate, this is definitely worth discussion imo. > > --Josh
I ran into a few more questions while playing with datarrays, so I started a list: http://github.com/kwgoodman/datarrayQ _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
