On Sat, Oct 29, 2011 at 3:32 AM, Charles R Harris <charlesr.har...@gmail.com > wrote:
> > > On Fri, Oct 28, 2011 at 6:45 PM, Wes McKinney <wesmck...@gmail.com> wrote: > >> On Fri, Oct 28, 2011 at 7:53 PM, Benjamin Root <ben.r...@ou.edu> wrote: >> > >> > >> > On Friday, October 28, 2011, Matthew Brett <matthew.br...@gmail.com> >> wrote: >> >> Hi, >> >> >> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers >> >> <ralf.gomm...@googlemail.com> wrote: >> >>> >> >>> >> >>> On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett < >> matthew.br...@gmail.com> >> >>> wrote: >> >>>> >> >>>> Hi, >> >>>> >> >>>> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris >> >>>> <charlesr.har...@gmail.com> wrote: >> >>>> > >> >>>> > >> >>>> > On Fri, Oct 28, 2011 at 3:56 PM, Matthew Brett >> >>>> > <matthew.br...@gmail.com> >> >>>> > wrote: >> >>>> >> >> >>>> >> Hi, >> >>>> >> >> >>>> >> On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett >> >>>> >> <matthew.br...@gmail.com> >> >>>> >> wrote: >> >>>> >> > Hi, >> >>>> >> > >> >>>> >> > On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris >> >>>> >> > <charlesr.har...@gmail.com> wrote: >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith < >> n...@pobox.com> >> >>>> >> >> wrote: >> >>>> >> >>> >> >>>> >> >>> On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant >> >>>> >> >>> <oliph...@enthought.com> >> >>>> >> >>> wrote: >> >>>> >> >>> > I think Nathaniel and Matthew provided very >> >>>> >> >>> > specific feedback that was helpful in understanding other >> >>>> >> >>> > perspectives >> >>>> >> >>> > of a >> >>>> >> >>> > difficult problem. In particular, I really wanted >> >>>> >> >>> > bit-patterns >> >>>> >> >>> > implemented. However, I also understand that Mark did >> quite >> >>>> >> >>> > a >> >>>> >> >>> > bit >> >>>> >> >>> > of >> >>>> >> >>> > work >> >>>> >> >>> > and altered his original designs quite a bit in response to >> >>>> >> >>> > community >> >>>> >> >>> > feedback. I wasn't a major part of the pull request >> >>>> >> >>> > discussion, >> >>>> >> >>> > nor >> >>>> >> >>> > did I >> >>>> >> >>> > merge the changes, but I support Charles if he reviewed the >> >>>> >> >>> > code >> >>>> >> >>> > and >> >>>> >> >>> > felt >> >>>> >> >>> > like it was the right thing to do. I likely would have done >> >>>> >> >>> > the >> >>>> >> >>> > same >> >>>> >> >>> > thing >> >>>> >> >>> > rather than let Mark Wiebe's work languish. >> >>>> >> >>> >> >>>> >> >>> My connectivity is spotty this week, so I'll stay out of the >> >>>> >> >>> technical >> >>>> >> >>> discussion for now, but I want to share a story. >> >>>> >> >>> >> >>>> >> >>> Maybe a year ago now, Jonathan Taylor and I were debating what >> >>>> >> >>> the >> >>>> >> >>> best API for describing statistical models would be -- whether >> we >> >>>> >> >>> wanted something like R's "formulas" (which I supported), or >> >>>> >> >>> another >> >>>> >> >>> approach based on sympy (his idea). To summarize, I thought >> his >> >>>> >> >>> API >> >>>> >> >>> was confusing, pointlessly complicated, and didn't actually >> solve >> >>>> >> >>> the >> >>>> >> >>> problem; he thought R-style formulas were superficially >> simpler >> >>>> >> >>> but >> >>>> >> >>> hopelessly confused and inconsistent underneath. Now, >> obviously, >> >>>> >> >>> I >> >>>> >> >>> was >> >>>> >> >>> right and he was wrong. Well, obvious to me, anyway... ;-) But >> it >> >>>> >> >>> wasn't like I could just wave a wand and make his arguments go >> >>>> >> >>> away, >> >>>> >> >>> no I should point out that the implementation hasn't - as far >> as >> >>>> >> >>> I can >> >> see - changed the discussion. The discussion was about the API. >> >> Implementations are useful for agreed APIs because they can point out >> >> where the API does not make sense or cannot be implemented. In this >> >> case, the API Mark said he was going to implement - he did implement - >> >> at least as far as I can see. Again, I'm happy to be corrected. >> >> >> >>>> In saying that we are insisting on our way, you are saying, >> implicitly, >> >>>> 'I >> >>>> am not going to negotiate'. >> >>> >> >>> That is only your interpretation. The observation that Mark >> compromised >> >>> quite a bit while you didn't seems largely correct to me. >> >> >> >> The problem here stems from our inability to work towards agreement, >> >> rather than standing on set positions. I set out what changes I think >> >> would make the current implementation OK. Can we please, please have >> >> a discussion about those points instead of trying to argue about who >> >> has given more ground. >> >> >> >>> That commitment would of course be good. However, even if that were >> >>> possible >> >>> before writing code and everyone agreed that the ideas of you and >> >>> Nathaniel >> >>> should be implemented in full, it's still not clear that either of you >> >>> would >> >>> be willing to write any code. Agreement without code still doesn't >> help >> >>> us >> >>> very much. >> >> >> >> I'm going to return to Nathaniel's point - it is a highly valuable >> >> thing to set ourselves the target of resolving substantial discussions >> >> by consensus. The route you are endorsing here is 'implementor >> >> wins'. We don't need to do it that way. We're a mature sensible >> >> bunch of adults who can talk out the issues until we agree they are >> >> ready for implementation, and then implement. That's all Nathaniel is >> >> saying. I think he's obviously right, and I'm sad that it isn't as >> >> clear to y'all as it is to me. >> >> >> >> Best, >> >> >> >> Matthew >> >> >> > >> > Everyone, can we please not do this?! I had enough of adults doing >> finger >> > pointing back over the summer during the whole debt ceiling debate. I >> think >> > we can all agree that we are better than the US congress? >> > >> > Forget about rudeness or decision processes. >> > >> > I will start by saying that I am willing to separate ignore and absent, >> but >> > only on the write side of things. On read, I want a single way to >> identify >> > the missing values. I also want only a single way to perform >> calculations >> > (either skip or propagate). >> > >> > An indicator of success would be that people stop using NaNs and magic >> > numbers (-9999, anyone?) and we could even deprecate nansum(), or at >> least >> > strongly suggest in its docs to use NA. >> >> Well, I haven't completely made up my mind yet, will have to do some >> more prototyping and playing (and potentially have some of my users >> eat the differently-flavored dogfood), but I'm really not very >> satisfied with the API at the moment. I'm mainly worried about the >> abstraction leaking through to pandas users (this is a pretty large >> group of people judging by # of downloads). >> >> The basic position I'm in is that I'm trying to push Python into a new >> space, namely mainstream data analysis and statistical computing, one >> that is solidly occupied by R and other such well-known players. My >> target users are not computer scientists. They are not going to invest >> in understanding dtypes very deeply or the internals of ndarray. In >> fact I've spent a great deal of effort making it so that pandas users >> can be productive and successful while having very little >> understanding of NumPy. Yes, I essentially "protect" my users from >> NumPy because using it well requires a certain level of sophistication >> that I think is unfair to demand of people. This might seem totally >> bizarre to some of you but it is simply the state of affairs. So far I >> have been successful because more people are using Python and pandas >> to do things that they used to do in R. The NA concept in R is dead >> simple and I don't see why we are incapable of also implementing >> something that is just as dead simple. To we, the scipy elite let's >> call us, it seems simple: "oh, just pass an extra flag to all my array >> constructors!" But this along with the masked array concept is going >> to have two likely outcomes: >> >> 1) Create a great deal more complication in my already very large codebase >> >> and/or >> >> 2) force pandas users to understand the new masked arrays after I've >> carefully made it so they can be largely ignorant of NumPy >> >> The mostly-NaN-based solution I've cobbled together and tweaked over >> the last 42 months actually *works really well*, amazingly, with >> relatively little cost in code complexity. Having found a reasonably >> stable equilibrium I'm extremely resistant to upset the balance. >> >> So I don't know. After watching these threads bounce back and forth >> I'm frankly not all that hopeful about a solution arising that >> actually addresses my needs. >> > > But Wes, what *are* your needs? You keep saying this, but we need examples > of how you want to operate and how numpy fails. As to dtypes, internals, and > all that, I don't see any of that in the current implementation, unless you > mean the maskna and skipna keywords. I believe someone on the previous > thread mentioned a way to deal with that. > >From the release notes I just learned that skipna is basically the same as in R: "R's parameter rm.na=T is spelled skipna=True in NumPy." It provides a good summary of the current status in master: https://github.com/numpy/numpy/blob/master/doc/release/2.0.0-notes.rst Ralf
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion