Re: [Numpy-discussion] What is consensus anyway
26.04.2012 03:11, Travis Oliphant kirjoitti: [clip] It would be nice if every pull request created a message to this list. Is that even possible? Unidirectional forwarding is possible, for instance using Github's API, https://github.com/pv/github-pull-request-fwd Github itself doesn't offer tools to do this, so an external server for sending the mails is needed, and the mailing list admins need to allow the corresponding mails to pass through. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
We're kind of drifting again here, but... Remember when all this discussion happened on usenet? Perhaps we're in yet another awkward transition period and soon all email list-type discussions will be on Github, Bitbucket, StackOverflow (e.g. pandas), etc. There's advantages and disadvantages to any sort of discussion paradigm, but I can imagine a future version of Github where each project has a tab for a StackOverflow-esque forum. As a user, that all sounds pretty appealing to me. But this is all just speculation and conjecture... -paul On Wed, Apr 25, 2012 at 9:48 PM, Fernando Perez fperez@gmail.com wrote: On Wed, Apr 25, 2012 at 6:28 PM, Benjamin Root ben.r...@ou.edu wrote: It would be nice if every pull request created a message to this list. Is that even possible? -Travis This ha been a concern of mine for matplotlib as well. The closest I can come is to set up an RSS feed, but all the titles are PR # and a action, so I lose track of which ones I want to view. Same here for IPython. If anybody figures out a clean solution, please advertise it! I think a bunch of us want the same thing... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Thu, Apr 26, 2012 at 6:37 AM, srean srean.l...@gmail.com wrote: On something else that was brought up: I do not consider myself competent/prepared enough to take on development, but it is not the case that I have _never_ felt the temptation. What I have found intimidating and styming is the perceived politics over development issues. The two places where I have felt this are a) on contentious threads on the list and b) what seems like legitimate patches tickets on trac that seem to be languishing for no compelling technical reason. I would be hardpressed to quote specifics, but I have encountered this feeling a few times. Patches languishing on Trac is a real problem. The issue here is not at all about not wanting those patches, but just about the overhead of getting them reviewed/fixed/committed. This problem has more or less disappeared with Github; there are very few PRs that are just sitting there. As for existing patches on Trac, if you or anyone else has an interest in one of them, checking that patch for test coverage / documentation and resubmitting it as a PR would be a massive help. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 11:18 PM, Ralf Gommers Perhaps a more formal development release system could help here. IIUC, numpy pretty much has two things: This is a good idea - not for development releases but for master. Building nightly/weekly binaries would help more people try out new features. good start, but I think master may fluctuate too quickly (and how often is it broken?) but better than nothing, yes? 2) there is the wxversion system wxversion was broken for a long time on Ubuntu too (~5 yrs ago). I don't exactly remember it as a good idea. well, it was a good idea, maybe not a good implementation -- and it was vary helpful a few years back when wx was in major flux. What we really need is python itself providing a package version selection mechanism, but Guidoc never saw the need (the existence of virtualenv proves the need if you ask me) virtualenv also doesn't help, because if you can use that you know how to build from source anyway. not true -- lots of folks use easy_install and/or pip with virtualenv. and the git barrier to entry is not trivial -- granted jsut getting master is not hard, but I know i've been using git for a couple months on a core project of mine, and I still find it's giving me far more pain that help. (I know I stil haven't wrapped my brain around what DVCS really is...) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Thu, Apr 26, 2012 at 7:02 PM, Chris Barker chris.bar...@noaa.gov wrote: On Mon, Apr 23, 2012 at 11:18 PM, Ralf Gommers Perhaps a more formal development release system could help here. IIUC, numpy pretty much has two things: This is a good idea - not for development releases but for master. Building nightly/weekly binaries would help more people try out new features. good start, but I think master may fluctuate too quickly (and how often is it broken?) but better than nothing, yes? How often is it broken? A couple of failing tests yes, but hardly ever seriously broken. 2) there is the wxversion system wxversion was broken for a long time on Ubuntu too (~5 yrs ago). I don't exactly remember it as a good idea. well, it was a good idea, maybe not a good implementation -- and it was vary helpful a few years back when wx was in major flux. What we really need is python itself providing a package version selection mechanism, but Guidoc never saw the need (the existence of virtualenv proves the need if you ask me) agreed virtualenv also doesn't help, because if you can use that you know how to build from source anyway. not true -- lots of folks use easy_install and/or pip with virtualenv. Pip only installs from source, so if you haven't got the right compilers, development headers etc. it will fail for numpy. easy_install is also a lottery, and only works for numpy on Windows unless you are set up to build from source. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Patches languishing on Trac is a real problem. The issue here is not at all about not wanting those patches, Oh yes I am sure of that, in the past it had not been clear what more is necessary to get them pulled in, or how to go about satisfying the requirements. The document you mailed on the scipy list goes a long way in addressing those issues. So thanks a lot. In fact it might be a good idea to add the link to it in the signature of the mail that trac replies with. but just about the overhead of getting them reviewed/fixed/committed. This problem has more or less disappeared with Github; there are very few PRs that are just sitting there. As for existing patches on Trac, if you or anyone else has an interest in one of them, checking that patch for test coverage / documentation and resubmitting it as a PR would be a massive help. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 4:02 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 8:56 PM, Fernando Perez fperez@gmail.com wrote: On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. Google shows several years of discussion on Monotone, but I have no idea what your contributions were Seriously??? Please, let's rise above this. We discuss people's opinions *on their technical merit alone*, regardless of the background of the person presenting them. I don't care if Linus himself shows up on the list with a bad idea, it should be shot down; and if someone we'd never heard of brings up a valid point, we should respect it. The day we start checking credentials at the door is the day this project will die as an open source effort. Or at least I think so, but perhaps I don't have enough 'commit credits' in my account for my opinion to matter... Fernando, I'm not checking credentials, I'm curious. Nathaniel has experience with FOSS projects, unlike us first timers, and I'd like to know what that experience was and what he learned from it. He has also mentioned Graydon Hoare in connection with RUST, and since Graydon was the prime mover in Monotone I'd like to know the story of the project. Yeah, I don't want to get into resumes and such here, since it'd be hard to avoid turning it into one of those whose has a bigger FOSS pecking-order contests, which I find both unpleasant and counter-productive. If I've learned anything useful from experience, then I've already tried to summarize it here (and really, experience may or may not guarantee any kind of wisdom). If you want to swap war stories, ask me some day over a $BEVERAGE :-). After sleeping on it, I was wondering if part of your objection to the consensus stuff is just to the word veto? Would you feel more comfortable if it was phrased like, the maintainers have noticed that trying to pick and choose on contentious issues tends to come back and bite them, so they've decided that they will not accept changes unless they have reasonable certainty that all substantive objections from the userbase have been worked through and resolved? It means the same thing in the end, but perhaps makes clearer how the power actually works. -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 4:07 AM, Nathaniel Smith n...@pobox.com wrote: On Wed, Apr 25, 2012 at 4:02 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 8:56 PM, Fernando Perez fperez@gmail.com wrote: On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. Google shows several years of discussion on Monotone, but I have no idea what your contributions were Seriously??? Please, let's rise above this. We discuss people's opinions *on their technical merit alone*, regardless of the background of the person presenting them. I don't care if Linus himself shows up on the list with a bad idea, it should be shot down; and if someone we'd never heard of brings up a valid point, we should respect it. The day we start checking credentials at the door is the day this project will die as an open source effort. Or at least I think so, but perhaps I don't have enough 'commit credits' in my account for my opinion to matter... Fernando, I'm not checking credentials, I'm curious. Nathaniel has experience with FOSS projects, unlike us first timers, and I'd like to know what that experience was and what he learned from it. He has also mentioned Graydon Hoare in connection with RUST, and since Graydon was the prime mover in Monotone I'd like to know the story of the project. Yeah, I don't want to get into resumes and such here, since it'd be hard to avoid turning it into one of those whose has a bigger FOSS pecking-order contests, which I find both unpleasant and counter-productive. If I've learned anything useful from experience, then I've already tried to summarize it here (and really, experience may or may not guarantee any kind of wisdom). If you want to swap war stories, ask me some day over a $BEVERAGE :-). Well, you have already appealed to the authority of greater experience, so it's a bit late to declare disinterest in the subject ;) I mean, at this point I really would like to see how big your FOSS is. After sleeping on it, I was wondering if part of your objection to the consensus stuff is just to the word veto? Would you feel more comfortable if it was phrased like, the maintainers have noticed that trying to pick and choose on contentious issues tends to come back and bite them, so they've decided that they will not accept changes unless they have reasonable certainty that all substantive objections from the userbase have been worked through and resolved? It means the same thing in the end, but perhaps makes clearer how the power actually works. I don't agree here. People work on open source to scratch an itch, so the process of making a contribution needs to be easy. Widespread veto makes it more difficult and instead of opening up the process, closes it down. There is less freedom, not more. That is one of the reasons that the smaller scikits attract people, they have more freedom to do what they want and fewer people to answer to. Scipy also has some of that advantage because there are a number of packages to choose from. The more strict the process and the more people to please, the less appealing the environment becomes. This can be observed in practice and the voluntary nature of FOSS amplifies the effect. But in the end, someone has to write the code. Steve McConnell (Code Complete) estimates that even in carefully planned projects code construction will take up 60-80 percent of the time and effort. And if the code isn't written, nothing else matters much. That is why people who write code are essential to a project, no amount of structure will take their place. And here again the voluntary nature of FOSS comes into play, folks can't be ordered to do the work. It can be suggested that certain things be done, and the desire to work with the group will lead people to do work they wouldn't consider doing for themselves, but unless they are interested in a particular feature they won't generally be motivated to sit down and devote the effort needed to get it done just because someone else wants it. And they will rightly be offended if anyone demands that they volunteer their work to implement some feature in a particular way. They have to be led there, not pushed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 06:03:25AM -0600, Charles R Harris wrote: Well, you have already appealed to the authority of greater experience, so it's a bit late to declare disinterest in the subject ;) I mean, at this point I really would like to see how big your FOSS is. Chuck, I am not sure that this is helpful for the discussion. I think that it is a great discussion to have in real life, as it is one of those in which all participants can learn a lot, but on a mailing list with a wider diffusion, it can very easily drift in a pissing contest. Gaël ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 1:03 PM, Charles R Harris charlesr.har...@gmail.com wrote: That is one of the reasons that the smaller scikits attract people, they have more freedom to do what they want and fewer people to answer to. Scipy also has some of that advantage because there are a number of packages to choose from. The more strict the process and the more people to please, the less appealing the environment becomes. A quick look shows ~100,000 downloads of 1.6.1 via PyPI. SF.net shows 600,000 numpy downloads in the last 12 months. I'm afraid the numpy developers have a lot of people to please, whether they like it or not :-). OTOH I'm still confused at what kind of strictness you're worried about in practice. Not too many of those people actually show up on the mailing list, and usually the problem is convincing those that *do* show up into actually expressing their needs rather than just assuming that real developers must know better. Fernando spoke eloquently in this thread in support of consensus, and IPython doesn't seem to be laboring under a strict process that's driving away developers. AFAICT whole-heartedly adopting the consensus idea would only have actually altered one (!) decision in the project to date, which is not exactly jack-booted as these things go. - N ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
I don't agree here. People work on open source to scratch an itch, so the process of making a contribution needs to be easy. Widespread veto makes it more difficult and instead of opening up the process, closes it down. There is less freedom, not more. That is one of the reasons that the smaller scikits attract people, they have more freedom to do what they want and fewer people to answer to. Scipy also has some of that advantage because there are a number of packages to choose from. The more strict the process and the more people to please, the less appealing the environment becomes. This can be observed in practice and the voluntary nature of FOSS amplifies the effect. It is true that it is easier to get developers to contribute to small projects where they can control exactly what happens and not have to appeal to a wider audience to get code changed and committed. This effect is well-illustrated by the emergence of scikits in the presence of SciPy. However, the idea that people work on open source to scratch an itch is incomplete. This is certainly one of the reasons volunteers work on open source.There are many people, however, that work on open source as part of their job. In the particular instance of the missing data support, Mark did much of the work as part of his job. It wasn't just to scratch an itch. So, we should not make assumptions on the basis of this incomplete model. NumPy is far-beyond the mode of a few people scratching an itch. It is in wide-spread use. It is a large project with a great deal of history and a diverse user-community. It needs people full-time to help maintain it.It needs maintainers who listen actively to anyone who will express their concerns cogently.It needs maintainers who recognize that any concern that somebody expresses is typically not a unique view. We cannot expect to find people like that who are just interested in scratching an itch and always working for free. Most projects suffer from lack of feedback. We should be worried about how to get more feedback and input from *just users* and be very sensitive to anyone feeling like their legitimate concerns are not being heard. Most people, rather than express their concerns, will just work-around the problem, write their own stuff, or move on to other languages and approaches. Your point about somebody writing the code is absolutely true, I would just suggest that the view that FOSS is always just volunteer labor needs to expand. People do work full time on FOSS as part of their job. We need to bring that to NumPy. I know of at least 2 other people besides me who are actively trying to make this possible. At Continuum we offer the opportunity to work on NumPy. We plan to continue this. We are hiring.In this context, I'm especially interested in making sure that it's not just the developers who get to decide what happens to NumPy. Nathaniel has clarified very well what veto-power really means. It's not absolute, it just means that users who write clear arguments get listened to actively. It doesn't replace the need for developers with wisdom and understanding of user-experiences, but active listening is a useful skill that we could all improve on: http://en.wikipedia.org/wiki/Active_listening A list full of bright, interested, active listeners is the kind of culture we need on this list. It's the kind of attitude we need from maintainers of NumPy. -Travis But in the end, someone has to write the code. Steve McConnell (Code Complete) estimates that even in carefully planned projects code construction will take up 60-80 percent of the time and effort. And if the code isn't written, nothing else matters much. That is why people who write code are essential to a project, no amount of structure will take their place. And here again the voluntary nature of FOSS comes into play, folks can't be ordered to do the work. It can be suggested that certain things be done, and the desire to work with the group will lead people to do work they wouldn't consider doing for themselves, but unless they are interested in a particular feature they won't generally be motivated to sit down and devote the effort needed to get it done just because someone else wants it. And they will rightly be offended if anyone demands that they volunteer their work to implement some feature in a particular way. They have to be led there, not pushed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Wed, Apr 25, 2012 at 9:39 AM, Travis Oliphant tra...@continuum.io wrote: I don't agree here. People work on open source to scratch an itch, so the process of making a contribution needs to be easy. Widespread veto makes it more difficult and instead of opening up the process, closes it down. There is less freedom, not more. That is one of the reasons that the smaller scikits attract people, they have more freedom to do what they want and fewer people to answer to. Scipy also has some of that advantage because there are a number of packages to choose from. The more strict the process and the more people to please, the less appealing the environment becomes. This can be observed in practice and the voluntary nature of FOSS amplifies the effect. It is true that it is easier to get developers to contribute to small projects where they can control exactly what happens and not have to appeal to a wider audience to get code changed and committed. This effect is well-illustrated by the emergence of scikits in the presence of SciPy. However, the idea that people work on open source to scratch an itch is incomplete. Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? Would you consider asking that question directly on list and asking for the most honest possible answers? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I don't know about SymPy. But in my view (and I'm just a typical user of NumPy), numpy seems to be at the base of what people actually need to do. I would assume most users of numpy actually use it because it's underlying piece of software, i.e. SciPy. It provides convenient, fast array structures to do maths. I would assume that most users see numpy as infrastructure, they write their own code on top of it. As a normal user of numpy, I wouldn't know where it would need improvement to suit my needs because it already does all I need. (Okay, masked arrays are something which could definitely improve, but that's another story.) This is different from other, higher-level FOSS projects, which are closer to end user final requirements, where end users might be more compelled to contribute because it's closer to what they're actually doing. For example, I just wrote two enhancements to scipy.interpolate, which were / will be merged recently / soon. Plus, numpy is a lot of C code, and to me (again, as a user) it seems more complicated to contribute because things are not as isolated. Just my 2 ct. Andreas. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On 4/25/2012 4:51 PM, Andreas H. wrote: I would assume that most users see numpy as infrastructure, they write their own code on top of it. As a normal user of numpy, I wouldn't know where it would need improvement to suit my needs because it already does all I need. (Okay, masked arrays are something which could definitely improve, but that's another story.) This is different from other, higher-level FOSS projects, Thank you Andreas. I was debating whether to explain exactly this, to point out that I found Matthew's question inappropriately aggressive, or both. Now I can do both in a flash. But I find I would also like to once again say thank you to the developers, who have given us an amazing piece of software. I would add that I am impressed by the deep respect they show each other even when dealing with hard issues. Alan Isaac Just another grateful user for many years. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
I too have to agree with Andreas. I have been using Numpy for years in my work, but am not versed in C so I don't even understand what numpy is doing under the hood. I too would only be able to contribute to the code at the python level, or as Andreas said, at improving SciPy packages and other Numpy-based projects. One area that you may be able to get more help from the general user base is with publicity, tutorials and word-of-mouth. I had recently shown Numpy to a friend who was versed in matlab, and he was really impressed because Numpy is easily incorporated into more general Python scripts. I've worked a lot with the Enthought Tool Suite and shown off some of that to my colleagues. They are impressed at the streamlined code-to-visuals process although I don't think they even realize that Numpy is responsible for all the numerics in the program. To this end, I think outreach would be helpful in recruiting new programmers. Once they understand that Numpy does a lot at the C-level and that it is not strictly a Python feature, they may realize its something that they can contribute to. On Wed, Apr 25, 2012 at 5:04 PM, Alan G Isaac alan.is...@gmail.com wrote: On 4/25/2012 4:51 PM, Andreas H. wrote: I would assume that most users see numpy as infrastructure, they write their own code on top of it. As a normal user of numpy, I wouldn't know where it would need improvement to suit my needs because it already does all I need. (Okay, masked arrays are something which could definitely improve, but that's another story.) This is different from other, higher-level FOSS projects, Thank you Andreas. I was debating whether to explain exactly this, to point out that I found Matthew's question inappropriately aggressive, or both. Now I can do both in a flash. But I find I would also like to once again say thank you to the developers, who have given us an amazing piece of software. I would add that I am impressed by the deep respect they show each other even when dealing with hard issues. Alan Isaac Just another grateful user for many years. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy.I've come to love the interesting plateau that NumPy lives on.But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote: Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy. I've come to love the interesting plateau that NumPy lives on. But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. Of course, there are potential explanations: 1) Numpy is too low-level for most people 2) The C code is too complicated 3) It's fine already, more or less are some obvious ones. I would say there are the easy answers. But of course, the easy answer may not be the right answer. It may not be easy to get right answer [1]. As you can see from Alan Isaac's reply on this thread, even asking the question can be taken as being in bad faith. In that situation, I think you'll find it hard to get sincere replies. Best, Matthew [1] http://en.wikipedia.org/wiki/Good_to_Great ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 5:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote: Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy. I've come to love the interesting plateau that NumPy lives on. But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. Of course, there are potential explanations: 1) Numpy is too low-level for most people 2) The C code is too complicated 3) It's fine already, more or less are some obvious ones. I would say there are the easy answers. But of course, the easy answer may not be the right answer. It may not be easy to get right answer [1]. As you can see from Alan Isaac's reply on this thread, even asking the question can be taken as being in bad faith. In that situation, I think you'll find it hard to get sincere replies. I don't see why this shouldn't be the sincere replies, I think these easy answers are also the right answer for most people. maybe I would add 4) writing code for a few hundred thousand users is a big responsibility and a bit scary Except for a few core c developers, most contributors contribute to parts of numpy, best example Pierre and masked arrays, or specific functions. Life goes on for most developers in the application areas, I guess. For example I'm very glad about the time that Pauli is spending on scipy. numpy is great [1] Josef Best, Matthew [1] http://en.wikipedia.org/wiki/Good_to_Great [1] http://sourceforge.net/projects/numpy/files/stats/timeline?dates=2000-01-11+to+2012-04-25 http://qa.debian.org/popcon.php?package=python-numpy ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wednesday, April 25, 2012, Matthew Brett wrote: Hi, On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.iojavascript:; wrote: Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy.I've come to love the interesting plateau that NumPy lives on.But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. Of course, there are potential explanations: 1) Numpy is too low-level for most people 2) The C code is too complicated 3) It's fine already, more or less are some obvious ones. I would say there are the easy answers. But of course, the easy answer may not be the right answer. It may not be easy to get right answer [1]. As you can see from Alan Isaac's reply on this thread, even asking the question can be taken as being in bad faith. In that situation, I think you'll find it hard to get sincere replies. As with anything, the phrasing of a question makes a world of a difference with regards to replies. Ask any pollster. When phrased correctly, I would not have any doubt about the sincerely of replies, and I would not worry about previewed hostility -- when phrased correctly. As the questioner, the onus is upon you to gauge the community and adjust the question appropriately. I think the fact that we engage in these discussions show that we value and care about each others perceptions and opinions with regards to numpy. Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Wed, Apr 25, 2012 at 1:35 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 25, 2012 at 9:39 AM, Travis Oliphant tra...@continuum.io wrote: I don't agree here. People work on open source to scratch an itch, so the process of making a contribution needs to be easy. Widespread veto makes it more difficult and instead of opening up the process, closes it down. There is less freedom, not more. That is one of the reasons that the smaller scikits attract people, they have more freedom to do what they want and fewer people to answer to. Scipy also has some of that advantage because there are a number of packages to choose from. The more strict the process and the more people to please, the less appealing the environment becomes. This can be observed in practice and the voluntary nature of FOSS amplifies the effect. It is true that it is easier to get developers to contribute to small projects where they can control exactly what happens and not have to appeal to a wider audience to get code changed and committed. This effect is well-illustrated by the emergence of scikits in the presence of SciPy. However, the idea that people work on open source to scratch an itch is incomplete. Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? Would you consider asking that question directly on list and asking for the most honest possible answers? Aha - I now realize that I was reading too quickly under the influence (again) of too much caffeine, and missed this part of Travis' email: In this context, I'm especially interested in making sure that it's not just the developers who get to decide what happens to NumPy. Nathaniel has clarified very well what veto-power really means. It's not absolute, it just means that users who write clear arguments get listened to actively. It doesn't replace the need for developers with wisdom and understanding of user-experiences, but active listening is a useful skill that we could all improve on: http://en.wikipedia.org/wiki/Active_listeningA list full of bright, interested, active listeners is the kind of culture we need on this list. It's the kind of attitude we need from maintainers of NumPy. which mostly answers my worry, and I apologize for pushing on an open door. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 10:54 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote: Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy.I've come to love the interesting plateau that NumPy lives on.But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. Of course, there are potential explanations: 1) Numpy is too low-level for most people 2) The C code is too complicated 3) It's fine already, more or less are some obvious ones. I would say there are the easy answers. But of course, the easy answer may not be the right answer. It may not be easy to get right answer [1]. As you can see from Alan Isaac's reply on this thread, even asking the question can be taken as being in bad faith. In that situation, I think you'll find it hard to get sincere replies. While I don't think jumping into NumPy C code is as difficult as some people made it to be, I think numpy reaped most of the low-hanging fruits, and is now at a stage where it requires massive investment to get significantly better. I would suggest a different question, whose answer may serve as a proxy to uncover the lack of contributions: what needs to be done in NumPy, and how can we make it simpler for newcommers ? Here is an incomplete, unshamelessly biased list: - Less dependencies on CPython internals - Allow for 3rd parties to extend numpy at the C level in more fundamental ways (e.g. I wished something like half-float dtype could be more easily developed out of tree) - Separate memory representation from higher level representation (slicing, broadcasting, etc…), to allow arrays to sit on non-contiguous memory areas, etc… - Test and performance infrastructure so we can track our evolution, get coverage of our C code, etc… - Fix bugs - Better integration with 3rd party on-disk storage (database, etc…) None of that is particularly simple nor has a fast learning curve, except for fixing bugs and maybe some of the infrastructure. I think most of this is necessary for the things Travis talked about a few weeks ago. What could make contributions easier: - different levels of C API documentation (still lacking anything besides reference) - ways to detect early when we break ABI, slightly more obscure platforms (we need good CI, ways to publish binaries that people can easily test, etc...) - improve infrastructure so that we can focus on the things we want to work on (improve the dire situation with bug tracking, etc…) Also, lots of people just don't know/want to know C. But people with say web skills would be welcome: we have a website that could use some help… So ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Wed, Apr 25, 2012 at 3:24 PM, josef.p...@gmail.com wrote: On Wed, Apr 25, 2012 at 5:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote: Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy. I've come to love the interesting plateau that NumPy lives on. But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. Of course, there are potential explanations: 1) Numpy is too low-level for most people 2) The C code is too complicated 3) It's fine already, more or less are some obvious ones. I would say there are the easy answers. But of course, the easy answer may not be the right answer. It may not be easy to get right answer [1]. As you can see from Alan Isaac's reply on this thread, even asking the question can be taken as being in bad faith. In that situation, I think you'll find it hard to get sincere replies. I don't see why this shouldn't be the sincere replies, I think these easy answers are also the right answer for most people. I wasn't saying these replies are not sincere, of course they are factors. I have heard other people give reasons why they didn't enjoy numpy development much, but I can't speak for them, only for me. I have done some numpy development, but very little. I've done a moderate amount of scipy development. I have considered doing more numpy development, in particular, I did want to do some work on the longdouble parts of numpy. Part of the reason I didn't do this was because, when I raised the question on the list, it did not seem there was much interest in a change, or even a real discussion. Partly from the masked array discussions, but not only, it seemed that the process of making decisions was not clear, and there seemed to be as many views about how this was done as there were developers. I suppose I'd summarize the atmosphere, as I have have felt it, as being that numpy was owned by someone else, and I wasn't quite sure who that was, but I was fairly sure it wasn't me. On the other hand, in some projects at least - of which Sympy is the most obvious example, I think it's easy to feel that all of us own Sympy (and I've only made one commit to Sympy, and that of someone else's idea). Adding to that, it does seem to me that the atmosphere on this list get ugly sometimes. In particular it seems to me that there's a sort of conformity that starts to emerge in which people feel it is necessary to praise or criticize people, but not the arguments. I suppose that is because there was a long time during which Travis was not on the list to model what kind of discussion he wanted. I'm glad that has changed now. The reason I keep returning to process, even though it is 'non-technical' - is because it seems to me that the atmosphere that I'm describing will have the strong effect of discouraging enthusiastic developers. It certainly discourages me. I don't think open-source software is just developers scratching an itch, I think it's about community, and the pleasure of working with people you like and trust, to do something you think is important. If I've made that harder, then I am sorry, and I'm very happy to hear why that is, and how I can help. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 7:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 25, 2012 at 3:24 PM, josef.p...@gmail.com wrote: On Wed, Apr 25, 2012 at 5:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote: Do you agree that Numpy has not been very successful in recruiting and maintaining new developers compared to its large user-base? Compared to - say - Sympy? Why do you think this is? I think it's mostly because it's infrastructure that is a means to an end. I certainly wasn't excited to have to work on NumPy originally, when my main interest was SciPy. I've come to love the interesting plateau that NumPy lives on. But, I think it mostly does the job it is supposed to do. The fact that it is in C is also not very sexy. It is also rather complicated with a lot of inter-related parts. I think NumPy could do much, much more --- but getting there is going to be a challenge of execution and education. You can get to know the code base. It just takes some time and patience. You also have to be comfortable with compilers and building software just to tweak the code. Would you consider asking that question directly on list and asking for the most honest possible answers? I'm always interested in honest answers and welcome any sincere perspective. Of course, there are potential explanations: 1) Numpy is too low-level for most people 2) The C code is too complicated 3) It's fine already, more or less are some obvious ones. I would say there are the easy answers. But of course, the easy answer may not be the right answer. It may not be easy to get right answer [1]. As you can see from Alan Isaac's reply on this thread, even asking the question can be taken as being in bad faith. In that situation, I think you'll find it hard to get sincere replies. I don't see why this shouldn't be the sincere replies, I think these easy answers are also the right answer for most people. I wasn't saying these replies are not sincere, of course they are factors. I have heard other people give reasons why they didn't enjoy numpy development much, but I can't speak for them, only for me. I have done some numpy development, but very little. I've done a moderate amount of scipy development. I have considered doing more numpy development, in particular, I did want to do some work on the longdouble parts of numpy. Part of the reason I didn't do this was because, when I raised the question on the list, it did not seem there was much interest in a change, or even a real discussion. Partly from the masked array discussions, but not only, it seemed that the process of making decisions was not clear, and there seemed to be as many views about how this was done as there were developers. I suppose I'd summarize the atmosphere, as I have have felt it, as being that numpy was owned by someone else, and I wasn't quite sure who that was, but I was fairly sure it wasn't me. On the other hand, in some projects at least - of which Sympy is the most obvious example, I think it's easy to feel that all of us own Sympy (and I've only made one commit to Sympy, and that of someone else's idea). Adding to that, it does seem to me that the atmosphere on this list get ugly sometimes. In particular it seems to me that there's a sort of conformity that starts to emerge in which people feel it is necessary to praise or criticize people, but not the arguments. I suppose that is because there was a long time during which Travis was not on the list to model what kind of discussion he wanted. I'm glad that has changed now. The reason I keep returning to process, even though it is 'non-technical' - is because it seems to me that the atmosphere that I'm describing will have the strong effect of discouraging enthusiastic developers. It certainly discourages me. I don't think open-source software is just developers scratching an itch, I think it's about community, and the pleasure of working with people you like and trust, to do something you think is important. Except for the big changes like NA and datetime, I think the debate is pretty boring. The main problem that I see for discussing technical issues is whether there are many developers really interested in commenting on code and coding. I think it mostly comes down to the discussion on tickets or pull requests. First my own experience with scipy.stats. Most of the time when I was cleaning up scipy.stats, I was alone, except for some helpful comments by Robert. My itch was that the bugs in scipy.stats were bugging me, and I just kept working and committing without code review until the bugs that I thought urgent were gone. Now, with Warren and Ralf also working on scipy.stats it is a lot more fun, since there is actually a regular community of 3 developers. My impression (since I only
Re: [Numpy-discussion] What is consensus anyway
On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com wrote: Except for the big changes like NA and datetime, I think the debate is pretty boring. The main problem that I see for discussing technical issues is whether there are many developers really interested in commenting on code and coding. I think it mostly comes down to the discussion on tickets or pull requests. This is a very insightful comment. Github has been a great thing for both NumPy and SciPy. However, it has changed the community feel for many because these pull request discussions don't happen on this list. You have to comment on a pull request to get notified of future comments or changes.The process is actually pretty nice, but it does mean you can't just hang out watching this list. You have to look at the pull requests and get involved there. It would be nice if every pull request created a message to this list.Is that even possible? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wednesday, April 25, 2012, Travis Oliphant wrote: On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com javascript:; wrote: Except for the big changes like NA and datetime, I think the debate is pretty boring. The main problem that I see for discussing technical issues is whether there are many developers really interested in commenting on code and coding. I think it mostly comes down to the discussion on tickets or pull requests. This is a very insightful comment. Github has been a great thing for both NumPy and SciPy. However, it has changed the community feel for many because these pull request discussions don't happen on this list. You have to comment on a pull request to get notified of future comments or changes.The process is actually pretty nice, but it does mean you can't just hang out watching this list. You have to look at the pull requests and get involved there. It would be nice if every pull request created a message to this list. Is that even possible? -Travis This ha been a concern of mine for matplotlib as well. The closest I can come is to set up an RSS feed, but all the titles are PR # and a action, so I lose track of which ones I want to view. All devs get an initial email for each PR, but I cant figure out how to get that down to the public list and it is hard to know if another dev took care of the PR or if it is just waiting. Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On 4/25/12 8:11 PM, Travis Oliphant wrote: On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com wrote: Except for the big changes like NA and datetime, I think the debate is pretty boring. The main problem that I see for discussing technical issues is whether there are many developers really interested in commenting on code and coding. I think it mostly comes down to the discussion on tickets or pull requests. This is a very insightful comment. Github has been a great thing for both NumPy and SciPy. However, it has changed the community feel for many because these pull request discussions don't happen on this list. You have to comment on a pull request to get notified of future comments or changes.The process is actually pretty nice, but it does mean you can't just hang out watching this list. You have to look at the pull requests and get involved there. It would be nice if every pull request created a message to this list.Is that even possible? Sure. Github has a pretty extensive hook system that can notify (via hitting a URL) about lots of events. https://github.com/blog/964-all-of-the-hooks http://developer.github.com/v3/repos/hooks/ I haven't actually used it (just read the docs), so I may be mistaken... Thanks, Jason ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Thu, Apr 26, 2012 at 6:41 AM, Travis Oliphant tra...@continuum.io wrote: [snip] It would be nice if every pull request created a message to this list. Is that even possible? That is definitely possible and shouldn't be too hard to do, like Jason said. But that can potentially cause some confusion, with some of the discussion starting off in the mailing list, and some of the discussion happening on the pull-request itself. Are my concerns justified? -- Puneeth ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On 4/25/12 11:08 PM, Puneeth Chaganti wrote: On Thu, Apr 26, 2012 at 6:41 AM, Travis Oliphanttra...@continuum.io wrote: [snip] It would be nice if every pull request created a message to this list.Is that even possible? That is definitely possible and shouldn't be too hard to do, like Jason said. But that can potentially cause some confusion, with some of the discussion starting off in the mailing list, and some of the discussion happening on the pull-request itself. Are my concerns justified? It wouldn't be too hard to have mailing list replies sent back to the pull request as comments (again, using the github API). Already, if you're on a ticket, you can just reply to a comment email and the reply is put as a comment in the pull request. Jason ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 11:08 PM, Puneeth Chaganti puncha...@gmail.com wrote: On Thu, Apr 26, 2012 at 6:41 AM, Travis Oliphant tra...@continuum.io wrote: [snip] It would be nice if every pull request created a message to this list. Is that even possible? That is definitely possible and shouldn't be too hard to do, like Jason said. But that can potentially cause some confusion, with some of the discussion starting off in the mailing list, and some of the discussion happening on the pull-request itself. Are my concerns justified? Related issue: some projects have an user's list and a devel list. It might be worth (re?)considering that option. They have their pros and cons but I think I like the idea of a devel list and seperate help wanted list. Something else that might be helpful for contentious threads is a stack-overflowesque system where readers can vote up responses of others. Sometimes just a i agree i disagree goes a long way, especially when you have many lurkers. On something else that was brought up: I do not consider myself competent/prepared enough to take on development, but it is not the case that I have _never_ felt the temptation. What I have found intimidating and styming is the perceived politics over development issues. The two places where I have felt this are a) on contentious threads on the list and b) what seems like legitimate patches tickets on trac that seem to be languishing for no compelling technical reason. I would be hardpressed to quote specifics, but I have encountered this feeling a few times. For my case it would not have mattered, because I doubt I would have contriuted anything useful. However, it might be the case that more competent lurkers might have felt the same way. The possibility of a patch relegated semipermanently to trac, or the possibility of getting caught up in the politics is bit of a disincentive. This is just an honest perception/observation. I am more of a get on with it, get the code out and rest will resolve itself eventually kind of a guy, thus long political/philosophical/epistemic threads distance me. I know there are legitimate reasons to have this discussions. But it seems to me that they get a bit too wordy here sometimes. My 10E-2. -- srean ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 6:28 PM, Benjamin Root ben.r...@ou.edu wrote: It would be nice if every pull request created a message to this list. Is that even possible? -Travis This ha been a concern of mine for matplotlib as well. The closest I can come is to set up an RSS feed, but all the titles are PR # and a action, so I lose track of which ones I want to view. Same here for IPython. If anybody figures out a clean solution, please advertise it! I think a bunch of us want the same thing... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 12:46 AM, Chris Barker chris.bar...@noaa.govwrote: On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote: Right now we are trying to balance difficult things: stable releases with experimental development. Perhaps a more formal development release system could help here. IIUC, numpy pretty much has two things: the latest release (and past ones) and master (and assorted experimentla branches). If someone develops a new feature, we can either: have them submit a pull request, and people with the where-with-all can pull it, compile, it, and start tesing it on their own -- hsitory shows that this is a small group. merge it with master -- and hope it gets the testing is should before it becomes part of a release, but: we are rightly heistant to put experimental stuff in master, and it really dont' get that much testing -- again only folks that are building master will even see it. Some projects have a more format development release system. wxPython, for instance has had for years development releases with odd numbers -- right now, the official release is 2.8.*, but there is a 2.9.* out there that is getting some use and testing. A couple of things help make this work: 1) Robin makes the effort to put out binaries for development releases -- it's easy to go get and give it a try. This is a good idea - not for development releases but for master. Building nightly/weekly binaries would help more people try out new features. 2) there is the wxversion system that makes it easy to install a new versin of wx, and easily switch between them (it's actually broken on OS-X right now --- :-) ) -- this pre-dated virtualenv and friends, maybe virtualenv is enough for this now. wxversion was broken for a long time on Ubuntu too (~5 yrs ago). I don't exactly remember it as a good idea. virtualenv also doesn't help, because if you can use that you know how to build from source anyway. Ralf Anyway, it's a thought -- I think some more rea-world use of new features before a real commitment to adopting them would be great. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.comwrote: On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. We certainly aren't there at the moment, but I can see us heading that way. But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since then datetime, NA, polynomial work, and various other enhancements have gone in along with some 280 bug fixes. The major technical problem blocking a 1.7 release is getting datetime working reliably on windows. So I think that is where the short term effort needs to be. Meanwhile, we are spending effort to get out a 1.6.2 just so people can work with a stable version with some of the bug fixes, and potentially we will spend more time and effort to pull out the NA code. In the future there may be a transition to C++ and eventually a break with the current ABI. Or not. There are at least two motivations that get folks to write code for open source projects, scratching an itch and money. Money hasn't been a big part of the Numpy picture so far, so that leaves scratching an itch. One of the attractions of Numpy is that it is a small project, BSD licensed, and not overburdened with governance and process. This makes scratching an itch not as difficult as it would be in a large project. If Numpy remains a small project but acquires the encumbrances of a big project much of that attraction will be lost. Momentum and direction also attracts people, but numpy is stalled at the moment as the whole NA thing circles around once again. What would I suggest as a way forward with the NA option. Let's take the issues. 1) Adding slots to PyArrayObject_fields. I don't think this is likely to be a problem unless someone's code passes the struct by value or uses assignment to initialize a statically allocated instance. I'm not saying no one does that, low level scientific code can contain all sorts of bizarre and astonishing constructs and it is also possible that these sort of things might turn up in an old FORTRAN program. The question here is whether to allow any changes at all, and I think we will have to in the future. Given that, consistent use of accessors will make later changes to the organization or implementation of the base structure transparent. Numpy itself now uses accessors for the heritage slots, but not for the new NA slots. So I suggest at a minimum adding accessors for the maskna_dtype, maskna_data, and maskna_strides. Of course, later removing these slots will still remain a problem. 2) NA. This breaks down into API and implementation issues. Personally, I think marking the NA stuff experimental leaves room to modify both and would prefer to go with what we have and change it into whatever looks best by modification through pull requests. This kicks the can down the road, but not so far that people sufficiently interested in working on the topic can't get modifications in. My own preferences for future API modifications are as follows. a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. b) There needs to be a distinction between missing and ignore. The mechanism for this is already in place in the payload type, although it isn't clear to me that that is uniformly used in all the NA code. There is also a place for missing *and* ignored. Which leads to c) Sums, etc. should always skip ignored data. If missing data is present, but not ignored, then a sum should return NA. The main danger I see here is that the behavior of arrays becomes state dependent, something that can lead to subtle
Re: [Numpy-discussion] What is consensus anyway
Hi, Le 24/04/2012 15:14, Charles R Harris a écrit : a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. From my lazy user perspective, having masked and non-masked arrays share the same look and feel would be a number one advantage over the existing numpy.ma arrays. I would like masked array to be as transparent as possible. b) There needs to be a distinction between missing and ignore. The mechanism for this is already in place in the payload type, although it isn't clear to me that that is uniformly used in all the NA code. There is also a place for missing *and* ignored. Which leads to If the idea of having two payloads is to avoid a maximum of skipna friends extra keywords, I would like it much. My feeling with my small experience with R is that I end up calling every function with a different magical set of keywords (na.rm, na.action, ... and I forgot). My 2 lazy user cents... Best, Pierre signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 24/04/2012 15:14, Charles R Harris a écrit : a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. From my lazy user perspective, having masked and non-masked arrays share the same look and feel would be a number one advantage over the existing numpy.ma arrays. I would like masked array to be as transparent as possible. I don't have any opinion about internal implementation. But users needs to be aware of whether they have masked arrays or not. Since many functions (most of scipy) wouldn't know how to handle NA and don't do any checks, (and shouldn't in my opinion if the NA check is costly). The result might be silently wrong numbers depending on the implementation. b) There needs to be a distinction between missing and ignore. The mechanism for this is already in place in the payload type, although it isn't clear to me that that is uniformly used in all the NA code. There is also a place for missing *and* ignored. Which leads to If the idea of having two payloads is to avoid a maximum of skipna friends extra keywords, I would like it much. My feeling with my small experience with R is that I end up calling every function with a different magical set of keywords (na.rm, na.action, ... and I forgot). There is a reason for requiring the user to decide what to do about NA's. Either we have utility functions/methods to help the user change the arrays and treat NA's before calling a function, or the function needs to ask the user what should be done about possible NAs. Doing it automatically might only be useful for specialised packages. My 2c Josef My 2 lazy user cents... Best, Pierre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Tue, Apr 24, 2012 at 6:14 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. We certainly aren't there at the moment, but I can see us heading that way. But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since then datetime, NA, polynomial work, and various other enhancements have gone in along with some 280 bug fixes. The major technical problem blocking a 1.7 release is getting datetime working reliably on windows. So I think that is where the short term effort needs to be. Meanwhile, we are spending effort to get out a 1.6.2 just so people can work with a stable version with some of the bug fixes, and potentially we will spend more time and effort to pull out the NA code. In the future there may be a transition to C++ and eventually a break with the current ABI. Or not. There are at least two motivations that get folks to write code for open source projects, scratching an itch and money. Money hasn't been a big part of the Numpy picture so far, so that leaves scratching an itch. One of the attractions of Numpy is that it is a small project, BSD licensed, and not overburdened with governance and process. This makes scratching an itch not as difficult as it would be in a large project. If Numpy remains a small project but acquires the encumbrances of a big project much of that attraction will be lost. Momentum and direction also attracts people, but numpy is stalled at the moment as the whole NA thing circles around once again. I think your assumptions are incorrect, although I have seen them before. No stated process leads to less encumbrance if and only if the implicit process works. It clearly doesn't work, precisely because the NA thing is circling round and round again. And the governance discussion. And previously the ABI breakage discussion. If you are on other mailing lists, I'm sure you are, you'll see that this does not happen to - say - Cython, or Sympy. In particular, I have not seen, on those lists, the current numpy way of simply blocking or avoiding discussion. Everything is discussed out to agreement, or at least until all parties accept the way forward. At the moment, the only hope I could imagine for the 'no governance is good governance' method, is that all those who don't agree would just shut up. It would be more peaceful, but for the reasons stated by Nathaniel, I think that would be a very bad outcome. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 24/04/2012 15:14, Charles R Harris a écrit : a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. From my lazy user perspective, having masked and non-masked arrays share the same look and feel would be a number one advantage over the existing numpy.ma arrays. I would like masked array to be as transparent as possible. I don't have any opinion about internal implementation. But users needs to be aware of whether they have masked arrays or not. Since many functions (most of scipy) wouldn't know how to handle NA and don't do any checks, (and shouldn't in my opinion if the NA check is costly). The result might be silently wrong numbers depending on the implementation. There should be a flag saying whether or not NA has been allocated and allocation happens when NA is assigned to an array item, so that should be fast. I don't think scipy currently deals with masked arrays in all areas,, so I believe that the same problem exists there and would also exist for missing data types. I think this sort of compatibility problem is worth a whole discussion by itself. b) There needs to be a distinction between missing and ignore. The mechanism for this is already in place in the payload type, although it isn't clear to me that that is uniformly used in all the NA code. There is also a place for missing *and* ignored. Which leads to If the idea of having two payloads is to avoid a maximum of skipna friends extra keywords, I would like it much. My feeling with my small experience with R is that I end up calling every function with a different magical set of keywords (na.rm, na.action, ... and I forgot). There is a reason for requiring the user to decide what to do about NA's. Either we have utility functions/methods to help the user change the arrays and treat NA's before calling a function, or the function needs to ask the user what should be done about possible NAs. Doing it automatically might only be useful for specialised packages. That's what the different payloads would do. I think the common use case would always have the ignore bit set. What are the other sorts of actions you are interested in, and should they be part of the functions in Numpy, such as mean and std, or should they rather implemented in stats packages that may be more specialized? I see numpy.ma currently used in the following spots in scipy: scipy/stats/mstats_extras.py scipy/stats/tests/test_mstats_extras.py scipy/stats/tests/test_mstats_basic.py scipy/stats/mstats_basic.py scipy/signal/filter_design.py scipy/optimize/optimize.py The advantage of nans, I suppose, is that they are in the hardware and so already universally part of Numpy. NA would be introduced, so would require a bit more work. I expect it will be several (many) years before they are dealt with as a matter of course. At minimum, one would need to check if the masked flag is set. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 12:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 24/04/2012 15:14, Charles R Harris a écrit : a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. From my lazy user perspective, having masked and non-masked arrays share the same look and feel would be a number one advantage over the existing numpy.ma arrays. I would like masked array to be as transparent as possible. I don't have any opinion about internal implementation. But users needs to be aware of whether they have masked arrays or not. Since many functions (most of scipy) wouldn't know how to handle NA and don't do any checks, (and shouldn't in my opinion if the NA check is costly). The result might be silently wrong numbers depending on the implementation. There should be a flag saying whether or not NA has been allocated and allocation happens when NA is assigned to an array item, so that should be fast. I don't think scipy currently deals with masked arrays in all areas,, so I believe that the same problem exists there and would also exist for missing data types. I think this sort of compatibility problem is worth a whole discussion by itself. To clarify a bit, a item could be marked as both missing and ignore. An item that is marked missing will propagate as missing, but if it is also ignored then things like mean and std will skip it. There would also be a clear operation that would clear the ignore bit but keep the missing bit. Now I can see the advantage of explicitly specifying behavior in functions as one is knows right at the spot what is intended whereas with the other alternative one needs to know the history of the array and whether ignore was ever set, but in that sense it is just like having default keyword values and could be implemented as such. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 2:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 24/04/2012 15:14, Charles R Harris a écrit : a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. From my lazy user perspective, having masked and non-masked arrays share the same look and feel would be a number one advantage over the existing numpy.ma arrays. I would like masked array to be as transparent as possible. I don't have any opinion about internal implementation. But users needs to be aware of whether they have masked arrays or not. Since many functions (most of scipy) wouldn't know how to handle NA and don't do any checks, (and shouldn't in my opinion if the NA check is costly). The result might be silently wrong numbers depending on the implementation. There should be a flag saying whether or not NA has been allocated and allocation happens when NA is assigned to an array item, so that should be fast. I don't think scipy currently deals with masked arrays in all areas,, so I believe that the same problem exists there and would also exist for missing data types. I think this sort of compatibility problem is worth a whole discussion by itself. b) There needs to be a distinction between missing and ignore. The mechanism for this is already in place in the payload type, although it isn't clear to me that that is uniformly used in all the NA code. There is also a place for missing *and* ignored. Which leads to If the idea of having two payloads is to avoid a maximum of skipna friends extra keywords, I would like it much. My feeling with my small experience with R is that I end up calling every function with a different magical set of keywords (na.rm, na.action, ... and I forgot). There is a reason for requiring the user to decide what to do about NA's. Either we have utility functions/methods to help the user change the arrays and treat NA's before calling a function, or the function needs to ask the user what should be done about possible NAs. Doing it automatically might only be useful for specialised packages. That's what the different payloads would do. I think the common use case would always have the ignore bit set. What are the other sorts of actions you are interested in, and should they be part of the functions in Numpy, such as mean and std, or should they rather implemented in stats packages that may be more specialized? I see numpy.ma currently used in the following spots in scipy: Like you said, this whole issue probably should be in a separate discussion, but I would like to point out here with my thoughts on default payload. If we don't have some sort of mechanism for flagging which functions are NA-friendly or not, then it would be wise to have NA default to NaN behavior. If only to prevent bugs that mess up data from being undetected. That being said, the determination of NA payload is tricky. Some functions may need to react differently to an NA. One that comes to mind is np.gradient(). However, other functions may not need to do anything because they depend entirely upon other functions that have already been updated to support NA. Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 2:43 PM, Pierre Haessig pierre.haes...@crans.org wrote: If the idea of having two payloads is to avoid a maximum of skipna friends extra keywords, I would like it much. My feeling with my small experience with R is that I end up calling every function with a different magical set of keywords (na.rm, na.action, ... and I forgot). While I can't in general defend R on consistency grounds, there is a logic to this particular case. Most basic R functions like 'sum' take the na.rm= argument, which can be True or False and is equivalent to the skipna argument we've talked about for ufuncs. The functions that take other arguments (like na.action= for model fitting functions, or use= for their equivalent of np.corrcoef) are the ones that have *more* than 2 ways to handle NAs. E.g. model fitting functions given NAs can raise an error, skip the NA cases, or pass the NA cases through, and the correlation matrix function has different options for what to do with cases where one column has an NA but there are two others that don't. Having a distinction between missing and ignored values doesn't really affect whether you need such options. (If anything I guess it could make such options even more complicated -- what if I want my regression function to error out on missing but skip over ignored values, etc.) - N ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 2:35 PM, Benjamin Root ben.r...@ou.edu wrote: On Tue, Apr 24, 2012 at 2:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote: On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 24/04/2012 15:14, Charles R Harris a écrit : a) All arrays should be implicitly masked, even if the mask isn't initially allocated. The maskna keyword can then be removed, taking with it the sense that there are two kinds of arrays. From my lazy user perspective, having masked and non-masked arrays share the same look and feel would be a number one advantage over the existing numpy.ma arrays. I would like masked array to be as transparent as possible. I don't have any opinion about internal implementation. But users needs to be aware of whether they have masked arrays or not. Since many functions (most of scipy) wouldn't know how to handle NA and don't do any checks, (and shouldn't in my opinion if the NA check is costly). The result might be silently wrong numbers depending on the implementation. There should be a flag saying whether or not NA has been allocated and allocation happens when NA is assigned to an array item, so that should be fast. I don't think scipy currently deals with masked arrays in all areas,, so I believe that the same problem exists there and would also exist for missing data types. I think this sort of compatibility problem is worth a whole discussion by itself. b) There needs to be a distinction between missing and ignore. The mechanism for this is already in place in the payload type, although it isn't clear to me that that is uniformly used in all the NA code. There is also a place for missing *and* ignored. Which leads to If the idea of having two payloads is to avoid a maximum of skipna friends extra keywords, I would like it much. My feeling with my small experience with R is that I end up calling every function with a different magical set of keywords (na.rm, na.action, ... and I forgot). There is a reason for requiring the user to decide what to do about NA's. Either we have utility functions/methods to help the user change the arrays and treat NA's before calling a function, or the function needs to ask the user what should be done about possible NAs. Doing it automatically might only be useful for specialised packages. That's what the different payloads would do. I think the common use case would always have the ignore bit set. What are the other sorts of actions you are interested in, and should they be part of the functions in Numpy, such as mean and std, or should they rather implemented in stats packages that may be more specialized? I see numpy.ma currently used in the following spots in scipy: I think most functions that operate on an axis are mostly unambiguous ignore, std, mean, var, histogram, should stay in numpy, np.cov might have pairwise or row/column wise deletion option (but I don't know what other packages are doing). (While I had to run off, Nathaniel explained this.) The main cases in stats (or statsmodels) for handling NaNs or NAs would be rowwise ignore or pretend temporarily that they are zero or some other neutral value. Like you said, this whole issue probably should be in a separate discussion, but I would like to point out here with my thoughts on default payload. If we don't have some sort of mechanism for flagging which functions are NA-friendly or not, then it would be wise to have NA default to NaN behavior. If only to prevent bugs that mess up data from being undetected. In scipy.stats it's currently the responsibility of the user, unless explicitly mentioned that a function knows how to handle nans or masked arrays, the default is we don't check and what you get returned might be anything. If there is a flag (and a cheap way to verify whether there are NaNs or NAs), then we could just add a check in every function. Josef That being said, the determination of NA payload is tricky. Some functions may need to react differently to an NA. One that comes to mind is np.gradient(). However, other functions may not need to do anything because they depend entirely upon other functions that have already been updated to support NA. Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 3:23 PM, Stéfan van der Walt ste...@sun.ac.zawrote: On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. Stéfan Good catch! Looks like we got off-track when the discussion talked about forks. Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
2012/4/24 Stéfan van der Walt ste...@sun.ac.za On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: 2012/4/24 Stéfan van der Walt ste...@sun.ac.za On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. This email is mainly to Travis. This thread seems to be dying, condemning us to keep repeating the same conversation with no result. Chuck has made it clear he is not interested in this conversation. Until it is clear you are interested in this conversation, it will keep dying. As you know, I think that will be very bad for numpy, and, as you know, I care a great deal about that. So, please, if you care about this, and agree that something should be done, please, say so, and if you don't agree something should be done, say so. It can't better without your help, See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. For the benefit of those of us interested in both discussions, would you kindly start a new thread on the MA topic? In response to Travis's suggestion of writing up a short summary of community principles, as well as Matthew's initial formulation, I agree that this would be helpful in enshrining the values we cherish here, as well as in communicating those values to the next generation of developers. From observing the community, I would guess that these values include: - That any party with an interest in NumPy is given the opportunity to speak and to be heard on the list. - That discussions that influence the course of the project take place openly, for anyone to observe. - That decisions are made once consensus is reached, i.e., if everyone agrees that they can live with the outcome. To summarize: NumPy development that is free fair, open and unified. We'll sometimes mess up and not follow our own guidelines, but with them in place at least we'll have something to refer back to as a reminder. Regards Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Thanks for the reminder, Stefan and keeping us on track. It is very helpful to those trying to sort through the messages to keep the discussions to one subject per thread. -Travis On Apr 24, 2012, at 2:23 PM, Stéfan van der Walt wrote: On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Apr 24, 2012, at 6:01 PM, Stéfan van der Walt wrote: On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. For the benefit of those of us interested in both discussions, would you kindly start a new thread on the MA topic? In response to Travis's suggestion of writing up a short summary of community principles, as well as Matthew's initial formulation, I agree that this would be helpful in enshrining the values we cherish here, as well as in communicating those values to the next generation of developers. From observing the community, I would guess that these values include: - That any party with an interest in NumPy is given the opportunity to speak and to be heard on the list. - That discussions that influence the course of the project take place openly, for anyone to observe. - That decisions are made once consensus is reached, i.e., if everyone agrees that they can live with the outcome. This is well stated. Thank you Stefan. Some will argue about what consensus means or who everyone is.But, if we are really worrying about that, then we have stopped listening to each other which is the number one community value that we should be promoting, demonstrating, and living by. Consensus to me means that anyone who can produce a well-reasoned argument and demonstrates by their persistence that they are actually using the code and are aware of the issues has veto power on pull requests. At times people with commit rights to NumPy might perform a pull request anyway, but they should acknowledge at least in the comment (but for major changes --- on this list) that they are doing so and provide their reasons. If I decide later that I think the pull request was made inappropriately in the face of objections and the reasons were not justified, then I will reserve the right to revert the pull request.I would like core developers of NumPy to have the same ability to check me as well.But, if there is a disagreement at that level, then I will reserve the right to decide. Basically, what we have in this situation is that the masked arrays were added to NumPy master with serious objections to the API. What I'm trying to decide right now is can we move forward and satisfy the objections without removing the ndarrayobject changes entirely (I do think the concerns warrant removal of the changes). The discussion around that is the most helpful right now, but should take place on another thread. Thanks, -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Apr 24, 2012, at 5:52 PM, Matthew Brett wrote: Hi, On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: 2012/4/24 Stéfan van der Walt ste...@sun.ac.za On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. This email is mainly to Travis. This thread seems to be dying, condemning us to keep repeating the same conversation with no result. Chuck has made it clear he is not interested in this conversation. Until it is clear you are interested in this conversation, it will keep dying. As you know, I think that will be very bad for numpy, and, as you know, I care a great deal about that. I am interested in the conversation, but I think I've already stated my views as well as I know how. I'm not sure what else I should do at this point. We do need consensus (defined as the absence of serious objectors) for me to agree to a NumPy 1.X release. I don't think it helps us get to a consensus to further discuss non-technical issues at this point. There is much interest in ideas for finding common ground in the masked array situation, but that should happen on another thread. -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
2012/4/24 Stéfan van der Walt ste...@sun.ac.za On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. For the benefit of those of us interested in both discussions, would you kindly start a new thread on the MA topic? In response to Travis's suggestion of writing up a short summary of community principles, as well as Matthew's initial formulation, I agree that this would be helpful in enshrining the values we cherish here, as well as in communicating those values to the next generation of developers. I think we adhere to these pretty well already, the problem is with the word 'everyone'. I grew up in Massachusetts where town meetings were a tradition. At those meetings the townsfolk voted on the budget, zoning, construction of public buildings, use of public spaces and other such topics. A quorum of voters was needed to make the votes binding, and apart from that the meeting was limited to people who lived in the town, they, after all, paid the taxes and had to live with the decisions. Outsiders could sit in by invitation, but had to sit in a special area and were not expected to speak unless called upon and certainly couldn't vote. So that is one tradition, a democratic tradition with a history of success. We are a much smaller community, physically separated, and don't need that sort of exclusivity, but even so we have our version of resident and taxes, which consists of hanging out on the list and contributing work. I think everyone is welcome to express an opinion and make an argument, but not everyone has a veto. I think a veto is a privilege, not a right, and to have that privilege I think one needs to demonstrate an investment in the project, consisting in this case of code contributions, code review, and other such mundane tasks that demonstrate a larger interest and a willingness to work. Anyone can do this, it doesn't require permission or special dispensation, Numpy is very open in that regard. Folks working in related projects, such as ipython and pandas, are also going to be listened to because they have made that investment in time and work and the popularity of Numpy depends on keeping them happy. But a right to veto doesn't automatically extend to everyone who happens to have an interest in a topic. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 5:24 PM, Travis Oliphant tra...@continuum.iowrote: On Apr 24, 2012, at 6:01 PM, Stéfan van der Walt wrote: On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. For the benefit of those of us interested in both discussions, would you kindly start a new thread on the MA topic? In response to Travis's suggestion of writing up a short summary of community principles, as well as Matthew's initial formulation, I agree that this would be helpful in enshrining the values we cherish here, as well as in communicating those values to the next generation of developers. From observing the community, I would guess that these values include: - That any party with an interest in NumPy is given the opportunity to speak and to be heard on the list. - That discussions that influence the course of the project take place openly, for anyone to observe. - That decisions are made once consensus is reached, i.e., if everyone agrees that they can live with the outcome. This is well stated. Thank you Stefan. Some will argue about what consensus means or who everyone is.But, if we are really worrying about that, then we have stopped listening to each other which is the number one community value that we should be promoting, demonstrating, and living by. Consensus to me means that anyone who can produce a well-reasoned argument and demonstrates by their persistence that they are actually using the code and are aware of the issues has veto power on pull requests. At times people with commit rights to NumPy might perform a pull request anyway, but they should acknowledge at least in the comment (but for major changes --- on this list) that they are doing so and provide their reasons. If I decide later that I think the pull request was made inappropriately in the face of objections and the reasons were not justified, then I will reserve the right to revert the pull request.I would like core developers of NumPy to have the same ability to check me as well.But, if there is a disagreement at that level, then I will reserve the right to decide. Basically, what we have in this situation is that the masked arrays were added to NumPy master with serious objections to the API. What I'm trying to decide right now is can we move forward and satisfy the objections without removing the ndarrayobject changes entirely (I do think the concerns warrant removal of the changes). The discussion around that is the most helpful right now, but should take place on another thread. Travis, if you are playing the BDFL role, then just make the darn decision and remove the code so we can get on with life. As it is you go back and forth and that does none of us any good, you're a big guy and you're rocking the boat. I don't agree with that decision, I'd rather evolve the code we have, but I'm willing to compromise with your decision in this matter. I'm not willing to compromise with Nathaniel's, nor it seems vice-versa. Nathaniel has volunteered to do the work, just ask him to submit a patch. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 4:49 PM, Charles R Harris charlesr.har...@gmail.com wrote: But a right to veto doesn't automatically extend to everyone who happens to have an interest in a topic. The time has long gone when we simply hacked on NumPy for our own benefit; if you will, NumPy users are our customers, and they have a stake in its development (or, to phrase it differently, I think we have a commitment to them). If we strongly encourage people to discuss, but still give them an avenue to object, we keep ourselves honest (both w.r.t. expectations on numpy and our own insight into problems and their solutions). Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tuesday, April 24, 2012, Matthew Brett wrote: Hi, On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris charlesr.har...@gmail.com javascript:; wrote: 2012/4/24 Stéfan van der Walt ste...@sun.ac.za javascript:; On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris charlesr.har...@gmail.com javascript:; wrote: The advantage of nans, I suppose, is that they are in the hardware and so Why are we having a discussion on NAN's in a thread on consensus? This is a strong indicator of the problem we're facing. We seem to have a consensus regarding interest in the topic. This email is mainly to Travis. This thread seems to be dying, condemning us to keep repeating the same conversation with no result. Chuck has made it clear he is not interested in this conversation. Until it is clear you are interested in this conversation, it will keep dying. As you know, I think that will be very bad for numpy, and, as you know, I care a great deal about that. So, please, if you care about this, and agree that something should be done, please, say so, and if you don't agree something should be done, say so. It can't better without your help, See you, Matthew Matthew, I agree with the general idea of consensus, and I think many of us here agree with the ideal in principle. Quite frankly, I am not sure what more you want from us. You are only going to get so much leeway on a philosophical discussion on goverance on a numerical computation mail list. The thread keeps dying (i say it is getting distracted) because coders are champing at the bit to get stuff done. In a sense, i think there is a consensus, if you will, to move on. All in favor, say Aye! Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 12:49 AM, Charles R Harris charlesr.har...@gmail.com wrote: I think we adhere to these pretty well already, the problem is with the word 'everyone'. I grew up in Massachusetts where town meetings were a tradition. At those meetings the townsfolk voted on the budget, zoning, construction of public buildings, use of public spaces and other such topics. A quorum of voters was needed to make the votes binding, and apart from that the meeting was limited to people who lived in the town, they, after all, paid the taxes and had to live with the decisions. Outsiders could sit in by invitation, but had to sit in a special area and were not expected to speak unless called upon and certainly couldn't vote. So that is one tradition, a democratic tradition with a history of success. We are a much smaller community, physically separated, and don't need that sort of exclusivity, but even so we have our version of resident and taxes, which consists of hanging out on the list and contributing work. I think everyone is welcome to express an opinion and make an argument, but not everyone has a veto. I think a veto is a privilege, not a right, and to have that privilege I think one needs to demonstrate an investment in the project, consisting in this case of code contributions, code review, and other such mundane tasks that demonstrate a larger interest and a willingness to work. Anyone can do this, it doesn't require permission or special dispensation, Numpy is very open in that regard. Folks working in related projects, such as ipython and pandas, are also going to be listened to because they have made that investment in time and work and the popularity of Numpy depends on keeping them happy. But a right to veto doesn't automatically extend to everyone who happens to have an interest in a topic. Consensus-seeking isn't about privilege or moral rights. It's about ruthless pragmatism. The end of your message actually gets very close to the position I'm advocating -- except that I'm saying, instead of trying to judge which people are worth keeping happy by looking up their commit record on projects you've heard of, you're safer erroring on the side of assuming that anyone taking the time to show up probably has some good reason for doing so, and that their concerns are probably shared by a larger group. You wouldn't refuse to try a chef's cooking until she's proven herself by washing dishes -- why the heck would you demand that people perform mundane tasks before you're willing to trust they have some insight? Acting as maintainer isn't a privilege -- it's a gift you give. So is feedback. Ignoring it is just a way of shooting your own project in the foot. - N ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Apr 24, 2012, at 7:16 PM, Stéfan van der Walt wrote: On Tue, Apr 24, 2012 at 4:49 PM, Charles R Harris charlesr.har...@gmail.com wrote: But a right to veto doesn't automatically extend to everyone who happens to have an interest in a topic. This is not my view, but it is Charles view and as he is an active developer in the NumPy community so this carries weight.I hope he can be convinced that active users are an important part of the community. Charles has made tremendous contributions to this community starting with significant code in Numarray that now lives in NumPy, significant commitment to code quality, significant effort on responding to pull requests, diligence in triaging and applying bug-fixes in tickets, and even responding to people who disagree with him. The time has long gone when we simply hacked on NumPy for our own benefit; if you will, NumPy users are our customers, and they have a stake in its development (or, to phrase it differently, I think we have a commitment to them). If we strongly encourage people to discuss, but still give them an avenue to object, we keep ourselves honest (both w.r.t. expectations on numpy and our own insight into problems and their solutions). +1 Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. We certainly aren't there at the moment, but I can see us heading that way. But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since then datetime, NA, polynomial work, and various other enhancements have gone in along with some 280 bug fixes. The major technical problem blocking a 1.7 release is getting datetime working reliably on windows. So I think that is where the short term effort needs to be. Meanwhile, we are spending effort to get out a 1.6.2 just so people can work with a stable version with some of the bug fixes, and potentially we will spend more time and effort to pull out the NA code. In the future there may be a transition to C++ and eventually a break with the current ABI. Or not. There are at least two motivations that get folks to write code for open source projects, scratching an itch and money. Money hasn't been a big part of the Numpy picture so far, so that leaves scratching an itch. One of the attractions of Numpy is that it is a small project, BSD licensed, and not overburdened with governance and process. This makes scratching an itch not as difficult as it would be in a large project. If Numpy remains a small project but acquires the encumbrances of a big project much of that attraction will be lost. Momentum and direction also attracts people, but numpy is stalled at the moment as the whole NA thing circles around once again. I don't think we need a fork, or to start maintaining separate stable and unstable trees, or any of the other complicated process changes that have been suggested. There are tons of projects that routinely make much bigger changes than we're talking about, and they do it without needing that kind of overhead. I know that these suggestions are all made in good faith, but they remind me of a line from that Apache page I linked earlier: People tend to avoid conflict and thrash around looking for something to substitute - somebody in charge, a rule, a process, stagnation. None of these tend to be very good substitutes for doing the hard work of resolving the conflict. I also think if you talk to potential contributors, you'll find that clear, simple processes and a history of respecting everyone's input are much more attractive than a no-rules free-for-all. Good engineering practices are not an encumbrance. Resolving conflicts before merging is a good engineering practice. What happened with the NA discussion is this: - There was substantial disagreement about whether NEP-style masks, or indeed, focusing on a mask-based implementation *at all*, was the best way forward. - There was also a perceived time constraint, that we had to either implement something immediately while Mark was there, or have nothing. So in the end, the latter concern outweighed the former, the discussion was cut off, and Mark's best guess at an API was merged into master. I totally understand how this decision made sense at the time, but the result is what we see now: it's left numpy stalled, rifts on the mailing list, boring discussions about process, and still no agreement about whether NEP-style masks will actually solve our users' problems. Getting past this isn't *complicated* -- it's just hard work. What would I suggest as a way forward with the NA option. Let's take the issues. 1) Adding slots to PyArrayObject_fields. I don't think this is likely to be a problem unless someone's code passes the struct by value or
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. We certainly aren't there at the moment, but I can see us heading that way. But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since then datetime, NA, polynomial work, and various other enhancements have gone in along with some 280 bug fixes. The major technical problem blocking a 1.7 release is getting datetime working reliably on windows. So I think that is where the short term effort needs to be. Meanwhile, we are spending effort to get out a 1.6.2 just so people can work with a stable version with some of the bug fixes, and potentially we will spend more time and effort to pull out the NA code. In the future there may be a transition to C++ and eventually a break with the current ABI. Or not. There are at least two motivations that get folks to write code for open source projects, scratching an itch and money. Money hasn't been a big part of the Numpy picture so far, so that leaves scratching an itch. One of the attractions of Numpy is that it is a small project, BSD licensed, and not overburdened with governance and process. This makes scratching an itch not as difficult as it would be in a large project. If Numpy remains a small project but acquires the encumbrances of a big project much of that attraction will be lost. Momentum and direction also attracts people, but numpy is stalled at the moment as the whole NA thing circles around once again. I don't think we need a fork, or to start maintaining separate stable and unstable trees, or any of the other complicated process changes that have been suggested. There are tons of projects that routinely make much bigger changes than we're talking about, and they do it without needing that kind of overhead. I know that these suggestions are all made in good faith, but they remind me of a line from that Apache page I linked earlier: People tend to avoid conflict and thrash around looking for something to substitute - somebody in charge, a rule, a process, stagnation. None of these tend to be very good substitutes for doing the hard work of resolving the conflict. I also think if you talk to potential contributors, you'll find that clear, simple processes and a history of respecting everyone's input are much more attractive than a no-rules free-for-all. Good engineering practices are not an encumbrance. Resolving conflicts before merging is a good engineering practice. What happened with the NA discussion is this: - There was substantial disagreement about whether NEP-style masks, or indeed, focusing on a mask-based implementation *at all*, was the best way forward. - There was also a perceived time constraint, that we had to either implement something immediately while Mark was there, or have nothing. So in the end, the latter concern outweighed the former, the discussion was cut off, and Mark's best guess at an API was merged into master. I totally understand how this decision made sense at the time, but the result is what we see now: it's left numpy stalled, rifts on the mailing list, boring discussions about process, and still no agreement about whether NEP-style masks will actually solve our users' problems. Getting past this isn't *complicated* -- it's just hard work. I admit to a certain curiosity about your own involvement in
Re: [Numpy-discussion] What is consensus anyway
Hi, On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. We certainly aren't there at the moment, but I can see us heading that way. But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since then datetime, NA, polynomial work, and various other enhancements have gone in along with some 280 bug fixes. The major technical problem blocking a 1.7 release is getting datetime working reliably on windows. So I think that is where the short term effort needs to be. Meanwhile, we are spending effort to get out a 1.6.2 just so people can work with a stable version with some of the bug fixes, and potentially we will spend more time and effort to pull out the NA code. In the future there may be a transition to C++ and eventually a break with the current ABI. Or not. There are at least two motivations that get folks to write code for open source projects, scratching an itch and money. Money hasn't been a big part of the Numpy picture so far, so that leaves scratching an itch. One of the attractions of Numpy is that it is a small project, BSD licensed, and not overburdened with governance and process. This makes scratching an itch not as difficult as it would be in a large project. If Numpy remains a small project but acquires the encumbrances of a big project much of that attraction will be lost. Momentum and direction also attracts people, but numpy is stalled at the moment as the whole NA thing circles around once again. I don't think we need a fork, or to start maintaining separate stable and unstable trees, or any of the other complicated process changes that have been suggested. There are tons of projects that routinely make much bigger changes than we're talking about, and they do it without needing that kind of overhead. I know that these suggestions are all made in good faith, but they remind me of a line from that Apache page I linked earlier: People tend to avoid conflict and thrash around looking for something to substitute - somebody in charge, a rule, a process, stagnation. None of these tend to be very good substitutes for doing the hard work of resolving the conflict. I also think if you talk to potential contributors, you'll find that clear, simple processes and a history of respecting everyone's input are much more attractive than a no-rules free-for-all. Good engineering practices are not an encumbrance. Resolving conflicts before merging is a good engineering practice. What happened with the NA discussion is this: - There was substantial disagreement about whether NEP-style masks, or indeed, focusing on a mask-based implementation *at all*, was the best way forward. - There was also a perceived time constraint, that we had to either implement something immediately while Mark was there, or have nothing. So in the end, the latter concern outweighed the former, the discussion was cut off, and Mark's best guess at an API was merged into master. I totally understand how this decision made sense at the time, but the result is what we see now: it's left numpy stalled, rifts on the mailing list, boring discussions about process, and still no agreement about whether NEP-style masks will actually solve our users' problems. Getting past this
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. Google shows several years of discussion on Monotone, but I have no idea what your contributions were Seriously??? Please, let's rise above this. We discuss people's opinions *on their technical merit alone*, regardless of the background of the person presenting them. I don't care if Linus himself shows up on the list with a bad idea, it should be shot down; and if someone we'd never heard of brings up a valid point, we should respect it. The day we start checking credentials at the door is the day this project will die as an open source effort. Or at least I think so, but perhaps I don't have enough 'commit credits' in my account for my opinion to matter... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 8:56 PM, Fernando Perez fperez@gmail.comwrote: On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. Google shows several years of discussion on Monotone, but I have no idea what your contributions were Seriously??? Please, let's rise above this. We discuss people's opinions *on their technical merit alone*, regardless of the background of the person presenting them. I don't care if Linus himself shows up on the list with a bad idea, it should be shot down; and if someone we'd never heard of brings up a valid point, we should respect it. The day we start checking credentials at the door is the day this project will die as an open source effort. Or at least I think so, but perhaps I don't have enough 'commit credits' in my account for my opinion to matter... Fernando, I'm not checking credentials, I'm curious. Nathaniel has experience with FOSS projects, unlike us first timers, and I'd like to know what that experience was and what he learned from it. He has also mentioned Graydon Hoare in connection with RUST, and since Graydon was the prime mover in Monotone I'd like to know the story of the project. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 11:28 PM, Fernando Perez fperez@gmail.com wrote: On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. I agree with the interpretation, however, whenever I look at this thread with google gmail, then I see the first line If you hang around big FOSS projects, you'll see the word consensus I'm only hanging around in this neighborhood (9 mailing lists), so I have no idea about big FOSS projects. Josef Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.comwrote: On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. Perhaps it was a bit colored, but even so, I'd like to know some specifics of his experience. Monotone was one of the projects that sprang up after Linus started using Bitkeeper as an open alternative, but that is actually fairly recent (2003 or so) and much of the discussion seems to have been carried on over IRC, rather than a mailing list. I'm guessing that some other projects could have taken place in the 90's, but things have changed so much since then that it is hard to know what was going on in that decade. There was certainly work on the C++ Template library, Linux, Python, and various utilities. But it is hard to know. In any case, I'd guess that Monotone was a fairly tight knit community, and about 2007 most of the developers left. I'd guess it was mostly a case of git and mercurial becoming dominant, and possibly they also lost interest in DVCS and moved on to other things. Numpy itself has gone through several of those transitions, and looking back, I think one of the problems was that when Travis left for Enthought he didn't officially hand off maintenance. The whole transition was a bit lucky, with David, Pauli, and myself unofficially continuing the work for the 1.3 and 1.4 releases. At that point I was hoping David could more or less take over, but he graduated, and Pauli would have been an excellent choice, but he took up his graduate studies. Turnover is a problem with open source, and no matter how much discussion there is, if people aren't doing the work the whole thing sort of peters out. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Apr 24, 2012, at 9:41 PM, Matthew Brett wrote: Hi, On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. We certainly aren't there at the moment, but I can see us heading that way. But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since then datetime, NA, polynomial work, and various other enhancements have gone in along with some 280 bug fixes. The major technical problem blocking a 1.7 release is getting datetime working reliably on windows. So I think that is where the short term effort needs to be. Meanwhile, we are spending effort to get out a 1.6.2 just so people can work with a stable version with some of the bug fixes, and potentially we will spend more time and effort to pull out the NA code. In the future there may be a transition to C++ and eventually a break with the current ABI. Or not. There are at least two motivations that get folks to write code for open source projects, scratching an itch and money. Money hasn't been a big part of the Numpy picture so far, so that leaves scratching an itch. One of the attractions of Numpy is that it is a small project, BSD licensed, and not overburdened with governance and process. This makes scratching an itch not as difficult as it would be in a large project. If Numpy remains a small project but acquires the encumbrances of a big project much of that attraction will be lost. Momentum and direction also attracts people, but numpy is stalled at the moment as the whole NA thing circles around once again. I don't think we need a fork, or to start maintaining separate stable and unstable trees, or any of the other complicated process changes that have been suggested. There are tons of projects that routinely make much bigger changes than we're talking about, and they do it without needing that kind of overhead. I know that these suggestions are all made in good faith, but they remind me of a line from that Apache page I linked earlier: People tend to avoid conflict and thrash around looking for something to substitute - somebody in charge, a rule, a process, stagnation. None of these tend to be very good substitutes for doing the hard work of resolving the conflict. I also think if you talk to potential contributors, you'll find that clear, simple processes and a history of respecting everyone's input are much more attractive than a no-rules free-for-all. Good engineering practices are not an encumbrance. Resolving conflicts before merging is a good engineering practice. What happened with the NA discussion is this: - There was substantial disagreement about whether NEP-style masks, or indeed, focusing on a mask-based implementation *at all*, was the best way forward. - There was also a perceived time constraint, that we had to either implement something immediately while Mark was there, or have nothing. So in the end, the latter concern outweighed the former, the discussion was cut off, and Mark's best guess at an API was merged into master. I totally understand how this decision made sense at the time, but the result is what we see now: it's left numpy stalled, rifts on the mailing list, boring discussions about process, and still no agreement about whether NEP-style masks will actually solve our users' problems. Getting past
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 8:50 PM, Charles R Harris charlesr.har...@gmail.com wrote: Turnover is a problem with open source, and no matter how much discussion there is, if people aren't doing the work the whole thing sort of peters out. That's very true, and I hope that by building a friendly and welcoming environment, we'll raise the chances of getting sufficient new contributors to help with this issue. For my talk at Euroscipy last year [1] I made some plots collecting git statistics that show how badly loaded most scientific python projects are on the shoulders of very, very few. I really hope we can find ways of spreading the load a bit wider, and everything we can do to make projects more appealing to new contributors is an effort worth making. Cheers, f http://fperez.org/talks/1108_euroscipy_keynote.pdf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote: On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.com wrote: On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. Perhaps it was a bit colored, but even so, I'd like to know some specifics of his experience. Monotone was one of the projects that sprang up after Linus started using Bitkeeper as an open alternative, but that is actually fairly recent (2003 or so) and much of the discussion seems to have been carried on over IRC, rather than a mailing list. I'm guessing that some other projects could have taken place in the 90's, but things have changed so much since then that it is hard to know what was going on in that decade. There was certainly work on the C++ Template library, Linux, Python, and various utilities. But it is hard to know. In any case, I'd guess that Monotone was a fairly tight knit community, and about 2007 most of the developers left. I'd guess it was mostly a case of git and mercurial becoming dominant, and possibly they also lost interest in DVCS and moved on to other things. Numpy itself has gone through several of those transitions, and looking back, I think one of the problems was that when Travis left for Enthought he didn't officially hand off maintenance. The whole transition was a bit lucky, with David, Pauli, and myself unofficially continuing the work for the 1.3 and 1.4 releases. At that point I was hoping David could more or less take over, but he graduated, and Pauli would have been an excellent choice, but he took up his graduate studies. Turnover is a problem with open source, and no matter how much discussion there is, if people aren't doing the work the whole thing sort of peters out. Thanks for explaining yourself.The tone you used could earlier have been mis-interpreted (though I would hope that people would look at your record of contribution and give you the benefit of the doubt). Your last sentence is very true. In this particular case, however, there is enough interest that the whole thing will not peter out, but there is a strong chance that there will be competing groups with divergent needs and interests vying for how the project should develop. There are many people who rely on NumPy and are concerned about its progress. NumFocus was created to fight for resources to further the whole ecosystem and not just rely on volunteers that are available. I fundamentally do not believe that model can scale.There are, however, ways to keep things open source and allow people to work on NumPy as their day-job. Several companies now exist that benefit from the NumPy code base and will be interested in seeing it grow. It is a mis-characterization to imply that I left the project without a hand-off. I never handed off the project because I never left it. I was very busy at Enthought. I will still be busy now. But, NumPy is very important to me and has remained so. I have spent a great deal of mental effort trying to figure out how to contribute to its growth. Yes, I allowed other people to contribute significantly to the project and was very receptive to their pull requests (even when I didn't think it was the most urgent thing or something I actually disagreed with). That should not be interpreted as having left. NumPy grew because it solved a useful problem and people were willing to tolerate its problems to make a difference by contributing. None of us matter as much to NumPy as the problems it helps people solve. To the degree it does that we are lucky to be able to contribute to the project. I hope all NumPy developers continue to be lucky enough to have people actually care about the problems NumPy solves now and can solve in the future. -Travis Chuck
Re: [Numpy-discussion] What is consensus anyway
On Wed, Apr 25, 2012 at 12:25 AM, Travis Oliphant tra...@continuum.io wrote: On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote: On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.com wrote: On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. Perhaps it was a bit colored, but even so, I'd like to know some specifics of his experience. Monotone was one of the projects that sprang up after Linus started using Bitkeeper as an open alternative, but that is actually fairly recent (2003 or so) and much of the discussion seems to have been carried on over IRC, rather than a mailing list. I'm guessing that some other projects could have taken place in the 90's, but things have changed so much since then that it is hard to know what was going on in that decade. There was certainly work on the C++ Template library, Linux, Python, and various utilities. But it is hard to know. In any case, I'd guess that Monotone was a fairly tight knit community, and about 2007 most of the developers left. I'd guess it was mostly a case of git and mercurial becoming dominant, and possibly they also lost interest in DVCS and moved on to other things. Numpy itself has gone through several of those transitions, and looking back, I think one of the problems was that when Travis left for Enthought he didn't officially hand off maintenance. The whole transition was a bit lucky, with David, Pauli, and myself unofficially continuing the work for the 1.3 and 1.4 releases. At that point I was hoping David could more or less take over, but he graduated, and Pauli would have been an excellent choice, but he took up his graduate studies. Turnover is a problem with open source, and no matter how much discussion there is, if people aren't doing the work the whole thing sort of peters out. Thanks for explaining yourself. The tone you used could earlier have been mis-interpreted (though I would hope that people would look at your record of contribution and give you the benefit of the doubt). Your last sentence is very true. In this particular case, however, there is enough interest that the whole thing will not peter out, but there is a strong chance that there will be competing groups with divergent needs and interests vying for how the project should develop. There are many people who rely on NumPy and are concerned about its progress. NumFocus was created to fight for resources to further the whole ecosystem and not just rely on volunteers that are available. I fundamentally do not believe that model can scale. There are, however, ways to keep things open source and allow people to work on NumPy as their day-job. Several companies now exist that benefit from the NumPy code base and will be interested in seeing it grow. It is a mis-characterization to imply that I left the project without a hand-off. I never handed off the project because I never left it. I was very busy at Enthought. I will still be busy now. But, NumPy is very important to me and has remained so. I have spent a great deal of mental effort trying to figure out how to contribute to its growth. Yes, I allowed other people to contribute significantly to the project and was very receptive to their pull requests (even when I didn't think it was the most urgent thing or something I actually disagreed with). Sorry that I missed this part of numpy history, I always had the impression that numpy is run by a community led by Chuck and the young guys, David, Pauli, Stefan, Pierre; and Robert on the mailing list . (But I came late, and am just a balcony muppet.) Josef That should not be interpreted as having left. NumPy grew because it solved a useful problem and people were willing to tolerate its problems to make a difference by contributing. None of us matter as much to NumPy as
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 10:25 PM, Travis Oliphant tra...@continuum.iowrote: On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote: On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.comwrote: On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. Perhaps it was a bit colored, but even so, I'd like to know some specifics of his experience. Monotone was one of the projects that sprang up after Linus started using Bitkeeper as an open alternative, but that is actually fairly recent (2003 or so) and much of the discussion seems to have been carried on over IRC, rather than a mailing list. I'm guessing that some other projects could have taken place in the 90's, but things have changed so much since then that it is hard to know what was going on in that decade. There was certainly work on the C++ Template library, Linux, Python, and various utilities. But it is hard to know. In any case, I'd guess that Monotone was a fairly tight knit community, and about 2007 most of the developers left. I'd guess it was mostly a case of git and mercurial becoming dominant, and possibly they also lost interest in DVCS and moved on to other things. Numpy itself has gone through several of those transitions, and looking back, I think one of the problems was that when Travis left for Enthought he didn't officially hand off maintenance. The whole transition was a bit lucky, with David, Pauli, and myself unofficially continuing the work for the 1.3 and 1.4 releases. At that point I was hoping David could more or less take over, but he graduated, and Pauli would have been an excellent choice, but he took up his graduate studies. Turnover is a problem with open source, and no matter how much discussion there is, if people aren't doing the work the whole thing sort of peters out. Thanks for explaining yourself.The tone you used could earlier have been mis-interpreted (though I would hope that people would look at your record of contribution and give you the benefit of the doubt). Your last sentence is very true. In this particular case, however, there is enough interest that the whole thing will not peter out, but there is a strong chance that there will be competing groups with divergent needs and interests vying for how the project should develop. There are many people who rely on NumPy and are concerned about its progress. NumFocus was created to fight for resources to further the whole ecosystem and not just rely on volunteers that are available. I fundamentally do not believe that model can scale.There are, however, ways to keep things open source and allow people to work on NumPy as their day-job. Several companies now exist that benefit from the NumPy code base and will be interested in seeing it grow. It is a mis-characterization to imply that I left the project without a hand-off. I never handed off the project because I never left it. I was very busy at Enthought. I will still be busy now. But, NumPy is very important to me and has remained so. I have spent a great deal of mental effort trying to figure out how to contribute to its growth. Yes, I allowed other people to contribute significantly to the project and was very receptive to their pull requests (even when I didn't think it was the most urgent thing or something I actually disagreed with). Well then, let's say you should have handed off, because you no longer had the time to devote to it. You made the 1.2.1 release, and after that you weren't really involved until recently. Now I'm sure that you didn't lose interest, but you did lose the time, and I think it would have been better if you had realized that fact up front. As it was, I suggested to David that it was time for a 1.3 release, and we preceded without permission from the usual suspects, yourself and Jarrod. I
Re: [Numpy-discussion] What is consensus anyway
On Apr 25, 2012, at 12:02 AM, Charles R Harris wrote: On Tue, Apr 24, 2012 at 10:25 PM, Travis Oliphant tra...@continuum.io wrote: On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote: On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.com wrote: On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: Fernando, I'm not checking credentials, I'm curious. Well, at least I think that an inquisitive query about someone's background, phrased like that, can be very easily misread. I can only speak for myself, but I immediately had the impression that you were indeed trying to validate his background as a proxy for the discussion, and suggesting that others had the same curiosity... Had the question been something more like Hey Nathaniel, what other projects do you think could inform our current view, maybe from stuff you've done in the past or lists you've lurked on?, I would have a very different reaction. But this sentence: I admit to a certain curiosity about your own involvement in FOSS projects, and I know I'm not alone in this. definitely reads to me with a rather dark and unpleasant angle. Upon rereading it again now, I still don't like the tone. I trust you when you indicate that your intent was different; perhaps it's a matter of phrasing, or the fact that English is not my native language and I may miss subtleties of native speakers. Perhaps it was a bit colored, but even so, I'd like to know some specifics of his experience. Monotone was one of the projects that sprang up after Linus started using Bitkeeper as an open alternative, but that is actually fairly recent (2003 or so) and much of the discussion seems to have been carried on over IRC, rather than a mailing list. I'm guessing that some other projects could have taken place in the 90's, but things have changed so much since then that it is hard to know what was going on in that decade. There was certainly work on the C++ Template library, Linux, Python, and various utilities. But it is hard to know. In any case, I'd guess that Monotone was a fairly tight knit community, and about 2007 most of the developers left. I'd guess it was mostly a case of git and mercurial becoming dominant, and possibly they also lost interest in DVCS and moved on to other things. Numpy itself has gone through several of those transitions, and looking back, I think one of the problems was that when Travis left for Enthought he didn't officially hand off maintenance. The whole transition was a bit lucky, with David, Pauli, and myself unofficially continuing the work for the 1.3 and 1.4 releases. At that point I was hoping David could more or less take over, but he graduated, and Pauli would have been an excellent choice, but he took up his graduate studies. Turnover is a problem with open source, and no matter how much discussion there is, if people aren't doing the work the whole thing sort of peters out. Thanks for explaining yourself.The tone you used could earlier have been mis-interpreted (though I would hope that people would look at your record of contribution and give you the benefit of the doubt). Your last sentence is very true. In this particular case, however, there is enough interest that the whole thing will not peter out, but there is a strong chance that there will be competing groups with divergent needs and interests vying for how the project should develop. There are many people who rely on NumPy and are concerned about its progress. NumFocus was created to fight for resources to further the whole ecosystem and not just rely on volunteers that are available. I fundamentally do not believe that model can scale.There are, however, ways to keep things open source and allow people to work on NumPy as their day-job. Several companies now exist that benefit from the NumPy code base and will be interested in seeing it grow. It is a mis-characterization to imply that I left the project without a hand-off. I never handed off the project because I never left it. I was very busy at Enthought. I will still be busy now. But, NumPy is very important to me and has remained so. I have spent a great deal of mental effort trying to figure out how to contribute to its growth. Yes, I allowed other people to contribute significantly to the project and was very receptive to their pull requests (even when I didn't think it was the most urgent thing or something I actually disagreed with). Well then, let's say you should have handed off, because you no longer had the time to devote to it. You made the 1.2.1 release, and after that you weren't really involved until recently. Now I'm sure that you didn't lose interest, but you did lose the time, and I think it would have been better if you had realized that fact up front. I will grant you that.
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 10:02 PM, josef.p...@gmail.com wrote: Sorry that I missed this part of numpy history, I always had the impression that numpy is run by a community led by Chuck and the young guys, David, Pauli, Stefan, Pierre; and Robert on the mailing list . (But I came late, and am just a balcony muppet.) Travis, when you have a free minute (ha :) it would be very nice if you wrote up a blog post with some of the history from say the 2000s with Numeric, through Numarray and into Numpy. Some of us saw all that happen first hand and know it well, but since most of it simply happened on mailing lists, conferences and assorted meetings, it's actually quite hard to understand that history if you arrive now. It's not really written up anywhere, and nobody is going to read 10 years' worth of email archives :) Guido a while back wrote a fantastic set of posts on the history of python itself that I've greatly enjoyed: http://python-history.blogspot.com/ something similar for numpy would be nice to have... Though thinking more about it, perhaps a better alternative could be a 'history of the scipy world' where multiple people could write guest posts about each project they've had a part of. I think something like that could be a lot of fun, and also useful :) Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Tue, Apr 24, 2012 at 05:59:09PM -0600, Charles R Harris wrote: Travis, if you are playing the BDFL role, then just make the darn decision and remove the code so we can get on with life. As it is you go back and forth and that does none of us any good, you're a big guy and you're rocking the boat. I don't agree with that decision, I'd rather evolve the code we have, but I'm willing to compromise with your decision in this matter. I think that Chuck's point here, in a thread on consensus, is very important: sometimes design discussions stall. If, in such situation, a BDFL makes a decision, acknowledging that he has no divine power to see the best of all option but needs to move on, it can help the project go forward. As long as nobody's feelings are hurt, a bit of dictatorship well used moves a project forward. Of course, as with any leadership, it only works because we as a community trust the leader. My 2 cents, Gael ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
I've given several talks on the subject, but I don't think I've ever written a blog-post about it. A reasonable history does exist in the beginning of the Guide to NumPy which is still available for free at http://www.tramy.us/numpybook.pdf -Travis On Apr 25, 2012, at 12:18 AM, Fernando Perez wrote: On Tue, Apr 24, 2012 at 10:02 PM, josef.p...@gmail.com wrote: Sorry that I missed this part of numpy history, I always had the impression that numpy is run by a community led by Chuck and the young guys, David, Pauli, Stefan, Pierre; and Robert on the mailing list . (But I came late, and am just a balcony muppet.) Travis, when you have a free minute (ha :) it would be very nice if you wrote up a blog post with some of the history from say the 2000s with Numeric, through Numarray and into Numpy. Some of us saw all that happen first hand and know it well, but since most of it simply happened on mailing lists, conferences and assorted meetings, it's actually quite hard to understand that history if you arrive now. It's not really written up anywhere, and nobody is going to read 10 years' worth of email archives :) Guido a while back wrote a fantastic set of posts on the history of python itself that I've greatly enjoyed: http://python-history.blogspot.com/ something similar for numpy would be nice to have... Though thinking more about it, perhaps a better alternative could be a 'history of the scipy world' where multiple people could write guest posts about each project they've had a part of. I think something like that could be a lot of fun, and also useful :) Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Sun, Apr 22, 2012 at 3:15 PM, Nathaniel Smith n...@pobox.com wrote: If you hang around big FOSS projects, you'll see the word consensus come up a lot. For example, the glibc steering committee recently dissolved itself in favor of governance directly by the consensus of the people active in glibc development[1]. It's the governing rule of the IETF, which defines many of the most important internet standards[2]. It is the primary way decisions are made on Wikipedia[3]. It's one of the fundamental aspects of accomplishing things within the Apache framework[4]. [1] https://lwn.net/Articles/488778/ [2] https://www.ietf.org/tao.html#getting.things.done [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus [4] https://www.apache.org/foundation/voting.html I think the big problem here is that Chuck (I hope I'm not misrepresenting him) is not interested in discussion of process, and the last time we had a specific thread on governance, Travis strongly implied he was not very interested either, at least at the time. In that situation, there's rather a high threshold to pass before getting involved in the discussion, and I think you're seeing some evidence for that. So, as before, and as we discussed on gchat :) - whether this discussion can go anywhere depends on Travis. Travis - what do you think? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 1:04 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sun, Apr 22, 2012 at 4:15 PM, Nathaniel Smith n...@pobox.com wrote: If you hang around big FOSS projects, you'll see the word consensus come up a lot. For example, the glibc steering committee recently dissolved itself in favor of governance directly by the consensus of the people active in glibc development[1]. It's the governing rule of the IETF, which defines many of the most important internet standards[2]. It is the primary way decisions are made on Wikipedia[3]. It's one of the fundamental aspects of accomplishing things within the Apache framework[4]. [1] https://lwn.net/Articles/488778/ [2] https://www.ietf.org/tao.html#getting.things.done [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus [4] https://www.apache.org/foundation/voting.html But it turns out that this consensus thing is actually somewhat mysterious, and one that most programmers immersed in this culture pick it up by osmosis. And numpy in particular has a lot of developers who are not coming from a classic FOSS programmer background! So this is my personal attempt to articulate what it is, and why requiring consensus is probably the best possible approach to project decision making. So what is consensus? Like, voting or something? - This is surprisingly subtle and specific. Consensus means something like, everyone who cares is satisfied with the result. It does *not* mean * Every opinion counts equally * We vote on anything * Every solution must be perfect and flawless * Every solution must leave everyone overjoyed * Everyone must sign off on every solution. It *does* mean * We invite people to speak up * We generally trust individuals to decide how important their opinion is * We generally trust individuals to decide whether or not they can live with some outcome * If they can't, then we take the time to find something better. One simple way of stating this is, everyone has a veto. In practice, such vetoes are almost never used, so this rule is not particularly illuminating on its own. Hence, the rest of this document. What a waste of time! That all sounds very pretty on paper, but we have stuff to get done. --- First, I'll note that this seemingly utopian scheme has a track record of producing such impractical systems as TCP/IP, SMTP, DNS, Apache, GCC, Linux, Samba, Python, ... Linux is Linus' private tree. Everything that goes in is his decision, everything that stays out is his decision. Of course, he delegates much of the work to people he trusts, but it doesn't even reach the level of a BDFL, it's DFL. As for consensus, it basically comes down to convincing the gatekeepers one level below Linus that your code might be useful. So bad example. Same with TCP/IP, which was basically Kahn and Cerf consulting with a few others and working by request of DARPA. GCC was Richard Stallman (I got one of the first tapes for a $30 donation), Python was Guido. Some of the projects later developed some form of governance but Guido, for instance, can veto anything he dislikes even if he is disinclined to do so. I'm not saying you're wrong about open source, I'm just saying that that each project differs and it is wrong to imply that they follow some common form of governance under the rubric FOSS and that they all seek consensus. And they certainly don't *start* that way. And there are also plenty of projects that fail when the prime mover loses interest or folks get tired of the politics. So a few points here: Consensus-based decision-making is an ideal and a guide, not an algorithm. There's nothing at all inconsistent between having a BDFL and using consensus as the primary guide for decision making -- it just means that the BDFL chooses to exercise their power in that way, and is generally trusted to make judgement calls about specific cases. See Fernando's reply down-thread for an example of this. And I'm not saying that all FOSS projects follow some common form of governance. But I am saying that there's a substantial amount of shared development culture across most successful FOSS projects, and a ton of experience on how to run a project successfully. Project management is a difficult and arcane skill set, and one that's hard to learn except through apprenticeship and osmosis. And it's definitely not included in most courses on programming for scientists! So it'd be nice if numpy could avoid having to re-make some of these mistakes... But the other effect of this being cultural values rather than something explicit and articulated is that sometimes you can't see it from the outside. For example: Linux: Technically, everything you say is true. In practice, good luck convincing Linus or a subsystem maintainer to accept your patch when other
Re: [Numpy-discussion] What is consensus anyway
Hi, On Mon, Apr 23, 2012 at 12:33 PM, Nathaniel Smith n...@pobox.com wrote: On Mon, Apr 23, 2012 at 1:04 AM, Charles R Harris charlesr.har...@gmail.com wrote: Linux is Linus' private tree. Everything that goes in is his decision, everything that stays out is his decision. Of course, he delegates much of the work to people he trusts, but it doesn't even reach the level of a BDFL, it's DFL. As for consensus, it basically comes down to convincing the gatekeepers one level below Linus that your code might be useful. So bad example. Same with TCP/IP, which was basically Kahn and Cerf consulting with a few others and working by request of DARPA. GCC was Richard Stallman (I got one of the first tapes for a $30 donation), Python was Guido. Some of the projects later developed some form of governance but Guido, for instance, can veto anything he dislikes even if he is disinclined to do so. I'm not saying you're wrong about open source, I'm just saying that that each project differs and it is wrong to imply that they follow some common form of governance under the rubric FOSS and that they all seek consensus. And they certainly don't *start* that way. And there are also plenty of projects that fail when the prime mover loses interest or folks get tired of the politics. [snip] Linux: Technically, everything you say is true. In practice, good luck convincing Linus or a subsystem maintainer to accept your patch when other people are raising substantive complaints. Here's an email I googled up in a few moments, in which Linus yells at people for trying to submit a patch to him without making sure that all interested parties have agreed: https://lkml.org/lkml/2009/9/14/481 Stuff regularly sits outside the kernel tree in limbo for *years* while people debate different approaches back and forth. To which I'd add: In fact, for [Linus'] decisions to be received as legitimate, they have to be consistent with the consensus of the opinions of participating developers as manifest on Linux mailing lists. It is not unusual for him to back down from a decision under the pressure of criticism from other developers. His position is based on the recognition of his fitness by the community of Linux developers and this type of authority is, therefore, constantly subject to withdrawal. His role is not that of a boss or a manager in the usual sense. In the final analysis, the direction of the project springs from the cumulative synthesis of modifications contributed by individual developers. http://shareable.net/blog/governance-of-open-source-george-dafermos-interview See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Linux: Technically, everything you say is true. In practice, good luck convincing Linus or a subsystem maintainer to accept your patch when other people are raising substantive complaints. Here's an email I googled up in a few moments, in which Linus yells at people for trying to submit a patch to him without making sure that all interested parties have agreed: https://lkml.org/lkml/2009/9/14/481 Stuff regularly sits outside the kernel tree in limbo for *years* while people debate different approaches back and forth. To which I'd add: In fact, for [Linus'] decisions to be received as legitimate, they have to be consistent with the consensus of the opinions of participating developers as manifest on Linux mailing lists. It is not unusual for him to back down from a decision under the pressure of criticism from other developers. His position is based on the recognition of his fitness by the community of Linux developers and this type of authority is, therefore, constantly subject to withdrawal. His role is not that of a boss or a manager in the usual sense. In the final analysis, the direction of the project springs from the cumulative synthesis of modifications contributed by individual developers. http://shareable.net/blog/governance-of-open-source-george-dafermos-interview This is the model that I have for NumPy development. It is my view of how NumPy has evolved already and how Numarray, and Numeric evolved before it as well.I also feel like these things are fundamentally determined by the people involved and by the personalities and styles of those who participate. There certainly are globally applicable principles (like code review, building consensus, and mutual respect) that are worth emphasizing over and over again. If it helps let's write those down and say these are the principles we live by. I am suspicious that you can go beyond this in formalizing the process as you ultimately are at the mercy of the people involved and their judgment, anyway. I can also see that for the benefit of newcomers and occasional contributors it can be beneficial to have some documentation of the natural, emergent methods and interactions that apply to cooperative software development. But, I would hesitate to put some-kind of aura of authority around such a document that implies the processes cannot be violated if good judgment demands that they should be. That is the basis of my hesitation to spend much time on officially documenting our process Right now we are trying to balance difficult things: stable releases with experimental development. The fact that we had such differences of opinion last year on masked arrays / missing values and how to incorporate them into a common object model means that we should not have committed the code to master until we figured out a way to reconcile Nathaniel's concerns. That is my current view.I was very enthused that we had someone contributing large scale changes that clearly showed an ability to understand the code and contribute to it --- that hadn't happened in a while. I wanted to encourage that. I still do. I think the process itself has shown that you can have an impact on NumPy just by voicing your opinion. Clearly, you have more of an effect on NumPy by submitting pull requests, but NumPy development does listen carefully to the voices of users. Best, -Travis See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote: Linux: Technically, everything you say is true. In practice, good luck convincing Linus or a subsystem maintainer to accept your patch when other people are raising substantive complaints. Here's an email I googled up in a few moments, in which Linus yells at people for trying to submit a patch to him without making sure that all interested parties have agreed: https://lkml.org/lkml/2009/9/14/481 Stuff regularly sits outside the kernel tree in limbo for *years* while people debate different approaches back and forth. To which I'd add: In fact, for [Linus'] decisions to be received as legitimate, they have to be consistent with the consensus of the opinions of participating developers as manifest on Linux mailing lists. It is not unusual for him to back down from a decision under the pressure of criticism from other developers. His position is based on the recognition of his fitness by the community of Linux developers and this type of authority is, therefore, constantly subject to withdrawal. His role is not that of a boss or a manager in the usual sense. In the final analysis, the direction of the project springs from the cumulative synthesis of modifications contributed by individual developers. http://shareable.net/blog/governance-of-open-source-george-dafermos-interview This is the model that I have for NumPy development. It is my view of how NumPy has evolved already and how Numarray, and Numeric evolved before it as well. I also feel like these things are fundamentally determined by the people involved and by the personalities and styles of those who participate. There certainly are globally applicable principles (like code review, building consensus, and mutual respect) that are worth emphasizing over and over again. If it helps let's write those down and say these are the principles we live by. I am suspicious that you can go beyond this in formalizing the process as you ultimately are at the mercy of the people involved and their judgment, anyway. I think writing it down would help enormously. For example, if you do agree to Nathaniel's view of consensus - *in principle* - and we write that down and agree, we have a document to appeal to when we next run into trouble.Maybe the document could say something like: We strive for consensus [some refs here]. Any substantial new feature is subject to consensus. Only if all avenues for consensus have been documented, and exhausted, will we [vote, defer to Travis, or some other tie-breaking thing]. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote: Right now we are trying to balance difficult things: stable releases with experimental development. Perhaps a more formal development release system could help here. IIUC, numpy pretty much has two things: the latest release (and past ones) and master (and assorted experimentla branches). If someone develops a new feature, we can either: have them submit a pull request, and people with the where-with-all can pull it, compile, it, and start tesing it on their own -- hsitory shows that this is a small group. merge it with master -- and hope it gets the testing is should before it becomes part of a release, but: we are rightly heistant to put experimental stuff in master, and it really dont' get that much testing -- again only folks that are building master will even see it. Some projects have a more format development release system. wxPython, for instance has had for years development releases with odd numbers -- right now, the official release is 2.8.*, but there is a 2.9.* out there that is getting some use and testing. A couple of things help make this work: 1) Robin makes the effort to put out binaries for development releases -- it's easy to go get and give it a try. 2) there is the wxversion system that makes it easy to install a new versin of wx, and easily switch between them (it's actually broken on OS-X right now --- :-) ) -- this pre-dated virtualenv and friends, maybe virtualenv is enough for this now. Anyway, it's a thought -- I think some more rea-world use of new features before a real commitment to adopting them would be great. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
That is an excellent thought. We could make the odd numbered releases experimental and the even-numbered as stable. That makes some sense.What do others think? -Travis On Apr 23, 2012, at 5:46 PM, Chris Barker wrote: On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote: Right now we are trying to balance difficult things: stable releases with experimental development. Perhaps a more formal development release system could help here. IIUC, numpy pretty much has two things: the latest release (and past ones) and master (and assorted experimentla branches). If someone develops a new feature, we can either: have them submit a pull request, and people with the where-with-all can pull it, compile, it, and start tesing it on their own -- hsitory shows that this is a small group. merge it with master -- and hope it gets the testing is should before it becomes part of a release, but: we are rightly heistant to put experimental stuff in master, and it really dont' get that much testing -- again only folks that are building master will even see it. Some projects have a more format development release system. wxPython, for instance has had for years development releases with odd numbers -- right now, the official release is 2.8.*, but there is a 2.9.* out there that is getting some use and testing. A couple of things help make this work: 1) Robin makes the effort to put out binaries for development releases -- it's easy to go get and give it a try. 2) there is the wxversion system that makes it easy to install a new versin of wx, and easily switch between them (it's actually broken on OS-X right now --- :-) ) -- this pre-dated virtualenv and friends, maybe virtualenv is enough for this now. Anyway, it's a thought -- I think some more rea-world use of new features before a real commitment to adopting them would be great. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 5:02 PM, Travis Oliphant tra...@continuum.iowrote: That is an excellent thought. We could make the odd numbered releases experimental and the even-numbered as stable. That makes some sense.What do others think? I'm starting to think that a fork might be the best solution to the present problem. There is plenty of precedent for forks in FOSS, for example GCC, EGCS, Redhat 1.97, LLVM and emacs, xemacs. There are several semi-official forks of linux (Android, the real time Kernel, etc.) Zeromq just forked, OpenOffice forked, there was XFree86 forked to Xorg, etc. Linus encourages forks, so there is even authority for that ;) Of course, the further the fork diverges from the original the harder reintegration becomes, witness Android and wake-locks. But a fork would cure a lot of contention. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 4:39 PM, Charles R Harris charlesr.har...@gmail.com wrote: I'm starting to think that a fork might be the best solution to the present problem. If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. It's really not that hard to focus our attention on technical issues and to reach consensus. Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 4:02 PM, Travis Oliphant tra...@continuum.io wrote: That is an excellent thought. We could make the odd numbered releases experimental and the even-numbered as stable. That makes some sense. What do others think? I think the concern with that is manpower: it effectively requires maintaining two complete projects alive in parallel. As far as I know, a number projects that used to have that model have backed off (the linux kernel included) to better enable a limited team to focus on development. I'm skeptical that numpy has the manpower to sustain that approach. Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote: If you are referring to the traditional concept of a fork, and not to the type we frequently make on GitHub, then I'm surprised that no one has objected already. What would a fork solve? To paraphrase the regexp saying: after forking, we'll simply have two problems. I concur with you here: github 'forks', yes, as many as possible! Hopefully every one of those will produce one or more PRs :) But a fork in the sense of a divergent parallel project? I think that would only be indicative of a complete failure to find a way to make progress here, and I doubt we're anywhere near that state. That forks are *possible* is indeed a valuable and important option in open source software, because it means that a truly dysfunctional original project team/direction can't hold a community hostage forever. But that doesn't mean that full-blown forks should be considered lightly, as they also carry enormous costs. I see absolutely nothing in the current scenario to even remotely consider that a full-blown fork would be a good idea, and I hope I'm right. It seems to me we're making progress on problems that led to real difficulties last year, but from multiple parties I see signs that give me reason to be optimistic that the project is getting better, not worse. Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
On Sun, Apr 22, 2012 at 4:15 PM, Nathaniel Smith n...@pobox.com wrote: If you hang around big FOSS projects, you'll see the word consensus come up a lot. For example, the glibc steering committee recently dissolved itself in favor of governance directly by the consensus of the people active in glibc development[1]. It's the governing rule of the IETF, which defines many of the most important internet standards[2]. It is the primary way decisions are made on Wikipedia[3]. It's one of the fundamental aspects of accomplishing things within the Apache framework[4]. [1] https://lwn.net/Articles/488778/ [2] https://www.ietf.org/tao.html#getting.things.done [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus [4] https://www.apache.org/foundation/voting.html But it turns out that this consensus thing is actually somewhat mysterious, and one that most programmers immersed in this culture pick it up by osmosis. And numpy in particular has a lot of developers who are not coming from a classic FOSS programmer background! So this is my personal attempt to articulate what it is, and why requiring consensus is probably the best possible approach to project decision making. So what is consensus? Like, voting or something? - This is surprisingly subtle and specific. Consensus means something like, everyone who cares is satisfied with the result. It does *not* mean * Every opinion counts equally * We vote on anything * Every solution must be perfect and flawless * Every solution must leave everyone overjoyed * Everyone must sign off on every solution. It *does* mean * We invite people to speak up * We generally trust individuals to decide how important their opinion is * We generally trust individuals to decide whether or not they can live with some outcome * If they can't, then we take the time to find something better. One simple way of stating this is, everyone has a veto. In practice, such vetoes are almost never used, so this rule is not particularly illuminating on its own. Hence, the rest of this document. What a waste of time! That all sounds very pretty on paper, but we have stuff to get done. --- First, I'll note that this seemingly utopian scheme has a track record of producing such impractical systems as TCP/IP, SMTP, DNS, Apache, GCC, Linux, Samba, Python, ... Linux is Linus' private tree. Everything that goes in is his decision, everything that stays out is his decision. Of course, he delegates much of the work to people he trusts, but it doesn't even reach the level of a BDFL, it's DFL. As for consensus, it basically comes down to convincing the gatekeepers one level below Linus that your code might be useful. So bad example. Same with TCP/IP, which was basically Kahn and Cerf consulting with a few others and working by request of DARPA. GCC was Richard Stallman (I got one of the first tapes for a $30 donation), Python was Guido. Some of the projects later developed some form of governance but Guido, for instance, can veto anything he dislikes even if he is disinclined to do so. I'm not saying you're wrong about open source, I'm just saying that that each project differs and it is wrong to imply that they follow some common form of governance under the rubric FOSS and that they all seek consensus. And they certainly don't *start* that way. And there are also plenty of projects that fail when the prime mover loses interest or folks get tired of the politics. But mere empirical results are often less convincing than a good story, so I will give you two. Why does a requirement for consensus work? Reason 1 (for optimists): *All of us are smarter than any of us.* For a complex project with many users, it's extraordinarily difficult for any one person to understand the full ramifications of any decision, particularly the sort of far-reaching architectural decisions that are most important. It's even more difficult to understand all the possibilities of all the different possible solutions. In fact, it's *extremely* common that the correct solution to a problem is the one that no-one thinks of until after a month of annoying debate. Spending a month to avoid an architectural problem that will haunt us for years is an *excellent* trade-off, even if it feels interminable at the time. Even two months. Usually disagreements are an indication that a better solution is possible, even when it's not clear what that would be. Reason 2 (for pessimists): *You **will** reach consensus sooner or later; it's less painful to do up front.* Example: NA handling. There are two schemes that people use for this right now -- numpy.ma and ugly NaN kluges (see e.g. nanmean). These are generally agreed to be suboptimal. Recently, two new contenders have shown up: the NEP masked-NA support currently in master, and the
Re: [Numpy-discussion] What is consensus anyway
Hi Nathaniel, thanks for a solid writeup of this topic. I just want to add a note from personal experience, regarding this specific point: On Sun, Apr 22, 2012 at 3:15 PM, Nathaniel Smith n...@pobox.com wrote: Usually disagreements are an indication that a better solution is possible, even when it's not clear what that would be. I think this is *extremely* important, so I want to highlight it from the rest of your post. Regarding how IPython operates, I think we have good evidence to illustrate the value of this... One of the members of the IPython team who joined earliest is Brian Granger: he started working on IPython around 2004 after a conversation we had in the context of a SciPy conference. Some of you may know that Brian and I went to graduate school together, which means we've known each other for much longer than IPython, and we've been good friends since. But that alone doesn't ensure a smooth collaboration; in fact Brian and I extremely often disagree *deeply* on design decisions about IPython. And yet, I think the project makes solid progress, not despite this but in an important way *thanks* to this divergence. Whenever we disagree, it typically means that each of us is seeing a partial solution to a problem, but not a really solid and complete one. I don't recall ever using my 'BDFL vote' in one of these discussions; instead we just keep going around the problem. Typically what happens is that after much discussion, we settle on a new solution that neither of us had quite seen at the start. I mention Brian specifically because him and I seem to be at opposite ends of some weird spectrum, disagreement between the other parties appears to fall somewhere in between. Here's an example that is currently in open discussion, and despite the fact that I'm completely convinced that something like this should go into IPython, I'm waiting. We'll continue the discussion to either find arguments that convince me otherwise, or to convince Brian of the value of the PR: https://github.com/ipython/ipython/pull/1343 It takes both patience and trust for this to work: we have to be willing to wait out the long discussion, and we have to trust that despite how much we may disagree on something, we both play fair and ultimately only want what's best for the project. That means giving the other party the benefit of the doubt at every turn, and having a willingness to let the discussion happen openly as long as is necessary for the project to remain healthy. For example in this case, I'm *really* convinced of my point, and I think blocking this PR actively hurts users. Is it worth saying OK, I'm overriding your concerns here and pushing this forward? Absolutely NOT! I'd only: - alienate Brian, a key member of the project without whom IPython would be nowhere near where it is today, and decrease his motivation to continue working - kill the opportunity for a discussion to produce an even cleaner solution than what we've seen so far - piss off a good friend. I put this last because while that's actually a very important reason for me, the fact that Brian and I are good personal friends is secondary here: this is about discussion between contributors independent of their personal relationships. I hope this perspective is useful... 1. Make it as easy as possible for people to see what's going on and join the discussion. All decisions and reasoning behind decisions take place in public. (On this note, it would be *really* good if pull request notifications went to the list.) If anyone knows how to do this, let me know; I'd like to do the same for IPython and our -dev list. Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion