subject:"Re\: \[Numpy\-discussion\] What is consensus anyway"

Re: [Numpy-discussion] What is consensus anyway

2012-05-04 Thread Pauli Virtanen

26.04.2012 03:11, Travis Oliphant kirjoitti:
[clip]
 It would be nice if every pull request created a message to this list.
 Is that even possible?

Unidirectional forwarding is possible, for instance using Github's API,

https://github.com/pv/github-pull-request-fwd

Github itself doesn't offer tools to do this, so an external server for
sending the mails is needed, and the mailing list admins need to allow
the corresponding mails to pass through.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-26 Thread Paul Hobson

We're kind of drifting again here, but...

Remember when all this discussion happened on usenet? Perhaps we're in
yet another awkward transition period and soon all email list-type
discussions will be on Github, Bitbucket, StackOverflow (e.g. pandas),
etc.

There's advantages and disadvantages to any sort of discussion
paradigm, but I can imagine a future version of Github where each
project has a tab for a StackOverflow-esque forum. As a user, that all
sounds pretty appealing to me. But this is all just speculation and
conjecture...
-paul

On Wed, Apr 25, 2012 at 9:48 PM, Fernando Perez fperez@gmail.com wrote:
 On Wed, Apr 25, 2012 at 6:28 PM, Benjamin Root ben.r...@ou.edu wrote:
 It would be nice if every pull request created a message to this list.
  Is that even possible?

 -Travis


 This ha been a concern of mine for matplotlib as well.  The closest I can
 come is to set up an RSS feed, but all the titles are PR # and a action, so
 I lose track of which ones I want to view.

 Same here for IPython.  If anybody figures out a clean solution,
 please advertise it!  I think a bunch of us want the same thing...

 Cheers,

 f
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-26 Thread Ralf Gommers

On Thu, Apr 26, 2012 at 6:37 AM, srean srean.l...@gmail.com wrote:


 On something else that was brought up: I do not consider myself
 competent/prepared enough to take on development, but it is not the
 case that I have _never_ felt the temptation. What I have found
 intimidating and styming is the perceived politics over development
 issues.  The two places where I have felt this are a) on contentious
 threads on the list and b) what seems like legitimate patches tickets
 on trac that seem to be languishing for no compelling technical
 reason. I would be hardpressed to quote specifics, but I have
 encountered this feeling a few times.


Patches languishing on Trac is a real problem. The issue here is not at all
about not wanting those patches, but just about the overhead of getting
them reviewed/fixed/committed. This problem has more or less disappeared
with Github; there are very few PRs that are just sitting there.

As for existing patches on Trac, if you or anyone else has an interest in
one of them, checking that patch for test coverage / documentation and
resubmitting it as a PR would be a massive help.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-26 Thread Chris Barker

On Mon, Apr 23, 2012 at 11:18 PM, Ralf Gommers

 Perhaps a more formal development release system could help here.
 IIUC, numpy pretty much has two things:

 This is a good idea - not for development releases but for master. Building
 nightly/weekly binaries would help more people try out new features.

good start, but I think master may fluctuate too quickly (and how
often is it broken?) but better than nothing, yes?

 2) there is the wxversion system

 wxversion was broken for a long time on Ubuntu too (~5 yrs ago). I don't
 exactly remember it as a good idea.

well, it was a good idea, maybe not a good implementation -- and it
was vary helpful a few years back when wx was in major flux. What we
really need is python itself providing a package version selection
mechanism, but Guidoc  never saw the need (the existence of
virtualenv proves the need if you ask me)

 virtualenv also doesn't help, because if you can use that you know how to 
 build from source anyway.

not true -- lots of folks use easy_install and/or pip with virtualenv.

and the git barrier to entry is not trivial -- granted jsut getting
master is not hard, but I know i've been using git for a couple months
on a core project of mine, and I still find it's giving me far more
pain that help. (I know I stil haven't wrapped my brain around what
DVCS really is...)

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-26 Thread Ralf Gommers

On Thu, Apr 26, 2012 at 7:02 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Mon, Apr 23, 2012 at 11:18 PM, Ralf Gommers

  Perhaps a more formal development release system could help here.
  IIUC, numpy pretty much has two things:

  This is a good idea - not for development releases but for master.
 Building
  nightly/weekly binaries would help more people try out new features.

 good start, but I think master may fluctuate too quickly (and how
 often is it broken?) but better than nothing, yes?


How often is it broken? A couple of failing tests yes, but hardly ever
seriously broken.


  2) there is the wxversion system

  wxversion was broken for a long time on Ubuntu too (~5 yrs ago). I don't
  exactly remember it as a good idea.

 well, it was a good idea, maybe not a good implementation -- and it
 was vary helpful a few years back when wx was in major flux. What we
 really need is python itself providing a package version selection
 mechanism, but Guidoc  never saw the need (the existence of
 virtualenv proves the need if you ask me)

 agreed


  virtualenv also doesn't help, because if you can use that you know how
 to build from source anyway.

 not true -- lots of folks use easy_install and/or pip with virtualenv.


Pip only installs from source, so if you haven't got the right compilers,
development headers etc. it will fail for numpy. easy_install is also a
lottery, and only works for numpy on Windows unless you are set up to build
from source.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-26 Thread srean

 Patches languishing on Trac is a real problem. The issue here is not at all
 about not wanting those patches,

Oh yes I am sure of that, in the past it had not been clear what more
is necessary to get them pulled in, or how to go about satisfying the
requirements. The document you mailed on the scipy list goes a long
way in addressing those issues. So thanks a lot. In fact it might be a
good idea to add the link to it in the signature of the mail that trac
replies with.

 but just about the overhead of getting them
 reviewed/fixed/committed. This problem has more or less disappeared with
 Github; there are very few PRs that are just sitting there.

 As for existing patches on Trac, if you or anyone else has an interest in
 one of them, checking that patch for test coverage / documentation and
 resubmitting it as a PR would be a massive help.

 Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Nathaniel Smith

On Wed, Apr 25, 2012 at 4:02 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Tue, Apr 24, 2012 at 8:56 PM, Fernando Perez fperez@gmail.com
 wrote:

 On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  I admit to a certain curiosity about your own involvement in FOSS
  projects,
  and I know I'm not alone in this. Google shows several years of
  discussion
  on Monotone, but I have no idea what your contributions were

 Seriously???

 Please, let's rise above this.  We discuss people's opinions *on their
 technical merit alone*, regardless of the background of the person
 presenting them.  I don't care if Linus himself shows up on the list
 with a bad idea, it should be shot down; and if someone we'd never
 heard of brings up a valid point, we should respect it.

 The day we start checking credentials at the door is the day this
 project will die as an open source effort.  Or at least I think so,
 but perhaps I don't have enough 'commit credits' in my account for my
 opinion to matter...


 Fernando, I'm not checking credentials, I'm curious. Nathaniel has
 experience with FOSS projects, unlike us first timers, and I'd like to know
 what that experience was and what he learned from it. He has also mentioned
 Graydon Hoare in connection with RUST, and since Graydon was the prime mover
 in Monotone I'd like to know the story of the project.

Yeah, I don't want to get into resumes and such here, since it'd be
hard to avoid turning it into one of those whose has a bigger FOSS
pecking-order contests, which I find both unpleasant and
counter-productive. If I've learned anything useful from experience,
then I've already tried to summarize it here (and really, experience
may or may not guarantee any kind of wisdom). If you want to swap war
stories, ask me some day over a $BEVERAGE :-).

After sleeping on it, I was wondering if part of your objection to the
consensus stuff is just to the word veto? Would you feel more
comfortable if it was phrased like, the maintainers have noticed that
trying to pick and choose on contentious issues tends to come back and
bite them, so they've decided that they will not accept changes unless
they have reasonable certainty that all substantive objections from
the userbase have been worked through and resolved? It means the same
thing in the end, but perhaps makes clearer how the power actually
works.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Charles R Harris

On Wed, Apr 25, 2012 at 4:07 AM, Nathaniel Smith n...@pobox.com wrote:

 On Wed, Apr 25, 2012 at 4:02 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Tue, Apr 24, 2012 at 8:56 PM, Fernando Perez fperez@gmail.com
  wrote:
 
  On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
   I admit to a certain curiosity about your own involvement in FOSS
   projects,
   and I know I'm not alone in this. Google shows several years of
   discussion
   on Monotone, but I have no idea what your contributions were
 
  Seriously???
 
  Please, let's rise above this.  We discuss people's opinions *on their
  technical merit alone*, regardless of the background of the person
  presenting them.  I don't care if Linus himself shows up on the list
  with a bad idea, it should be shot down; and if someone we'd never
  heard of brings up a valid point, we should respect it.
 
  The day we start checking credentials at the door is the day this
  project will die as an open source effort.  Or at least I think so,
  but perhaps I don't have enough 'commit credits' in my account for my
  opinion to matter...
 
 
  Fernando, I'm not checking credentials, I'm curious. Nathaniel has
  experience with FOSS projects, unlike us first timers, and I'd like to
 know
  what that experience was and what he learned from it. He has also
 mentioned
  Graydon Hoare in connection with RUST, and since Graydon was the prime
 mover
  in Monotone I'd like to know the story of the project.

 Yeah, I don't want to get into resumes and such here, since it'd be
 hard to avoid turning it into one of those whose has a bigger FOSS
 pecking-order contests, which I find both unpleasant and
 counter-productive. If I've learned anything useful from experience,
 then I've already tried to summarize it here (and really, experience
 may or may not guarantee any kind of wisdom). If you want to swap war
 stories, ask me some day over a $BEVERAGE :-).


Well, you have already appealed to the authority of greater experience, so
it's a bit late to declare disinterest in the subject ;) I mean, at this
point I really would like to see how big your FOSS is.


 After sleeping on it, I was wondering if part of your objection to the
 consensus stuff is just to the word veto? Would you feel more
 comfortable if it was phrased like, the maintainers have noticed that
 trying to pick and choose on contentious issues tends to come back and
 bite them, so they've decided that they will not accept changes unless
 they have reasonable certainty that all substantive objections from
 the userbase have been worked through and resolved? It means the same
 thing in the end, but perhaps makes clearer how the power actually
 works.


I don't agree here. People work on open source to scratch an itch, so the
process of making a contribution needs to be easy. Widespread veto makes it
more difficult and instead of opening up the process, closes it down. There
is less freedom, not more. That is one of the reasons that the smaller
scikits attract people, they have more freedom to do what they want and
fewer people to answer to. Scipy also has some of that advantage because
there are a number of packages to choose from. The more strict the process
and the more people to please, the less appealing the environment becomes.
This can be observed in practice and the voluntary nature of FOSS amplifies
the effect.

But in the end, someone has to write the code. Steve McConnell (Code
Complete) estimates that even in carefully planned projects code
construction will take up 60-80 percent of the time and effort. And if the
code isn't written, nothing else matters much. That is why people who write
code are essential to a project, no amount of structure will take their
place. And here again the voluntary nature of FOSS comes into play, folks
can't be ordered to do the work. It can be suggested that certain things be
done, and the desire to work with the group will lead people to do work
they wouldn't consider doing for themselves, but unless they are interested
in a particular feature they won't generally be motivated to sit down and
devote the effort needed to get it done just because someone else wants it.
And they will rightly be offended if anyone demands that they volunteer
their work to implement some feature in a particular way. They have to be
led there, not pushed.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Gael Varoquaux

On Wed, Apr 25, 2012 at 06:03:25AM -0600, Charles R Harris wrote:
 Well, you have already appealed to the authority of greater experience, so
 it's a bit late to declare disinterest in the subject ;) I mean, at this
 point I really would like to see how big your FOSS is.

Chuck, I am not sure that this is helpful for the discussion. I think
that it is a great discussion to have in real life, as it is one of those
in which all participants can learn a lot, but on a mailing list with a
wider diffusion, it can very easily drift in a pissing contest.

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Nathaniel Smith

On Wed, Apr 25, 2012 at 1:03 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 That is one of the reasons that the smaller
 scikits attract people, they have more freedom to do what they want and
 fewer people to answer to. Scipy also has some of that advantage because
 there are a number of packages to choose from. The more strict the process
 and the more people to please, the less appealing the environment becomes.

A quick look shows ~100,000 downloads of 1.6.1 via PyPI. SF.net shows
600,000 numpy downloads in the last 12 months. I'm afraid the numpy
developers have a lot of people to please, whether they like it or not
:-).

OTOH I'm still confused at what kind of strictness you're worried
about in practice. Not too many of those people actually show up on
the mailing list, and usually the problem is convincing those that
*do* show up into actually expressing their needs rather than just
assuming that real developers must know better. Fernando spoke
eloquently in this thread in support of consensus, and IPython doesn't
seem to be laboring under a strict process that's driving away
developers. AFAICT whole-heartedly adopting the consensus idea would
only have actually altered one (!) decision in the project to date,
which is not exactly jack-booted as these things go.

- N
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Travis Oliphant

 
 I don't agree here. People work on open source to scratch an itch, so the 
 process of making a contribution needs to be easy. Widespread veto makes it 
 more difficult and instead of opening up the process, closes it down. There 
 is less freedom, not more. That is one of the reasons that the smaller 
 scikits attract people, they have more freedom to do what they want and fewer 
 people to answer to. Scipy also has some of that advantage because there are 
 a number of packages to choose from. The more strict the process and the more 
 people to please, the less appealing the environment becomes. This can be 
 observed in practice and the voluntary nature of FOSS amplifies the effect.

It is true that it is easier to get developers to contribute to small projects 
where they can control exactly what happens and not have to appeal to a wider 
audience to get code changed and committed.   This effect is well-illustrated 
by the emergence of scikits in the presence of SciPy. 

However, the idea that people work on open source to scratch an itch is 
incomplete.   This is certainly one of the reasons volunteers work on open 
source.There are many people, however, that work on open source as part of 
their job. In the particular instance of the missing data support, Mark did 
much of the work as part of his job.   It wasn't just to scratch an itch.
So, we should not make assumptions on the basis of this incomplete model.  

NumPy is far-beyond the mode of a few people scratching an itch.   It is in 
wide-spread use.   It is a large project with a great deal of history and a 
diverse user-community.   It needs people full-time to help maintain it.It 
needs maintainers who listen actively to anyone who will express their concerns 
cogently.It needs maintainers who recognize that any concern that somebody 
expresses is typically not a unique view.   We cannot expect to find people 
like that who are just interested in scratching an itch and always working 
for free.

Most projects suffer from lack of feedback.   We should be worried about how to 
get more feedback and input from *just users* and be very sensitive to anyone 
feeling like their legitimate concerns are not being heard.  Most people, 
rather than express their concerns, will just work-around the problem, write 
their own stuff, or move on to other languages and approaches.  

Your point about somebody writing the code is absolutely true, I would just 
suggest that the view that FOSS is always just volunteer labor needs to expand. 
 People do work full time on FOSS as part of their job.   We need to bring that 
to NumPy.   I know of at least 2 other people besides me who are actively 
trying to make this possible.   

At Continuum we offer the opportunity to work on NumPy.   We plan to continue 
this.  We are hiring.In this context, I'm especially interested in making 
sure that it's not just the developers who get to decide what happens to NumPy. 
  Nathaniel has clarified very well what veto-power really means.  It's not 
absolute, it just means that users who write clear arguments get listened to 
actively.   It doesn't replace the need for developers with wisdom and 
understanding of user-experiences, but active listening is a useful skill 
that we could all improve on:  http://en.wikipedia.org/wiki/Active_listening
A list full of bright, interested, active listeners is the kind of culture we 
need on this list.  It's the kind of attitude we need from maintainers of 
NumPy. 

-Travis






 
 But in the end, someone has to write the code. Steve McConnell (Code 
 Complete) estimates that even in carefully planned projects code construction 
 will take up 60-80 percent of the time and effort. And if the code isn't 
 written, nothing else matters much. That is why people who write code are 
 essential to a project, no amount of structure will take their place. And 
 here again the voluntary nature of FOSS comes into play, folks can't be 
 ordered to do the work. It can be suggested that certain things be done, and 
 the desire to work with the group will lead people to do work they wouldn't 
 consider doing for themselves, but unless they are interested in a particular 
 feature they won't generally be motivated to sit down and devote the effort 
 needed to get it done just because someone else wants it. And they will 
 rightly be offended if anyone demands that they volunteer their work to 
 implement some feature in a particular way. They have to be led there, not 
 pushed.
 
 Chuck
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Matthew Brett

Hi,

On Wed, Apr 25, 2012 at 9:39 AM, Travis Oliphant tra...@continuum.io wrote:

 I don't agree here. People work on open source to scratch an itch, so the
 process of making a contribution needs to be easy. Widespread veto makes it
 more difficult and instead of opening up the process, closes it down. There
 is less freedom, not more. That is one of the reasons that the smaller
 scikits attract people, they have more freedom to do what they want and
 fewer people to answer to. Scipy also has some of that advantage because
 there are a number of packages to choose from. The more strict the process
 and the more people to please, the less appealing the environment becomes.
 This can be observed in practice and the voluntary nature of FOSS amplifies
 the effect.


 It is true that it is easier to get developers to contribute to small
 projects where they can control exactly what happens and not have to appeal
 to a wider audience to get code changed and committed.   This effect is
 well-illustrated by the emergence of scikits in the presence of SciPy.

 However, the idea that people work on open source to scratch an itch is
 incomplete.

Do you agree that Numpy has not been very successful in recruiting and
maintaining new developers compared to its large user-base?

Compared to - say - Sympy?

Why do you think this is?

Would you consider asking that question directly on list and asking
for the most honest possible answers?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Andreas H.

 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?
 
 Compared to - say - Sympy?
 
 Why do you think this is?

I don't know about SymPy. But in my view (and I'm just a typical user of
NumPy), numpy seems to be at the base of what people actually need to
do. I would assume most users of numpy actually use it because it's
underlying piece of software, i.e. SciPy. It provides convenient, fast
array structures to do maths. I would assume that most users see numpy
as infrastructure, they write their own code on top of it. As a normal
user of numpy, I wouldn't know where it would need improvement to suit
my needs because it already does all I need. (Okay, masked arrays are
something which could definitely improve, but that's another story.)

This is different from other, higher-level FOSS projects, which are
closer to end user final requirements, where end users might be more
compelled to contribute because it's closer to what they're actually
doing. For example, I just wrote two enhancements to scipy.interpolate,
which were / will be merged recently / soon.

Plus, numpy is a lot of C code, and to me (again, as a user) it seems
more complicated to contribute because things are not as isolated.

Just my 2 ct.

Andreas.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Alan G Isaac

On 4/25/2012 4:51 PM, Andreas H. wrote:
 I would assume that most users see numpy
 as infrastructure, they write their own code on top of it. As a normal
 user of numpy, I wouldn't know where it would need improvement to suit
 my needs because it already does all I need. (Okay, masked arrays are
 something which could definitely improve, but that's another story.)

 This is different from other, higher-level FOSS projects,


Thank you Andreas.  I was debating whether to explain exactly this,
to point out that I found Matthew's question inappropriately aggressive,
or both. Now I can do both in a flash.

But I find I would also like to once again say thank you to the
developers, who have given us an amazing piece of software.
I would add that I am impressed by the deep respect they show
each other even when dealing with hard issues.

Alan Isaac
Just another grateful user for many years.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Adam Hughes

I too have to agree with Andreas.  I have been using Numpy for years in my
work, but am not versed in C so I don't even understand what numpy is doing
under the hood.  I too would only be able to contribute to the code at the
python level, or as Andreas said, at improving SciPy packages and other
Numpy-based projects.

One area that you may be able to get more help from the general user base
is with publicity, tutorials and word-of-mouth.  I had recently shown Numpy
to a friend who was versed in matlab, and he was really impressed because
Numpy is easily incorporated into more general Python scripts.  I've worked
a lot with the Enthought Tool Suite and shown off some of that to my
colleagues.  They are impressed at the streamlined code-to-visuals process
although I don't think they even realize that Numpy is responsible for all
the numerics in the program.  To this end, I think outreach would be
helpful in recruiting new programmers.  Once they understand that Numpy
does a lot at the C-level and that it is not strictly a Python feature,
they may realize its something that they can contribute to.

On Wed, Apr 25, 2012 at 5:04 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 4/25/2012 4:51 PM, Andreas H. wrote:
  I would assume that most users see numpy
  as infrastructure, they write their own code on top of it. As a normal
  user of numpy, I wouldn't know where it would need improvement to suit
  my needs because it already does all I need. (Okay, masked arrays are
  something which could definitely improve, but that's another story.)
 
  This is different from other, higher-level FOSS projects,


 Thank you Andreas.  I was debating whether to explain exactly this,
 to point out that I found Matthew's question inappropriately aggressive,
 or both. Now I can do both in a flash.

 But I find I would also like to once again say thank you to the
 developers, who have given us an amazing piece of software.
 I would add that I am impressed by the deep respect they show
 each other even when dealing with hard issues.

 Alan Isaac
 Just another grateful user for many years.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Travis Oliphant

 
 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?
 
 Compared to - say - Sympy?
 
 Why do you think this is?

I think it's mostly because it's infrastructure that is a means to an end.   I 
certainly wasn't excited to have to work on NumPy originally, when my main 
interest was SciPy.I've come to love the interesting plateau that NumPy 
lives on.But, I think it mostly does the job it is supposed to do. The 
fact that it is in C is also not very sexy.   It is also rather complicated 
with a lot of inter-related parts.   

I think NumPy could do much, much more --- but getting there is going to be a 
challenge of execution and education.  

You can get to know the code base.  It just takes some time and patience.   You 
also have to be comfortable with compilers and building software just to tweak 
the code. 


 
 Would you consider asking that question directly on list and asking
 for the most honest possible answers?

I'm always interested in honest answers and welcome any sincere perspective.  

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Matthew Brett

Hi,

On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote:

 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?

 Compared to - say - Sympy?

 Why do you think this is?

 I think it's mostly because it's infrastructure that is a means to an end.   
 I certainly wasn't excited to have to work on NumPy originally, when my main 
 interest was SciPy.    I've come to love the interesting plateau that NumPy 
 lives on.    But, I think it mostly does the job it is supposed to do.     
 The fact that it is in C is also not very sexy.   It is also rather 
 complicated with a lot of inter-related parts.

 I think NumPy could do much, much more --- but getting there is going to be a 
 challenge of execution and education.

 You can get to know the code base.  It just takes some time and patience.   
 You also have to be comfortable with compilers and building software just to 
 tweak the code.



 Would you consider asking that question directly on list and asking
 for the most honest possible answers?

 I'm always interested in honest answers and welcome any sincere perspective.

Of course, there are potential explanations:

1) Numpy is too low-level for most people
2) The C code is too complicated
3) It's fine already, more or less

are some obvious ones. I would say there are the easy answers. But of
course, the easy answer may not be the right answer. It may not be
easy to get right answer [1].   As you can see from Alan Isaac's reply
on this thread, even asking the question can be taken as being in bad
faith.  In that situation, I think you'll find it hard to get sincere
replies.

Best,

Matthew

[1] http://en.wikipedia.org/wiki/Good_to_Great
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread josef . pktd

On Wed, Apr 25, 2012 at 5:54 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote:

 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?

 Compared to - say - Sympy?

 Why do you think this is?

 I think it's mostly because it's infrastructure that is a means to an end.   
 I certainly wasn't excited to have to work on NumPy originally, when my main 
 interest was SciPy.    I've come to love the interesting plateau that NumPy 
 lives on.    But, I think it mostly does the job it is supposed to do.     
 The fact that it is in C is also not very sexy.   It is also rather 
 complicated with a lot of inter-related parts.

 I think NumPy could do much, much more --- but getting there is going to be 
 a challenge of execution and education.

 You can get to know the code base.  It just takes some time and patience.   
 You also have to be comfortable with compilers and building software just to 
 tweak the code.



 Would you consider asking that question directly on list and asking
 for the most honest possible answers?

 I'm always interested in honest answers and welcome any sincere perspective.

 Of course, there are potential explanations:

 1) Numpy is too low-level for most people
 2) The C code is too complicated
 3) It's fine already, more or less

 are some obvious ones. I would say there are the easy answers. But of
 course, the easy answer may not be the right answer. It may not be
 easy to get right answer [1].   As you can see from Alan Isaac's reply
 on this thread, even asking the question can be taken as being in bad
 faith.  In that situation, I think you'll find it hard to get sincere
 replies.

I don't see why this shouldn't be the sincere replies, I think these
easy answers are also the right answer for most people.

maybe I would add
4) writing code for a few hundred thousand users is a big
responsibility and a bit scary

Except for a few core c developers, most contributors contribute to
parts of numpy, best example Pierre and masked arrays, or specific
functions. Life goes on for most developers in the application areas,
I guess. For example I'm very glad about the time that Pauli is
spending on scipy.

numpy is great [1]

Josef


 Best,

 Matthew

 [1] http://en.wikipedia.org/wiki/Good_to_Great

[1]
http://sourceforge.net/projects/numpy/files/stats/timeline?dates=2000-01-11+to+2012-04-25
http://qa.debian.org/popcon.php?package=python-numpy

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Benjamin Root

On Wednesday, April 25, 2012, Matthew Brett wrote:

 Hi,

 On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant 
 tra...@continuum.iojavascript:;
 wrote:
 
  Do you agree that Numpy has not been very successful in recruiting and
  maintaining new developers compared to its large user-base?
 
  Compared to - say - Sympy?
 
  Why do you think this is?
 
  I think it's mostly because it's infrastructure that is a means to an
 end.   I certainly wasn't excited to have to work on NumPy originally, when
 my main interest was SciPy.I've come to love the interesting plateau
 that NumPy lives on.But, I think it mostly does the job it is supposed
 to do. The fact that it is in C is also not very sexy.   It is also
 rather complicated with a lot of inter-related parts.
 
  I think NumPy could do much, much more --- but getting there is going to
 be a challenge of execution and education.
 
  You can get to know the code base.  It just takes some time and
 patience.   You also have to be comfortable with compilers and building
 software just to tweak the code.
 
 
 
  Would you consider asking that question directly on list and asking
  for the most honest possible answers?
 
  I'm always interested in honest answers and welcome any sincere
 perspective.

 Of course, there are potential explanations:

 1) Numpy is too low-level for most people
 2) The C code is too complicated
 3) It's fine already, more or less

 are some obvious ones. I would say there are the easy answers. But of
 course, the easy answer may not be the right answer. It may not be
 easy to get right answer [1].   As you can see from Alan Isaac's reply
 on this thread, even asking the question can be taken as being in bad
 faith.  In that situation, I think you'll find it hard to get sincere
 replies.


As with anything, the phrasing of a question makes a world of a difference
with regards to replies. Ask any pollster.  When phrased correctly, I would
not have any doubt about the sincerely of replies, and I would not worry
about previewed hostility -- when phrased correctly. As the questioner, the
onus is upon you to gauge the community and adjust the question
appropriately.

I think the fact that we engage in these discussions show that we value and
care about each others perceptions and opinions with regards to numpy.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Matthew Brett

Hi,

On Wed, Apr 25, 2012 at 1:35 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Wed, Apr 25, 2012 at 9:39 AM, Travis Oliphant tra...@continuum.io wrote:

 I don't agree here. People work on open source to scratch an itch, so the
 process of making a contribution needs to be easy. Widespread veto makes it
 more difficult and instead of opening up the process, closes it down. There
 is less freedom, not more. That is one of the reasons that the smaller
 scikits attract people, they have more freedom to do what they want and
 fewer people to answer to. Scipy also has some of that advantage because
 there are a number of packages to choose from. The more strict the process
 and the more people to please, the less appealing the environment becomes.
 This can be observed in practice and the voluntary nature of FOSS amplifies
 the effect.


 It is true that it is easier to get developers to contribute to small
 projects where they can control exactly what happens and not have to appeal
 to a wider audience to get code changed and committed.   This effect is
 well-illustrated by the emergence of scikits in the presence of SciPy.

 However, the idea that people work on open source to scratch an itch is
 incomplete.

 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?

 Compared to - say - Sympy?

 Why do you think this is?

 Would you consider asking that question directly on list and asking
 for the most honest possible answers?

Aha - I now realize that I was reading too quickly under the influence
(again) of too much caffeine, and missed this part of Travis' email:

  In this context, I'm especially interested
 in making sure that it's not just the developers who get to decide what
 happens to NumPy.   Nathaniel has clarified very well what veto-power
 really means.  It's not absolute, it just means that users who write clear
 arguments get listened to actively.   It doesn't replace the need for
 developers with wisdom and understanding of user-experiences, but active
 listening is a useful skill that we could all improve on:
  http://en.wikipedia.org/wiki/Active_listeningA list full of bright,
 interested, active listeners is the kind of culture we need on this list.
  It's the kind of attitude we need from maintainers of NumPy.

which mostly answers my worry, and I apologize for pushing on an open door.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread David Cournapeau

On Wed, Apr 25, 2012 at 10:54 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io
 wrote:
 
  Do you agree that Numpy has not been very successful in recruiting and
  maintaining new developers compared to its large user-base?
 
  Compared to - say - Sympy?
 
  Why do you think this is?
 
  I think it's mostly because it's infrastructure that is a means to an
 end.   I certainly wasn't excited to have to work on NumPy originally, when
 my main interest was SciPy.I've come to love the interesting plateau
 that NumPy lives on.But, I think it mostly does the job it is supposed
 to do. The fact that it is in C is also not very sexy.   It is also
 rather complicated with a lot of inter-related parts.
 
  I think NumPy could do much, much more --- but getting there is going to
 be a challenge of execution and education.
 
  You can get to know the code base.  It just takes some time and
 patience.   You also have to be comfortable with compilers and building
 software just to tweak the code.
 
 
 
  Would you consider asking that question directly on list and asking
  for the most honest possible answers?
 
  I'm always interested in honest answers and welcome any sincere
 perspective.

 Of course, there are potential explanations:

 1) Numpy is too low-level for most people
 2) The C code is too complicated
 3) It's fine already, more or less

 are some obvious ones. I would say there are the easy answers. But of
 course, the easy answer may not be the right answer. It may not be
 easy to get right answer [1].   As you can see from Alan Isaac's reply
 on this thread, even asking the question can be taken as being in bad
 faith.  In that situation, I think you'll find it hard to get sincere
 replies.


While I don't think jumping into NumPy C code is as difficult as some
people made it to be, I think numpy reaped most of the low-hanging fruits,
and is now at a stage where it requires massive investment to get
significantly better.

I would suggest a different question, whose answer may serve as a proxy to
uncover the lack of contributions: what needs to be done in NumPy, and how
can we make it simpler for newcommers ? Here is an incomplete,
unshamelessly biased list:

  - Less dependencies on CPython internals
  - Allow for 3rd parties to extend numpy at the C level in more
fundamental ways (e.g. I wished something like half-float dtype could be
more easily developed out of tree)
  - Separate memory representation from higher level representation
(slicing, broadcasting, etc…), to allow arrays to sit on non-contiguous
memory areas, etc…
  - Test and performance infrastructure so we can track our evolution, get
coverage of our C code, etc…
  - Fix bugs
  - Better integration with 3rd party on-disk storage (database, etc…)

None of that is particularly simple nor has a fast learning curve, except
for fixing bugs and maybe some of the infrastructure. I think most of this
is necessary for the things Travis talked about a few weeks ago.

What could make contributions easier:
  - different levels of C API documentation (still lacking anything besides
reference)
  - ways to detect early when we break ABI, slightly more obscure platforms
(we need good CI, ways to publish binaries that people can easily test,
etc...)
  - improve infrastructure so that we can focus on the things we want to
work on (improve the dire situation with bug tracking, etc…)

Also, lots of people just don't know/want to know C. But people with say
web skills would be welcome: we have a website that could use some help…

So
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Matthew Brett

Hi,

On Wed, Apr 25, 2012 at 3:24 PM,  josef.p...@gmail.com wrote:
 On Wed, Apr 25, 2012 at 5:54 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io wrote:

 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?

 Compared to - say - Sympy?

 Why do you think this is?

 I think it's mostly because it's infrastructure that is a means to an end.  
  I certainly wasn't excited to have to work on NumPy originally, when my 
 main interest was SciPy.    I've come to love the interesting plateau that 
 NumPy lives on.    But, I think it mostly does the job it is supposed to 
 do.     The fact that it is in C is also not very sexy.   It is also rather 
 complicated with a lot of inter-related parts.

 I think NumPy could do much, much more --- but getting there is going to be 
 a challenge of execution and education.

 You can get to know the code base.  It just takes some time and patience.   
 You also have to be comfortable with compilers and building software just 
 to tweak the code.



 Would you consider asking that question directly on list and asking
 for the most honest possible answers?

 I'm always interested in honest answers and welcome any sincere perspective.

 Of course, there are potential explanations:

 1) Numpy is too low-level for most people
 2) The C code is too complicated
 3) It's fine already, more or less

 are some obvious ones. I would say there are the easy answers. But of
 course, the easy answer may not be the right answer. It may not be
 easy to get right answer [1].   As you can see from Alan Isaac's reply
 on this thread, even asking the question can be taken as being in bad
 faith.  In that situation, I think you'll find it hard to get sincere
 replies.

 I don't see why this shouldn't be the sincere replies, I think these
 easy answers are also the right answer for most people.

I wasn't saying these replies are not sincere, of course they are factors.

I have heard other people give reasons why they didn't enjoy numpy
development much, but I can't speak for them, only for me.

I have done some numpy development, but very little.

I've done a moderate amount of scipy development.

I have considered doing more numpy development, in particular, I did
want to do some work on the longdouble parts of numpy.

Part of the reason I didn't do this was because, when I raised the
question on the list, it did not seem there was much interest in a
change, or even a real discussion.

Partly from the masked array discussions, but not only, it seemed that
the process of making decisions was not clear, and there seemed to be
as many views about how this was done as there were developers.

I suppose I'd summarize the atmosphere, as I have have felt it, as
being that numpy was owned by someone else, and I wasn't quite sure
who that was, but I was fairly sure it wasn't me.   On the other hand,
in some projects at least - of which Sympy is the most obvious
example, I think it's easy to feel that all of us own Sympy (and I've
only made one commit to Sympy, and that of someone else's idea).

Adding to that, it does seem to me that the atmosphere on this list
get ugly sometimes.  In particular it seems to me that there's a sort
of conformity that starts to emerge in which people feel it is
necessary to praise or criticize people, but not the arguments.   I
suppose that is because there was a long time during which Travis was
not on the list to model what kind of discussion he wanted.  I'm glad
that has changed now.

The reason I keep returning to process, even though it is
'non-technical' - is because it seems to me that the atmosphere that
I'm describing will have the strong effect of discouraging
enthusiastic developers.  It certainly discourages me.  I don't think
open-source software is just developers scratching an itch, I think
it's about community, and the pleasure of working with people you like
and trust, to do something you think is important.

If I've made that harder, then I am sorry, and I'm very happy to hear
why that is, and how I can help.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread josef . pktd

On Wed, Apr 25, 2012 at 7:08 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Wed, Apr 25, 2012 at 3:24 PM,  josef.p...@gmail.com wrote:
 On Wed, Apr 25, 2012 at 5:54 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant tra...@continuum.io 
 wrote:

 Do you agree that Numpy has not been very successful in recruiting and
 maintaining new developers compared to its large user-base?

 Compared to - say - Sympy?

 Why do you think this is?

 I think it's mostly because it's infrastructure that is a means to an end. 
   I certainly wasn't excited to have to work on NumPy originally, when my 
 main interest was SciPy.    I've come to love the interesting plateau that 
 NumPy lives on.    But, I think it mostly does the job it is supposed to 
 do.     The fact that it is in C is also not very sexy.   It is also 
 rather complicated with a lot of inter-related parts.

 I think NumPy could do much, much more --- but getting there is going to 
 be a challenge of execution and education.

 You can get to know the code base.  It just takes some time and patience.  
  You also have to be comfortable with compilers and building software just 
 to tweak the code.



 Would you consider asking that question directly on list and asking
 for the most honest possible answers?

 I'm always interested in honest answers and welcome any sincere 
 perspective.

 Of course, there are potential explanations:

 1) Numpy is too low-level for most people
 2) The C code is too complicated
 3) It's fine already, more or less

 are some obvious ones. I would say there are the easy answers. But of
 course, the easy answer may not be the right answer. It may not be
 easy to get right answer [1].   As you can see from Alan Isaac's reply
 on this thread, even asking the question can be taken as being in bad
 faith.  In that situation, I think you'll find it hard to get sincere
 replies.

 I don't see why this shouldn't be the sincere replies, I think these
 easy answers are also the right answer for most people.

 I wasn't saying these replies are not sincere, of course they are factors.

 I have heard other people give reasons why they didn't enjoy numpy
 development much, but I can't speak for them, only for me.

 I have done some numpy development, but very little.

 I've done a moderate amount of scipy development.

 I have considered doing more numpy development, in particular, I did
 want to do some work on the longdouble parts of numpy.

 Part of the reason I didn't do this was because, when I raised the
 question on the list, it did not seem there was much interest in a
 change, or even a real discussion.

 Partly from the masked array discussions, but not only, it seemed that
 the process of making decisions was not clear, and there seemed to be
 as many views about how this was done as there were developers.

 I suppose I'd summarize the atmosphere, as I have have felt it, as
 being that numpy was owned by someone else, and I wasn't quite sure
 who that was, but I was fairly sure it wasn't me.   On the other hand,
 in some projects at least - of which Sympy is the most obvious
 example, I think it's easy to feel that all of us own Sympy (and I've
 only made one commit to Sympy, and that of someone else's idea).

 Adding to that, it does seem to me that the atmosphere on this list
 get ugly sometimes.  In particular it seems to me that there's a sort
 of conformity that starts to emerge in which people feel it is
 necessary to praise or criticize people, but not the arguments.   I
 suppose that is because there was a long time during which Travis was
 not on the list to model what kind of discussion he wanted.  I'm glad
 that has changed now.

 The reason I keep returning to process, even though it is
 'non-technical' - is because it seems to me that the atmosphere that
 I'm describing will have the strong effect of discouraging
 enthusiastic developers.  It certainly discourages me.  I don't think
 open-source software is just developers scratching an itch, I think
 it's about community, and the pleasure of working with people you like
 and trust, to do something you think is important.

Except for the big changes like NA and datetime, I think the debate is
pretty boring.
The main problem that I see for discussing technical issues is whether
there are many
developers really interested in commenting on code and coding.
I think it mostly comes down to the discussion on tickets or pull requests.

First my own experience with scipy.stats. Most of the time when I was
cleaning up scipy.stats,
I was alone, except for some helpful comments by Robert. My itch was that the
bugs in scipy.stats were bugging me, and I just kept working and
committing without
code review until the bugs that I thought urgent were gone.
Now, with Warren and Ralf also working on scipy.stats it is a lot more
fun, since there
is actually a regular community of 3 developers.

My impression (since I only

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Travis Oliphant


On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com wrote:

 
 Except for the big changes like NA and datetime, I think the debate is
 pretty boring.
 The main problem that I see for discussing technical issues is whether
 there are many
 developers really interested in commenting on code and coding.
 I think it mostly comes down to the discussion on tickets or pull requests.

This is a very insightful comment.   Github has been a great thing for both 
NumPy and SciPy.   However, it has changed the community feel for many because 
these pull request discussions don't happen on this list. 

You have to comment on a pull request to get notified of future comments or 
changes.The process is actually pretty nice, but it does mean you can't 
just hang out watching this list.  You have to look at the pull requests and 
get involved there. 

It would be nice if every pull request created a message to this list.Is 
that even possible? 

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Benjamin Root

On Wednesday, April 25, 2012, Travis Oliphant wrote:


 On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com javascript:; wrote:

 
  Except for the big changes like NA and datetime, I think the debate is
  pretty boring.
  The main problem that I see for discussing technical issues is whether
  there are many
  developers really interested in commenting on code and coding.
  I think it mostly comes down to the discussion on tickets or pull
 requests.

 This is a very insightful comment.   Github has been a great thing for
 both NumPy and SciPy.   However, it has changed the community feel for many
 because these pull request discussions don't happen on this list.

 You have to comment on a pull request to get notified of future comments
 or changes.The process is actually pretty nice, but it does mean you
 can't just hang out watching this list.  You have to look at the pull
 requests and get involved there.

 It would be nice if every pull request created a message to this list.
  Is that even possible?

 -Travis


This ha been a concern of mine for matplotlib as well.  The closest I can
come is to set up an RSS feed, but all the titles are PR # and a action, so
I lose track of which ones I want to view.

All devs get an initial email for each PR, but I cant figure out how to get
that down to the public list and it is hard to know if another dev took
care of the PR or if it is just waiting.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Jason Grout

On 4/25/12 8:11 PM, Travis Oliphant wrote:

 On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com wrote:


 Except for the big changes like NA and datetime, I think the debate is
 pretty boring.
 The main problem that I see for discussing technical issues is whether
 there are many
 developers really interested in commenting on code and coding.
 I think it mostly comes down to the discussion on tickets or pull requests.

 This is a very insightful comment.   Github has been a great thing for both 
 NumPy and SciPy.   However, it has changed the community feel for many 
 because these pull request discussions don't happen on this list.

 You have to comment on a pull request to get notified of future comments or 
 changes.The process is actually pretty nice, but it does mean you can't 
 just hang out watching this list.  You have to look at the pull requests and 
 get involved there.

 It would be nice if every pull request created a message to this list.Is 
 that even possible?

Sure.  Github has a pretty extensive hook system that can notify (via 
hitting a URL) about lots of events.

https://github.com/blog/964-all-of-the-hooks

http://developer.github.com/v3/repos/hooks/

I haven't actually used it (just read the docs), so I may be mistaken...

Thanks,

Jason
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Puneeth Chaganti

On Thu, Apr 26, 2012 at 6:41 AM, Travis Oliphant tra...@continuum.io wrote:
[snip]

 It would be nice if every pull request created a message to this list.    Is 
 that even possible?

That is definitely possible and shouldn't be too hard to do, like
Jason said.  But that can potentially cause some confusion, with some
of the discussion starting off in the mailing list, and some of the
discussion happening on the pull-request itself.  Are my concerns
justified?

--
Puneeth
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Jason Grout

On 4/25/12 11:08 PM, Puneeth Chaganti wrote:
 On Thu, Apr 26, 2012 at 6:41 AM, Travis Oliphanttra...@continuum.io  wrote:
 [snip]

 It would be nice if every pull request created a message to this list.Is 
 that even possible?

 That is definitely possible and shouldn't be too hard to do, like
 Jason said.  But that can potentially cause some confusion, with some
 of the discussion starting off in the mailing list, and some of the
 discussion happening on the pull-request itself.  Are my concerns
 justified?

It wouldn't be too hard to have mailing list replies sent back to the 
pull request as comments (again, using the github API).  Already, if 
you're on a ticket, you can just reply to a comment email and the reply 
is put as a comment in the pull request.

Jason


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread srean

On Wed, Apr 25, 2012 at 11:08 PM, Puneeth Chaganti puncha...@gmail.com wrote:
 On Thu, Apr 26, 2012 at 6:41 AM, Travis Oliphant tra...@continuum.io wrote:
 [snip]

 It would be nice if every pull request created a message to this list.    Is 
 that even possible?

 That is definitely possible and shouldn't be too hard to do, like
 Jason said.  But that can potentially cause some confusion, with some
 of the discussion starting off in the mailing list, and some of the
 discussion happening on the pull-request itself.  Are my concerns
 justified?

Related issue: some projects have an user's list and a devel list. It
might be worth (re?)considering that option. They have their pros and
cons but I think I like the idea of a devel list and seperate help
wanted list.

Something else that might be  helpful for contentious threads is a
stack-overflowesque system where readers can vote up responses of
others. Sometimes just a i agree i disagree goes a long way,
especially when you have many lurkers.

On something else that was brought up: I do not consider myself
competent/prepared enough to take on development, but it is not the
case that I have _never_ felt the temptation. What I have found
intimidating and styming is the perceived politics over development
issues.  The two places where I have felt this are a) on contentious
threads on the list and b) what seems like legitimate patches tickets
on trac that seem to be languishing for no compelling technical
reason. I would be hardpressed to quote specifics, but I have
encountered this feeling a few times.

 For my case it would not have mattered, because I doubt I would have
contriuted anything useful. However, it might be the case that more
competent lurkers might have felt the same way. The possibility of a
patch relegated semipermanently to trac, or the possibility of getting
caught up in the politics is bit of a disincentive. This is just an
honest perception/observation.

I am more of a get on with it, get the code out and rest will resolve
itself eventually kind of a guy, thus long
political/philosophical/epistemic threads distance me. I know there
are legitimate reasons to have this discussions. But it seems to me
that they get a bit too wordy here sometimes.

My 10E-2.

-- srean
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Fernando Perez

On Wed, Apr 25, 2012 at 6:28 PM, Benjamin Root ben.r...@ou.edu wrote:
 It would be nice if every pull request created a message to this list.
  Is that even possible?

 -Travis


 This ha been a concern of mine for matplotlib as well.  The closest I can
 come is to set up an RSS feed, but all the titles are PR # and a action, so
 I lose track of which ones I want to view.

Same here for IPython.  If anybody figures out a clean solution,
please advertise it!  I think a bunch of us want the same thing...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Ralf Gommers

On Tue, Apr 24, 2012 at 12:46 AM, Chris Barker chris.bar...@noaa.govwrote:

 On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io
 wrote:
  Right now we are trying to balance difficult things:  stable releases
 with experimental development.

 Perhaps a more formal development release system could help here.
 IIUC, numpy pretty much has two things: the latest release (and past
 ones) and master (and assorted experimentla branches). If someone
 develops a new feature, we can either:

 have them submit a pull request, and people with the where-with-all
 can pull it, compile, it, and start tesing it on their own -- hsitory
 shows that this is a small group.

 merge it with master -- and hope it gets the testing is should before
 it becomes part of a release, but: we are rightly heistant to put
 experimental stuff in master, and it really dont' get that much
 testing -- again only folks that are building master will even see it.


 Some projects have a more format development release system.
 wxPython, for instance has had for years development releases with odd
 numbers -- right now, the official release is 2.8.*, but there is a
 2.9.* out there that is getting some use and testing. A couple of
 things help make this work:

 1) Robin makes the effort to put out binaries for development releases
 -- it's easy to go get and give it a try.


This is a good idea - not for development releases but for master. Building
nightly/weekly binaries would help more people try out new features.


 2) there is the wxversion system that makes it easy to install a new
 versin of wx, and easily switch between them (it's actually broken on
 OS-X right now --- :-) ) -- this pre-dated virtualenv and friends,
 maybe virtualenv is enough for this now.


wxversion was broken for a long time on Ubuntu too (~5 yrs ago). I don't
exactly remember it as a good idea.

virtualenv also doesn't help, because if you can use that you know how to
build from source anyway.

Ralf




 Anyway, it's a thought -- I think some more rea-world use of new
 features before a real commitment to adopting them would be great.

 -Chris




 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.comwrote:

 On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za
 wrote:
  If you are referring to the traditional concept of a fork, and not to
  the type we frequently make on GitHub, then I'm surprised that no one
  has objected already.  What would a fork solve? To paraphrase the
  regexp saying: after forking, we'll simply have two problems.

 I concur with you here: github 'forks', yes, as many as possible!
 Hopefully every one of those will produce one or more PRs :)  But a
 fork in the sense of a divergent parallel project?  I think that would
 only be indicative of a complete failure to find a way to make
 progress here, and I doubt we're anywhere near that state.

 That forks are *possible* is indeed a valuable and important option in
 open source software, because it means that a truly dysfunctional
 original project team/direction can't hold a community hostage
 forever.  But that doesn't mean that full-blown forks should be
 considered lightly, as they also carry enormous costs.

 I see absolutely nothing in the current scenario to even remotely
 consider that a full-blown fork would be a good idea, and I hope I'm
 right.  It seems to me we're making progress on problems that led to
 real difficulties last year, but from multiple parties I see signs
 that give me reason to be optimistic that the project is getting
 better, not worse.


We certainly aren't there at the moment, but I can see us heading that way.
But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since
then datetime, NA, polynomial work, and various other enhancements have
gone in along with some 280 bug fixes. The major technical problem blocking
a 1.7 release is getting datetime working reliably on windows. So I think
that is where the short term effort needs to be. Meanwhile, we are spending
effort to get out a 1.6.2 just so people can work with a stable version
with some of the bug fixes, and potentially we will spend more time and
effort to pull out the NA code. In the future there may be a transition to
C++ and eventually a break with the current ABI. Or not.

There are at least two motivations that get folks to write code for open
source projects, scratching an itch and money. Money hasn't been a big part
of the Numpy picture so far, so that leaves scratching an itch. One of the
attractions of Numpy is that it is a small project, BSD licensed, and not
overburdened with governance and process. This makes scratching an itch not
as difficult as it would be in a large project. If Numpy remains a small
project but acquires the encumbrances of a big project much of that
attraction will be lost. Momentum and direction also attracts people, but
numpy is stalled at the moment as the whole NA thing circles around once
again.

What would I suggest as a way forward with the NA option. Let's take the
issues.

1) Adding slots to PyArrayObject_fields. I don't think this is likely to be
a problem unless someone's code passes the struct by value or uses
assignment to initialize a statically allocated instance. I'm not saying no
one does that, low level scientific code can contain all sorts of bizarre
and astonishing constructs and it is also possible that these sort of
things might turn up in an old FORTRAN program. The question here is
whether to allow any changes at all, and I think we will have to in the
future. Given that, consistent use of accessors will make later changes to
the organization or implementation of the base structure transparent. Numpy
itself now uses accessors for the heritage slots, but not for the new NA
slots. So I suggest at a minimum adding accessors for the maskna_dtype,
maskna_data, and maskna_strides. Of course, later removing these slots will
still remain a problem.

2) NA. This breaks down into API and implementation issues. Personally, I
think marking the NA stuff experimental leaves room to modify both and
would prefer to go with what we have and change it into whatever looks best
by modification through pull requests. This kicks the can down the road,
but not so far that people sufficiently interested in working on the topic
can't get modifications in. My own preferences for future API modifications
are as follows.

a) All arrays should be implicitly masked, even if the mask isn't initially
allocated. The maskna keyword can then be removed, taking with it the sense
that there are two kinds of arrays.

b) There needs to be a distinction between missing and ignore. The
mechanism for this is already in place in the payload type, although it
isn't clear to me that that is uniformly used in all the NA code. There is
also a place for missing *and* ignored. Which leads to

c) Sums, etc. should always skip ignored data. If missing data is present,
but not ignored, then a sum should return NA. The main danger I see here is
that the behavior of arrays becomes state dependent, something that can
lead to subtle

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Pierre Haessig

Hi,

Le 24/04/2012 15:14, Charles R Harris a écrit :

 a) All arrays should be implicitly masked, even if the mask isn't
 initially allocated. The maskna keyword can then be removed, taking
 with it the sense that there are two kinds of arrays.


From my lazy user perspective, having masked and non-masked arrays share
the same look and feel would be a number one advantage over the
existing numpy.ma arrays. I would like masked array to be as transparent
as possible.

 b) There needs to be a distinction between missing and ignore. The
 mechanism for this is already in place in the payload type, although
 it isn't clear to me that that is uniformly used in all the NA code.
 There is also a place for missing *and* ignored. Which leads to

If the idea of having two payloads is to avoid a maximum of skipna 
friends extra keywords, I would like it much. My feeling with my small
experience with R is that I end up calling every function with a
different magical set of keywords (na.rm, na.action, ... and I forgot).

My 2 lazy user cents...

Best,
Pierre



signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread josef . pktd

On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig
pierre.haes...@crans.org wrote:
 Hi,

 Le 24/04/2012 15:14, Charles R Harris a écrit :

 a) All arrays should be implicitly masked, even if the mask isn't
 initially allocated. The maskna keyword can then be removed, taking
 with it the sense that there are two kinds of arrays.


 From my lazy user perspective, having masked and non-masked arrays share
 the same look and feel would be a number one advantage over the
 existing numpy.ma arrays. I would like masked array to be as transparent
 as possible.

I don't have any opinion about internal implementation.

But users needs to be aware of whether they have masked arrays or not.
Since many functions (most of scipy) wouldn't know how to handle NA
and don't do any checks, (and shouldn't in my opinion if the NA check
is costly). The result might be silently wrong numbers depending on
the implementation.


 b) There needs to be a distinction between missing and ignore. The
 mechanism for this is already in place in the payload type, although
 it isn't clear to me that that is uniformly used in all the NA code.
 There is also a place for missing *and* ignored. Which leads to

 If the idea of having two payloads is to avoid a maximum of skipna 
 friends extra keywords, I would like it much. My feeling with my small
 experience with R is that I end up calling every function with a
 different magical set of keywords (na.rm, na.action, ... and I forgot).

There is a reason for requiring the user to decide what to do about NA's.
Either we have utility functions/methods to help the user change the
arrays and treat NA's before calling a function, or the function needs
to ask the user what should be done about possible NAs.
Doing it automatically might only be useful for specialised packages.

My 2c

Josef


 My 2 lazy user cents...

 Best,
 Pierre


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Matthew Brett

Hi,

On Tue, Apr 24, 2012 at 6:14 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com
 wrote:

 On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za
 wrote:
  If you are referring to the traditional concept of a fork, and not to
  the type we frequently make on GitHub, then I'm surprised that no one
  has objected already.  What would a fork solve? To paraphrase the
  regexp saying: after forking, we'll simply have two problems.

 I concur with you here: github 'forks', yes, as many as possible!
 Hopefully every one of those will produce one or more PRs :)  But a
 fork in the sense of a divergent parallel project?  I think that would
 only be indicative of a complete failure to find a way to make
 progress here, and I doubt we're anywhere near that state.

 That forks are *possible* is indeed a valuable and important option in
 open source software, because it means that a truly dysfunctional
 original project team/direction can't hold a community hostage
 forever.  But that doesn't mean that full-blown forks should be
 considered lightly, as they also carry enormous costs.

 I see absolutely nothing in the current scenario to even remotely
 consider that a full-blown fork would be a good idea, and I hope I'm
 right.  It seems to me we're making progress on problems that led to
 real difficulties last year, but from multiple parties I see signs
 that give me reason to be optimistic that the project is getting
 better, not worse.


 We certainly aren't there at the moment, but I can see us heading that way.
 But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since
 then datetime, NA, polynomial work, and various other enhancements have gone
 in along with some 280 bug fixes. The major technical problem blocking a 1.7
 release is getting datetime working reliably on windows. So I think that is
 where the short term effort needs to be. Meanwhile, we are spending effort
 to get out a 1.6.2 just so people can work with a stable version with some
 of the bug fixes, and potentially we will spend more time and effort to pull
 out the NA code. In the future there may be a transition to C++ and
 eventually a break with the current ABI. Or not.

 There are at least two motivations that get folks to write code for open
 source projects, scratching an itch and money. Money hasn't been a big part
 of the Numpy picture so far, so that leaves scratching an itch. One of the
 attractions of Numpy is that it is a small project, BSD licensed, and not
 overburdened with governance and process. This makes scratching an itch not
 as difficult as it would be in a large project. If Numpy remains a small
 project but acquires the encumbrances of a big project much of that
 attraction will be lost. Momentum and direction also attracts people, but
 numpy is stalled at the moment as the whole NA thing circles around once
 again.

I think your assumptions are incorrect, although I have seen them before.

No stated process leads to less encumbrance if and only if the
implicit process works.

It clearly doesn't work, precisely because the NA thing is circling
round and round again.

And the governance discussion.

And previously the ABI breakage discussion.

If you are on other mailing lists, I'm sure you are, you'll see that
this does not happen to - say - Cython, or Sympy.   In particular, I
have not seen, on those lists, the current numpy way of simply
blocking or avoiding discussion.   Everything is discussed out to
agreement, or at least until all parties accept the way forward.

At the moment, the only hope I could imagine for the 'no governance is
good governance' method, is that all those who don't agree would just
shut up.   It would be more peaceful, but for the reasons stated by
Nathaniel, I think that would be a very bad outcome.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig
 pierre.haes...@crans.org wrote:
  Hi,
 
  Le 24/04/2012 15:14, Charles R Harris a écrit :
 
  a) All arrays should be implicitly masked, even if the mask isn't
  initially allocated. The maskna keyword can then be removed, taking
  with it the sense that there are two kinds of arrays.
 
 
  From my lazy user perspective, having masked and non-masked arrays share
  the same look and feel would be a number one advantage over the
  existing numpy.ma arrays. I would like masked array to be as transparent
  as possible.

 I don't have any opinion about internal implementation.

 But users needs to be aware of whether they have masked arrays or not.
 Since many functions (most of scipy) wouldn't know how to handle NA
 and don't do any checks, (and shouldn't in my opinion if the NA check
 is costly). The result might be silently wrong numbers depending on
 the implementation.


There should be a flag saying whether or not NA has been allocated and
allocation happens when NA is assigned to an array item, so that should be
fast. I don't think scipy currently deals with masked arrays in all areas,,
so I believe that the same problem exists there and would also exist for
missing data types. I think this sort of compatibility problem is worth a
whole discussion by itself.



 
  b) There needs to be a distinction between missing and ignore. The
  mechanism for this is already in place in the payload type, although
  it isn't clear to me that that is uniformly used in all the NA code.
  There is also a place for missing *and* ignored. Which leads to
 
  If the idea of having two payloads is to avoid a maximum of skipna 
  friends extra keywords, I would like it much. My feeling with my small
  experience with R is that I end up calling every function with a
  different magical set of keywords (na.rm, na.action, ... and I forgot).

 There is a reason for requiring the user to decide what to do about NA's.
 Either we have utility functions/methods to help the user change the
 arrays and treat NA's before calling a function, or the function needs
 to ask the user what should be done about possible NAs.
 Doing it automatically might only be useful for specialised packages.


That's what the different payloads would do. I think the common use case
would always have the ignore bit set. What are the other sorts of actions
you are interested in, and should they be part of the functions in Numpy,
such as mean and std, or should they rather implemented in stats packages
that may be more specialized? I see numpy.ma currently used in the
following spots in scipy:

scipy/stats/mstats_extras.py
scipy/stats/tests/test_mstats_extras.py
scipy/stats/tests/test_mstats_basic.py
scipy/stats/mstats_basic.py
scipy/signal/filter_design.py
scipy/optimize/optimize.py

The advantage of nans, I suppose, is that they are in the hardware and so
already universally part of Numpy. NA would be introduced, so would require
a bit more work. I expect it will be several (many) years before they are
dealt with as a matter of course. At minimum, one would need to check if
the masked flag is set.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 12:12 PM, Charles R Harris 
charlesr.har...@gmail.com wrote:



 On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig
 pierre.haes...@crans.org wrote:
  Hi,
 
  Le 24/04/2012 15:14, Charles R Harris a écrit :
 
  a) All arrays should be implicitly masked, even if the mask isn't
  initially allocated. The maskna keyword can then be removed, taking
  with it the sense that there are two kinds of arrays.
 
 
  From my lazy user perspective, having masked and non-masked arrays share
  the same look and feel would be a number one advantage over the
  existing numpy.ma arrays. I would like masked array to be as
 transparent
  as possible.

 I don't have any opinion about internal implementation.

 But users needs to be aware of whether they have masked arrays or not.
 Since many functions (most of scipy) wouldn't know how to handle NA
 and don't do any checks, (and shouldn't in my opinion if the NA check
 is costly). The result might be silently wrong numbers depending on
 the implementation.


 There should be a flag saying whether or not NA has been allocated and
 allocation happens when NA is assigned to an array item, so that should be
 fast. I don't think scipy currently deals with masked arrays in all areas,,
 so I believe that the same problem exists there and would also exist for
 missing data types. I think this sort of compatibility problem is worth a
 whole discussion by itself.


To clarify a bit, a item could be marked as both missing and ignore. An
item that is marked missing will propagate as missing, but if it is also
ignored then things like mean and std will skip it. There would also be a
clear operation that would clear the ignore bit but keep the missing bit.
Now I can see the advantage of explicitly specifying behavior in functions
as one is knows right at the spot what is intended whereas with the other
alternative one needs to know the history of the array and whether ignore
was ever set, but in that sense it is just like having default keyword
values and could be implemented as such.
snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Benjamin Root

On Tue, Apr 24, 2012 at 2:12 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig
 pierre.haes...@crans.org wrote:
  Hi,
 
  Le 24/04/2012 15:14, Charles R Harris a écrit :
 
  a) All arrays should be implicitly masked, even if the mask isn't
  initially allocated. The maskna keyword can then be removed, taking
  with it the sense that there are two kinds of arrays.
 
 
  From my lazy user perspective, having masked and non-masked arrays share
  the same look and feel would be a number one advantage over the
  existing numpy.ma arrays. I would like masked array to be as
 transparent
  as possible.

 I don't have any opinion about internal implementation.

 But users needs to be aware of whether they have masked arrays or not.
 Since many functions (most of scipy) wouldn't know how to handle NA
 and don't do any checks, (and shouldn't in my opinion if the NA check
 is costly). The result might be silently wrong numbers depending on
 the implementation.


 There should be a flag saying whether or not NA has been allocated and
 allocation happens when NA is assigned to an array item, so that should be
 fast. I don't think scipy currently deals with masked arrays in all areas,,
 so I believe that the same problem exists there and would also exist for
 missing data types. I think this sort of compatibility problem is worth a
 whole discussion by itself.



 
  b) There needs to be a distinction between missing and ignore. The
  mechanism for this is already in place in the payload type, although
  it isn't clear to me that that is uniformly used in all the NA code.
  There is also a place for missing *and* ignored. Which leads to
 
  If the idea of having two payloads is to avoid a maximum of skipna 
  friends extra keywords, I would like it much. My feeling with my small
  experience with R is that I end up calling every function with a
  different magical set of keywords (na.rm, na.action, ... and I forgot).

 There is a reason for requiring the user to decide what to do about NA's.
 Either we have utility functions/methods to help the user change the
 arrays and treat NA's before calling a function, or the function needs
 to ask the user what should be done about possible NAs.
 Doing it automatically might only be useful for specialised packages.


 That's what the different payloads would do. I think the common use case
 would always have the ignore bit set. What are the other sorts of actions
 you are interested in, and should they be part of the functions in Numpy,
 such as mean and std, or should they rather implemented in stats packages
 that may be more specialized? I see numpy.ma currently used in the
 following spots in scipy:


Like you said, this whole issue probably should be in a separate
discussion, but I would like to point out here with my thoughts on default
payload.  If we don't have some sort of mechanism for flagging which
functions are NA-friendly or not, then it would be wise to have NA default
to NaN behavior.  If only to prevent bugs that mess up data from being
undetected.

That being said, the determination of NA payload is tricky.  Some functions
may need to react differently to an NA.  One that comes to mind is
np.gradient().  However, other functions may not need to do anything
because they depend entirely upon other functions that have already been
updated to support NA.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Nathaniel Smith

On Tue, Apr 24, 2012 at 2:43 PM, Pierre Haessig
pierre.haes...@crans.org wrote:
 If the idea of having two payloads is to avoid a maximum of skipna 
 friends extra keywords, I would like it much. My feeling with my small
 experience with R is that I end up calling every function with a
 different magical set of keywords (na.rm, na.action, ... and I forgot).

While I can't in general defend R on consistency grounds, there is a
logic to this particular case.

Most basic R functions like 'sum' take the na.rm= argument, which can
be True or False and is equivalent to the skipna argument we've talked
about for ufuncs. The functions that take other arguments (like
na.action= for model fitting functions, or use= for their equivalent
of np.corrcoef) are the ones that have *more* than 2 ways to handle
NAs. E.g. model fitting functions given NAs can raise an error, skip
the NA cases, or pass the NA cases through, and the correlation matrix
function has different options for what to do with cases where one
column has an NA but there are two others that don't.

Having a distinction between missing and ignored values doesn't really
affect whether you need such options. (If anything I guess it could
make such options even more complicated -- what if I want my
regression function to error out on missing but skip over ignored
values, etc.)

- N
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread josef . pktd

On Tue, Apr 24, 2012 at 2:35 PM, Benjamin Root ben.r...@ou.edu wrote:
 On Tue, Apr 24, 2012 at 2:12 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:



 On Tue, Apr 24, 2012 at 9:25 AM, josef.p...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig
 pierre.haes...@crans.org wrote:
  Hi,
 
  Le 24/04/2012 15:14, Charles R Harris a écrit :
 
  a) All arrays should be implicitly masked, even if the mask isn't
  initially allocated. The maskna keyword can then be removed, taking
  with it the sense that there are two kinds of arrays.
 
 
  From my lazy user perspective, having masked and non-masked arrays
  share
  the same look and feel would be a number one advantage over the
  existing numpy.ma arrays. I would like masked array to be as
  transparent
  as possible.

 I don't have any opinion about internal implementation.

 But users needs to be aware of whether they have masked arrays or not.
 Since many functions (most of scipy) wouldn't know how to handle NA
 and don't do any checks, (and shouldn't in my opinion if the NA check
 is costly). The result might be silently wrong numbers depending on
 the implementation.


 There should be a flag saying whether or not NA has been allocated and
 allocation happens when NA is assigned to an array item, so that should be
 fast. I don't think scipy currently deals with masked arrays in all areas,,
 so I believe that the same problem exists there and would also exist for
 missing data types. I think this sort of compatibility problem is worth a
 whole discussion by itself.



 
  b) There needs to be a distinction between missing and ignore. The
  mechanism for this is already in place in the payload type, although
  it isn't clear to me that that is uniformly used in all the NA code.
  There is also a place for missing *and* ignored. Which leads to
 
  If the idea of having two payloads is to avoid a maximum of skipna 
  friends extra keywords, I would like it much. My feeling with my small
  experience with R is that I end up calling every function with a
  different magical set of keywords (na.rm, na.action, ... and I forgot).

 There is a reason for requiring the user to decide what to do about NA's.
 Either we have utility functions/methods to help the user change the
 arrays and treat NA's before calling a function, or the function needs
 to ask the user what should be done about possible NAs.
 Doing it automatically might only be useful for specialised packages.


 That's what the different payloads would do. I think the common use case
 would always have the ignore bit set. What are the other sorts of actions
 you are interested in, and should they be part of the functions in Numpy,
 such as mean and std, or should they rather implemented in stats packages
 that may be more specialized? I see numpy.ma currently used in the following
 spots in scipy:

I think most functions that operate on an axis are mostly unambiguous
ignore, std, mean, var, histogram, should stay in numpy, np.cov might
have pairwise or row/column wise deletion option (but I don't know
what other packages are doing).

(While I had to run off, Nathaniel explained this.)

The main cases in stats (or statsmodels) for handling NaNs or NAs
would be rowwise ignore or pretend temporarily that they are zero or
some other neutral value.



 Like you said, this whole issue probably should be in a separate discussion,
 but I would like to point out here with my thoughts on default payload.  If
 we don't have some sort of mechanism for flagging which functions are
 NA-friendly or not, then it would be wise to have NA default to NaN
 behavior.  If only to prevent bugs that mess up data from being undetected.

In scipy.stats it's currently the responsibility of the user, unless
explicitly mentioned that a function knows how to handle nans or
masked arrays, the default is we don't check and what you get
returned might be anything.

If there is a flag (and a cheap way to verify whether there are NaNs
or NAs), then we could just add a check in every function.

Josef


 That being said, the determination of NA payload is tricky.  Some functions
 may need to react differently to an NA.  One that comes to mind is
 np.gradient().  However, other functions may not need to do anything because
 they depend entirely upon other functions that have already been updated to
 support NA.

 Cheers!
 Ben Root


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Stéfan van der Walt

On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
 The advantage of nans, I suppose, is that they are in the hardware and so

Why are we having a discussion on NAN's in a thread on consensus?
This is a strong indicator of the problem we're facing.

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Benjamin Root

On Tue, Apr 24, 2012 at 3:23 PM, Stéfan van der Walt ste...@sun.ac.zawrote:

 On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  The advantage of nans, I suppose, is that they are in the hardware and so

 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.

 Stéfan


Good catch!  Looks like we got off-track when the discussion talked about
forks.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

2012/4/24 Stéfan van der Walt ste...@sun.ac.za

 On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  The advantage of nans, I suppose, is that they are in the hardware and so

 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.


We seem to have a consensus regarding interest in the topic.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Matthew Brett

Hi,

On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 2012/4/24 Stéfan van der Walt ste...@sun.ac.za

 On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  The advantage of nans, I suppose, is that they are in the hardware and
  so

 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.


 We seem to have a consensus regarding interest in the topic.

This email is mainly to Travis.

This thread seems to be dying, condemning us to keep repeating the
same conversation with no result.

Chuck has made it clear he is not interested in this conversation.
Until it is clear you are interested in this conversation, it will
keep dying.   As you know, I think that will be very bad for numpy,
and, as you know, I care a great deal about that.

So, please, if you care about this, and agree that something should be
done, please, say so, and if you don't agree something should be done,
say so.  It can't better without your help,

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Stéfan van der Walt

On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.

 We seem to have a consensus regarding interest in the topic.

For the benefit of those of us interested in both discussions, would
you kindly start a new thread on the MA topic?

In response to Travis's suggestion of writing up a short summary of
community principles, as well as Matthew's initial formulation, I
agree that this would be helpful in enshrining the values we cherish
here, as well as in communicating those values to the next generation
of developers.

From observing the community, I would guess that these values include:

- That any party with an interest in NumPy is given the opportunity to
speak and to be heard on the list.
- That discussions that influence the course of the project take place
openly, for anyone to observe.
- That decisions are made once consensus is reached, i.e., if everyone
agrees that they can live with the outcome.

To summarize: NumPy development that is free  fair, open and unified.

We'll sometimes mess up and not follow our own guidelines, but with
them in place at least we'll have something to refer back to as a
reminder.

Regards
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant

Thanks for the reminder, Stefan and keeping us on track. 

It is very helpful to those trying to sort through the messages to keep the 
discussions to one subject per thread. 

-Travis




On Apr 24, 2012, at 2:23 PM, Stéfan van der Walt wrote:

 On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 The advantage of nans, I suppose, is that they are in the hardware and so
 
 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.
 
 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant


On Apr 24, 2012, at 6:01 PM, Stéfan van der Walt wrote:

 On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.
 
 We seem to have a consensus regarding interest in the topic.
 
 For the benefit of those of us interested in both discussions, would
 you kindly start a new thread on the MA topic?
 
 In response to Travis's suggestion of writing up a short summary of
 community principles, as well as Matthew's initial formulation, I
 agree that this would be helpful in enshrining the values we cherish
 here, as well as in communicating those values to the next generation
 of developers.
 
 From observing the community, I would guess that these values include:
 
 - That any party with an interest in NumPy is given the opportunity to
 speak and to be heard on the list.
 - That discussions that influence the course of the project take place
 openly, for anyone to observe.
 - That decisions are made once consensus is reached, i.e., if everyone
 agrees that they can live with the outcome.

This is well stated.  Thank you Stefan. 

Some will argue about what consensus means or who everyone is.But, if 
we are really worrying about that, then we have stopped listening to each other 
which is the number one community value that we should be promoting, 
demonstrating, and living by. 

Consensus to me means that anyone who can produce a well-reasoned argument and 
demonstrates by their persistence that they are actually using the code and are 
aware of the issues has veto power on pull requests. At times people with 
commit rights to NumPy might perform a pull request anyway, but they should 
acknowledge at least in the comment (but for major changes ---  on this list) 
that they are doing so and provide their reasons.

If I decide later that I think the pull request was made inappropriately in the 
face of objections and the reasons were not justified, then I will reserve the 
right to revert the pull request.I would like core developers of NumPy to 
have the same ability to check me as well.But, if there is a disagreement 
at that level, then I will reserve the right to decide. 

Basically, what we have in this situation is that the masked arrays were added 
to NumPy master with serious objections to the API.   What I'm trying to decide 
right now is can we move forward and satisfy the objections without removing 
the ndarrayobject changes entirely (I do think the concerns warrant removal of 
the changes).   The discussion around that is the most helpful right now, but 
should take place on another thread. 

Thanks,

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant


On Apr 24, 2012, at 5:52 PM, Matthew Brett wrote:

 Hi,
 
 On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
 2012/4/24 Stéfan van der Walt ste...@sun.ac.za
 
 On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 The advantage of nans, I suppose, is that they are in the hardware and
 so
 
 Why are we having a discussion on NAN's in a thread on consensus?
 This is a strong indicator of the problem we're facing.
 
 
 We seem to have a consensus regarding interest in the topic.
 
 This email is mainly to Travis.
 
 This thread seems to be dying, condemning us to keep repeating the
 same conversation with no result.
 
 Chuck has made it clear he is not interested in this conversation.
 Until it is clear you are interested in this conversation, it will
 keep dying.   As you know, I think that will be very bad for numpy,
 and, as you know, I care a great deal about that.

I am interested in the conversation, but I think I've already stated my views 
as well as I know how.   I'm not sure what else I should do at this point.
We do need consensus (defined as the absence of serious objectors) for me to 
agree to a NumPy 1.X release.  

I don't think it helps us get to a consensus to further discuss non-technical 
issues at this point. 

There is much interest in ideas for finding common ground in the masked array 
situation, but that should happen on another thread.

-Travis





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

2012/4/24 Stéfan van der Walt ste...@sun.ac.za

 On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Why are we having a discussion on NAN's in a thread on consensus?
  This is a strong indicator of the problem we're facing.
 
  We seem to have a consensus regarding interest in the topic.

 For the benefit of those of us interested in both discussions, would
 you kindly start a new thread on the MA topic?

 In response to Travis's suggestion of writing up a short summary of
 community principles, as well as Matthew's initial formulation, I
 agree that this would be helpful in enshrining the values we cherish
 here, as well as in communicating those values to the next generation
 of developers.


I think we adhere to these pretty well already, the problem is with the
word 'everyone'. I grew up in Massachusetts where town meetings were a
tradition. At those meetings the townsfolk voted on the budget, zoning,
construction of public buildings, use of public spaces and other such
topics. A quorum of voters was needed to make the votes binding, and apart
from that the meeting was limited to people who lived in the town, they,
after all, paid the taxes and had to live with the decisions. Outsiders
could sit in by invitation, but had to sit in a special area and were not
expected to speak unless called upon and certainly couldn't vote. So that
is one tradition, a democratic tradition with a history of success. We are
a much smaller community, physically separated, and don't need that sort of
exclusivity, but even so we have our version of resident and taxes, which
consists of hanging out on the list and contributing work. I think everyone
is welcome to express an opinion and make an argument, but not everyone has
a veto. I think a veto is a privilege, not a right, and to have that
privilege I think one needs to demonstrate an investment in the project,
consisting in this case of code contributions, code review, and other such
mundane tasks that demonstrate a larger interest and a willingness to work.
Anyone can do this, it doesn't require permission or special dispensation,
Numpy is very open in that regard. Folks working in related projects, such
as ipython and pandas, are also going to be listened to because they have
made that investment in time and work and the popularity of Numpy depends
on keeping them happy. But a right to veto doesn't automatically extend to
everyone who happens to have an interest in a topic.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 5:24 PM, Travis Oliphant tra...@continuum.iowrote:


 On Apr 24, 2012, at 6:01 PM, Stéfan van der Walt wrote:

  On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  Why are we having a discussion on NAN's in a thread on consensus?
  This is a strong indicator of the problem we're facing.
 
  We seem to have a consensus regarding interest in the topic.
 
  For the benefit of those of us interested in both discussions, would
  you kindly start a new thread on the MA topic?
 
  In response to Travis's suggestion of writing up a short summary of
  community principles, as well as Matthew's initial formulation, I
  agree that this would be helpful in enshrining the values we cherish
  here, as well as in communicating those values to the next generation
  of developers.
 
  From observing the community, I would guess that these values include:
 
  - That any party with an interest in NumPy is given the opportunity to
  speak and to be heard on the list.
  - That discussions that influence the course of the project take place
  openly, for anyone to observe.
  - That decisions are made once consensus is reached, i.e., if everyone
  agrees that they can live with the outcome.

 This is well stated.  Thank you Stefan.

 Some will argue about what consensus means or who everyone is.But,
 if we are really worrying about that, then we have stopped listening to
 each other which is the number one community value that we should be
 promoting, demonstrating, and living by.

 Consensus to me means that anyone who can produce a well-reasoned argument
 and demonstrates by their persistence that they are actually using the code
 and are aware of the issues has veto power on pull requests. At times
 people with commit rights to NumPy might perform a pull request anyway, but
 they should acknowledge at least in the comment (but for major changes ---
  on this list) that they are doing so and provide their reasons.

 If I decide later that I think the pull request was made inappropriately
 in the face of objections and the reasons were not justified, then I will
 reserve the right to revert the pull request.I would like core
 developers of NumPy to have the same ability to check me as well.But,
 if there is a disagreement at that level, then I will reserve the right to
 decide.

 Basically, what we have in this situation is that the masked arrays were
 added to NumPy master with serious objections to the API.   What I'm trying
 to decide right now is can we move forward and satisfy the objections
 without removing the ndarrayobject changes entirely (I do think the
 concerns warrant removal of the changes).   The discussion around that is
 the most helpful right now, but should take place on another thread.


Travis, if you are playing the BDFL role, then just make the darn decision
and remove the code so we can get on with life. As it is you go back and
forth and that does none of us any good, you're a big guy and you're
rocking the boat. I don't agree with that decision, I'd rather evolve the
code we have, but I'm willing to compromise with your decision in this
matter. I'm not willing to compromise with Nathaniel's, nor it seems
vice-versa. Nathaniel has volunteered to do the work, just ask him to
submit a patch.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Stéfan van der Walt

On Tue, Apr 24, 2012 at 4:49 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 But a right to veto doesn't automatically extend to everyone who happens to 
 have
 an interest in a topic.

The time has long gone when we simply hacked on NumPy for our own
benefit; if you will, NumPy users are our customers, and they have a
stake in its development (or, to phrase it differently, I think we
have a commitment to them).

If we strongly encourage people to discuss, but still give them an
avenue to object, we keep ourselves honest (both w.r.t. expectations
on numpy and our own insight into problems and their solutions).

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Benjamin Root

On Tuesday, April 24, 2012, Matthew Brett wrote:

 Hi,

 On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
 charlesr.har...@gmail.com javascript:; wrote:
 
 
  2012/4/24 Stéfan van der Walt ste...@sun.ac.za javascript:;
 
  On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
  charlesr.har...@gmail.com javascript:; wrote:
   The advantage of nans, I suppose, is that they are in the hardware and
   so
 
  Why are we having a discussion on NAN's in a thread on consensus?
  This is a strong indicator of the problem we're facing.
 
 
  We seem to have a consensus regarding interest in the topic.

 This email is mainly to Travis.

 This thread seems to be dying, condemning us to keep repeating the
 same conversation with no result.

 Chuck has made it clear he is not interested in this conversation.
 Until it is clear you are interested in this conversation, it will
 keep dying.   As you know, I think that will be very bad for numpy,
 and, as you know, I care a great deal about that.

 So, please, if you care about this, and agree that something should be
 done, please, say so, and if you don't agree something should be done,
 say so.  It can't better without your help,

 See you,

 Matthew


Matthew,

I agree with the general idea of consensus, and I think many of us here
agree with the ideal in principle. Quite frankly, I am not sure what more
 you want from us. You are only going to get so much leeway on a
philosophical discussion on goverance on a numerical computation mail list.
The thread keeps dying (i say it is getting distracted) because coders
are champing at the bit to get stuff done.

In a sense, i think there is a consensus, if you will, to move on.  All in
favor, say Aye!

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Nathaniel Smith

On Wed, Apr 25, 2012 at 12:49 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
 I think we adhere to these pretty well already, the problem is with the word
 'everyone'. I grew up in Massachusetts where town meetings were a tradition.
 At those meetings the townsfolk voted on the budget, zoning, construction of
 public buildings, use of public spaces and other such topics. A quorum of
 voters was needed to make the votes binding, and apart from that the meeting
 was limited to people who lived in the town, they, after all, paid the taxes
 and had to live with the decisions. Outsiders could sit in by invitation,
 but had to sit in a special area and were not expected to speak unless
 called upon and certainly couldn't vote. So that is one tradition, a
 democratic tradition with a history of success. We are a much smaller
 community, physically separated, and don't need that sort of exclusivity,
 but even so we have our version of resident and taxes, which consists of
 hanging out on the list and contributing work. I think everyone is welcome
 to express an opinion and make an argument, but not everyone has a veto. I
 think a veto is a privilege, not a right, and to have that privilege I think
 one needs to demonstrate an investment in the project, consisting in this
 case of code contributions, code review, and other such mundane tasks that
 demonstrate a larger interest and a willingness to work. Anyone can do this,
 it doesn't require permission or special dispensation, Numpy is very open in
 that regard. Folks working in related projects, such as ipython and pandas,
 are also going to be listened to because they have made that investment in
 time and work and the popularity of Numpy depends on keeping them happy. But
 a right to veto doesn't automatically extend to everyone who happens to have
 an interest in a topic.

Consensus-seeking isn't about privilege or moral rights. It's about
ruthless pragmatism.

The end of your message actually gets very close to the position I'm
advocating -- except that I'm saying, instead of trying to judge which
people are worth keeping happy by looking up their commit record on
projects you've heard of, you're safer erroring on the side of
assuming that anyone taking the time to show up probably has some good
reason for doing so, and that their concerns are probably shared by a
larger group.

You wouldn't refuse to try a chef's cooking until she's proven herself
by washing dishes -- why the heck would you demand that people perform
mundane tasks before you're willing to trust they have some insight?
Acting as maintainer isn't a privilege -- it's a gift you give. So is
feedback. Ignoring it is just a way of shooting your own project in
the foot.

- N
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant

On Apr 24, 2012, at 7:16 PM, Stéfan van der Walt wrote:

 On Tue, Apr 24, 2012 at 4:49 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 But a right to veto doesn't automatically extend to everyone who happens to 
 have
 an interest in a topic.

This is not my view, but it is Charles view and as he is an active developer in 
the NumPy community so this carries weight.I hope he can be convinced that 
active users are an important part of the community.  

Charles has made tremendous contributions to this community starting with 
significant code in Numarray that now lives in NumPy, significant commitment to 
code quality, significant effort on responding to pull requests, diligence in 
triaging and applying bug-fixes in tickets, and even responding to people who 
disagree with him.  

 
 The time has long gone when we simply hacked on NumPy for our own
 benefit; if you will, NumPy users are our customers, and they have a
 stake in its development (or, to phrase it differently, I think we
 have a commitment to them).
 
 If we strongly encourage people to discuss, but still give them an
 avenue to object, we keep ourselves honest (both w.r.t. expectations
 on numpy and our own insight into problems and their solutions).

+1

 
 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Nathaniel Smith

On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com
 wrote:

 On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za
 wrote:
  If you are referring to the traditional concept of a fork, and not to
  the type we frequently make on GitHub, then I'm surprised that no one
  has objected already.  What would a fork solve? To paraphrase the
  regexp saying: after forking, we'll simply have two problems.

 I concur with you here: github 'forks', yes, as many as possible!
 Hopefully every one of those will produce one or more PRs :)  But a
 fork in the sense of a divergent parallel project?  I think that would
 only be indicative of a complete failure to find a way to make
 progress here, and I doubt we're anywhere near that state.

 That forks are *possible* is indeed a valuable and important option in
 open source software, because it means that a truly dysfunctional
 original project team/direction can't hold a community hostage
 forever.  But that doesn't mean that full-blown forks should be
 considered lightly, as they also carry enormous costs.

 I see absolutely nothing in the current scenario to even remotely
 consider that a full-blown fork would be a good idea, and I hope I'm
 right.  It seems to me we're making progress on problems that led to
 real difficulties last year, but from multiple parties I see signs
 that give me reason to be optimistic that the project is getting
 better, not worse.


 We certainly aren't there at the moment, but I can see us heading that way.
 But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since
 then datetime, NA, polynomial work, and various other enhancements have gone
 in along with some 280 bug fixes. The major technical problem blocking a 1.7
 release is getting datetime working reliably on windows. So I think that is
 where the short term effort needs to be. Meanwhile, we are spending effort
 to get out a 1.6.2 just so people can work with a stable version with some
 of the bug fixes, and potentially we will spend more time and effort to pull
 out the NA code. In the future there may be a transition to C++ and
 eventually a break with the current ABI. Or not.

 There are at least two motivations that get folks to write code for open
 source projects, scratching an itch and money. Money hasn't been a big part
 of the Numpy picture so far, so that leaves scratching an itch. One of the
 attractions of Numpy is that it is a small project, BSD licensed, and not
 overburdened with governance and process. This makes scratching an itch not
 as difficult as it would be in a large project. If Numpy remains a small
 project but acquires the encumbrances of a big project much of that
 attraction will be lost. Momentum and direction also attracts people, but
 numpy is stalled at the moment as the whole NA thing circles around once
 again.

I don't think we need a fork, or to start maintaining separate stable
and unstable trees, or any of the other complicated process changes
that have been suggested. There are tons of projects that routinely
make much bigger changes than we're talking about, and they do it
without needing that kind of overhead. I know that these suggestions
are all made in good faith, but they remind me of a line from that
Apache page I linked earlier: People tend to avoid conflict and
thrash around looking for something to substitute - somebody in
charge, a rule, a process, stagnation. None of these tend to be very
good substitutes for doing the hard work of resolving the conflict.

I also think if you talk to potential contributors, you'll find that
clear, simple processes and a history of respecting everyone's input
are much more attractive than a no-rules free-for-all. Good
engineering practices are not an encumbrance. Resolving conflicts
before merging is a good engineering practice.

What happened with the NA discussion is this:
  - There was substantial disagreement about whether NEP-style masks,
or indeed, focusing on a mask-based implementation *at all*, was the
best way forward.
  - There was also a perceived time constraint, that we had to either
implement something immediately while Mark was there, or have nothing.

So in the end, the latter concern outweighed the former, the
discussion was cut off, and Mark's best guess at an API was merged
into master. I totally understand how this decision made sense at the
time, but the result is what we see now: it's left numpy stalled,
rifts on the mailing list, boring discussions about process, and still
no agreement about whether NEP-style masks will actually solve our
users' problems.

Getting past this isn't *complicated* -- it's just hard work.

 What would I suggest as a way forward with the NA option. Let's take the
 issues.

 1) Adding slots to PyArrayObject_fields. I don't think this is likely to be
 a problem unless someone's code passes the struct by value or

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com
  wrote:
 
  On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za
  wrote:
   If you are referring to the traditional concept of a fork, and not to
   the type we frequently make on GitHub, then I'm surprised that no one
   has objected already.  What would a fork solve? To paraphrase the
   regexp saying: after forking, we'll simply have two problems.
 
  I concur with you here: github 'forks', yes, as many as possible!
  Hopefully every one of those will produce one or more PRs :)  But a
  fork in the sense of a divergent parallel project?  I think that would
  only be indicative of a complete failure to find a way to make
  progress here, and I doubt we're anywhere near that state.
 
  That forks are *possible* is indeed a valuable and important option in
  open source software, because it means that a truly dysfunctional
  original project team/direction can't hold a community hostage
  forever.  But that doesn't mean that full-blown forks should be
  considered lightly, as they also carry enormous costs.
 
  I see absolutely nothing in the current scenario to even remotely
  consider that a full-blown fork would be a good idea, and I hope I'm
  right.  It seems to me we're making progress on problems that led to
  real difficulties last year, but from multiple parties I see signs
  that give me reason to be optimistic that the project is getting
  better, not worse.
 
 
  We certainly aren't there at the moment, but I can see us heading that
 way.
  But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago.
 Since
  then datetime, NA, polynomial work, and various other enhancements have
 gone
  in along with some 280 bug fixes. The major technical problem blocking a
 1.7
  release is getting datetime working reliably on windows. So I think that
 is
  where the short term effort needs to be. Meanwhile, we are spending
 effort
  to get out a 1.6.2 just so people can work with a stable version with
 some
  of the bug fixes, and potentially we will spend more time and effort to
 pull
  out the NA code. In the future there may be a transition to C++ and
  eventually a break with the current ABI. Or not.
 
  There are at least two motivations that get folks to write code for open
  source projects, scratching an itch and money. Money hasn't been a big
 part
  of the Numpy picture so far, so that leaves scratching an itch. One of
 the
  attractions of Numpy is that it is a small project, BSD licensed, and not
  overburdened with governance and process. This makes scratching an itch
 not
  as difficult as it would be in a large project. If Numpy remains a small
  project but acquires the encumbrances of a big project much of that
  attraction will be lost. Momentum and direction also attracts people, but
  numpy is stalled at the moment as the whole NA thing circles around once
  again.

 I don't think we need a fork, or to start maintaining separate stable
 and unstable trees, or any of the other complicated process changes
 that have been suggested. There are tons of projects that routinely
 make much bigger changes than we're talking about, and they do it
 without needing that kind of overhead. I know that these suggestions
 are all made in good faith, but they remind me of a line from that
 Apache page I linked earlier: People tend to avoid conflict and
 thrash around looking for something to substitute - somebody in
 charge, a rule, a process, stagnation. None of these tend to be very
 good substitutes for doing the hard work of resolving the conflict.

 I also think if you talk to potential contributors, you'll find that
 clear, simple processes and a history of respecting everyone's input
 are much more attractive than a no-rules free-for-all. Good
 engineering practices are not an encumbrance. Resolving conflicts
 before merging is a good engineering practice.

 What happened with the NA discussion is this:
  - There was substantial disagreement about whether NEP-style masks,
 or indeed, focusing on a mask-based implementation *at all*, was the
 best way forward.
  - There was also a perceived time constraint, that we had to either
 implement something immediately while Mark was there, or have nothing.

 So in the end, the latter concern outweighed the former, the
 discussion was cut off, and Mark's best guess at an API was merged
 into master. I totally understand how this decision made sense at the
 time, but the result is what we see now: it's left numpy stalled,
 rifts on the mailing list, boring discussions about process, and still
 no agreement about whether NEP-style masks will actually solve our
 users' problems.

 Getting past this isn't *complicated* -- it's just hard work.


I admit to a certain curiosity about your own involvement in

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Matthew Brett

Hi,

On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com
  wrote:
 
  On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za
  wrote:
   If you are referring to the traditional concept of a fork, and not to
   the type we frequently make on GitHub, then I'm surprised that no one
   has objected already.  What would a fork solve? To paraphrase the
   regexp saying: after forking, we'll simply have two problems.
 
  I concur with you here: github 'forks', yes, as many as possible!
  Hopefully every one of those will produce one or more PRs :)  But a
  fork in the sense of a divergent parallel project?  I think that would
  only be indicative of a complete failure to find a way to make
  progress here, and I doubt we're anywhere near that state.
 
  That forks are *possible* is indeed a valuable and important option in
  open source software, because it means that a truly dysfunctional
  original project team/direction can't hold a community hostage
  forever.  But that doesn't mean that full-blown forks should be
  considered lightly, as they also carry enormous costs.
 
  I see absolutely nothing in the current scenario to even remotely
  consider that a full-blown fork would be a good idea, and I hope I'm
  right.  It seems to me we're making progress on problems that led to
  real difficulties last year, but from multiple parties I see signs
  that give me reason to be optimistic that the project is getting
  better, not worse.
 
 
  We certainly aren't there at the moment, but I can see us heading that
  way.
  But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago.
  Since
  then datetime, NA, polynomial work, and various other enhancements have
  gone
  in along with some 280 bug fixes. The major technical problem blocking a
  1.7
  release is getting datetime working reliably on windows. So I think that
  is
  where the short term effort needs to be. Meanwhile, we are spending
  effort
  to get out a 1.6.2 just so people can work with a stable version with
  some
  of the bug fixes, and potentially we will spend more time and effort to
  pull
  out the NA code. In the future there may be a transition to C++ and
  eventually a break with the current ABI. Or not.
 
  There are at least two motivations that get folks to write code for open
  source projects, scratching an itch and money. Money hasn't been a big
  part
  of the Numpy picture so far, so that leaves scratching an itch. One of
  the
  attractions of Numpy is that it is a small project, BSD licensed, and
  not
  overburdened with governance and process. This makes scratching an itch
  not
  as difficult as it would be in a large project. If Numpy remains a small
  project but acquires the encumbrances of a big project much of that
  attraction will be lost. Momentum and direction also attracts people,
  but
  numpy is stalled at the moment as the whole NA thing circles around once
  again.

 I don't think we need a fork, or to start maintaining separate stable
 and unstable trees, or any of the other complicated process changes
 that have been suggested. There are tons of projects that routinely
 make much bigger changes than we're talking about, and they do it
 without needing that kind of overhead. I know that these suggestions
 are all made in good faith, but they remind me of a line from that
 Apache page I linked earlier: People tend to avoid conflict and
 thrash around looking for something to substitute - somebody in
 charge, a rule, a process, stagnation. None of these tend to be very
 good substitutes for doing the hard work of resolving the conflict.

 I also think if you talk to potential contributors, you'll find that
 clear, simple processes and a history of respecting everyone's input
 are much more attractive than a no-rules free-for-all. Good
 engineering practices are not an encumbrance. Resolving conflicts
 before merging is a good engineering practice.

 What happened with the NA discussion is this:
  - There was substantial disagreement about whether NEP-style masks,
 or indeed, focusing on a mask-based implementation *at all*, was the
 best way forward.
  - There was also a perceived time constraint, that we had to either
 implement something immediately while Mark was there, or have nothing.

 So in the end, the latter concern outweighed the former, the
 discussion was cut off, and Mark's best guess at an API was merged
 into master. I totally understand how this decision made sense at the
 time, but the result is what we see now: it's left numpy stalled,
 rifts on the mailing list, boring discussions about process, and still
 no agreement about whether NEP-style masks will actually solve our
 users' problems.

 Getting past this

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Fernando Perez

On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 I admit to a certain curiosity about your own involvement in FOSS projects,
 and I know I'm not alone in this. Google shows several years of discussion
 on Monotone, but I have no idea what your contributions were

Seriously???

Please, let's rise above this.  We discuss people's opinions *on their
technical merit alone*, regardless of the background of the person
presenting them.  I don't care if Linus himself shows up on the list
with a bad idea, it should be shot down; and if someone we'd never
heard of brings up a valid point, we should respect it.

The day we start checking credentials at the door is the day this
project will die as an open source effort.  Or at least I think so,
but perhaps I don't have enough 'commit credits' in my account for my
opinion to matter...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 8:56 PM, Fernando Perez fperez@gmail.comwrote:

 On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  I admit to a certain curiosity about your own involvement in FOSS
 projects,
  and I know I'm not alone in this. Google shows several years of
 discussion
  on Monotone, but I have no idea what your contributions were

 Seriously???

 Please, let's rise above this.  We discuss people's opinions *on their
 technical merit alone*, regardless of the background of the person
 presenting them.  I don't care if Linus himself shows up on the list
 with a bad idea, it should be shot down; and if someone we'd never
 heard of brings up a valid point, we should respect it.

 The day we start checking credentials at the door is the day this
 project will die as an open source effort.  Or at least I think so,
 but perhaps I don't have enough 'commit credits' in my account for my
 opinion to matter...


Fernando, I'm not checking credentials, I'm curious. Nathaniel has
experience with FOSS projects, unlike us first timers, and I'd like to know
what that experience was and what he learned from it. He has also mentioned
Graydon Hoare in connection with RUST, and since Graydon was the prime
mover in Monotone I'd like to know the story of the project.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Fernando Perez

On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 Fernando, I'm not checking credentials, I'm curious.

Well, at least I think that an inquisitive query about someone's
background, phrased like that, can be very easily misread.  I can only
speak for myself, but I immediately had the impression that you were
indeed trying to validate his background as a proxy for the
discussion, and suggesting that others had the same curiosity...

Had the question been something more like Hey Nathaniel, what other
projects do you think could inform our current view, maybe from stuff
you've done in the past or lists you've lurked on?, I would have a
very different reaction.  But this sentence:


I admit to a certain curiosity about your own involvement in FOSS
projects, and I know I'm not alone in this.


definitely reads to me with a rather dark and unpleasant angle. Upon
rereading it again now, I still don't like the tone.  I trust you when
you indicate that your intent was different; perhaps it's a matter of
phrasing, or the fact that English is not my native language and I may
miss subtleties of native speakers.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread josef . pktd

On Tue, Apr 24, 2012 at 11:28 PM, Fernando Perez fperez@gmail.com wrote:
 On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 Fernando, I'm not checking credentials, I'm curious.

 Well, at least I think that an inquisitive query about someone's
 background, phrased like that, can be very easily misread.  I can only
 speak for myself, but I immediately had the impression that you were
 indeed trying to validate his background as a proxy for the
 discussion, and suggesting that others had the same curiosity...

 Had the question been something more like Hey Nathaniel, what other
 projects do you think could inform our current view, maybe from stuff
 you've done in the past or lists you've lurked on?, I would have a
 very different reaction.  But this sentence:

 
 I admit to a certain curiosity about your own involvement in FOSS
 projects, and I know I'm not alone in this.
 

 definitely reads to me with a rather dark and unpleasant angle. Upon
 rereading it again now, I still don't like the tone.  I trust you when
 you indicate that your intent was different; perhaps it's a matter of
 phrasing, or the fact that English is not my native language and I may
 miss subtleties of native speakers.

I agree with the interpretation, however, whenever I look at this
thread with google gmail, then I see the first line
If you hang around big FOSS projects, you'll see the word consensus

I'm only hanging around in this neighborhood (9 mailing lists), so I
have no idea about big FOSS projects.

Josef



 Cheers,

 f
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.comwrote:

 On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Fernando, I'm not checking credentials, I'm curious.

 Well, at least I think that an inquisitive query about someone's
 background, phrased like that, can be very easily misread.  I can only
 speak for myself, but I immediately had the impression that you were
 indeed trying to validate his background as a proxy for the
 discussion, and suggesting that others had the same curiosity...

 Had the question been something more like Hey Nathaniel, what other
 projects do you think could inform our current view, maybe from stuff
 you've done in the past or lists you've lurked on?, I would have a
 very different reaction.  But this sentence:

 
 I admit to a certain curiosity about your own involvement in FOSS
 projects, and I know I'm not alone in this.
 

 definitely reads to me with a rather dark and unpleasant angle. Upon
 rereading it again now, I still don't like the tone.  I trust you when
 you indicate that your intent was different; perhaps it's a matter of
 phrasing, or the fact that English is not my native language and I may
 miss subtleties of native speakers.


Perhaps it was a bit colored, but even so, I'd like to know some specifics
of his experience. Monotone was one of the projects that sprang up after
Linus started using Bitkeeper as an open alternative, but that is actually
fairly recent (2003 or so) and much of the discussion seems to have been
carried on over IRC, rather than a mailing list. I'm guessing that some
other projects could have taken place in the 90's, but things have changed
so much since then that it is hard to know what was going on in that
decade. There was certainly work on the C++ Template library, Linux,
Python, and various utilities. But it is hard to know. In any case, I'd
guess that Monotone was a fairly tight knit community, and about 2007 most
of the developers left. I'd guess it was mostly a case of git and mercurial
becoming dominant, and possibly they also lost interest in DVCS and moved
on to other things.

Numpy itself has gone through several of those transitions, and looking
back, I think one of the problems was that when Travis left for Enthought
he didn't officially hand off maintenance. The whole transition was a bit
lucky, with David, Pauli, and myself unofficially continuing the work for
the 1.3 and 1.4 releases. At that point I was hoping David could more or
less take over, but he graduated, and Pauli would have been an excellent
choice, but he took up his graduate studies. Turnover is a problem with
open source, and no matter how much discussion there is, if people aren't
doing the work the whole thing sort of peters out.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant


On Apr 24, 2012, at 9:41 PM, Matthew Brett wrote:

 Hi,
 
 On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
 On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith n...@pobox.com wrote:
 
 On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
 On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez fperez@gmail.com
 wrote:
 
 On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za
 wrote:
 If you are referring to the traditional concept of a fork, and not to
 the type we frequently make on GitHub, then I'm surprised that no one
 has objected already.  What would a fork solve? To paraphrase the
 regexp saying: after forking, we'll simply have two problems.
 
 I concur with you here: github 'forks', yes, as many as possible!
 Hopefully every one of those will produce one or more PRs :)  But a
 fork in the sense of a divergent parallel project?  I think that would
 only be indicative of a complete failure to find a way to make
 progress here, and I doubt we're anywhere near that state.
 
 That forks are *possible* is indeed a valuable and important option in
 open source software, because it means that a truly dysfunctional
 original project team/direction can't hold a community hostage
 forever.  But that doesn't mean that full-blown forks should be
 considered lightly, as they also carry enormous costs.
 
 I see absolutely nothing in the current scenario to even remotely
 consider that a full-blown fork would be a good idea, and I hope I'm
 right.  It seems to me we're making progress on problems that led to
 real difficulties last year, but from multiple parties I see signs
 that give me reason to be optimistic that the project is getting
 better, not worse.
 
 
 We certainly aren't there at the moment, but I can see us heading that
 way.
 But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago.
 Since
 then datetime, NA, polynomial work, and various other enhancements have
 gone
 in along with some 280 bug fixes. The major technical problem blocking a
 1.7
 release is getting datetime working reliably on windows. So I think that
 is
 where the short term effort needs to be. Meanwhile, we are spending
 effort
 to get out a 1.6.2 just so people can work with a stable version with
 some
 of the bug fixes, and potentially we will spend more time and effort to
 pull
 out the NA code. In the future there may be a transition to C++ and
 eventually a break with the current ABI. Or not.
 
 There are at least two motivations that get folks to write code for open
 source projects, scratching an itch and money. Money hasn't been a big
 part
 of the Numpy picture so far, so that leaves scratching an itch. One of
 the
 attractions of Numpy is that it is a small project, BSD licensed, and
 not
 overburdened with governance and process. This makes scratching an itch
 not
 as difficult as it would be in a large project. If Numpy remains a small
 project but acquires the encumbrances of a big project much of that
 attraction will be lost. Momentum and direction also attracts people,
 but
 numpy is stalled at the moment as the whole NA thing circles around once
 again.
 
 I don't think we need a fork, or to start maintaining separate stable
 and unstable trees, or any of the other complicated process changes
 that have been suggested. There are tons of projects that routinely
 make much bigger changes than we're talking about, and they do it
 without needing that kind of overhead. I know that these suggestions
 are all made in good faith, but they remind me of a line from that
 Apache page I linked earlier: People tend to avoid conflict and
 thrash around looking for something to substitute - somebody in
 charge, a rule, a process, stagnation. None of these tend to be very
 good substitutes for doing the hard work of resolving the conflict.
 
 I also think if you talk to potential contributors, you'll find that
 clear, simple processes and a history of respecting everyone's input
 are much more attractive than a no-rules free-for-all. Good
 engineering practices are not an encumbrance. Resolving conflicts
 before merging is a good engineering practice.
 
 What happened with the NA discussion is this:
  - There was substantial disagreement about whether NEP-style masks,
 or indeed, focusing on a mask-based implementation *at all*, was the
 best way forward.
  - There was also a perceived time constraint, that we had to either
 implement something immediately while Mark was there, or have nothing.
 
 So in the end, the latter concern outweighed the former, the
 discussion was cut off, and Mark's best guess at an API was merged
 into master. I totally understand how this decision made sense at the
 time, but the result is what we see now: it's left numpy stalled,
 rifts on the mailing list, boring discussions about process, and still
 no agreement about whether NEP-style masks will actually solve our
 users' problems.
 
 Getting past

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Fernando Perez

On Tue, Apr 24, 2012 at 8:50 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 Turnover is a problem with open source, and no matter how much discussion
 there is, if people aren't doing the work the whole thing sort of peters
 out.

That's very true, and I hope that by building a friendly and welcoming
environment, we'll raise the chances of getting sufficient new
contributors to help with this issue.  For my talk at Euroscipy last
year [1] I made some plots collecting git statistics that show how
badly loaded most scientific python projects are on the shoulders of
very, very few.  I really hope we can find ways of spreading the load
a bit wider, and everything we can do to make projects more appealing
to new contributors is an effort worth making.

Cheers,

f

http://fperez.org/talks/1108_euroscipy_keynote.pdf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant


On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote:

 
 
 On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.com wrote:
 On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Fernando, I'm not checking credentials, I'm curious.
 
 Well, at least I think that an inquisitive query about someone's
 background, phrased like that, can be very easily misread.  I can only
 speak for myself, but I immediately had the impression that you were
 indeed trying to validate his background as a proxy for the
 discussion, and suggesting that others had the same curiosity...
 
 Had the question been something more like Hey Nathaniel, what other
 projects do you think could inform our current view, maybe from stuff
 you've done in the past or lists you've lurked on?, I would have a
 very different reaction.  But this sentence:
 
 
 I admit to a certain curiosity about your own involvement in FOSS
 projects, and I know I'm not alone in this.
 
 
 definitely reads to me with a rather dark and unpleasant angle. Upon
 rereading it again now, I still don't like the tone.  I trust you when
 you indicate that your intent was different; perhaps it's a matter of
 phrasing, or the fact that English is not my native language and I may
 miss subtleties of native speakers.
 
 
 Perhaps it was a bit colored, but even so, I'd like to know some specifics of 
 his experience. Monotone was one of the projects that sprang up after Linus 
 started using Bitkeeper as an open alternative, but that is actually fairly 
 recent (2003 or so) and much of the discussion seems to have been carried on 
 over IRC, rather than a mailing list. I'm guessing that some other projects 
 could have taken place in the 90's, but things have changed so much since 
 then that it is hard to know what was going on in that decade. There was 
 certainly work on the C++ Template library, Linux, Python, and various 
 utilities. But it is hard to know. In any case, I'd guess that Monotone was a 
 fairly tight knit community, and about 2007 most of the developers left. I'd 
 guess it was mostly a case of git and mercurial becoming dominant, and 
 possibly they also lost interest in DVCS and moved on to other things.
 
 Numpy itself has gone through several of those transitions, and looking back, 
 I think one of the problems was that when Travis left for Enthought he didn't 
 officially hand off maintenance. The whole transition was a bit lucky, with 
 David, Pauli, and myself unofficially continuing the work for the 1.3 and 1.4 
 releases. At that point I was hoping David could more or less take over, but 
 he graduated, and Pauli would have been an excellent choice, but he took up 
 his graduate studies. Turnover is a problem with open source, and no matter 
 how much discussion there is, if people aren't doing the work the whole thing 
 sort of peters out.

Thanks for explaining yourself.The tone you used could earlier have been 
mis-interpreted (though I would hope that people would look at your record of 
contribution and give you the benefit of the doubt).   Your last sentence is 
very true.   In this particular case, however, there is enough interest that 
the whole thing will not peter out, but there is a strong chance that there 
will be competing groups with divergent needs and interests vying for how the 
project should develop.   

There are many people who rely on NumPy and are concerned about its progress.   
NumFocus was created to fight for resources to further the whole ecosystem and 
not just rely on volunteers that are available.   I fundamentally do not 
believe that model can scale.There are, however, ways to keep things open 
source and allow people to work on NumPy as their day-job.  Several companies 
now exist that benefit from the NumPy code base and will be interested in 
seeing it grow.

It is a mis-characterization to imply that I left the project without a 
hand-off.   I never handed off the project because I never left it.   I was 
very busy at Enthought.  I will still be busy now.   But, NumPy is very 
important to me and has remained so.   I have spent a great deal of mental 
effort trying to figure out how to contribute to its growth.   Yes, I allowed 
other people to contribute significantly to the project and was very receptive 
to their pull requests (even when I didn't think it was the most urgent thing 
or something I actually disagreed with).

That should not be interpreted as having left.   NumPy grew because it solved 
a useful problem and people were willing to tolerate its problems to make a 
difference by contributing. None of us matter as much to NumPy as the 
problems it helps people solve.   To the degree it does that we are lucky to 
be able to contribute to the project.   I hope all NumPy developers continue to 
be lucky enough to have people actually care about the problems NumPy solves 
now and can solve in the future. 

-Travis








 
 Chuck

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread josef . pktd

On Wed, Apr 25, 2012 at 12:25 AM, Travis Oliphant tra...@continuum.io wrote:

 On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote:



 On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.com
 wrote:

 On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Fernando, I'm not checking credentials, I'm curious.

 Well, at least I think that an inquisitive query about someone's
 background, phrased like that, can be very easily misread.  I can only
 speak for myself, but I immediately had the impression that you were
 indeed trying to validate his background as a proxy for the
 discussion, and suggesting that others had the same curiosity...

 Had the question been something more like Hey Nathaniel, what other
 projects do you think could inform our current view, maybe from stuff
 you've done in the past or lists you've lurked on?, I would have a
 very different reaction.  But this sentence:

 
 I admit to a certain curiosity about your own involvement in FOSS
 projects, and I know I'm not alone in this.
 

 definitely reads to me with a rather dark and unpleasant angle. Upon
 rereading it again now, I still don't like the tone.  I trust you when
 you indicate that your intent was different; perhaps it's a matter of
 phrasing, or the fact that English is not my native language and I may
 miss subtleties of native speakers.


 Perhaps it was a bit colored, but even so, I'd like to know some specifics
 of his experience. Monotone was one of the projects that sprang up after
 Linus started using Bitkeeper as an open alternative, but that is actually
 fairly recent (2003 or so) and much of the discussion seems to have been
 carried on over IRC, rather than a mailing list. I'm guessing that some
 other projects could have taken place in the 90's, but things have changed
 so much since then that it is hard to know what was going on in that decade.
 There was certainly work on the C++ Template library, Linux, Python, and
 various utilities. But it is hard to know. In any case, I'd guess that
 Monotone was a fairly tight knit community, and about 2007 most of the
 developers left. I'd guess it was mostly a case of git and mercurial
 becoming dominant, and possibly they also lost interest in DVCS and moved on
 to other things.

 Numpy itself has gone through several of those transitions, and looking
 back, I think one of the problems was that when Travis left for Enthought he
 didn't officially hand off maintenance. The whole transition was a bit
 lucky, with David, Pauli, and myself unofficially continuing the work for
 the 1.3 and 1.4 releases. At that point I was hoping David could more or
 less take over, but he graduated, and Pauli would have been an excellent
 choice, but he took up his graduate studies. Turnover is a problem with open
 source, and no matter how much discussion there is, if people aren't doing
 the work the whole thing sort of peters out.


 Thanks for explaining yourself.    The tone you used could earlier have been
 mis-interpreted (though I would hope that people would look at your record
 of contribution and give you the benefit of the doubt).   Your last sentence
 is very true.   In this particular case, however, there is enough interest
 that the whole thing will not peter out, but there is a strong chance that
 there will be competing groups with divergent needs and interests vying for
 how the project should develop.

 There are many people who rely on NumPy and are concerned about its
 progress.   NumFocus was created to fight for resources to further the whole
 ecosystem and not just rely on volunteers that are available.   I
 fundamentally do not believe that model can scale.    There are, however,
 ways to keep things open source and allow people to work on NumPy as their
 day-job.  Several companies now exist that benefit from the NumPy code base
 and will be interested in seeing it grow.

 It is a mis-characterization to imply that I left the project without a
 hand-off.   I never handed off the project because I never left it.   I
 was very busy at Enthought.  I will still be busy now.   But, NumPy is very
 important to me and has remained so.   I have spent a great deal of mental
 effort trying to figure out how to contribute to its growth.   Yes, I
 allowed other people to contribute significantly to the project and was very
 receptive to their pull requests (even when I didn't think it was the most
 urgent thing or something I actually disagreed with).

Sorry that I missed this part of numpy history, I always had the
impression that numpy is run by a community led by Chuck and the young
guys, David, Pauli, Stefan, Pierre; and Robert on the mailing list .
(But I came late, and am just a balcony muppet.)

Josef


 That should not be interpreted as having left.   NumPy grew because it
 solved a useful problem and people were willing to tolerate its problems to
 make a difference by contributing.     None of us matter as much to NumPy as

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Charles R Harris

On Tue, Apr 24, 2012 at 10:25 PM, Travis Oliphant tra...@continuum.iowrote:


 On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote:



 On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.comwrote:

 On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Fernando, I'm not checking credentials, I'm curious.

 Well, at least I think that an inquisitive query about someone's
 background, phrased like that, can be very easily misread.  I can only
 speak for myself, but I immediately had the impression that you were
 indeed trying to validate his background as a proxy for the
 discussion, and suggesting that others had the same curiosity...

 Had the question been something more like Hey Nathaniel, what other
 projects do you think could inform our current view, maybe from stuff
 you've done in the past or lists you've lurked on?, I would have a
 very different reaction.  But this sentence:

 
 I admit to a certain curiosity about your own involvement in FOSS
 projects, and I know I'm not alone in this.
 

 definitely reads to me with a rather dark and unpleasant angle. Upon
 rereading it again now, I still don't like the tone.  I trust you when
 you indicate that your intent was different; perhaps it's a matter of
 phrasing, or the fact that English is not my native language and I may
 miss subtleties of native speakers.


 Perhaps it was a bit colored, but even so, I'd like to know some specifics
 of his experience. Monotone was one of the projects that sprang up after
 Linus started using Bitkeeper as an open alternative, but that is actually
 fairly recent (2003 or so) and much of the discussion seems to have been
 carried on over IRC, rather than a mailing list. I'm guessing that some
 other projects could have taken place in the 90's, but things have changed
 so much since then that it is hard to know what was going on in that
 decade. There was certainly work on the C++ Template library, Linux,
 Python, and various utilities. But it is hard to know. In any case, I'd
 guess that Monotone was a fairly tight knit community, and about 2007 most
 of the developers left. I'd guess it was mostly a case of git and mercurial
 becoming dominant, and possibly they also lost interest in DVCS and moved
 on to other things.

 Numpy itself has gone through several of those transitions, and looking
 back, I think one of the problems was that when Travis left for Enthought
 he didn't officially hand off maintenance. The whole transition was a bit
 lucky, with David, Pauli, and myself unofficially continuing the work for
 the 1.3 and 1.4 releases. At that point I was hoping David could more or
 less take over, but he graduated, and Pauli would have been an excellent
 choice, but he took up his graduate studies. Turnover is a problem with
 open source, and no matter how much discussion there is, if people aren't
 doing the work the whole thing sort of peters out.


 Thanks for explaining yourself.The tone you used could earlier have
 been mis-interpreted (though I would hope that people would look at your
 record of contribution and give you the benefit of the doubt).   Your last
 sentence is very true.   In this particular case, however, there is enough
 interest that the whole thing will not peter out, but there is a strong
 chance that there will be competing groups with divergent needs and
 interests vying for how the project should develop.

 There are many people who rely on NumPy and are concerned about its
 progress.   NumFocus was created to fight for resources to further the
 whole ecosystem and not just rely on volunteers that are available.   I
 fundamentally do not believe that model can scale.There are, however,
 ways to keep things open source and allow people to work on NumPy as their
 day-job.  Several companies now exist that benefit from the NumPy code base
 and will be interested in seeing it grow.

 It is a mis-characterization to imply that I left the project without a
 hand-off.   I never handed off the project because I never left it.   I
 was very busy at Enthought.  I will still be busy now.   But, NumPy is very
 important to me and has remained so.   I have spent a great deal of mental
 effort trying to figure out how to contribute to its growth.   Yes, I
 allowed other people to contribute significantly to the project and was
 very receptive to their pull requests (even when I didn't think it was the
 most urgent thing or something I actually disagreed with).


Well then, let's say you should have handed off, because you no longer had
the time to devote to it. You made the 1.2.1 release, and after that you
weren't really involved until recently. Now I'm sure that you didn't lose
interest, but you did lose the time, and I think it would have been better
if you had realized that fact up front. As it was, I suggested to David
that it was time for a 1.3 release, and we preceded without permission from
the usual suspects, yourself and Jarrod. I

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant


On Apr 25, 2012, at 12:02 AM, Charles R Harris wrote:

 
 
 On Tue, Apr 24, 2012 at 10:25 PM, Travis Oliphant tra...@continuum.io wrote:
 
 On Apr 24, 2012, at 10:50 PM, Charles R Harris wrote:
 
 
 
 On Tue, Apr 24, 2012 at 9:28 PM, Fernando Perez fperez@gmail.com wrote:
 On Tue, Apr 24, 2012 at 8:02 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Fernando, I'm not checking credentials, I'm curious.
 
 Well, at least I think that an inquisitive query about someone's
 background, phrased like that, can be very easily misread.  I can only
 speak for myself, but I immediately had the impression that you were
 indeed trying to validate his background as a proxy for the
 discussion, and suggesting that others had the same curiosity...
 
 Had the question been something more like Hey Nathaniel, what other
 projects do you think could inform our current view, maybe from stuff
 you've done in the past or lists you've lurked on?, I would have a
 very different reaction.  But this sentence:
 
 
 I admit to a certain curiosity about your own involvement in FOSS
 projects, and I know I'm not alone in this.
 
 
 definitely reads to me with a rather dark and unpleasant angle. Upon
 rereading it again now, I still don't like the tone.  I trust you when
 you indicate that your intent was different; perhaps it's a matter of
 phrasing, or the fact that English is not my native language and I may
 miss subtleties of native speakers.
 
 
 Perhaps it was a bit colored, but even so, I'd like to know some specifics 
 of his experience. Monotone was one of the projects that sprang up after 
 Linus started using Bitkeeper as an open alternative, but that is actually 
 fairly recent (2003 or so) and much of the discussion seems to have been 
 carried on over IRC, rather than a mailing list. I'm guessing that some 
 other projects could have taken place in the 90's, but things have changed 
 so much since then that it is hard to know what was going on in that decade. 
 There was certainly work on the C++ Template library, Linux, Python, and 
 various utilities. But it is hard to know. In any case, I'd guess that 
 Monotone was a fairly tight knit community, and about 2007 most of the 
 developers left. I'd guess it was mostly a case of git and mercurial 
 becoming dominant, and possibly they also lost interest in DVCS and moved on 
 to other things.
 
 Numpy itself has gone through several of those transitions, and looking 
 back, I think one of the problems was that when Travis left for Enthought he 
 didn't officially hand off maintenance. The whole transition was a bit 
 lucky, with David, Pauli, and myself unofficially continuing the work for 
 the 1.3 and 1.4 releases. At that point I was hoping David could more or 
 less take over, but he graduated, and Pauli would have been an excellent 
 choice, but he took up his graduate studies. Turnover is a problem with open 
 source, and no matter how much discussion there is, if people aren't doing 
 the work the whole thing sort of peters out.
 
 Thanks for explaining yourself.The tone you used could earlier have been 
 mis-interpreted (though I would hope that people would look at your record of 
 contribution and give you the benefit of the doubt).   Your last sentence is 
 very true.   In this particular case, however, there is enough interest that 
 the whole thing will not peter out, but there is a strong chance that there 
 will be competing groups with divergent needs and interests vying for how the 
 project should develop.   
 
 There are many people who rely on NumPy and are concerned about its progress. 
   NumFocus was created to fight for resources to further the whole ecosystem 
 and not just rely on volunteers that are available.   I fundamentally do not 
 believe that model can scale.There are, however, ways to keep things open 
 source and allow people to work on NumPy as their day-job.  Several companies 
 now exist that benefit from the NumPy code base and will be interested in 
 seeing it grow.
 
 It is a mis-characterization to imply that I left the project without a 
 hand-off.   I never handed off the project because I never left it.   I was 
 very busy at Enthought.  I will still be busy now.   But, NumPy is very 
 important to me and has remained so.   I have spent a great deal of mental 
 effort trying to figure out how to contribute to its growth.   Yes, I allowed 
 other people to contribute significantly to the project and was very 
 receptive to their pull requests (even when I didn't think it was the most 
 urgent thing or something I actually disagreed with). 
 
 Well then, let's say you should have handed off, because you no longer had 
 the time to devote to it. You made the 1.2.1 release, and after that you 
 weren't really involved until recently. Now I'm sure that you didn't lose 
 interest, but you did lose the time, and I think it would have been better if 
 you had realized that fact up front.

I will grant you that.

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Fernando Perez

On Tue, Apr 24, 2012 at 10:02 PM,  josef.p...@gmail.com wrote:
 Sorry that I missed this part of numpy history, I always had the
 impression that numpy is run by a community led by Chuck and the young
 guys, David, Pauli, Stefan, Pierre; and Robert on the mailing list .
 (But I came late, and am just a balcony muppet.)

Travis, when you have a free minute (ha :)  it would be very nice if
you wrote up a blog post with some of the history from say the 2000s
with Numeric, through Numarray and into Numpy.  Some of us saw all
that happen first hand and know it well, but since most of it simply
happened on mailing lists, conferences and assorted meetings, it's
actually quite hard to understand that history if you arrive now.
It's not really written up anywhere, and nobody is going to read 10
years' worth of email archives :)

Guido a while back wrote a fantastic set of posts on the history of
python itself that I've greatly enjoyed:

http://python-history.blogspot.com/

something similar for numpy would be nice to have...

Though thinking more about it, perhaps a better alternative could be a
'history of the scipy world' where multiple people could write guest
posts about each project they've had a part of.  I think something
like that could be a lot of fun, and also useful :)

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Gael Varoquaux

On Tue, Apr 24, 2012 at 05:59:09PM -0600, Charles R Harris wrote:
 Travis, if you are playing the BDFL role, then just make the darn decision
 and remove the code so we can get on with life. As it is you go back and
 forth and that does none of us any good, you're a big guy and you're
 rocking the boat. I don't agree with that decision, I'd rather evolve the
 code we have, but I'm willing to compromise with your decision in this
 matter.

I think that Chuck's point here, in a thread on consensus, is very
important: sometimes design discussions stall. If, in such situation, a
BDFL makes a decision, acknowledging that he has no divine power to see
the best of all option but needs to move on, it can help the project go
forward.

As long as nobody's feelings are hurt, a bit of dictatorship well used
moves a project forward. Of course, as with any leadership, it only works
because we as a community trust the leader.

My 2 cents,

Gael
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Travis Oliphant

I've given several talks on the subject, but I don't think I've ever written a 
blog-post about it. A reasonable history does exist  in the beginning of 
the Guide to NumPy which is still available for free at 

http://www.tramy.us/numpybook.pdf


-Travis




On Apr 25, 2012, at 12:18 AM, Fernando Perez wrote:

 On Tue, Apr 24, 2012 at 10:02 PM,  josef.p...@gmail.com wrote:
 Sorry that I missed this part of numpy history, I always had the
 impression that numpy is run by a community led by Chuck and the young
 guys, David, Pauli, Stefan, Pierre; and Robert on the mailing list .
 (But I came late, and am just a balcony muppet.)
 
 Travis, when you have a free minute (ha :)  it would be very nice if
 you wrote up a blog post with some of the history from say the 2000s
 with Numeric, through Numarray and into Numpy.  Some of us saw all
 that happen first hand and know it well, but since most of it simply
 happened on mailing lists, conferences and assorted meetings, it's
 actually quite hard to understand that history if you arrive now.
 It's not really written up anywhere, and nobody is going to read 10
 years' worth of email archives :)
 
 Guido a while back wrote a fantastic set of posts on the history of
 python itself that I've greatly enjoyed:
 
 http://python-history.blogspot.com/
 
 something similar for numpy would be nice to have...
 
 Though thinking more about it, perhaps a better alternative could be a
 'history of the scipy world' where multiple people could write guest
 posts about each project they've had a part of.  I think something
 like that could be a lot of fun, and also useful :)
 
 Cheers,
 
 f
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Matthew Brett

Hi,

On Sun, Apr 22, 2012 at 3:15 PM, Nathaniel Smith n...@pobox.com wrote:
 If you hang around big FOSS projects, you'll see the word consensus
 come up a lot. For example, the glibc steering committee recently
 dissolved itself in favor of governance directly by the consensus of
 the people active in glibc development[1]. It's the governing rule of
 the IETF, which defines many of the most important internet
 standards[2]. It is the primary way decisions are made on
 Wikipedia[3]. It's one of the fundamental aspects of accomplishing
 things within the Apache framework[4].

 [1] https://lwn.net/Articles/488778/
 [2] https://www.ietf.org/tao.html#getting.things.done
 [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus
 [4] https://www.apache.org/foundation/voting.html

I think the big problem here is that Chuck (I hope I'm not
misrepresenting him) is not interested in discussion of process, and
the last time we had a specific thread on governance, Travis strongly
implied he was not very interested either, at least at the time.

In that situation, there's rather a high threshold to pass before
getting involved in the discussion, and I think you're seeing some
evidence for that.

So, as before, and as we discussed on gchat :) - whether this
discussion can go anywhere depends on Travis.   Travis - what do you
think?

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Nathaniel Smith

On Mon, Apr 23, 2012 at 1:04 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sun, Apr 22, 2012 at 4:15 PM, Nathaniel Smith n...@pobox.com wrote:

 If you hang around big FOSS projects, you'll see the word consensus
 come up a lot. For example, the glibc steering committee recently
 dissolved itself in favor of governance directly by the consensus of
 the people active in glibc development[1]. It's the governing rule of
 the IETF, which defines many of the most important internet
 standards[2]. It is the primary way decisions are made on
 Wikipedia[3]. It's one of the fundamental aspects of accomplishing
 things within the Apache framework[4].

 [1] https://lwn.net/Articles/488778/
 [2] https://www.ietf.org/tao.html#getting.things.done
 [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus
 [4] https://www.apache.org/foundation/voting.html

 But it turns out that this consensus thing is actually somewhat
 mysterious, and one that most programmers immersed in this culture
 pick it up by osmosis. And numpy in particular has a lot of developers
 who are not coming from a classic FOSS programmer background! So this
 is my personal attempt to articulate what it is, and why requiring
 consensus is probably the best possible approach to project decision
 making.

 So what is consensus? Like, voting or something?
 -

 This is surprisingly subtle and specific.

 Consensus means something like, everyone who cares is satisfied
 with the result.

 It does *not* mean
 * Every opinion counts equally
 * We vote on anything
 * Every solution must be perfect and flawless
 * Every solution must leave everyone overjoyed
 * Everyone must sign off on every solution.

 It *does* mean
 * We invite people to speak up
 * We generally trust individuals to decide how important their opinion is
 * We generally trust individuals to decide whether or not they can
 live with some outcome
 * If they can't, then we take the time to find something better.

 One simple way of stating this is, everyone has a veto. In practice,
 such vetoes are almost never used, so this rule is not particularly
 illuminating on its own. Hence, the rest of this document.

 What a waste of time! That all sounds very pretty on paper, but we
 have stuff to get done.

 ---

 First, I'll note that this seemingly utopian scheme has a track record
 of producing such impractical systems as TCP/IP, SMTP, DNS, Apache,
 GCC, Linux, Samba, Python, ...


 Linux is Linus' private tree. Everything that goes in is his decision,
 everything that stays out is his decision. Of course, he delegates much of
 the work to people he trusts, but it doesn't even reach the level of a BDFL,
 it's DFL. As for consensus, it basically comes down to convincing the
 gatekeepers one level below Linus that your code might be useful. So bad
 example. Same with TCP/IP, which was basically Kahn and Cerf consulting with
 a few others and working by request of DARPA. GCC was Richard Stallman (I
 got one of the first tapes for a $30 donation), Python was Guido. Some of
 the projects later developed some form of governance but Guido, for
 instance, can veto anything he dislikes even if he is disinclined to do so.
 I'm not saying you're wrong about open source, I'm just saying that that
 each project differs and it is wrong to imply that they follow some common
 form of governance under the rubric FOSS and that they all seek consensus.
 And they certainly don't *start* that way. And there are also plenty of
 projects that fail when the prime mover loses interest or folks get tired of
 the politics.

So a few points here:

Consensus-based decision-making is an ideal and a guide, not an
algorithm. There's nothing at all inconsistent between having a BDFL
and using consensus as the primary guide for decision making -- it
just means that the BDFL chooses to exercise their power in that way,
and is generally trusted to make judgement calls about specific cases.
See Fernando's reply down-thread for an example of this.

And I'm not saying that all FOSS projects follow some common form of
governance. But I am saying that there's a substantial amount of
shared development culture across most successful FOSS projects, and a
ton of experience on how to run a project successfully. Project
management is a difficult and arcane skill set, and one that's hard to
learn except through apprenticeship and osmosis. And it's definitely
not included in most courses on programming for scientists! So it'd be
nice if numpy could avoid having to re-make some of these mistakes...

But the other effect of this being cultural values rather than
something explicit and articulated is that sometimes you can't see it
from the outside. For example:

Linux: Technically, everything you say is true. In practice, good luck
convincing Linus or a subsystem maintainer to accept your patch when
other

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Matthew Brett

Hi,

On Mon, Apr 23, 2012 at 12:33 PM, Nathaniel Smith n...@pobox.com wrote:
 On Mon, Apr 23, 2012 at 1:04 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 Linux is Linus' private tree. Everything that goes in is his decision,
 everything that stays out is his decision. Of course, he delegates much of
 the work to people he trusts, but it doesn't even reach the level of a BDFL,
 it's DFL. As for consensus, it basically comes down to convincing the
 gatekeepers one level below Linus that your code might be useful. So bad
 example. Same with TCP/IP, which was basically Kahn and Cerf consulting with
 a few others and working by request of DARPA. GCC was Richard Stallman (I
 got one of the first tapes for a $30 donation), Python was Guido. Some of
 the projects later developed some form of governance but Guido, for
 instance, can veto anything he dislikes even if he is disinclined to do so.
 I'm not saying you're wrong about open source, I'm just saying that that
 each project differs and it is wrong to imply that they follow some common
 form of governance under the rubric FOSS and that they all seek consensus.
 And they certainly don't *start* that way. And there are also plenty of
 projects that fail when the prime mover loses interest or folks get tired of
 the politics.

[snip]

 Linux: Technically, everything you say is true. In practice, good luck
 convincing Linus or a subsystem maintainer to accept your patch when
 other people are raising substantive complaints. Here's an email I
 googled up in a few moments, in which Linus yells at people for trying
 to submit a patch to him without making sure that all interested
 parties have agreed:
  https://lkml.org/lkml/2009/9/14/481
 Stuff regularly sits outside the kernel tree in limbo for *years*
 while people debate different approaches back and forth.

To which I'd add:

In fact, for [Linus'] decisions to be received as legitimate, they
have to be consistent with the consensus of the opinions of
participating developers as manifest on Linux mailing lists. It is not
unusual for him to back down from a decision under the pressure of
criticism from other developers. His position is based on the
recognition of his fitness by the community of Linux developers and
this type of authority is, therefore, constantly subject to
withdrawal. His role is not that of a boss or a manager in the usual
sense. In the final analysis, the direction of the project springs
from the cumulative synthesis of modifications contributed by
individual developers.
http://shareable.net/blog/governance-of-open-source-george-dafermos-interview

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Travis Oliphant

 
 Linux: Technically, everything you say is true. In practice, good luck
 convincing Linus or a subsystem maintainer to accept your patch when
 other people are raising substantive complaints. Here's an email I
 googled up in a few moments, in which Linus yells at people for trying
 to submit a patch to him without making sure that all interested
 parties have agreed:
  https://lkml.org/lkml/2009/9/14/481
 Stuff regularly sits outside the kernel tree in limbo for *years*
 while people debate different approaches back and forth.
 
 To which I'd add:
 
 In fact, for [Linus'] decisions to be received as legitimate, they
 have to be consistent with the consensus of the opinions of
 participating developers as manifest on Linux mailing lists. It is not
 unusual for him to back down from a decision under the pressure of
 criticism from other developers. His position is based on the
 recognition of his fitness by the community of Linux developers and
 this type of authority is, therefore, constantly subject to
 withdrawal. His role is not that of a boss or a manager in the usual
 sense. In the final analysis, the direction of the project springs
 from the cumulative synthesis of modifications contributed by
 individual developers.
 http://shareable.net/blog/governance-of-open-source-george-dafermos-interview
 

This is the model that I have for NumPy development.   It is my view of how 
NumPy has evolved already and how Numarray, and Numeric evolved before it as 
well.I also feel like these things are fundamentally determined by the 
people involved and by the personalities and styles of those who participate.   
 There certainly are globally applicable principles (like code review, building 
consensus, and mutual respect) that are worth emphasizing over and over again.  
 If it helps let's write those down and say these are the principles we live 
by.   I am suspicious that you can go beyond this in formalizing the process 
as you ultimately are at the mercy of the people involved and their judgment, 
anyway. 

I can also see that for the benefit of newcomers and occasional contributors it 
can be beneficial to have some documentation of the natural, emergent methods 
and interactions that apply to cooperative software development.   But, I would 
hesitate to put some-kind of aura of authority around such a document that 
implies the processes cannot be violated if good judgment demands that they 
should be.  That is the basis of my hesitation to spend much time on 
officially documenting our process 

Right now we are trying to balance difficult things:  stable releases with 
experimental development. The fact that we had such differences of opinion 
last year on masked arrays / missing values and how to incorporate them into a 
common object model means that we should not have committed the code to master 
until we figured out a way to reconcile Nathaniel's concerns.   That is my 
current view.I was very enthused that we had someone contributing large 
scale changes that clearly showed an ability to understand the code and 
contribute to it --- that hadn't happened in a while.   I wanted to encourage 
that.  I still do. 

I think the process itself has shown that you can have an impact on NumPy just 
by voicing your opinion.   Clearly, you have more of an effect on NumPy by 
submitting pull requests, but NumPy development does listen carefully to the 
voices of users. 

Best, 

-Travis



 See you,
 
 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Matthew Brett

Hi,

On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote:

 Linux: Technically, everything you say is true. In practice, good luck
 convincing Linus or a subsystem maintainer to accept your patch when
 other people are raising substantive complaints. Here's an email I
 googled up in a few moments, in which Linus yells at people for trying
 to submit a patch to him without making sure that all interested
 parties have agreed:
  https://lkml.org/lkml/2009/9/14/481
 Stuff regularly sits outside the kernel tree in limbo for *years*
 while people debate different approaches back and forth.

 To which I'd add:

 In fact, for [Linus'] decisions to be received as legitimate, they
 have to be consistent with the consensus of the opinions of
 participating developers as manifest on Linux mailing lists. It is not
 unusual for him to back down from a decision under the pressure of
 criticism from other developers. His position is based on the
 recognition of his fitness by the community of Linux developers and
 this type of authority is, therefore, constantly subject to
 withdrawal. His role is not that of a boss or a manager in the usual
 sense. In the final analysis, the direction of the project springs
 from the cumulative synthesis of modifications contributed by
 individual developers.
 http://shareable.net/blog/governance-of-open-source-george-dafermos-interview


 This is the model that I have for NumPy development.   It is my view of how 
 NumPy has evolved already and how Numarray, and Numeric evolved before it as 
 well.    I also feel like these things are fundamentally determined by the 
 people involved and by the personalities and styles of those who participate. 
    There certainly are globally applicable principles (like code review, 
 building consensus, and mutual respect) that are worth emphasizing over and 
 over again.   If it helps let's write those down and say these are the 
 principles we live by.   I am suspicious that you can go beyond this in 
 formalizing the process as you ultimately are at the mercy of the people 
 involved and their judgment, anyway.

I think writing it down would help enormously.  For example, if you do
agree to Nathaniel's view of consensus - *in principle* - and we write
that down and agree, we have a document to appeal to when we next run
into trouble.Maybe the document could say something like:


We strive for consensus [some refs here].

Any substantial new feature is subject to consensus.

Only if all avenues for consensus have been documented, and exhausted,
will we [vote, defer to Travis, or some other tie-breaking thing].


Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Chris Barker

On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote:
 Right now we are trying to balance difficult things:  stable releases with 
 experimental development.

Perhaps a more formal development release system could help here.
IIUC, numpy pretty much has two things: the latest release (and past
ones) and master (and assorted experimentla branches). If someone
develops a new feature, we can either:

have them submit a pull request, and people with the where-with-all
can pull it, compile, it, and start tesing it on their own -- hsitory
shows that this is a small group.

merge it with master -- and hope it gets the testing is should before
it becomes part of a release, but: we are rightly heistant to put
experimental stuff in master, and it really dont' get that much
testing -- again only folks that are building master will even see it.


Some projects have a more format development release system.
wxPython, for instance has had for years development releases with odd
numbers -- right now, the official release is 2.8.*, but there is a
2.9.* out there that is getting some use and testing. A couple of
things help make this work:

1) Robin makes the effort to put out binaries for development releases
-- it's easy to go get and give it a try.

2) there is the wxversion system that makes it easy to install a new
versin of wx, and easily switch between them (it's actually broken on
OS-X right now --- :-) ) -- this pre-dated virtualenv and friends,
maybe virtualenv is enough for this now.


Anyway, it's a thought -- I think some more rea-world use of new
features before a real commitment to adopting them would be great.

-Chris




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Travis Oliphant

That is an excellent thought.

We could make the odd numbered releases experimental and the even-numbered as 
stable.  

That makes some sense.What do others think?

-Travis



On Apr 23, 2012, at 5:46 PM, Chris Barker wrote:

 On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant tra...@continuum.io wrote:
 Right now we are trying to balance difficult things:  stable releases with 
 experimental development.
 
 Perhaps a more formal development release system could help here.
 IIUC, numpy pretty much has two things: the latest release (and past
 ones) and master (and assorted experimentla branches). If someone
 develops a new feature, we can either:
 
 have them submit a pull request, and people with the where-with-all
 can pull it, compile, it, and start tesing it on their own -- hsitory
 shows that this is a small group.
 
 merge it with master -- and hope it gets the testing is should before
 it becomes part of a release, but: we are rightly heistant to put
 experimental stuff in master, and it really dont' get that much
 testing -- again only folks that are building master will even see it.
 
 
 Some projects have a more format development release system.
 wxPython, for instance has had for years development releases with odd
 numbers -- right now, the official release is 2.8.*, but there is a
 2.9.* out there that is getting some use and testing. A couple of
 things help make this work:
 
 1) Robin makes the effort to put out binaries for development releases
 -- it's easy to go get and give it a try.
 
 2) there is the wxversion system that makes it easy to install a new
 versin of wx, and easily switch between them (it's actually broken on
 OS-X right now --- :-) ) -- this pre-dated virtualenv and friends,
 maybe virtualenv is enough for this now.
 
 
 Anyway, it's a thought -- I think some more rea-world use of new
 features before a real commitment to adopting them would be great.
 
 -Chris
 
 
 
 
 -- 
 
 Christopher Barker, Ph.D.
 Oceanographer
 
 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception
 
 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Charles R Harris

On Mon, Apr 23, 2012 at 5:02 PM, Travis Oliphant tra...@continuum.iowrote:

 That is an excellent thought.

 We could make the odd numbered releases experimental and the
 even-numbered as stable.

 That makes some sense.What do others think?


I'm starting to think that a fork might be the best solution to the present
problem. There is plenty of precedent for forks in FOSS, for example GCC,
EGCS, Redhat 1.97, LLVM and emacs, xemacs. There are several semi-official
forks of linux (Android, the real time Kernel, etc.) Zeromq just forked,
OpenOffice forked, there was XFree86 forked to Xorg, etc. Linus encourages
forks, so there is even authority for that ;) Of course, the further the
fork diverges from the original the harder reintegration becomes, witness
Android and wake-locks. But a fork would cure a lot of contention.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Stéfan van der Walt

On Mon, Apr 23, 2012 at 4:39 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 I'm starting to think that a fork might be the best solution to the present
 problem.

If you are referring to the traditional concept of a fork, and not to
the type we frequently make on GitHub, then I'm surprised that no one
has objected already.  What would a fork solve? To paraphrase the
regexp saying: after forking, we'll simply have two problems.

It's really not that hard to focus our attention on technical issues
and to reach consensus.

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Fernando Perez

On Mon, Apr 23, 2012 at 4:02 PM, Travis Oliphant tra...@continuum.io wrote:
 That is an excellent thought.

 We could make the odd numbered releases experimental and the even-numbered 
 as stable.

 That makes some sense.    What do others think?

I think the concern with that is manpower: it effectively requires
maintaining two complete projects alive in parallel.  As far as I
know, a number projects that used to have that model have backed off
(the linux kernel included) to better enable a limited team to focus
on development.  I'm skeptical that numpy has the manpower to sustain
that approach.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-23 Thread Fernando Perez

On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt ste...@sun.ac.za wrote:
 If you are referring to the traditional concept of a fork, and not to
 the type we frequently make on GitHub, then I'm surprised that no one
 has objected already.  What would a fork solve? To paraphrase the
 regexp saying: after forking, we'll simply have two problems.

I concur with you here: github 'forks', yes, as many as possible!
Hopefully every one of those will produce one or more PRs :)  But a
fork in the sense of a divergent parallel project?  I think that would
only be indicative of a complete failure to find a way to make
progress here, and I doubt we're anywhere near that state.

That forks are *possible* is indeed a valuable and important option in
open source software, because it means that a truly dysfunctional
original project team/direction can't hold a community hostage
forever.  But that doesn't mean that full-blown forks should be
considered lightly, as they also carry enormous costs.

I see absolutely nothing in the current scenario to even remotely
consider that a full-blown fork would be a good idea, and I hope I'm
right.  It seems to me we're making progress on problems that led to
real difficulties last year, but from multiple parties I see signs
that give me reason to be optimistic that the project is getting
better, not worse.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What is consensus anyway

2012-04-22 Thread Charles R Harris

On Sun, Apr 22, 2012 at 4:15 PM, Nathaniel Smith n...@pobox.com wrote:

 If you hang around big FOSS projects, you'll see the word consensus
 come up a lot. For example, the glibc steering committee recently
 dissolved itself in favor of governance directly by the consensus of
 the people active in glibc development[1]. It's the governing rule of
 the IETF, which defines many of the most important internet
 standards[2]. It is the primary way decisions are made on
 Wikipedia[3]. It's one of the fundamental aspects of accomplishing
 things within the Apache framework[4].

 [1] https://lwn.net/Articles/488778/
 [2] https://www.ietf.org/tao.html#getting.things.done
 [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus
 [4] https://www.apache.org/foundation/voting.html

 But it turns out that this consensus thing is actually somewhat
 mysterious, and one that most programmers immersed in this culture
 pick it up by osmosis. And numpy in particular has a lot of developers
 who are not coming from a classic FOSS programmer background! So this
 is my personal attempt to articulate what it is, and why requiring
 consensus is probably the best possible approach to project decision
 making.

 So what is consensus? Like, voting or something?
 -

 This is surprisingly subtle and specific.

 Consensus means something like, everyone who cares is satisfied
 with the result.

 It does *not* mean
 * Every opinion counts equally
 * We vote on anything
 * Every solution must be perfect and flawless
 * Every solution must leave everyone overjoyed
 * Everyone must sign off on every solution.

 It *does* mean
 * We invite people to speak up
 * We generally trust individuals to decide how important their opinion is
 * We generally trust individuals to decide whether or not they can
 live with some outcome
 * If they can't, then we take the time to find something better.

 One simple way of stating this is, everyone has a veto. In practice,
 such vetoes are almost never used, so this rule is not particularly
 illuminating on its own. Hence, the rest of this document.

 What a waste of time! That all sounds very pretty on paper, but we
 have stuff to get done.

 ---

 First, I'll note that this seemingly utopian scheme has a track record
 of producing such impractical systems as TCP/IP, SMTP, DNS, Apache,
 GCC, Linux, Samba, Python, ...


Linux is Linus' private tree. Everything that goes in is his decision,
everything that stays out is his decision. Of course, he delegates much of
the work to people he trusts, but it doesn't even reach the level of a
BDFL, it's DFL. As for consensus, it basically comes down to convincing the
gatekeepers one level below Linus that your code might be useful. So bad
example. Same with TCP/IP, which was basically Kahn and Cerf consulting
with a few others and working by request of DARPA. GCC was Richard Stallman
(I got one of the first tapes for a $30 donation), Python was Guido. Some
of the projects later developed some form of governance but Guido, for
instance, can veto anything he dislikes even if he is disinclined to do so.
I'm not saying you're wrong about open source, I'm just saying that that
each project differs and it is wrong to imply that they follow some common
form of governance under the rubric FOSS and that they all seek consensus.
And they certainly don't *start* that way. And there are also plenty of
projects that fail when the prime mover loses interest or folks get tired
of the politics.

But mere empirical results are often less convincing than a good
 story, so I will give you two. Why does a requirement for consensus
 work?

 Reason 1 (for optimists): *All of us are smarter than any of us.* For
 a complex project with many users, it's extraordinarily difficult for
 any one person to understand the full ramifications of any decision,
 particularly the sort of far-reaching architectural decisions that are
 most important. It's even more difficult to understand all the
 possibilities of all the different possible solutions. In fact, it's
 *extremely* common that the correct solution to a problem is the one
 that no-one thinks of until after a month of annoying debate. Spending
 a month to avoid an architectural problem that will haunt us for years
 is an *excellent* trade-off, even if it feels interminable at the
 time. Even two months. Usually disagreements are an indication that a
 better solution is possible, even when it's not clear what that would
 be.

 Reason 2 (for pessimists): *You **will** reach consensus sooner or
 later; it's less painful to do up front.* Example: NA handling. There
 are two schemes that people use for this right now -- numpy.ma and
 ugly NaN kluges (see e.g. nanmean). These are generally agreed to be
 suboptimal. Recently, two new contenders have shown up: the NEP
 masked-NA support currently in master, and the

Re: [Numpy-discussion] What is consensus anyway

2012-04-22 Thread Fernando Perez

Hi Nathaniel,

thanks for a solid writeup of this topic.  I just want to add a note
from personal experience, regarding this specific point:

On Sun, Apr 22, 2012 at 3:15 PM, Nathaniel Smith n...@pobox.com wrote:
 Usually disagreements are an indication that a
 better solution is possible, even when it's not clear what that would
 be.

I think this is *extremely* important, so I want to highlight it from
the rest of your post.  Regarding how IPython operates, I think we
have good evidence to illustrate the value of this... One of the
members of the IPython team who joined earliest is Brian Granger: he
started working on IPython around 2004 after a conversation we had in
the context of a SciPy conference.  Some of you may know that Brian
and I went to graduate school together, which means we've known each
other for much longer than IPython, and we've been good friends since.
 But that alone doesn't ensure a smooth collaboration; in fact Brian
and I extremely often disagree *deeply* on design decisions about
IPython.

And yet, I think the project makes solid progress, not despite this
but in an important way *thanks* to this divergence.  Whenever we
disagree, it typically means that each of us is seeing a partial
solution to a problem, but not a really solid and complete one.  I
don't recall ever using my 'BDFL vote' in one of these discussions;
instead we just keep going around the problem.  Typically what happens
is that after much discussion, we settle on a new solution that
neither of us had quite seen at the start. I mention Brian
specifically because him and I seem to be at opposite ends of some
weird spectrum, disagreement between the other parties appears to fall
somewhere in between.

Here's an example that is currently in open discussion, and despite
the fact that I'm completely convinced that something like this should
go into IPython, I'm waiting.  We'll continue the discussion to either
find arguments that convince me otherwise, or to convince Brian of the
value of the PR:

https://github.com/ipython/ipython/pull/1343

It takes both patience and trust for this to work: we have to be
willing to wait out the long discussion, and we have to trust that
despite how much we may disagree on something, we both play fair and
ultimately only want what's best for the project.  That means giving
the other party the benefit of the doubt at every turn, and having a
willingness to let the discussion happen openly as long as is
necessary for the project to remain healthy.

For example in this case, I'm *really* convinced of my point, and I
think blocking this PR actively hurts users.  Is it worth saying OK,
I'm overriding your concerns here and pushing this forward?
Absolutely NOT!  I'd only:

- alienate Brian, a key member of the project without whom IPython
would be nowhere near where it is today, and decrease his motivation
to continue working
- kill the opportunity for a discussion to produce an even cleaner
solution than what we've seen so far
- piss off a good friend.  I put this last because while that's
actually a very important reason for me, the fact that Brian and I are
good personal friends is secondary here: this is about discussion
between contributors independent of their personal relationships.

I hope this perspective is useful...

 1. Make it as easy as possible for people to see what's going on and
 join the discussion. All decisions and reasoning behind decisions take
 place in public. (On this note, it would be *really* good if pull
 request notifications went to the list.)

If anyone knows how to do this, let me know; I'd like to do the same
for IPython and our -dev list.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

84 matches

Mail list logo