Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Donald E. Foss
For what it's worth, this email thread and your summary writeup, Wes, are a 
significant call to action on their own. 

I've been passive, not by choice, but by policy. Given the significance and 
need of this project, I'll see what I can do on my side. It will be at least a 
week given the US holiday. 

Donald E. Foss

> On Jun 30, 2018, at 2:15 PM, Marco Neumann  
> wrote:
> 
> Hey,
> 
> first of all, thanks a lot for your, Uwes, the mergers and contributors
> work. Now, to the maintainer problem:
> 
> # Arrow as "a library"
> One thing that makes Arrow special is that it is not a single, but many
> libraries (one for each language) and many of them are not only a
> binding to a C/C++ lib, but partly a complete re-implementation of the
> protocol, e.g.:
> 
> - C++: one core, but also contains Python specialties
> - Java: another core
> - Rust: yet another core
> - Python: a binding to C++ but also a lot more stuff because of Pandas
> ...
> 
> And you two are maintaining all of them and I doubt that you have the
> capacities and knowledge to do this at the desired level of quality
> (which is natural, not a personal issue or offense). So this I would
> call "pseudo-maintenance", since you're solely the gatekeeper that does
> some shallow reviewing and has the burden to do the housekeeping and
> the merging. So why accepting these language bindings in the first
> place without bringing a core maintainer in place? For example, let's
> say someone proposes a binding to Haskell now. That should not be
> accepted as part of the official Apache implementation without a
> dedicated maintainer (ideally the PR-author would be that person, but
> there may others who step up).
> 
> Right now, it might be too late to remove some of the incomplete / WIP
> implementations that don't have a core maintainer though.
> 
> # GitHub
> Another special thing to consider is that Arrow is (ab)using GitHub as
> a code hosting platform. Even as a contributor, this has obvious bad
> uncool consequences:
> 
> - you have yet another issue hosting system to log in
> - there is yet another information channel to keep track of (this ML
>  for example, which has a semi-informative web interface telling you
>  can only login using Google but does not tell you how to subscribe to
>  the list)
> - links to issues don't work in the known magic way
> - you're merging the PRs by closing them; which is by all means a not
>  very nice way because it does not reflect the contributors work in
>  the project overview and personal profiles, but exactly this is a
>  large part of the GitHub community (btw: merging PRs without using
>  GitHubs merge button IS possible as bors/bors-ng proof)
> 
> So as a potential maintainer, this is already a bumper, since I know
> that there are things less confortable then the system I would get from
> any normal GitHub or Gitlab project.
> 
> I'm not really sure how to solve this or if it should be solved (read
> about the laziness aspect in "Contribution VS Maintenance" below)
> 
> # Time / Payment
> Yes, this is indeed a big issue. From what I can tell from the open
> source projects I was involved in is that for large contributor crowds,
> you normally have full/half-time positions in place for the core
> maintainer (look at the Mozilla projects, the Blender Foundation, Gnome
> / Red Hat). So at one point I think maintaining isn't a part time /
> hobby thing anymore (w/o downgrading the hard work of Hobby-
> contributors, in contrast). I don't have a link at hand, but I recall
> some discussion about GitHub and it's importance for hiring (since it
> it acts as a CV) after MS bought it, and some of the responses are
> "doing all this work in your free time is a privilege of wealthy,
> mostly-white men", which without signing this statement in this really
> bare form already shows a problem of open source world.
> 
> # Contribution VS Maintenance
> The very "nice" thing about patch/PR contribution is that you do your
> work and then you can walk away and it's the maintainers problem to
> release the artifact, upgrade/migrate your code and ensure that the
> tests you've written never break. It's comfortable. Being a maintainer
> means all the opposite things. And in the end, you get blamed for not
> supporting certain features (see the open source paragraph here https:/
> /blog.ghost.org/5/ ) or for security disasters (remember the OpenSSL
> disaster).
> 
> I think together with the previous point this means, we have to get
> companies to pay for that work, and not just dump their features to an
> OSS repo.
> 
> # Path to Maintainership
> So I think (from my narrow point of view!) that many people expect that
> the path from "outsider" to "maintainer" takes the route over "a lot of
> patch/PR contributions". If I'm reading your mail right, that is not
> necessarily the case for Apache projects and I think that's great. The
> "review PRs" path sounds great, but I think GitHub or any platform I'm
> aware don't do a 

Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Wes McKinney
hi Antoine,

On Sat, Jun 30, 2018 at 2:35 PM, Antoine Pitrou  wrote:
>
> Hi Wes,
>
>> I'm not sure what's the best way to address this problem. The quality
>> of our code review has declined at times as we struggle to keep up
>> with the flow of patches -- I don't think this is good. Having the
>> patch queue pile up isn't great either.
>
> I'd like to do more reviews but due to the breadth of topics and
> technologies in our code base I don't feel competent for many of the PRs
> that are being posted.

As one of the top 3 maintainers (by # of patches merged) in 2018, and
the newest committer, there is no need to apologize for anything.

>
> For example, on a Rust PR I may do a brief review of concepts, APIs or
> general cleanliness, but not much more.
>
>> Personally, I'm having a
>> difficult time balancing project maintenance and patch authoring,
>> particularly in the last 6 months.
>
> I think it's ok to spend most of your time on reviewing and project
> maintenance.

That's what I will do for a while, but honestly it is creating a lot
of stress for me because we are not progressing very quickly towards a
feature-complete iteration of the columnar format and the ability to
do a 1.0 release. If I were able to spend more time writing patches, I
feel I could put more pressure on the project to reach that point
sooner.

>
>> Any thoughts about how we can grow the maintainership? Somehow we need
>> to reach ~5-6 core maintainers over the next year.
>
> Or more of them, if we want all topics to be covered by at least 1-2
> maintainers.

Agreed. As an example, Kou has done an excellent job maintaining the
C/GLib subproject and has been super responsive dealing with debugging
and packaging / release management issues.

>
> Regards
>
> Antoine.


Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Wes McKinney
hi Marco,

some comments inline

On Sat, Jun 30, 2018 at 2:15 PM, Marco Neumann
 wrote:
> Hey,
>
> first of all, thanks a lot for your, Uwes, the mergers and contributors
> work. Now, to the maintainer problem:
>
> # Arrow as "a library"
> One thing that makes Arrow special is that it is not a single, but many
> libraries (one for each language) and many of them are not only a
> binding to a C/C++ lib, but partly a complete re-implementation of the
> protocol, e.g.:
>
> - C++: one core, but also contains Python specialties
> - Java: another core
> - Rust: yet another core
> - Python: a binding to C++ but also a lot more stuff because of Pandas
> ...
>
> And you two are maintaining all of them and I doubt that you have the
> capacities and knowledge to do this at the desired level of quality
> (which is natural, not a personal issue or offense). So this I would
> call "pseudo-maintenance", since you're solely the gatekeeper that does
> some shallow reviewing and has the burden to do the housekeeping and
> the merging. So why accepting these language bindings in the first
> place without bringing a core maintainer in place? For example, let's
> say someone proposes a binding to Haskell now. That should not be
> accepted as part of the official Apache implementation without a
> dedicated maintainer (ideally the PR-author would be that person, but
> there may others who step up).

The most development activity, and where we have the most need of
help, is in C++ and Python. The other area is in dev/CI infrastructure
and release management.

We're falling behind on implementation and design work involving
Java-land (I have been trying for about a year to hammer down an
improved Interval type), but that's a separate problem.

We are about to reach a point (particularly if Gandiva becomes part of
Apache Arrow) where more languages will become dependent on the C++
library. This makes the need for more C++ maintainers even more
urgent.

I think the other libraries have done a good job of self-managing
their code (e.g. Java, JavaScript), and I frequently merge patches
when there is a +1 or some other consensus.

>
> Right now, it might be too late to remove some of the incomplete / WIP
> implementations that don't have a core maintainer though.

Honestly, the incomplete/WIP projects are not causing any maintenance
burden. It's the main projects and their development lifecycle that is
creating a lot of work.

>
> # GitHub
> Another special thing to consider is that Arrow is (ab)using GitHub as
> a code hosting platform. Even as a contributor, this has obvious bad
> uncool consequences:

I think these issues are red herrings. If maintainers are more
motivated by the gamification of their open source contributions
rather than the health and success of the proejct, I really question
how valuable of a maintainer they are.

>
> - you have yet another issue hosting system to log in

I strongly dispute the notion that using JIRA is a deterrent to
maintainers. If anyone, it's a filter for drive-by contributors and
unserious maintainers. I say this as the project's primary JIRA
gardener.

> - there is yet another information channel to keep track of (this ML
>   for example, which has a semi-informative web interface telling you
>   can only login using Google but does not tell you how to subscribe to
>   the list)
> - links to issues don't work in the known magic way

I think these things might deter passers-by, but I don't see why they
would be a problem for someone who is concerned with the health of the
project. As the primary maintainer of the project, these things don't
impact me in any way.

> - you're merging the PRs by closing them; which is by all means a not
>   very nice way because it does not reflect the contributors work in
>   the project overview and personal profiles, but exactly this is a
>   large part of the GitHub community (btw: merging PRs without using
>   GitHubs merge button IS possible as bors/bors-ng proof)

For each patch you contribute, you get one contribution "point" on
GitHub, but it won't show that you have a PR "merged". I don't see why
we should have to comply with GitHub's gamified approach to open
source.

>
> So as a potential maintainer, this is already a bumper, since I know
> that there are things less confortable then the system I would get from
> any normal GitHub or Gitlab project.
>
> I'm not really sure how to solve this or if it should be solved (read
> about the laziness aspect in "Contribution VS Maintenance" below)

I don't mean to be too dismissive of these concerns (they are common;
people have a difficult time with change) -- I've been long critical
of people concerned with their "GitHub High Score". See some writing
on this from a while ago:
http://wesmckinney.com/blog/github-open-source-contributions/

>
> # Time / Payment
> Yes, this is indeed a big issue. From what I can tell from the open
> source projects I was involved in is that for large contributor crowds,
> you 

Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Marco Neumann
Hey,

first of all, thanks a lot for your, Uwes, the mergers and contributors
work. Now, to the maintainer problem:

# Arrow as "a library"
One thing that makes Arrow special is that it is not a single, but many
libraries (one for each language) and many of them are not only a
binding to a C/C++ lib, but partly a complete re-implementation of the
protocol, e.g.:

- C++: one core, but also contains Python specialties
- Java: another core
- Rust: yet another core
- Python: a binding to C++ but also a lot more stuff because of Pandas
...

And you two are maintaining all of them and I doubt that you have the
capacities and knowledge to do this at the desired level of quality
(which is natural, not a personal issue or offense). So this I would
call "pseudo-maintenance", since you're solely the gatekeeper that does
some shallow reviewing and has the burden to do the housekeeping and
the merging. So why accepting these language bindings in the first
place without bringing a core maintainer in place? For example, let's
say someone proposes a binding to Haskell now. That should not be
accepted as part of the official Apache implementation without a
dedicated maintainer (ideally the PR-author would be that person, but
there may others who step up).

Right now, it might be too late to remove some of the incomplete / WIP
implementations that don't have a core maintainer though.

# GitHub
Another special thing to consider is that Arrow is (ab)using GitHub as
a code hosting platform. Even as a contributor, this has obvious bad
uncool consequences:

- you have yet another issue hosting system to log in
- links to issues don't work in the known magic way
- you're merging the PRs by closing them; which is by all means a not
  very nice way because it does not reflect the contributors work in
  the project overview and personal profiles, but exactly this is a
  large part of the GitHub community (btw: merging PRs without using
  GitHubs merge button IS possible as bors/bors-ng proof)

So as a potential maintainer, this is already a bumper, since I know
that there are things less confortable then the system I would get from
any normal GitHub or Gitlab project.

I'm not really sure how to solve this or if it should be solved (read
about the laziness aspect in "Contribution VS Maintenance" below)

# Time / Payment
Yes, this is indeed a big issue. From what I can tell from the open
source projects I was involved in is that for large contributor crowds,
you normally have full/half-time positions in place for the core
maintainer (look at the Mozilla projects, the Blender Foundation, Gnome
/ Red Hat). So at one point I think maintaining isn't a part time /
hobby thing anymore (w/o downgrading the hard work of Hobby-
contributors, in contrast). I don't have a link at hand, but I recall
some discussion about GitHub and it's importance for hiring (since it
it acts as a CV) after MS bought it, and some of the responses are
"doing all this work in your free time is a privilege of wealthy,
mostly-white men", which without signing this statement in this really
bare form already shows a problem of open source world.

# Contribution VS Maintenance
The very "nice" thing about patch/PR contribution is that you do your
work and then you can walk away and it's the maintainers problem to
release the artifact, upgrade/migrate your code and ensure that the
tests you've written never break. It's comfortable. Being a maintainer
means all the opposite things. And in the end, you get blamed for not
supporting certain features (see the open source paragraph here https:/
/blog.ghost.org/5/ ) or for security disasters (remember the OpenSSL
disaster).

I think together with the previous point this means, we have to get
companies to pay for that work, and not just dump their features to an
OSS repo.

# Path to Maintainership
So I think (from my narrow point of view!) that many people expect that
the path from "outsider" to "maintainer" takes the route over "a lot of
patch/PR contributions". If I'm reading your mail right, that is not
necessarily the case for Apache projects and I think that's great. The
"review PRs" path sounds great, but I think GitHub or any platform I'm
aware don't do a good job in getting people to do so. I mean, I see a
PR and a can leave a review, but for me it is not really clear which
consequences this have (naturally, random people don't have a veto on
changes). So I can jump in when I think something is wrong, but I
cannot approve a PR. This makes sense, but it poses the question of
"how?!". I mean, it is pretty clear on how to become a patch/PR
contributor, but it is not clear on how to become a maintainer, at
least not in an easy way. (I'm sure it's written down somewhere).

So, overall I think a clear Call for Action at the top of the README
could help. Like "Hey, we're looking for maintainers, you could start
by reviewing some PRs and after some reviews maintainers will just be
the last gatekeeper and after some more time, 

Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Antoine Pitrou


Hi Wes,

> I'm not sure what's the best way to address this problem. The quality
> of our code review has declined at times as we struggle to keep up
> with the flow of patches -- I don't think this is good. Having the
> patch queue pile up isn't great either.

I'd like to do more reviews but due to the breadth of topics and
technologies in our code base I don't feel competent for many of the PRs
that are being posted.

For example, on a Rust PR I may do a brief review of concepts, APIs or
general cleanliness, but not much more.

> Personally, I'm having a
> difficult time balancing project maintenance and patch authoring,
> particularly in the last 6 months.

I think it's ok to spend most of your time on reviewing and project
maintenance.

> Any thoughts about how we can grow the maintainership? Somehow we need
> to reach ~5-6 core maintainers over the next year.

Or more of them, if we want all topics to be covered by at least 1-2
maintainers.

Regards

Antoine.


Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Marco Neumann
Hey,

first of all, thanks a lot for your, Uwes, the mergers and contributors
work. Now, to the maintainer problem:

# Arrow as "a library"
One thing that makes Arrow special is that it is not a single, but many
libraries (one for each language) and many of them are not only a
binding to a C/C++ lib, but partly a complete re-implementation of the
protocol, e.g.:

- C++: one core, but also contains Python specialties
- Java: another core
- Rust: yet another core
- Python: a binding to C++ but also a lot more stuff because of Pandas
...

And you two are maintaining all of them and I doubt that you have the
capacities and knowledge to do this at the desired level of quality
(which is natural, not a personal issue or offense). So this I would
call "pseudo-maintenance", since you're solely the gatekeeper that does
some shallow reviewing and has the burden to do the housekeeping and
the merging. So why accepting these language bindings in the first
place without bringing a core maintainer in place? For example, let's
say someone proposes a binding to Haskell now. That should not be
accepted as part of the official Apache implementation without a
dedicated maintainer (ideally the PR-author would be that person, but
there may others who step up).

Right now, it might be too late to remove some of the incomplete / WIP
implementations that don't have a core maintainer though.

# GitHub
Another special thing to consider is that Arrow is (ab)using GitHub as
a code hosting platform. Even as a contributor, this has obvious bad
uncool consequences:

- you have yet another issue hosting system to log in
- there is yet another information channel to keep track of (this ML
  for example, which has a semi-informative web interface telling you
  can only login using Google but does not tell you how to subscribe to
  the list)
- links to issues don't work in the known magic way
- you're merging the PRs by closing them; which is by all means a not
  very nice way because it does not reflect the contributors work in
  the project overview and personal profiles, but exactly this is a
  large part of the GitHub community (btw: merging PRs without using
  GitHubs merge button IS possible as bors/bors-ng proof)

So as a potential maintainer, this is already a bumper, since I know
that there are things less confortable then the system I would get from
any normal GitHub or Gitlab project.

I'm not really sure how to solve this or if it should be solved (read
about the laziness aspect in "Contribution VS Maintenance" below)

# Time / Payment
Yes, this is indeed a big issue. From what I can tell from the open
source projects I was involved in is that for large contributor crowds,
you normally have full/half-time positions in place for the core
maintainer (look at the Mozilla projects, the Blender Foundation, Gnome
/ Red Hat). So at one point I think maintaining isn't a part time /
hobby thing anymore (w/o downgrading the hard work of Hobby-
contributors, in contrast). I don't have a link at hand, but I recall
some discussion about GitHub and it's importance for hiring (since it
it acts as a CV) after MS bought it, and some of the responses are
"doing all this work in your free time is a privilege of wealthy,
mostly-white men", which without signing this statement in this really
bare form already shows a problem of open source world.

# Contribution VS Maintenance
The very "nice" thing about patch/PR contribution is that you do your
work and then you can walk away and it's the maintainers problem to
release the artifact, upgrade/migrate your code and ensure that the
tests you've written never break. It's comfortable. Being a maintainer
means all the opposite things. And in the end, you get blamed for not
supporting certain features (see the open source paragraph here https:/
/blog.ghost.org/5/ ) or for security disasters (remember the OpenSSL
disaster).

I think together with the previous point this means, we have to get
companies to pay for that work, and not just dump their features to an
OSS repo.

# Path to Maintainership
So I think (from my narrow point of view!) that many people expect that
the path from "outsider" to "maintainer" takes the route over "a lot of
patch/PR contributions". If I'm reading your mail right, that is not
necessarily the case for Apache projects and I think that's great. The
"review PRs" path sounds great, but I think GitHub or any platform I'm
aware don't do a good job in getting people to do so. I mean, I see a
PR and a can leave a review, but for me it is not really clear which
consequences this have (naturally, random people don't have a veto on
changes). So I can jump in when I think something is wrong, but I
cannot approve a PR. This makes sense, but it poses the question of
"how?!". I mean, it is pretty clear on how to become a patch/PR
contributor, but it is not clear on how to become a maintainer, at
least not in an easy way. (I'm sure it's written down somewhere).

So, overall I think a clear Call 

Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Holden Karau
One of the things I’ve started doing in the Spark project is live code
reviews to encourage other folks to get involved in the review process and
help it seem more achievable (see
https://www.youtube.com/playlist?list=PLRLebp9QyZtYF46jlSnIu2x1NDBkKa2uw )
.

Another that I think has helped us is making it clear one of the steps to
becoming a committer (something often valued by corporate employers) is
being involved in the review process.

I don’t know how much this applies, but some of the committees have also
found our PR dashboard which gives a view of PRs that are ready to merge
and organized by area to be helpful (see
http://spark-prs.appspot.com ).

YMMV of course, but this is a problem with I spend a lot of time thinking
about (only sometimes with answers) so really interested to see where the
discussion goes.

I gave a somewhat related talk: (Dealing with Contributor Overload) at FOSS
backstage recently
https://youtu.be/XS8cTLAuHUw

I’m not really all that involved with the Arrow project but if folks would
be open to it I’d be happy to add it to my list of projects I do livestream
reviews with.

On Sat, Jun 30, 2018 at 7:58 AM Wes McKinney  wrote:

> hi folks,
>
> Arrow has grown by leaps and bounds over the last 2.5 years. We are
> approaching our 2000th patch and on track to surpass 200 unique
> contributors by year end.
>
> All this contribution growth is great, but it has a hidden cost: the
> maintenance. The burden of maintaining the project: particularly
> reviewing and merging patches, has fallen on a very small number of
> people. From the commit logs, we can see how many patches each
> committer has merged:
>
> $ git shortlog -csn d5aa7c46692474376a3c31704cfc4783c86338f2..master
>   1289  Wes McKinney
>268  Uwe L. Korn
> 74  Korn, Uwe
> 54  Antoine Pitrou
> 52  Julien Le Dem
> 39  Philipp Moritz
> 18  Kouhei Sutou
> 18  Steven Phillips
> 13  Bryan Cutler
> 11  Jacques Nadeau
> 10  Phillip Cloud
>  8  Brian Hulette
>  5  Robert Nishihara
>  5  adeneche
>  4  GitHub
>  3  Sidd
>  3  siddharth
>  1  AbdelHakim Deneche
>  1  Your Name Here
>
> So Uwe and I have merged ~84% of the patches in the project so far.
> This isn't a completely accurate reflection of the maintainer burden,
> since many others contribute to code reviews and other aspects of
> patch maintenance, and you have to be a committer to earn a place on
> this list.
>
> I'm not sure what's the best way to address this problem. The quality
> of our code review has declined at times as we struggle to keep up
> with the flow of patches -- I don't think this is good. Having the
> patch queue pile up isn't great either. Personally, I'm having a
> difficult time balancing project maintenance and patch authoring,
> particularly in the last 6 months.
>
> Unfortunately, many people believe that writing patches is the primary
> mode of contribution to an open source project. Apache projects
> explicitly state that non-patch contributions are valued in earning
> karma (committership and PMC membership). We're starting to have more
> corporate contributors come out of the woodwork, and while it's great
> for contributors to be paid to write patches for the project, they are
> rarely given the time and space to contribute meaningfully to
> maintenance.
>
> Any thoughts about how we can grow the maintainership? Somehow we need
> to reach ~5-6 core maintainers over the next year.
>
> Thanks,
> Wes
>
-- 
Twitter: https://twitter.com/holdenkarau


[jira] [Created] (ARROW-2771) [JS] Add row proxy object accessor

2018-06-30 Thread Brian Hulette (JIRA)
Brian Hulette created ARROW-2771:


 Summary: [JS] Add row proxy object accessor
 Key: ARROW-2771
 URL: https://issues.apache.org/jira/browse/ARROW-2771
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Reporter: Brian Hulette
Assignee: Brian Hulette


The {{Table}} class would be much easier to interact with if it returned 
familiar Javascript objects representing a row. As Jeff Heer 
[demonstrated|https://beta.observablehq.com/@jheer/from-apache-arrow-to-javascript-objects]
 it's possible to create JS Proxy objects that read directly from Arrow memory. 
We should generate these types of objects in {{Table.get}} and in the {{Table}} 
iterator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2770) [Python] Account for conda-forge compiler migration in conda recipes

2018-06-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2770:
---

 Summary: [Python] Account for conda-forge compiler migration in 
conda recipes
 Key: ARROW-2770
 URL: https://issues.apache.org/jira/browse/ARROW-2770
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Wes McKinney
 Fix For: 0.10.0


See https://github.com/conda-forge/arrow-cpp-feedstock/pull/53



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Wes McKinney
hi folks,

Arrow has grown by leaps and bounds over the last 2.5 years. We are
approaching our 2000th patch and on track to surpass 200 unique
contributors by year end.

All this contribution growth is great, but it has a hidden cost: the
maintenance. The burden of maintaining the project: particularly
reviewing and merging patches, has fallen on a very small number of
people. From the commit logs, we can see how many patches each
committer has merged:

$ git shortlog -csn d5aa7c46692474376a3c31704cfc4783c86338f2..master
  1289  Wes McKinney
   268  Uwe L. Korn
74  Korn, Uwe
54  Antoine Pitrou
52  Julien Le Dem
39  Philipp Moritz
18  Kouhei Sutou
18  Steven Phillips
13  Bryan Cutler
11  Jacques Nadeau
10  Phillip Cloud
 8  Brian Hulette
 5  Robert Nishihara
 5  adeneche
 4  GitHub
 3  Sidd
 3  siddharth
 1  AbdelHakim Deneche
 1  Your Name Here

So Uwe and I have merged ~84% of the patches in the project so far.
This isn't a completely accurate reflection of the maintainer burden,
since many others contribute to code reviews and other aspects of
patch maintenance, and you have to be a committer to earn a place on
this list.

I'm not sure what's the best way to address this problem. The quality
of our code review has declined at times as we struggle to keep up
with the flow of patches -- I don't think this is good. Having the
patch queue pile up isn't great either. Personally, I'm having a
difficult time balancing project maintenance and patch authoring,
particularly in the last 6 months.

Unfortunately, many people believe that writing patches is the primary
mode of contribution to an open source project. Apache projects
explicitly state that non-patch contributions are valued in earning
karma (committership and PMC membership). We're starting to have more
corporate contributors come out of the woodwork, and while it's great
for contributors to be paid to write patches for the project, they are
rarely given the time and space to contribute meaningfully to
maintenance.

Any thoughts about how we can grow the maintainership? Somehow we need
to reach ~5-6 core maintainers over the next year.

Thanks,
Wes


[jira] [Created] (ARROW-2769) [Python] Deprecate and rename add_metadata methods

2018-06-30 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-2769:
--

 Summary: [Python] Deprecate and rename add_metadata methods
 Key: ARROW-2769
 URL: https://issues.apache.org/jira/browse/ARROW-2769
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Krisztian Szucs


Deprecate and replace `pyarrow.Field.add_metadata` (and other likely named 
methods) with replace_metadata, set_metadata or with_metadata. Knowing Spark's 
immutable API, I would have chosen with_metadata but I guess this is probably 
not what the average Python user would expect as naming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)