For what it's worth, this email thread and your summary writeup, Wes, are a significant call to action on their own.
I've been passive, not by choice, but by policy. Given the significance and need of this project, I'll see what I can do on my side. It will be at least a week given the US holiday. Donald E. Foss > On Jun 30, 2018, at 2:15 PM, Marco Neumann <ma...@crepererum.net.INVALID> > wrote: > > Hey, > > first of all, thanks a lot for your, Uwes, the mergers and contributors > work. Now, to the maintainer problem: > > # Arrow as "a library" > One thing that makes Arrow special is that it is not a single, but many > libraries (one for each language) and many of them are not only a > binding to a C/C++ lib, but partly a complete re-implementation of the > protocol, e.g.: > > - C++: one core, but also contains Python specialties > - Java: another core > - Rust: yet another core > - Python: a binding to C++ but also a lot more stuff because of Pandas > ... > > And you two are maintaining all of them and I doubt that you have the > capacities and knowledge to do this at the desired level of quality > (which is natural, not a personal issue or offense). So this I would > call "pseudo-maintenance", since you're solely the gatekeeper that does > some shallow reviewing and has the burden to do the housekeeping and > the merging. So why accepting these language bindings in the first > place without bringing a core maintainer in place? For example, let's > say someone proposes a binding to Haskell now. That should not be > accepted as part of the official Apache implementation without a > dedicated maintainer (ideally the PR-author would be that person, but > there may others who step up). > > Right now, it might be too late to remove some of the incomplete / WIP > implementations that don't have a core maintainer though. > > # GitHub > Another special thing to consider is that Arrow is (ab)using GitHub as > a code hosting platform. Even as a contributor, this has obvious bad > uncool consequences: > > - you have yet another issue hosting system to log in > - there is yet another information channel to keep track of (this ML > for example, which has a semi-informative web interface telling you > can only login using Google but does not tell you how to subscribe to > the list) > - links to issues don't work in the known magic way > - you're merging the PRs by closing them; which is by all means a not > very nice way because it does not reflect the contributors work in > the project overview and personal profiles, but exactly this is a > large part of the GitHub community (btw: merging PRs without using > GitHubs merge button IS possible as bors/bors-ng proof) > > So as a potential maintainer, this is already a bumper, since I know > that there are things less confortable then the system I would get from > any normal GitHub or Gitlab project. > > I'm not really sure how to solve this or if it should be solved (read > about the laziness aspect in "Contribution VS Maintenance" below) > > # Time / Payment > Yes, this is indeed a big issue. From what I can tell from the open > source projects I was involved in is that for large contributor crowds, > you normally have full/half-time positions in place for the core > maintainer (look at the Mozilla projects, the Blender Foundation, Gnome > / Red Hat). So at one point I think maintaining isn't a part time / > hobby thing anymore (w/o downgrading the hard work of Hobby- > contributors, in contrast). I don't have a link at hand, but I recall > some discussion about GitHub and it's importance for hiring (since it > it acts as a CV) after MS bought it, and some of the responses are > "doing all this work in your free time is a privilege of wealthy, > mostly-white men", which without signing this statement in this really > bare form already shows a problem of open source world. > > # Contribution VS Maintenance > The very "nice" thing about patch/PR contribution is that you do your > work and then you can walk away and it's the maintainers problem to > release the artifact, upgrade/migrate your code and ensure that the > tests you've written never break. It's comfortable. Being a maintainer > means all the opposite things. And in the end, you get blamed for not > supporting certain features (see the open source paragraph here https:/ > /blog.ghost.org/5/ ) or for security disasters (remember the OpenSSL > disaster). > > I think together with the previous point this means, we have to get > companies to pay for that work, and not just dump their features to an > OSS repo. > > # Path to Maintainership > So I think (from my narrow point of view!) that many people expect that > the path from "outsider" to "maintainer" takes the route over "a lot of > patch/PR contributions". If I'm reading your mail right, that is not > necessarily the case for Apache projects and I think that's great. The > "review PRs" path sounds great, but I think GitHub or any platform I'm > aware don't do a good job in getting people to do so. I mean, I see a > PR and a can leave a review, but for me it is not really clear which > consequences this have (naturally, random people don't have a veto on > changes). So I can jump in when I think something is wrong, but I > cannot approve a PR. This makes sense, but it poses the question of > "how?!". I mean, it is pretty clear on how to become a patch/PR > contributor, but it is not clear on how to become a maintainer, at > least not in an easy way. (I'm sure it's written down somewhere). > > So, overall I think a clear Call for Action at the top of the README > could help. Like "Hey, we're looking for maintainers, you could start > by reviewing some PRs and after some reviews maintainers will just be > the last gatekeeper and after some more time, you can even merge PRs on > your own". > > # My personal contribution > Triggered by this call for help, I'll try to get more involved in > Python, C++ and Rust reviews. > > So, these are some thoughts that I hope may help. > > Thanks again for addressing this issue and your time and passion, > Marco > >> On 2018/06/30 14:57:42, Wes McKinney <w...@gmail.com> wrote: >> hi folks,> >> >> Arrow has grown by leaps and bounds over the last 2.5 years. We are> >> approaching our 2000th patch and on track to surpass 200 unique> >> contributors by year end.> >> >> All this contribution growth is great, but it has a hidden cost: > > the> >> maintenance. The burden of maintaining the project: particularly> >> reviewing and merging patches, has fallen on a very small number of> >> people. From the commit logs, we can see how many patches each> >> committer has merged:> >> >> $ git shortlog -csn > > d5aa7c46692474376a3c31704cfc4783c86338f2..master> >> 1289 Wes McKinney> >> 268 Uwe L. Korn> >> 74 Korn, Uwe> >> 54 Antoine Pitrou> >> 52 Julien Le Dem> >> 39 Philipp Moritz> >> 18 Kouhei Sutou> >> 18 Steven Phillips> >> 13 Bryan Cutler> >> 11 Jacques Nadeau> >> 10 Phillip Cloud> >> 8 Brian Hulette> >> 5 Robert Nishihara> >> 5 adeneche> >> 4 GitHub> >> 3 Sidd> >> 3 siddharth> >> 1 AbdelHakim Deneche> >> 1 Your Name Here> >> >> So Uwe and I have merged ~84% of the patches in the project so far.> >> This isn't a completely accurate reflection of the maintainer > > burden,> >> since many others contribute to code reviews and other aspects of> >> patch maintenance, and you have to be a committer to earn a place > > on> >> this list.> >> >> I'm not sure what's the best way to address this problem. The > > quality> >> of our code review has declined at times as we struggle to keep up> >> with the flow of patches -- I don't think this is good. Having the> >> patch queue pile up isn't great either. Personally, I'm having a> >> difficult time balancing project maintenance and patch authoring,> >> particularly in the last 6 months.> >> >> Unfortunately, many people believe that writing patches is the > > primary> >> mode of contribution to an open source project. Apache projects> >> explicitly state that non-patch contributions are valued in earning> >> karma (committership and PMC membership). We're starting to have > > more> >> corporate contributors come out of the woodwork, and while it's > > great> >> for contributors to be paid to write patches for the project, they > > are> >> rarely given the time and space to contribute meaningfully to> >> maintenance.> >> >> Any thoughts about how we can grow the maintainership? Somehow we > > need> >> to reach ~5-6 core maintainers over the next year.> >> >> Thanks,> >> Wes>