Mike,

A few top level thoughts without replying directly inline below I
get pretty tired of reading in-between threads on that, so thought
I would summarize here:

1. I’ve been on tons of projects that had commit, review later,
commit. In fact, pretty much every project I’m on works that way,
mostly. Nutch, we commit things directly sometimes - sometimes we
review. Tika we commit things directly sometimes - sometimes things
break. No one yells at me if I broke something. We just fix it, and
make the tests pass again. I’ve worked on a ton of projects over
the years at Apache. See the thing is too, I’m not proposing one
model or the other. I think there are situations to support both
on projects.

Like I said, in my mind: RTC is great if it’s controversial; or if
you just desire a review. There is nothing stopping you from *always*
RTC’ing what you do. That’s your prerogative. In the hypothetical
situation I committed anything to OCW at some point in the future
(thought having done a ton of work upstream of the project at JPL
before coming to Apache) I just don’t want that imposed on *me*.
And moreover, I think some of this stuff in terms of conversations
I’m seeing on the Githubs could be obviated by Kyo not having to
post examples; or slides; or 10 comments before he can just push
the dang thing, see if the test broke, if it did, fix it (or you
fix it; or Kim fixes or someone fixes it out of the kindness of
their hearts). If no one fixes it, and no one reverts it, we’ve got
a dead community. We’re using version control. Having things perfect
before doing things is unnecessary and I honestly don’t think people
should have to make even 1 Github comment before making a JIRA
issue, or just pushing code. We want minimum barriers for some;
bigger barriers for others; in-betweens, we want the whole lot.  We
have to be flexible.

2. We have to make the ability to capture *everyone’s* contributions,
whatever pace they go at. Realize that the pace may change, depending
on someone’s job; life, funding, etc. Right now, my #1 goal considering
the state of this project right now is to reduce barriers for getting
the project going in terms of pushing things. That means when I see
lots of conversation on Github, and talk about code being pushed;
or things in presentations rather than code simply being pushed, I
think there are more barriers than we need. It may be that you
always want PRs and Kim and you love conversing on Github about the
PRs, and then when you finally have things the way you agreed upon
it you push the code. I am just saying be open to when that’s not
the case, and accept it - it will earn more people around this
project in 10 years including yourself. Should someone every come
in and be funded full time to work on this project or 10 people
come to be funded full time to work on this project, if we followed
only a strictly RTC approach, we would end up with what Hadoop ended
up with, and the potential for something like Spark to end up with.
I’m glad you brought Spark up. I convinced them to come to Apache.
I know plenty about their community that’s great - but also some
things that aren’t. Dirty laundry that’s emerged in public about
barriers to contributions and higher bars for committers and PMC
which are now split in that project.  It’s certainly a model for
doing great things, but realize too - there are 124+ contributors;
large VC-backed companies whose only goal is to work on software,
not necessarily do science with little funding and contributions,
etc., so it’s a largely different ballgame.  When you and I and
Zimmy made the Bot for OCW Climate, it was with little pieces of
our time - imagine if I had you spend your time writing up the
ENTIRE plan (which I never actually saw a document describing mind
you) for the OCW bot before I asked you to look at the Spark one
and do it? Would you have had an easy time writing up the plan FIRST
before doing the work?

3. To the point about JIRA and Github notifications being as easy
to spot as plain old email that simply is not backed up by fact.
In fact look at Spark - they had to create a completely separate
mailing list to handle their conversations because honestly dev@s.a.o
was impossible to be subscribed on due to all the automated nonsense.
I am still unsubscribed to dev@s.a.o to this day b/c of that. In
addition, look up emails over the 15+ year history of this foundation
(you have access now as a member) - JIRA and SVN auto commits and
Git auto notifications and conversations have proven NOT to be very
easy to follow as something with a simple subject line, written by
a human being and not a bot.

4. RE: Git - committing locally is fine, but the ASF only cares
about code on the ASF’s hardware. Github (in the guide you referenced)
is NOT the canonical source for Apache OCW. The canonical source
is the writeable Git repos at the ASF. This was one of the major
concerns about Git that the ASF had in initially implementing it.
Storing stuff in branches and committing all you want creates
nightmares when the PMC who is responsible for managing the code
tries to collectively work on a codebase *together* and to have
shared stewardship of the code base.  I don’t see that right now.
I see a very small amount of people doing things every now and then,
with most of the commentary on whether it’s right or wrong coming
from you. And people like Kyo not even knowing if they have write
access to the repository at Apache or not. That’s not an Apache
project’s model and it needs to be fixed here in OCW. There are no
BDFL’s in this project. We are all members of the PMC and have
shared rights and stewardship to the code.


Those are my thoughts on this.

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Michael Joyce <jo...@apache.org>
Reply-To: "dev@climate.apache.org" <dev@climate.apache.org>
Date: Tuesday, May 12, 2015 at 11:49 AM
To: "dev@climate.apache.org" <dev@climate.apache.org>
Subject: Re: Project Workflow

>Pre-email note - 'you' is used here to collectively refer to a nonexistent
>person, not a specific person in this chain of emails.
>
>---
>
>I would certainly agree that RTC COULD cause problems if it wasn't applied
>fairly. But it's applied generally across all commits from all
>contributors
>here. If no one gives any feedback on a patch after a while (again, I
>usually stick with the fairly standard 72 hours idea), merge the dang
>thing. That's always been my stance on it. I think applying this to all
>contributions regardless of status is actually more fair than most
>project's approaches where contributors have to submit a patch and wait
>for
>someone to hopefully come along and give a crap.
>
>Also, I think it's a huge misnomer that CTR turns into anything other than
>Commit then Commit some more. I have never seen someone do a code review
>after pushing commits on any project I've worked on. People don't want to
>do reviews, they're certainly not going to do it after the fact. Do early
>reviews always fix the problem of bugs being introduced? Of course not. Do
>they make it worse? Absolutely not. We've seen that early reviews prevent
>broken tests and bugs from being committed to our code base multiple times
>(and that has been on many people' contributions, including my own many
>times). Even if this project decided to switch to a CTR approach every one
>of my contributions would go through a PR. It keeps you honest with
>regards
>to running the tests and it gives people opportunities to help you make
>your code better. That's always a plus to me. I wish more people would
>actually give feed back on my contributions/PRs. That would make me really
>happy.
>
>Also, no committer/PMC member is excluded from merging pull requests.
>Everyone who is a committer/PMC can merge pull requests. So if you want to
>be responsible for validating that the pull request doesn't break stuff
>and
>getting it merged, please do! The jenkins build goes a long way towards
>helping with that and does most of the heavy lifting anyway. If anyone is
>worried that only Kim and I have been merging PRs then step up and help
>get
>PRs merged.
>
>Regarding conversations being buried on Github: All conversations are
>mapped back to the mailing list under the appropriate PR title and to the
>appropriate JIRA ticket (assuming the workflow laid out on the wiki is
>followed). I don't see the conversation being on a PR being much different
>than the conversation being on JIRA when someone submits a patch (a rather
>common workflow on other projects in my experience). I'm open to hearing
>how that is problematic though. My thought, if people are going to ignore
>github chatter, then they're probably going to ignore JIRA chatter, and
>they're probably not going to notice emails either.
>
>Also, we're using git. If you want to scratch your own itch, make a branch
>and do work. Commit locally all you want. You have a full version of the
>repository. It's exactly the same thing on github/asf servers. Keeping
>your
>branch up to date with master is trivially easy. When you want to push
>that
>contribution out make a pull request. It's extremely easy to branch a
>billion times and merge and commit locally and do your own thing. I'm not
>terribly certain how having a pull request/code review centric workflow
>hinders this??
>
>One more final note. Apache Spark, one of the most active ASF projects
>uses
>a more complicated Github based PR-centric workflow that uses RTC. It
>certainly hasn't prevented them from getting hundreds of
>committers/contributors.
>
>
>-- Jimmy
>
>On Tue, May 12, 2015 at 10:02 AM, Ramirez, Paul M (398M) <
>paul.m.rami...@jpl.nasa.gov> wrote:
>
>> I believe the current workflow has inhibited progress and caused
>>tension.
>> Although I've been more of an observer than contributor it seems to me
>>that
>> when your community is small that the emphasis should be on allowing as
>> Chris states for people to "scratch their own itch." Additionally, I've
>> seen at times that reviews have focused on minor items and that
>>discussions
>> often took longer than either side reaching out to the other to help get
>> the patch in and build community and camaraderie amongst all those that
>>are
>> already committers. This has gone on to the extent that appears
>>detrimental
>> to the project.
>>
>> I would actively support CTR at this point so that energy and progress
>>can
>> be infused. Sure this could cause some technical debt to build up but
>> community over code would seemingly once again come to the forefront
>>which
>> appears to be lacking at the moment.
>>
>>
>> --Paul
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Paul Ramirez, M.S.
>> Technical Group Supervisor
>> Computer Science for Data Intensive Applications (398M)
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 158-264, Mailstop: 158-242
>> Email: paul.m.rami...@jpl.nasa.gov<mailto:paul.m.rami...@jpl.nasa.gov>
>> Office: 818-354-1015
>> Cell: 818-395-8194
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> On May 12, 2015, at 9:33 AM, "Mattmann, Chris A (3980)" <
>> chris.a.mattm...@jpl.nasa.gov<mailto:chris.a.mattm...@jpl.nasa.gov>>
>>  wrote:
>>
>> I don’t think we should dictate everything be code reviewed. I’ve
>> seen this directly lead to conversations that are relevant to
>> development being buried in Github. Take for example your and
>> Whitehall’s conversation(s) with Kyo that I doubt anyone here has
>> ever seen since they aren’t even commenting on the Github. Yes,
>> Github emails are sent to the dev list, my guess is that people
>> ignore them.
>>
>> On the code review issue - Kyo (or others) shouldn’t have to endlessly
>> debate or discuss the advantages or disadvantages of this or that
>> before simply pushing code, and pushing tests. My general rule of
>> thumb is that there are CTR and RTC workflows and use cases for
>> both. RTC works great when it’s possibly controversial or when you
>> really want someone’s eyes on your code for review. However it’s
>> also overhead that I do not believe is needed if a developer wants
>> to push forward and scratch his or her itch, in an area of the
>> codebase that they are working on. The codebase is factored out
>> enough reasonably well so that people can work on things in parallel
>> and independently. When in doubt, ask.
>>
>>
>> I’m also pretty worried since anyone that looks at the Git and
>> project history over the last year can easily see that Mike has
>> pretty much been doing the bulk load of the pushing and code
>> committing here. Kim’s stepped up recently as has Kyo, which is
>> 3 people, which is great, but I’m worried about a project with
>> a small number of active developers (really 1 major) imposing
>> RTC - I don’t have time to look up citations but you are free
>> to scope these out over the ASF archives. RTC on smaller projects
>> just leads to barriers. We need to be flexible and make it inviting
>> for at the very least, our own developers to contribute to the project,
>> let along attracting new people. Ross was elected in December 2014,
>> which is great, but we need to do better.
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattm...@nasa.gov<mailto:chris.a.mattm...@nasa.gov>
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Michael Joyce <jo...@apache.org<mailto:jo...@apache.org>>
>> Reply-To: "dev@climate.apache.org<mailto:dev@climate.apache.org>" <
>> dev@climate.apache.org<mailto:dev@climate.apache.org>>
>> Date: Tuesday, May 12, 2015 at 8:55 AM
>> To: "dev@climate.apache.org<mailto:dev@climate.apache.org>" <
>> dev@climate.apache.org<mailto:dev@climate.apache.org>>
>> Subject: Project Workflow
>>
>> Hi folks,
>>
>> Since this has been brought up a few times on various tickets I thought
>> now
>> would be a good time to go over our project workflow and make sure it's
>> working for us.
>>
>> A general overview of the workflow that we use is available at [1]. A
>> brief
>> overview is that:
>> - ALL changes are pushed up to Github for code review before being
>>merged.
>> - If no issues are raised within a reasonable amount of time (usually 72
>> hours is what I stick with) those changes can be merged.
>>
>> In general, I've been quite happy with this workflow. We have a low
>>enough
>> throughput that this isn't overwhelming and I think it's great that we
>>can
>> get together and review each other's code. I know I appreciate the
>> opportunity for people to find issues with my code before we merge it. I
>> think it would be beneficial to flesh out the docs a bit more on the
>> workflow (for instance, how to run tests should be included in there,
>>how
>> long to wait for a merge, etc.). So community, what do we think of our
>> workflow? Do we like it so far? Is it working for us? Are there pain
>> points? What don't we like? Etc.
>>
>> [1]
>> 
>>https://cwiki.apache.org/confluence/display/CLIMATE/Developer+Getting+Sta
>>r
>> ted+Guide
>>
>>
>> -- Jimmy
>>
>>
>>

Reply via email to