On Sun, Sep 2, 2012 at 7:58 AM, Steve Loughran <[email protected]> wrote: > On 1 September 2012 09:20, Todd Lipcon <[email protected]> wrote: > >> Thanks for starting this thread, Steve. I think your points below are >> good. I've snipped most of your comment and will reply inline to one >> bit below: >> >> On Fri, Aug 31, 2012 at 10:07 AM, Steve Loughran >> <[email protected]> wrote: >> >> >> > >> > How then do we get (a) more dev projects working and integrated by the >> > current committers, and (b) a process in which people who are not yet >> > contributors/committers can develop non-trivial changes to the project >> in a >> > way that it is done with the knowledge, support and mentorship of the >> rest >> > of the community? >> >> > Both HDFS2 and MRv2 are in trunk, therefore I consider them successes. > > >> Here's one proposal, making use of git as an easy way to allow >> non-committers to "commit" code while still tracking development in >> the usual places: >> > > This is effectively what people do. I'm less worried about the code side of > things than the integration and mentoring > > >> - Upon anyone's request, we create a new "Version" tag in JIRA. >> > > -1. There are enough versions. There is a "tag" field in JIRA for precisely > this purpose > > >> - The developers create an umbrella JIRA for the project, and file the >> individual work items as subtasks (either up front, or as they are >> developed if using a more iterative model) >> > > as today > > >> - On the umbrella, they add a pointer to a git branch to be used as >> the staging area for the branch. As they develop each subtask, they >> can use the JIRA to discuss the development like they would with a >> normally committed JIRA, but when they feel it is ready to go (not >> requiring a +1 from any committer) they commit to their git branch >> instead of the SVN repo. >> > > some integration w/ jenkins and pull testing would be good here > > >> - When the branch is ready to merge, they can call a merge vote, which >> requires +1 from 3 committers, same as a branch being proposed by an >> existing committer. A committer would then use git-svn to merge their >> branch commit-by-commit, or if it is less extensive, simply generate a >> single big patch to commit into SVN. >> >> My thinking is that this would provide a low-friction way for people >> to collaborate with the community and develop in the open, without >> having to work closely with any committer to review every individual >> subtask. >> >> Another alternative, if people are reluctant to use git, would be to >> add a "sandbox/" repository inside our SVN, and hand out commit bit to >> branches inside there without any PMC vote. Anyone interested in >> contributing could request a branch in the sandbox, and be granted >> access as soon as they get an apache SVN account. >> >> > I don't see the technical issues with how the merge is done as the main > problem. > > The barriers to getting your stuff in are > 1. getting people to care enough to help develop the feature -mentorship, > collaborative development. > 2. getting incremental parts in to avoid the continual > merge-regression-test hell that you go through if you are trying to keep a > separate branch alive. It's not the technical aspects of the merge so much > as the need to run all the hadoop tests and your own test suite, and track > down whether a failure is a regression in -trunk or something in your code. > > Jun's patch is an example of this situation. We haven't seen the effort he > and his colleagues have done with merge and test, but I'm confident it's > been there. What they now have is a "big bang" class of patch which is so > big that anyone reviewing it would have to spend a couple of weeks going > through the codebase trying to understand it. Which as we all know means > two weeks not doing all the things you are committed to doing. > > We know it's there, we know it's current -so how to use this as an exercise > in something to pull in incrementally?
Jun's patches from HADOOP-8468 (which were developed on a private github repo) are being pulled in incrementally into trunk, there's no feature branch (which I think would have been a better route but at least the current approach has not prevented some progress). All the recent examples of features that I can think of that have been developed upstream first at Apache on feature branches have gone well. Thanks, Eli
