On 2009-03-13 10:17-0600 Geoffrey Furnish wrote: > [...]The main purpose in this post is just to sample the other developers, > all of > you who are currently much more actively involved in PLplot than I am, or > even than I expect to be once I regain my stride, so to speak, and just see > if any of you would be interested in seeing PLplot switch to git.
Hi Geoffrey: Thanks for your thoughtful and thought-provoking post. However, although I normally tend to be an early adopter, I don't think we should move to git at this time. Reasons in context below. > > Here's the SF page relating to project git support: > > http://apps.sourceforge.net/trac/sourceforge/wiki/Git > > [..]One of the biggest things that hits newcomers to git, is the whole > concept of > distributed source control. Git docs talk about peer relationships and so > on, and people sometimes wonder how you maintain any control in a git-managed > project. The git docs sometimes talk as if they're all about empowering > anti-establishment rogue coders, and this fuels the perceptions that git is > impossible to use for a project where anyone wants a sense of control. > > My answer to this is that it's just a policy decision, which is outside the > space of SCM per se. In other words, you just designate a repo as your > "official" project/company repo, and that's it. It's official because you > say so, not because it occupies any particular point in the software > collaboration interaction diagram. SF provides a technique for hosting and > communicating with that repo, so it's not at all hard for us to establish the > identify of the "master project repo". The policy needs to bbe backed up > with some relatively straightforward understandings, such as, releases will > be built from tags present in the project master repo, etc. Fair enough. > > The next big thing [...] > which is truly superlative, it's the > branch management. I believe you, but see my comments about branches below. > The next thing to understand about git is that it supports a wide variety of > what I call "collaboration paradigms". And I could hardly write down all the > possible work flows. Instead, let me just paint a picture of a few. > > 1) The same-old same-old cvs/svn approach, practiced with git. > > 2) Individuals using branches, still mostly centralized > > 3) "Distributed development gone wild" > > The project has a master repo, the one at SF. > Developers clone it. > Developers create topic branches, do work thereupon. > Developers publish their changes to "collaboration hubs". > Other developers pull back and forth from their peers on as-interested > basis. > Eventually subteams decide something is "done enough for public > consumption". > Someone does a git co masster; git merge topic; git commit; git push origin > to make the multi-developer topic branch integrated into the master branch > and available in the project master repo. > > So that's a description of some work flows. Of course that's just the > beginning. You can get a lot wilder than 3) if you're so inclined. And I > guess a key point here is that different members can practice the above > collaboration styles concurrently. It's not so much about abolishing option > 1), as just enabling options 2) and 3). >From my observations of projects that use git (such as the Intel X driver stack which includes at least the kernel, drm, mesa, X server and the Intel X driver), (3) is a huge issue. It appears each one of those stack components has a whole variety of possible git version possibilities so a lot of time is wasted in testing trying to figure out which git version to use for each component. I have a X instability problem for my Intel hardware, but I have largely given up reporting it because the X Intel developers appear to be off in their own land working with inconsistent bits and pieces which nobody else tests. (I fully expect this situation to improve once the pace of Intel X development slows down, but I have been saying that for the past year, and there is still no end in sight with more and more features and no fix for instability issues.) Carl Worth (a well-respected X developer who works for Intel and who is keen as mustard on git) is having big problems with this issue just trying to sort out what will go into their next release. If he is having trouble sorting out the git bits and pieces for the X intel stack, there is little hope for the rest of us mere mortals! That stack of software is obviously an extreme case of git development gone wild, and I doubt PLplot development will never be that wild. Nevertheless, it is a concern if PLplot moves to git because even now we have trouble getting users and developers to report back the exact version of PLplot they were using when they give bug reports, and it appears the possibility of additional git versions would just exacerbate that problem. > > So why bother? Why is it worth it to take a step up to a greater degree of > complexity in the software development process and tooling? > > In my professional projects where I have used git exclusively for over two > years now, I find these to be the most significant points I have observed on > my git journey. > > 1) SCM systems, particularly centralized ones like cvs/svn, actually > discourage some developers from participation, due to the overhead of > synchronization. > > Just look at PLplot trunk activiity. Just during the period where I've been > trying to get re-engaged here over the last few weeks, I see two or three > roughly completely independent whirlwinds of activity, all swirling around in > the code base at the top of trunk. Everytime somebody does something, the > "commit" it to trunk. Everybody's got to update, and this causes lots of > rerunning cmake, rebuilding, etc. For a large project that might be an issue, but a rebuild of PLplot takes very little time at all (especially with CMake). Also, the various components of PLplot are really nicely separated with few side effects so you really have to work at it to mess up others who are working on some different component of PLplot on the svn trunk. Of course, if two people are working on, e.g., qt together, they could mess each other up, but I personally like the implied build discipline demanded of all our developers by working on the trunk together. Thus, there was little point to work on qt on a branch. Similarly, I had no interest in working on the recent qsastime stuff (still on-going by the way with more leap seconds functionality to come) on a branch. The point is, our build system is flexible enough to easily work around most issues introduced by working together on the trunk version. In practice, our de facto development on svn trunk has worked out well with trunk build breakages for default configurations being extremly rare, and when they do exist, short lived. > Hardly anyone is using branches. We > actually used to use branches a bit back in the cvs days, but during the svn > era of PLplot, I'm not sure there have been any branches used, save the > python branch which I didn't manage to merge quick enough, and now it's > hopelessly orphanned in the svn quagmire. With centralized SCM, the only > good way to collaborate with peers, is through the central repo. And with > svn branches being quite uninviting, to put it politely, people tend to > collaborate through the central repo trunk (master branch in git-speak). > This is obvioulsy inefficient, and it's disruptive enough that it discourages > people from involvement. I admit my bias here. I am just not that convinced of the desireability of branches for most PLplot code development. Let's face it, a lot of our development focusses on language bindings, examples, and device drivers and not the core library. Thus, the natural separation of most of those periperal components allows us to develop together in the trunk version without messing each other up rather than separately on branches. Of course, if a change is really disruptive, it should be done on a branch. Your historical development of our dynamic device functionality is a good case in point. It is possible your recent development of a plframe GUI capability for python also belongs in that category, but the question there is obfuscated by a lot of cruft in our python bindings and examples, which I probably think belongs to you and vice versa so it might have been better to do a cleanout of the old cruft first, then do that development on the trunk. > > The way git changes this game, is by making it really really easy for a > person to "live on a branch". If you fork a branch, and work on it for a > while, you don't need to feel like you're getting lost, like something that > fell off a fast moving train. At any time, you can just do git merge master > on your topic branch, and you're tracking master, without having to inject > your own, not yet ready changes onto the master. You can track master, and > dvelop your topic at your own speed, and commit it to master and push it up > for public consumption only when you're really ready. > > That means fewer "pushes" hit the project master, means other developers > spend less time "integrating" the changes from others, spells less wasted > time on the not-so-fun part of collaboration, and more developer focus just > going into your own topic branch development work. > > Quite literally, git delivers a huge windfall of incentive for developers to > actualy participate in an SCM managed project, instead of avoiding it. I realize my bias on this subject so I would be willing to conceed this point. Everybody tells me that once they have tried git, they do not ever want to move back to subversion. The problem for me is I haven't tried git so I don't miss it a bit. :-) > > 2) Exploratory work is greatly facilitated. > > To me, this is perhaps the greatest benefit of using git. In cvs/svn, the > only way you can "commit" your work, is to shove it into the one centralized > master project repo. Even if you use a branch, it's still up there, > occupying space in the permanent record. Because of this, it is actually a > disincentive to do exploratory work. People get conservative over time, and > only want to "commmit" stuff which is going to be a "keeper". Since people > don't like to not commit their work, they tend to consolidate around only > "keeper" activities, and true innovation suffers. In part, this is traceable > to centralized repos. If you do something dumb, and you check it in, its up > there forever. > > With git (and this would be true of other decentralized scms), you can do > your exploratory development, check it in, obtain all the benefits of using > an SCM, and *still not be commited to having others see it*. You only need > to push it up when you decide it's a keeper. Sometimes you don't know that > at the beginning. > > With git, you can do exploratory development, and then trhow it away, > without ever bothering others with it, if you so choose. This is really > enabling. > > Fantastic branching, and decentrlaized repos. Those are the things that make > git such a game changer in the world of software development. IMO. Excellent points. git sounds like a paradigm shift that would shake up PLplot development considerably in a positive way. > [...]What would a transition plan look like? Suppose we all try git for a > bit, > decide we like it, decide to switch. How does that go? > > There are two primary options: > > 1) Import the entire PLplot history, stuff that up into the SF plplot git > repo as the starting point. > > 2) Take plplot svn trunk, check it into git, and go from there. > > Option 1) seems like it might be the natural/obvious choice. That said, in > my professional circles, we've actually opted for 2). And been happy that > way, even over the next year or two. I am convinced our code history is worth preserving. That allows us to delete unused code now in confidence that if we ever need it back we can easily get it from the repo. (For example, I am going to propose removing the currently unused parts of the sys directory tree in the near future because I KNOW we can get any of it back from svn.) Furthermore, our history helps us to keep track of licensing issues which is really important for a open-source project. Finally, our project is one of the oldest open-source projects around so it would be a shame to delete all that history. So to my mind approach (1) is an absolute requirement, and approach (2) would be a showstopper. I felt so strongly about this issue, that I put in long hours on the conversion from cvs to svn (including extensive automatic checks of every separate commit message) just to make sure our history was completely preserved. I will demand similar care for any proposed svn to git conversion project, but I warn such care is going to take lots of effort by anybody who volunteers to do that conversion work. In any case, I think we should put off this decision at least for a year. My idea is to let other projects be the SF guinea pigs to work out all the bugs in the freshly minted SF git support as well as to improve the tools for conversion of a project (including all its history) from svn to git. In sum, I am against moving to git for now, but we should consider this possibility again in the future (say when a majority of our developers are enthused about git from personal experience with it for other projects, and when someone steps forward to do the svn to git conversion work along with the hard part which is the required detailed checking of same). Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); PLplot scientific plotting software package (plplot.org); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ ------------------------------------------------------------------------------ Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Plplot-devel mailing list Plplot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/plplot-devel