Re: [Plplot-devel] SF adds support for git, anyone interested?

Geoffrey Furnish Fri, 13 Mar 2009 22:18:43 -0700

Alan W. Irwin writes:
 > On 2009-03-13 10:17-0600 Geoffrey Furnish wrote:
 > > [...]The main purpose in this post is just to sample the other
 > > developers, all of you who are currently much more actively involved in
 > > PLplot than I am, or even than I expect to be once I regain my stride,
 > > so to speak, and just see if any of you would be interested in seeing
 > > PLplot switch to git.
 > 
 > Thanks for your thoughtful and thought-provoking post.  However, although I
 > normally tend to be an early adopter, I don't think we should move to git at
 > this time.
 > 
 > Reasons in context below.
 > [...]
 > >
 > > The next thing to understand about git is that it supports a wide
 > > variety of what I call "collaboration paradigms".  And I could hardly
 > > write down all the possible work flows.  Instead, let me just paint a
 > > picture of a few.
 > >
 > > 1) The same-old same-old cvs/svn approach, practiced with git.
 > >
 > > 2) Individuals using branches, still mostly centralized
 > >
 > > 3) "Distributed development gone wild"
 > >
 > > The project has a master repo, the one at SF.
 > > Developers clone it.
 > > Developers create topic branches, do work thereupon.
 > > Developers publish their changes to "collaboration hubs".
 > > Other developers pull back and forth from their peers on as-interested
 > > basis.
 > > Eventually subteams decide something is "done enough for public
 > > consumption".
 > > Someone does a git co masster; git merge topic; git commit; git push
 > > origin to make the multi-developer topic branch integrated into the
 > > master branch and available in the project master repo.
 > >
 > > So that's a description of some work flows.  Of course that's just the
 > > beginning.  You can get a lot wilder than 3) if you're so inclined.  And
 > > I guess a key point here is that different members can practice the
 > > above collaboration styles concurrently.  It's not so much about
 > > abolishing option 1), as just enabling options 2) and 3).
 > 
 > From my observations of projects that use git (such as the Intel X driver
 > stack which includes at least the kernel, drm, mesa, X server and the
 > Intel X driver), (3) is a huge issue.  It appears each one of those stack
 > components has a whole variety of possible git version possibilities so a
 > lot of time is wasted in testing trying to figure out which git version to
 > use for each component.  I have a X instability problem for my Intel
 > hardware, but I have largely given up reporting it because the X Intel
 > developers appear to be off in their own land working with inconsistent
 > bits and pieces which nobody else tests.  (I fully expect this situation
 > to improve once the pace of Intel X development slows down, but I have
 > been saying that for the past year, and there is still no end in sight
 > with more and more features and no fix for instability issues.) Carl Worth
 > (a well-respected X developer who works for Intel and who is keen as
 > mustard on git) is having big problems with this issue just trying to sort
 > out what will go into their next release.  If he is having trouble sorting
 > out the git bits and pieces for the X intel stack, there is little hope
 > for the rest of us mere mortals!


I hope you don't mind a couple of clarifying questions.  Are you saying that
there is trouble with different versions of git?  Or are you saying there are
several separate pieces of software, each with several developers, each
publishing their own publicly accessible git repos, so that end users are
bewildered with trying to figure out what to pull from where to assemble all
the pieces of a working system?

At first I thought you were saying the former, but after rereading your
comments several times, I'm starting to think you are saying the later.

If the later, well, I think it's a bit hyperbolic to apply it to PLplot.  

What I imagine for PLplot, if we were to switch to git, is that we would
continue to maintain a clear "master repo" (the one hosted at SF).  We would
put tags on that repo, do release builds from only tags that are in that
repo, and so on.  Users would be expected to take file releases, or at least
be working with their origin set to our master project repo, if they want
help from the list.

I could further imagine that developers might advertise "collaboration hubs"
amongst themselves.  Not meaning private e-mail only.  Plplot-devel would be
fine.  But the point is, X says to Y, "Pull my branch uvw from my publicly
accessible git collab hub at git://wherever/plplot", and Y says "Great, now
feetch my further updates from git://elsewhere/plplot".  When X and Y are
mutually satisfied, one of them pushes it up.

Meanwhile, develper Z might be following a more clasic work flow of just
cloning the project master, working on the master branch, and pushing up when
ready.  The two models are not mutually exclusvie.

In all cases, we're talking about a "software stack" of just one git module:
plplot.  

So the picemeil assemblage problem you describe with the Intel X driver
project seems like a distant worry for a git-based PLplot future, to me.

 > That stack of software is obviously an extreme case of git development
 > gone wild, and I doubt PLplot development will never be that wild.
 > Nevertheless, it is a concern if PLplot moves to git because even now we
 > have trouble getting users and developers to report back the exact version
 > of PLplot they were using when they give bug reports, and it appears the
 > possibility of additional git versions would just exacerbate that problem.

I think I agree, except for the extent of my worry about this.  Every user
can report their origin, and the git commit identifier.  I would suggest that
unless their origin is our SF-hosted repo, they don't have a reasonable
expectation of getting much help from the PLplot core team.

If one of us were to publish our own PLplot git repo, and some user had
issues with sometehing they pulled from that, then it should be pretty clear
they need to contact that same developer for support.  The PLplot team should
be able to feel pretty relaxed about communicating to users that if they want
help, they need to be asking about a branch that is available in the project
master repo, if they want support on list.

That doesn't seem at all unreasonable to me, nor would I expect it to be
off-putting or unexpected by the general user community.

 > > 1) SCM systems, particularly centralized ones like cvs/svn, actually
 > >   discourage some developers from participation, due to the overhead of
 > > synchronization.
 > >
 > > Just look at PLplot trunk activiity.  Just during the period where I've
 > > been trying to get re-engaged here over the last few weeks, I see two or
 > > three roughly completely independent whirlwinds of activity, all
 > > swirling around in the code base at the top of trunk.  Everytime
 > > somebody does something, the "commit" it to trunk.  Everybody's got to
 > > update, and this causes lots of rerunning cmake, rebuilding, etc.
 > 
 > For a large project that might be an issue, but a rebuild of PLplot takes
 > very little time at all (especially with CMake).  Also, the various
 > components of PLplot are really nicely separated with few side effects so
 > you really have to work at it to mess up others who are working on some
 > different component of PLplot on the svn trunk.  

Not to be contrarion, or mean-spirited, but I should just point out that from
my perspective, PLplot is pretty "messed up" right now.  And has been for a
very very long time.

I have two different code bases in my professional life that link to PLplot.
Both of them are locked into years-old versions of PLplot.  About a year ago
I made the mistake of bumping one of them up to a modern PLplot release, and
then got flooded by users of my code reporting segfaults which took down the
whole application.  

The problems are mostly apparently something having to do with the Tk widget
(plframe).  But the point is, it's been broken for many many releases, it
prevents me from updating to trunk on two professional projects until I find
and fix the bug(s).

I am committed to doing this, as time allows, and will propagate the fixes to
the PLplot repo (svn or perhaps a git repo if we can get there) when I have
them. 

I am just trying to drive the point that the current situation with PLplot is
not entirely ducky.  There are long standing problems that need fixing.

 > Of course, if two people are working on, e.g., qt together, they could
 > mess each other up, but I personally like the implied build discipline
 > demanded of all our developers by working on the trunk together.

To me, this is more distractive than helpful.

 > Thus, there was little point to work on
 > qt on a branch. Similarly, I had no interest in working on the recent
 > qsastime stuff (still on-going by the way with more leap seconds
 > functionality to come) on a branch.  The point is, our build system is
 > flexible enough to easily work around most issues introduced by working
 > together on the trunk version.

Distraction is the big issue I see.  No one working style will ever suit
everyone, and I am not really directly trying to say "Alan, you should work
on a branch".  But what I am trying to say is that some people would choose
to work on a branch a lot more, if the SCM system was better with branches.
Svn is winning a reputation as a particularly weak contender for some working
paradigms that some people like a lot.  Other people are happy with it.

What I see as being significant about git is that it supports more
collaboration paradigms effectively, without forcing you to use just one
limited one.  Some people may just not like the perceived complexity, but
there are simple working styles that are well supported with git, as well as
much more sophisticated working styles that are also supported well by git.
Unfortunately, statements like this cannot be made about svn.

 > In practice, our de facto development on svn trunk has worked out well with
 > trunk build breakages for default configurations being extremly rare, and
 > when they do exist, short lived.

It's great to be associated with a project with so much vitality.  But I do
think it would help if things hit the master/trunk in more complete shape
when they first showed up.  "Fewer, bigger pushes" we would say in
git-speak. 

 > > Hardly anyone is using branches.  We actually used to use branches a bit
 > > back in the cvs days, but during the svn era of PLplot, I'm not sure
 > > there have been any branches used, save the python branch which I didn't
 > > manage to merge quick enough, and now it's hopelessly orphanned in the
 > > svn quagmire.  With centralized SCM, the only good way to collaborate
 > > with peers, is through the central repo.  And with svn branches being
 > > quite uninviting, to put it politely, people tend to collaborate through
 > > the central repo trunk (master branch in git-speak).  This is obvioulsy
 > > inefficient, and it's disruptive enough that it discourages people from
 > > involvement.
 > 
 > I admit my bias here.  I am just not that convinced of the desireability of
 > branches for most PLplot code development.  Let's face it, a lot of our
 > development focusses on language bindings, examples, and device drivers and
 > not the core library.  Thus, the natural separation of most of those
 > periperal components allows us to develop together in the trunk version
 > without messing each other up rather than separately on branches.

A huge fraction of the commits (this is my subjective observation/evaluation)
is hitting in the area of the cmake config files, which affect everyone.
Perhaps it usually works.  But I'm just saying, there's no reason for that
stuff to be on master/trunk till it's "more done".

The reason for the current practice is, I think, largely driven by the
weakness of svn, rather than by inherent superiority of the software
development process practiced in this manner.

 > Of course, if a change is really disruptive, it should be done on a
 > branch.  Your historical development of our dynamic device functionality
 > is a good case in point.  It is possible your recent development of a
 > plframe GUI capability for python also belongs in that category, but the
 > question there is obfuscated by a lot of cruft in our python bindings and
 > examples, which I probably think belongs to you and vice versa so it might
 > have been better to do a cleanout of the old cruft first, then do that
 > development on the trunk.

Part of the reason I started the python branch in PLplot/svn, was, well, err,
I guess a couple:

1) I was ridiculously naive about svn.  I thought it would work better for
   branch-based development.  Boy did I get "educated".

2) I knew I would be diddling with the build system, and I saw no reason to
   foist that upon others till I had it worked out better.  I am very
comfortable with that call, in retrospect.  When I do merge that work into
trunk in the coming days, I'm probably still going to need help from others,
but at least you guys missed all the thrashing along the way up until now.
That's one of my big points here.  The whole group doesn't need to have my
(or anyone else's) thrashing in their face.  We just need the done stuff.
Sometimes getting done requires help, and thus we need ways to collaborate.
Checking in partially working stuff on the trunk of the master project repo
is one way to pass work to collaborators.  But it also pushes the same work
in progress (WIP) materials into the update/checkout path for others who
*don't* have interest in collaborating on that particular thing.

Branches, and outside-of-the-project-master-repo, collaboration schemes, as
offered by git, provide a way around these disruptions to the rest of the
team. 

 > I realize my bias on this subject so I would be willing to conceed this
 > point.  Everybody tells me that once they have tried git, they do not
 > ever want to move back to subversion.  The problem for me is I haven't tried
 > git so I don't miss it a bit.  :-)

I totally understand.  One thing I'm looking for here in this thread, is to
see if others are interested, and if so, let's identify what we could do to
help people get the exposure they would need to feel good about making a
decision. 

Certianly one thing we need, is to hear from people who develop on Windows
regularly, if they can live with any of the git clients for Windows.

But the group will surely have other requirements that would factor into a
decision to switch, and I would like to learn what these other requirements
might be.  Understanding work flows is probably another area where people
have questions.  And may or may not feel they know how to state the
questions.  

 > > [...]What would a transition plan look like?  Suppose we all try git for
 > > a bit, decide we like it, decide to switch.  How does that go?
 > >
 > > There are two primary options:
 > >
 > > 1) Import the entire PLplot history, stuff that up into the SF plplot git
 > >   repo as the starting point.
 > >
 > > 2) Take plplot svn trunk, check it into git, and go from there.
 > >
 > > Option 1) seems like it might be the natural/obvious choice.  That said,
 > > in my professional circles, we've actually opted for 2).  And been happy
 > > that way, even over the next year or two.
 > 
 > I am convinced our code history is worth preserving.  That allows us to
 > delete unused code now in confidence that if we ever need it back we can
 > easily get it from the repo.  (For example, I am going to propose removing
 > the currently unused parts of the sys directory tree in the near future
 > because I KNOW we can get any of it back from svn.) Furthermore, our
 > history helps us to keep track of licensing issues which is really
 > important for a open-source project. Finally, our project is one of the
 > oldest open-source projects around so it would be a shame to delete all
 > that history.  So to my mind approach (1) is an absolute requirement, and
 > approach (2) would be a showstopper.  I felt so strongly about this issue,
 > that I put in long hours on the conversion from cvs to svn (including
 > extensive automatic checks of every separate commit message) just to make
 > sure our history was completely preserved.  I will demand similar care for
 > any proposed svn to git conversion project, but I warn such care is going
 > to take lots of effort by anybody who volunteers to do that conversion
 > work. 

In the professional cases I mentioned above, we did not "sacrifie history".
We just agreed that if we needed history prior to the cutover date, we'd go
back to the old SCM to get it.  We didn't throw out the old SCM (which was
cvs), and rm -rf * it.  We held on to it, and used it if necessary.  Our team
was satisfied with that.  Like you, I value the history, and if we switched
to git, and used option 1), I would definitely also require that SVN stay
"live". 

But if you realized that option already, and still mean you require the old
history to be preserved in the new repo, so that you would never have to type
svn again, well, it is definitely possilbe to achieve that, and I'm certainly
not opposed to it.

 > In any case, I think we should put off this decision at least for a
 > year.  

!!!  Yikes.  Svn is killing me!  Okay, slight exageration.

But I would like to understand your requirements in detail.  That would help
me assess the work involved in satisfying them.  If your requirements knock
me over, then yeah, waiting a year for others to hammer on the svn to git
import system might turn out to be the best option.  But I've migrated stuff
from cvs into git, and been happy with the result, and have used git
exclusively for two years now, with absolutely no regrets.  So rather than
just glibly putting it off for a year, I'd rather understand what you feel is
essential before you could be convinced.  And I'd rather not make a decision
on a timeline/goal, until I see what your list looks like.

 > My idea is to let other projects be the SF guinea pigs to work out
 > all the bugs in the freshly minted SF git support as well as to improve
 > the tools for conversion of a project (including all its history) from svn
 > to git.
 > 
 > In sum, I am against moving to git for now, but we should consider this
 > possibility again in the future (say when a majority of our developers are
 > enthused about git from personal experience with it for other projects, and
 > when someone steps forward to do the svn to git conversion work along with
 > the hard part which is the required detailed checking of same).

Mmm.  Are there ways we could explore the potential of git directly on this
project?  I'll bet there are.  If there's interest, there surely is a way to
dabble and test the waters.

I'm not talking about cutting over from svn to git in some sort of a whiplash
inducing sudden irreversible transaction.  Rather, I'm looking for ways we
could keep svn as the project maste rrepo for now, but test the waters a bit,
shall we say, with git.

If I found a way to publish a PLplot git repo with something new in it, is
there anyone who would be willing to pull it and give it a whirl?

I don't actually have a good way to do that right now, outside of SF.  My
current employer is not into providing web services for extracurricular
activities, so to speak.  So I'll have to think a bit about how I might do
that.

Cheers,

-- 
Geoff

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Re: [Plplot-devel] SF adds support for git, anyone interested?

Reply via email to