Hello all,

Greetings after a long sabatical.  

I've been lately trying again to get re-engaged with PLplot development.  One
of the first things I'm trying to get done, is to merge the python branch
that I had worked on about 15 months ago, and get that stuff onto trunk.
Then there is some more python oriented work I'd like to do from there.

We've undertaken a few big changes on the PLplot project over the years.  I
guess the biggest one of late, was the cutover to cmake.  Personally, I'm
still trying to recover from that.

In my outside-PLplot life, I and most people I work with in the professional
arena, ditched cvs/svn, and switched to git.  For me, and most of my closest
professional colleagues, this switch has been both huge, and positive.

By a huge switch, I don't so much mean to evoke worries of astronomical
catalysis barriers, as just to say that the change to git has had a really
big impact on how we do development.

I have become so enamored with git, that I have come to view it as somewhat
arduous to work with other, lesser systems.  SVN is at the top of my personal
list of "lesser" systems.  Sometimes I wonder if the problem is just lack of
education on svn-ways, but I keep running across internet articles where
others lament the same difficulties as I have, so I don't think it's just me.

Lately, in my attempts to try to get caught up with PLplot, I've actually
been using the git svn client to track our git repo.  Some of you might've
noticed my triffling commits a few days ago.  Those were committed using
git-svn, with no direct personal interaction with svn itself.  

The main purpose in this post is just to sample the other developers, all of
you who are currently much more actively involved in PLplot than I am, or
even than I expect to be once I regain my stride, so to speak, and just see
if any of you would be interested in seeing PLplot switch to git.

Here's the SF page relating to project git support:

    http://apps.sourceforge.net/trac/sourceforge/wiki/Git

Is anyone besides me interested in making the switch to git for the PLplot
project? 

On the off chance that there are some of you who don't have a strong opinion
yet, and are willing to consider the options, I thought I'd just relate some
of my personal experiences surrounding the cutover to git.

At my former company, Lightspeed Logic, we eventually realized that we were
outgrowing cvs.  We hired more people, had more going on, and decided to
consider switching SCM's in the interest of promoting team productivity.  We
did a sort of pilot study, in which people tried out both svn and git, then
we had a results comparison meeting, and ended up choosing git.

Looking back, the one thing I would say about our efforts to compare svn
versus git, is that we were way too gracious to svn in the comparison.  The
reasons we chose to switch to git instead of svn, proved to be much stronger
in retrospect than we realized at the time we made the decision.

I really don't want to provoke any sort of major flame war here, so I enter
into a discussion of my perception of the technical stuff with some
trepidation.  Fine with me if others  post alternative or contradictory
viewpoints, but if it comes to that, let's just keep in mind that we're
talking about personal perceptions and experiences.

One of the biggest things that hits newcomers to git, is the whole concept of
distributed source control.  Git docs talk about peer relationships and so
on, and people sometimes wonder how you maintain any control in a git-managed
project.  The git docs sometimes talk as if they're all about empowering
anti-establishment rogue coders, and this fuels the perceptions that git is
impossible to use for a project where anyone wants a sense of control.

My answer to this is that it's just a policy decision, which is outside the
space of SCM per se.  In other words, you just designate a repo as your
"official" project/company repo, and that's it.  It's official because you
say so, not because it occupies any particular point in the software
collaboration interaction diagram.  SF provides a technique for hosting and
communicating with that repo, so it's not at all hard for us to establish the
identify of the "master project repo".  The policy needs to bbe backed up
with some relatively straightforward understandings, such as, releases will
be built from tags present in the project master repo, etc.

The next big thing to understand about git is that it makes branching
beautiful.  Others have written about the mechanisms of git branching, so I
won't go into the mechanisms.  I just want to talk here about the experience,
so to speak.  In git, you can fork a branch in a heartbeat.  You can switch
beetween branches just as fast.  Well, you might want to do a commit of your
work before switching away from a branch, but anyway its still fast.  And you
can merge branches essentially willy-nilly, and git always seems to know what
you're doing.

With cvs, you could fork a branch, work on it for a while, then merge it back
to the head.  As long as you used branches that way, they worked okay.  In
svn, it seems--almost unbelievably--worse.  Apparently to do branch work
right in svn, you have to actually keep crib notes of your revisions for
subsequent manual merging.  Frankly what I've read about it in the context of
svn has been so off-putting, that it's the main reason I haven't merged the
python branch back to trunk yet.  The way I'll get it done, at this point, is
probably going to be through git (using git svn if PLplot stays svn-centric).

With git, when you want to merge a branch, you just say git merge <other
branch>, and that merges other branch into your current branch.  And this
works in all directions.  You can merge topic branches into master,
quickly and easily.  And you can merge the master branch into your topic
branch just as easily.  This lets you "track" trunk (called master in
git-speak), easily, while maintaining as-yet unfinished topic work on a topic
branch, until it's really ready to merge.

If I had to pick one thing about git which is truly superlative, it's the
branch management.  But, there isn't just one thing about git which makes it
great.  It's beyond being just merely "great" because of so many features
that all conspire together to make git a true game-changer in the world of
SCM. 

The next thing to understand about git is that it supports a wide variety of
what I call "collaboration paradigms".  And I could hardly write down all the
possible work flows.  Instead, let me just paint a picture of a few.

1) The same-old same-old cvs/svn approach, practiced with git.

The project has a master repo, the one at SF.
Developers clone it.
They change code, check it in (git commit) to their local repo.
Developers do "git pull" to track upstream changes.
Developers do "git push" to push their changes up to the project master repo.

2) Individuals using branches, still mostly centralized

The project has a master repo, the one at SF.
Developers clone it.
Developers create topic branches, do work thereupon.
Developers push their branches up to the project master repo, as well
as their changes to the master branch.
Developers occasionally merge topic branches to the master branch, and
push up this result of integration.

3) "Distributed development gone wild"

The project has a master repo, the one at SF.
Developers clone it.
Developers create topic branches, do work thereupon.
Developers publish their changes to "collaboration hubs".
Other developers pull back and forth from their peers on as-interested
basis. 
Eventually subteams decide something is "done enough for public
consumption".  
Someone does a git co masster; git merge topic; git commit; git push origin
to make the multi-developer topic branch integrated into the master branch
and available in the project master repo.

So that's a description of some work flows.  Of course that's just the
beginning.  You can get a lot wilder than 3) if you're so inclined.  And I
guess a key point here is that different members can practice the above
collaboration styles concurrently.  It's not so much about abolishing option
1), as just enabling options 2) and 3).

So why bother?  Why is it worth it to take a step up to a greater degree of
complexity in the software development process and tooling?

In my professional projects where I have used git exclusively for over two
years now, I find these to be the most significant points I have observed on
my git journey.

1) SCM systems, particularly centralized ones like cvs/svn, actually
   discourage some developers from participation, due to the overhead of
synchronization.  

Just look at PLplot trunk activiity.  Just during the period where I've been
trying to get re-engaged here over the last few weeks, I see two or three
roughly completely independent whirlwinds of activity, all swirling around in
the code base at the top of trunk.  Everytime somebody does something, the
"commit" it to trunk.  Everybody's got to update, and this causes lots of
rerunning cmake, rebuilding, etc.  Hardly anyone is using branches.  We
actually used to use branches a bit back in the cvs days, but during the svn
era of PLplot, I'm not sure there have been any branches used, save the
python branch which I didn't manage to merge quick enough, and now it's
hopelessly orphanned in the svn quagmire.    With centralized SCM, the only
good way to collaborate with peers, is through the central repo.  And with
svn branches being quite uninviting, to put it politely, people tend to
collaborate through the central repo trunk (master branch in git-speak).
This is obvioulsy inefficient, and it's disruptive enough that it discourages
people from involvement.

The way git changes this game, is by making it really really easy for a
person to "live on a branch".  If you fork a branch, and work on it for a
while, you don't need to feel like you're getting lost, like something that
fell off a fast moving train.  At any time, you can just do git merge master
on your topic branch, and you're tracking master, without having to inject
your own, not yet ready changes onto the master.  You can track master, and
dvelop your topic at your own speed, and commit it to master and push it up
for public consumption only when you're really ready.

That means fewer "pushes" hit the project master, means other developers
spend less time "integrating" the changes from others, spells less wasted
time on the not-so-fun part of collaboration, and more developer focus just
going into your own topic branch development work.

Quite literally, git delivers a huge windfall of incentive for developers to
actualy participate in an SCM managed project, instead of avoiding it.

2) Exploratory work is greatly facilitated.

To me, this is perhaps the greatest benefit of using git.  In cvs/svn, the
only way you can "commit" your work, is to shove it into the one centralized
master project repo.  Even if you use a branch, it's still up there,
occupying space in the permanent record.  Because of this, it is actually a
disincentive to do exploratory work.  People get conservative over time, and
only want to "commmit" stuff which is going to be a "keeper".  Since people
don't like to not commit their work, they tend to consolidate around only
"keeper" activities, and true innovation suffers.  In part, this is traceable
to centralized repos.  If you do something dumb, and you check it in, its up
there forever.

With git (and this would be true of other decentralized scms), you can do
your exploratory development, check it in, obtain all the benefits of using
an SCM, and *still not be commited to having others see it*.  You only need
to push it up when you decide it's a keeper.  Sometimes you don't know that
at the beginning.  

With git, you can do exploratory development, and then trhow it away,
without ever bothering others with it, if you so choose.   This is really
enabling. 

Fantastic branching, and decentrlaized repos.  Those are the things that make
git such a game changer in the world of software development.  IMO.

Questions I could see people asking:

Why not just use git-svn?  If you love git so much, how about you just use
git-svn?

Well, I will, if that's where this leads.  But with direct git hosting now an
SF supported option, I thought I should just put it out there for
consideration, that we could in fact, switch the whole project to git.  Maybe
the question is, why switch the project, when you could just use git-svn on
your own?

Well, git-svn ain't git.  And if PLplot sticks with svn as the master repo,
then we're going to keep seeing the whole developer base doing everything on
trunk, because it doesn't really look like anyone's very interested in using
svn branches (no surprise, since svn branching is so unappealing).

Git repos are also smaller, and git is much faster.

I'd like to see us switch.  But I will use git-svn personally if that's my
only option.


What would a transition plan look like?  Suppose we all try git for a bit,
decide we like it, decide to switch.  How does that go?

There are two primary options:

1) Import the entire PLplot history, stuff that up into the SF plplot git
   repo as the starting point.

2) Take plplot svn trunk, check it into git, and go from there.

Option 1) seems like it might be the natural/obvious choice.  That said, in
my professional circles, we've actually opted for 2).  And been happy that
way, even over the next year or two.

So, personally, I'd probably just do 2) again, if others were willing to make
the switch, and willing to take this shortcut.  But if others were willing to
make the switch only on the condition of doing it the 1) way, well, I'd be
willing to help with that too.

So there it is.  I look forward to seeing what others have to say about
this.  Bottom line:  Is anybody other than me interested in switching PLplot
to git?

-- 
Geoff

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Reply via email to