Fwd: Best practices for updating old repos

Michael O'Cleirigh Fri, 16 Jun 2017 03:17:10 -0700

(Sorry I sent this originally last night in gmail but not in plain
text mode and it bounced)

Hi Michael,

In git if you don't merge often then you get these merge conflict hell
situations.

In my experience the main conflicts come not from the unified diff of
those 130 commits but from differences in the surrounding code.

Merging/rebase/cherrypicking directly to the latest upstream sounds
impossible to me.

These conflicts come from the distance between the local fork branch
and the upstream branch.

You need to merge through closer commits first to have a hope of
getting something automatic to work.

Something like getting the list  of releases made in the upstream in
the last 5 years and merging them in order into the fork branch.

i.e. merge v1, merge v2, ... merge v300

I went through something similiar with a subversion repo we converted to git.

In subversion they were cherry picking done work into a release branch.

In git a feature branch mode was being used.

It turned out some commits were never cherry picked and bringing them
to the latest release was hard.

We tried many of the approaches you outlined, took what git would give
us automatically and in the most hairy cases recreated the changes on
the latest upstream by reading the diff of the original commit and
rewriting it on the latest code.

In terms of how the history looks after the merge conflicts are
resolved you could internalize the fixups into a single commit applied
onto the original fork branch.  So that history would show the 130
commit branch directly merged into the upstream.

You would use the git-commit-tree command to reuse the merged tree id
and then use it as a merge commit between the 130th commit id and the
upstream commit id.

Regards,

Michael

On Thu, Jun 15, 2017 at 8:52 PM, Michael Eager <ea...@eagerm.com> wrote:
>
> Hi All --
>
> I'm working with code that is based on a five year old repository.
> There are 130 local commits since the repo was forked.  Naturally,
> the upstream project has moved on significantly.
>
> I'm wondering about best approaches to updating the repo to the
> current upstream version.  Here are the approaches I've considered:
>
> - Rebase from upstream.  Likely almost every patch will fail with
>   multiple merge conflicts.
>
> - Merge local branch into upstream.  Likely many merge failures, but
>   fewer than with rebase.
>
> - Apply individual patches from the old repo to the upstream repo.
>   Fix merge conflicts, rebuild, fix build failures.  There may be
>   some duplication and additional merge problems created, where a
>   later patch from the old repo fixes the same conflict or build
>   failure.
>
> I've tried each of these approaches on various projects.  Each has
> problems. After resolving merge issues there are build failures which
> need to be resolved and additional patches created.  The result is
> that the patch history is a bit chaotic, where there are later patches
> which fix problems with early patches.  I've tried to sort the fix
> patches to follow the patch they correct, so that the fixes were
> together and I could merge them, but that can be difficult.
>
> I've used Stacked Git a little, but don't know if it will make
> any of this easier.
>
> On some projects, I've reimplemented changes in the upstream repo,
> abandoning the patch history from the old repo:
>
> - Create diff of old repo and upstream.  Apply only the changes
>   to add new functionality, which are in the patches to the
>   old repo.   Fix problems caused by API changes, renamed files, etc.
>
> - Re-implement the changes on the upstream repo.  Some of the old
>   code would be re-used, but modified to fit in the current upstream.
>   Some new code would be written.
>
> One other variant of the rebase approach I've thought of is to do
> this incrementally, rebasing the old repo against an upstream commit
> a short time after the old repo was forked, fixing any conflicts,
> rebuilding and fixing build failures.  Then repeat, with a bit
> newer commit.  Then repeat, until I get to the top.  This sounds
> tedious, but some of it can be automated.  It also might result in
> my making the changes compatible with upstream code which was later
> abandoned or significantly changed.
>
> Anyone have a different approach that I should consider?  Or maybe
> offer advice on how to make one of these approaches work better?
> What is best practice to update an old repo?
>
> --
> Michael Eager    ea...@eagercon.com
> 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

Fwd: Best practices for updating old repos

Reply via email to