Thanks for taking the time to clean these up!
Christopher wrote:
Not much change. After doing 'git gc --aggressive --prune=now' on a 'git
clone --mirror', the repo size was 33M before the removal of these refs,
and 27M after. Since they were mostly pointing to existing blobs, I
wouldn't expect it to have dropped much. I'm actually a bit surprised it
dropped as much as 6M.
That said, i don't know how frequently ASF does a 'gc', or if git does that
automatically on the ASF remote, so I don't know if/when the potential for
a slightly smaller size will benefit anybody.
On Fri, Mar 4, 2016 at 4:32 PM William Slacum<[email protected]> wrote:
Any stats on what the repo size is after removing the refs and doing
something like `git gc`?
On Fri, Mar 4, 2016 at 4:25 PM, Christopher<[email protected]> wrote:
I was able to deleted 135 duplicate refs of the kind I described. Only
one
resulted in a new branch being created (ACCUMULO-722). We probably don't
need that at all, but it might be useful to turn into patches to attach
to
the "Won't Fix" ticket, rather than preserve them as an inactive branch.
Also note that the ACCUMULO-722 branch is not rooted on any other
branches
in our git repo. It was essentially just a sandbox in svn where Eric had
been working.
On Wed, Mar 2, 2016 at 6:14 PM Christopher<[email protected]> wrote:
(tl;dr version: I'm going to clean up refs/remotes/** in git, which
contains duplicate history and messes with 'git clone --mirror'; these
are
refs which are neither branches nor tags and leftover from git-svn)
So, when we switched from svn to git, there were a lot of leftover refs
left in the git repository that are from old branches/history which has
already been merged into the branches/tags that we've since created. I
think these were leftover from weird git-svn behavior. These can, and
should, be cleaned up.
You can see all of them when you do a:
git ls-remote origin
In that output, our current branches are the refs/heads/*, and our tags
are the refs/tags/*
The extras which need to be cleaned up are the refs/remotes/*
(including
refs/remotes/tags/*)
As you can see, these are duplicates of branches which have been merged
in
already, or temporary tags which didn't make it to a release (release
candidates) but whose relevant history is already in our normal git
history, or they are branches which were abandoned on purpose
(ACCUMULO-722).
Usually these extra refs don't present a problem, because we don't
normally see them when we clone (they aren't branches which are
normally
fetched). However, there are a few cases where this is a problem. In
particular, they show up when you do "git clone --mirror", and if you
push
this mirror to another git repository, like a GitLab mirror (git push
--mirror), they show up as extra branches which don't appear to exist
in
the original (a very confusing situation for a "mirror").
The interesting thing about these, is that even when they have the same
history as the git branches/tags we maintain now, the SHA1s don't match
up.
This seems to imply they were leftover from a previous invocation of
git-svn.
So, what I'd like to do is go through each of these extra refs one by
one,
and figure out if we already have this history in our branches/tags. If
we
do, then I'd delete these extras. If we don't (as in the case of
ACCUMULO-722), I'd just convert that to a normal git branch
(refs/heads/*)
until we decide what to do with it at some future point in time (for
example, perhaps do a 'git format-patch' on it and attach the files to
the
"Won't Fix" ticket so we can delete the dead branch? not sure, but that
can
be deferred).