On Thu, 14 Sep 2017, Jeff King wrote:

> On Thu, Sep 14, 2017 at 07:32:11AM -0400, Robert P. J. Day wrote:
>
> >   [is this the right place to ask questions about git usage? or is
> > there a different forum where one can submit possibly
> > embarrassingly silly questions?]
>
> No, this is the right place for embarrassing questions. :)
>
> >   say, early on, one commits a sizable directory of content, call
> > it /mydir. that directory sits there for a while until it becomes
> > obvious it's out of date and worthless and should never have been
> > committed. the obvious solution would seem to be:
> >
> >   $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD
> >
> > correct?
>
> That would work, though note that using an --index-filter would be
> more efficient (since it avoids checking out each tree as it walks
> the history).

  i'm just digging into --index-filter as we speak, i realize it's
noticeably faster.

> >   however, say one version of that directory was committed early
> > on, then later tossed for being useless with "git rm", and
> > subsequently replaced by newer content under exactly the same
> > name. now i'd like to go back and delete the history related to
> > that early version of /mydir, but not the second.
>
> Makes sense as a goal.
>
> >   obviously, i can't use the above command as it would delete both
> > versions. so it appears the solution would be a trivial
> > application of the "--commit-filter" option:
> >
> >    git filter-branch --commit-filter '
> >      if [ "$GIT_COMMIT" = "<commit-id>" ] ; then
> >        skip_commit "$@";
> >      else
> >        git commit-tree "$@";
> >      fi' HEAD
> >
> > where <commit-id> is the commit that introduced the first verrsion of
> > /mydir. do i have that right? is there a simpler way to do this?
>
> No, this won't work. Filter-branch is not walking the history and
> applying the changes to each commit, like rebase does.  It's
> literally operating on each commit object, and recall that each
> commit object points to a tree that is a snapshot of the repository
> contents.
>
> So if you skip a commit, that commit itself goes away. But the
> commit after it (which didn't touch the unwanted contents) will
> still mention those contents in its tree.

  ah, of course, duh.

> I think you want to stick with a --tree-filter (or an
> --index-filter), but just selectively decide when to do the
> deletion. For example, if you can tell the difference between the
> two states based on the presence of some file, then perhaps:
>
>   git filter-branch --prune-empty --index-filter '
>       if git rev-parse --verify :dir/sentinel >/dev/null 2>&1
>       then
>         git rm --cached -rf dir
>       fi
>   ' HEAD
>
> The "--prune-empty" is optional, but will drop commits that become
> empty because they _only_ touched that directory.
>
> We use ":dir/sentinel" to see if the entry is in the index, because
> the index filter won't have the tree checked out. Likewise, we need
> to use "rm --cached" to just touch the index.

  got it. one last query -- i note that there is no "else" clause in
that code for "--index-filter". am i assuming correctly that if i was
using "--tree-filter" instead, i really would need if/then/else along
the lines of:

  if blah ; then
    skip_commit "$@"
  else
    git commit-tree "$@"
  fi

thank you kindly.

rday

-- 

========================================================================
Robert P. J. Day                                 Ottawa, Ontario, CANADA
                        http://crashcourse.ca

Twitter:                                       http://twitter.com/rpjday
LinkedIn:                               http://ca.linkedin.com/in/rpjday
========================================================================

Reply via email to