Re: Converting repo from HG, `git filter-branch --prune-empty -- --all` is extremely slow and errors out.

2013-09-13 Thread Felipe Contreras
On Thu, Sep 12, 2013 at 7:47 PM, Felipe Contreras
 wrote:

> Indeed, I remember writing my own simplified version of 'git
> filter-branch' that was much faster. If I recall correctly, the trick
> was avoiding 'git write-tree' which can be done if you are not using
> any tree filter, but 'git filter-branch' is not that smart.
>
> If all you want to do is prune empty commits, it should be easy to
> write a script that simply does 'git commit-tree'. I might decide to
> do that based on my script if I have time today.

Here it is, it's straightforward and should be easy to understand.

-- 
Felipe Contreras


filter-branch
Description: Binary data


Re: Converting repo from HG, `git filter-branch --prune-empty -- --all` is extremely slow and errors out.

2013-09-12 Thread Felipe Contreras
On Thu, Sep 12, 2013 at 5:01 PM, John Gietzen  wrote:
> Background:
> Windows, git version 1.8.3.msysgit.0
> bare repo, 54k commits after migration from HG
> git filter-branch --prune-empty -- --all
>
> I'm trying to clean up our repository after migrating it from HG.  I'm 
> running the filter-branch command listed above in an effort to clean up all 
> of garbage commits that HG required ("closing branch" commits and their ilk).
>
> From my past experience, "git filter-branch" is extremely quick when using 
> simple filters, like env-filter, since it doesn't have to touch the working 
> dir.  However, in our case each revision is taking 1-3 seconds; our entire 
> repo will take 30 hours to clean up at this rate.  Normally, this wouldn't be 
> a problem, except that we are getting "sh.exe couldn't start" errors after 
> anywhere between the 5000th and 6000th rewritten commit.  Filter-branch 
> doesn't have support for picking up where it left off, so we are entirely 
> unable to clean up our repo.

Indeed, I remember writing my own simplified version of 'git
filter-branch' that was much faster. If I recall correctly, the trick
was avoiding 'git write-tree' which can be done if you are not using
any tree filter, but 'git filter-branch' is not that smart.

If all you want to do is prune empty commits, it should be easy to
write a script that simply does 'git commit-tree'. I might decide to
do that based on my script if I have time today.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Converting repo from HG, `git filter-branch --prune-empty -- --all` is extremely slow and errors out.

2013-09-12 Thread John Gietzen
Background:
Windows, git version 1.8.3.msysgit.0
bare repo, 54k commits after migration from HG
git filter-branch --prune-empty -- --all

I'm trying to clean up our repository after migrating it from HG.  I'm running 
the filter-branch command listed above in an effort to clean up all of garbage 
commits that HG required ("closing branch" commits and their ilk).

>From my past experience, "git filter-branch" is extremely quick when using 
>simple filters, like env-filter, since it doesn't have to touch the working 
>dir.  However, in our case each revision is taking 1-3 seconds; our entire 
>repo will take 30 hours to clean up at this rate.  Normally, this wouldn't be 
>a problem, except that we are getting "sh.exe couldn't start" errors after 
>anywhere between the 5000th and 6000th rewritten commit.  Filter-branch 
>doesn't have support for picking up where it left off, so we are entirely 
>unable to clean up our repo. 

All that being said, I have 3 questions:
  1.  Is there anything I can do to speed up the filter-branch command? 
(Alternatively, is there a way I can profile git-filter-branch.sh on msysgit?)
  2.  Any idea why sh.exe would fail?
  3.  Is there a way I can resume the filter-branch command when/if it fails?  
(Alternatively, is there a way I can do the filter-branch in pieces and 
efficiently rebase... or something?)

I have already had to modify git-filter-branch.sh myself (to support the 
immense number of refs we are rewriting), so I'm comfortable with that.

Any help you can offer would be appreciated.
 
Thanks in advance,
John Gietzen

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html