Re: [OT] Re: Git repository with full GCC history

2007-06-08 Thread Harvey Harrison

On 6/7/07, Bernardo Innocenti [EMAIL PROTECTED] wrote:

Harvey Harrison wrote:

 The final results of a repository holding a clone of trunk:

With or without branches? (shouldn't matter that much, just
for the record)



Just trunk.


 Size of git packs:
 pack + index - 286344kB
 git svn metadata - nearly 13MB, allows incremental updates as more
 commits made in svn.

That's great, but... could you tell us how you did it?


git repack -a -d -f --window=100 --depth=100

(I know the depth is a bit much, depth=50 produces about 420MB pack).
Be prepared to wait a long time to finish.  But hopefully I'll have it
up somewhere soon for others.

Cheers,

Harvey


Re: [OT] Re: Git repository with full GCC history

2007-06-07 Thread Harvey Harrison

On 6/4/07, David Woodhouse [EMAIL PROTECTED] wrote:

On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote:
 If I can reproduce it I'll see if I can find some webspace.



I figured out my operator error with git gc.

The final results of a repository holding a clone of trunk:

Size of git packs:
pack + index - 286344kB
git svn metadata - nearly 13MB, allows incremental updates as more
commits made in svn.

I'm in touch with David getting some space to upload it.

Cheers,

Harvey Harrison


Re: [OT] Re: Git repository with full GCC history

2007-06-07 Thread Bernardo Innocenti

Harvey Harrison wrote:


The final results of a repository holding a clone of trunk:


With or without branches? (shouldn't matter that much, just
for the record)


Size of git packs:
pack + index - 286344kB
git svn metadata - nearly 13MB, allows incremental updates as more
commits made in svn.


That's great, but... could you tell us how you did it?

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: [OT] Re: Git repository with full GCC history

2007-06-04 Thread David Woodhouse
On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote:
 If I can reproduce it I'll see if I can find some webspace.

If you mail me a SSH public key you can also put it on
git.infradead.org.

-- 
dwmw2



Re: [OT] Re: Git repository with full GCC history

2007-06-04 Thread Bernardo Innocenti

David Woodhouse wrote:

On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote:

If I can reproduce it I'll see if I can find some webspace.


If you mail me a SSH public key you can also put it on
git.infradead.org.


Come visit git.infradead.org and its GCC development fork.

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: [OT] Re: Git repository with full GCC history

2007-06-04 Thread Jan-Benedict Glaw
On Mon, 2007-06-04 05:17:17 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote:
 David Woodhouse wrote:
  On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote:
   If I can reproduce it I'll see if I can find some webspace.
 
  If you mail me a SSH public key you can also put it on
  git.infradead.org.
 
 Come visit git.infradead.org and its GCC development fork.

*cough* No reason to fork.  At least I'm just too used to GIT these
days and like it quite a lot, that's why I work on getting the
toolchain repos converted (and kept up-to-date!) somewhere as GIT
repos.

This just eases the pain keeping those patches up-to-date in some
branches, that aren't yet ready for merging.

MfG, JBG

-- 
  Jan-Benedict Glaw  [EMAIL PROTECTED]  +49-172-7608481
 Signature of:  http://perl.plover.com/Questions.html
 the second  :


signature.asc
Description: Digital signature


Re: [OT] Re: Git repository with full GCC history

2007-06-04 Thread Bernardo Innocenti

Jan-Benedict Glaw wrote:


Come visit git.infradead.org and its GCC development fork.


*cough* No reason to fork.  At least I'm just too used to GIT these
days and like it quite a lot, that's why I work on getting the
toolchain repos converted (and kept up-to-date!) somewhere as GIT
repos.


Err... Of course I was just joking.



This just eases the pain keeping those patches up-to-date in some
branches, that aren't yet ready for merging.


Indeed, but if we moved the git repository to gcc.gnu.org it
could also be used for pushing patches to the centralized
GCC repository.

For everyday development, I'd very much prefer using Git than
Subversion.

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: [OT] Re: Git repository with full GCC history

2007-06-03 Thread Bernardo Innocenti

Harvey Harrison wrote:



I get about 1.4 GB for the pack with the default
depth and window parameters.


I forgot to mention that I obtained an ~800MB repository
with git 1.5.0.x after increasing the window size to 20.



The defaults changed significantly somewhere near version 1.5.1 I
believe with the delta caching mechanism making it much less expensive
to use a deeper delta depth.


Did you enable UseDeltaBaseOffset perhaps?  Or any of the git-gc
configuration parameters?

I suspecet I may have huge loose objects created by
git-svn because of how it does branching and rebasing.



I don't know what Harvey is up to, he claimed his tree
was between 400 and 500 MB.


Sorry, my mistake, the repo I was looking at only contained a clone of trunk.


Mine too. I have not cloned the branches.



I'm starting to pull the rest of the branches now, but I am also a
little surprised at the difference.


Please, make your tree available for inspection.


--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: [OT] Re: Git repository with full GCC history

2007-06-03 Thread Harvey Harrison

On 6/3/07, Bernardo Innocenti [EMAIL PROTECTED] wrote:

Harvey Harrison wrote:


 I get about 1.4 GB for the pack with the default
 depth and window parameters.

I forgot to mention that I obtained an ~800MB repository
with git 1.5.0.x after increasing the window size to 20.




Now I don't know what is going on, I tried to reproduce my packfile
and get a 1.4GB packfile.  But I'm sure I redid the same operations?
I'd just disregard my size numbers until I can figure out what I've
done wrong.


 The defaults changed significantly somewhere near version 1.5.1 I
 believe with the delta caching mechanism making it much less expensive
 to use a deeper delta depth.

Did you enable UseDeltaBaseOffset perhaps?  Or any of the git-gc
configuration parameters?


I did recently upgrade to the tip of 'next' branch of git, will go
back to my previous version.

I have set

[core]
legacyheaders = false
[repack]
usedeltabaseoffset = true

in my .gitconfig.



Please, make your tree available for inspection.



If I can reproduce it I'll see if I can find some webspace.

Harvey


Re: Git repository with full GCC history

2007-06-01 Thread Bernardo Innocenti

Harvey Harrison wrote:


Was this repo made with svnimport or git-svn? svnimport is faster but
chooses bad delta bases as a result.  git repack -a -d -f would allow
git to choose better deltas rather than reusing the deltas that
svnimport created.


I used:

git-svn fetch
git-fetch . remotes/git-svn


Yes, I did a git-repack -a -d -f too.  And I even did
one with --window=20, but nothing changed.



(I think, I'm not a git expert).


Neither am I, but after all, who is?  (Linus, you don't count)



What version of git did you use? 1.5.0.6 here.


1.5.2


I shall try it...  That's probably it.

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


[OT] Re: Git repository with full GCC history

2007-06-01 Thread Gabriel Paubert
On Fri, Jun 01, 2007 at 02:52:43AM -0400, Bernardo Innocenti wrote:
 Harvey Harrison wrote:
 
 Was this repo made with svnimport or git-svn? svnimport is faster but
 chooses bad delta bases as a result.  git repack -a -d -f would allow
 git to choose better deltas rather than reusing the deltas that
 svnimport created.
 
 I used:
 
 git-svn fetch
 git-fetch . remotes/git-svn
 
 
 Yes, I did a git-repack -a -d -f too.  And I even did
 one with --window=20, but nothing changed.
 
 
 (I think, I'm not a git expert).
 
 Neither am I, but after all, who is?  (Linus, you don't count)
 
 
 What version of git did you use? 1.5.0.6 here.
 
 1.5.2
 
 I shall try it...  That's probably it.
 

This may be the pack depth which was increased to 50 according
to 1.5.2 release notes:

  - The default pack depth has been increased to 50, as the
 recent addition of delta_base_cache makes deeper delta chains
 much less expensive to access.  Depending on the project, it was
 reported that this reduces the resulting pack file by 10%
 or so.

I'm almost certain that the savings will be much larger than 10% for 
some files, for example the ChangeLogs.

BTW, there is a strange line in the current ChangeLog, between May 30th
and May 31st entries:  .r125234. Is it just me, a subversion
glitch or something else?

Gabriel


Re: Git repository with full GCC history

2007-06-01 Thread Bernardo Innocenti

Jan-Benedict Glaw wrote:

On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote:

I've set up a Git mirror of the entire GCC history on
server space kindly provided by David Woodhouse.

You can clone it with:

git-clone git://git.infradead.org/gcc.git


How often will it be synced with upstream SVN?


I've setup a cron job every hour, but I can increase the
frequency if needed.  git-svn is not a cpu/bandwidth hog.


While you're at it,
would David mind to also place a binutils, glibc and glibc-ports GIT
repo next to it?  That way, there would be a nice single point of GIT
repos for the whole toolchain.


For this, I'd prefer waiting for David's answer.  David,
my guess is that all of these combined should be smaller
than GCC alone.  There should be fewer users, too.



Thanks for the work, I'll just clone it right now :)


Be our guest, and let me know if you find a way to
repack the repo to a smaller size.

Not that I care that much... 800MB is small enough for
today's bandwidth.

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: Git repository with full GCC history

2007-06-01 Thread Jan-Benedict Glaw
On Fri, 2007-06-01 04:47:11 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote:
 Jan-Benedict Glaw wrote:
  On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] 
  wrote:
   I've set up a Git mirror of the entire GCC history on
   server space kindly provided by David Woodhouse.
  
   You can clone it with:
  
   git-clone git://git.infradead.org/gcc.git
 
  How often will it be synced with upstream SVN?
 
 I've setup a cron job every hour, but I can increase the
 frequency if needed.  git-svn is not a cpu/bandwidth hog.

Can single (or a really small number of SVN commits) be pulled
efficiently? A year ago, I had something similar in place and called
the update script from procmail (so that it would pull the last commit
right after receiving the commit email.)

But hourly mirroring sounds like being more than enough, though.

  While you're at it,
  would David mind to also place a binutils, glibc and glibc-ports GIT
  repo next to it?  That way, there would be a nice single point of GIT
  repos for the whole toolchain.
 
 For this, I'd prefer waiting for David's answer.  David,
 my guess is that all of these combined should be smaller
 than GCC alone.  There should be fewer users, too.

I guess that'll be in the 150..200 MB range.

  Thanks for the work, I'll just clone it right now :)
 
 Be our guest, and let me know if you find a way to
 repack the repo to a smaller size.

You already did a full repack as I get from the other emails. I don't
think it'll pack any smaller.  You'd increase the window sizes, but
that simply won't pack it to 400MB :)  It's surely not worth burning
lots of CPU cycles for one megabyte, or two...

 Not that I care that much... 800MB is small enough for
 today's bandwidth.

Maybe we'd find one or two of these root servers offered by some
ISPs with unlimited traffic and start to spread the load, or ask the
kernel.org guys if they'd also host a copy of the repo.

MfG, JBG

-- 
  Jan-Benedict Glaw  [EMAIL PROTECTED]  +49-172-7608481
Signature of: Alles wird gut! ...und heute wirds schon ein bißchen 
besser.
the second  :


signature.asc
Description: Digital signature


Re: Git repository with full GCC history

2007-06-01 Thread David Woodhouse
On Fri, 2007-06-01 at 10:39 +0200, Jan-Benedict Glaw wrote:
 How often will it be synced with upstream SVN?  While you're at it,
 would David mind to also place a binutils, glibc and glibc-ports GIT
 repo next to it?  That way, there would be a nice single point of GIT
 repos for the whole toolchain. 

Sounds like a fine plan. Bernie, if you want to create these in your
home directory I'll move them to /srv/git next to gcc.git.

-- 
dwmw2



Re: Git repository with full GCC history

2007-06-01 Thread Gabriel Paubert
On Fri, Jun 01, 2007 at 04:47:11AM -0400, Bernardo Innocenti wrote:
 Jan-Benedict Glaw wrote:
 On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] 
 wrote:
 I've set up a Git mirror of the entire GCC history on
 server space kindly provided by David Woodhouse.
 
 You can clone it with:
 
 git-clone git://git.infradead.org/gcc.git
 
 How often will it be synced with upstream SVN?
 
 I've setup a cron job every hour, but I can increase the
 frequency if needed.  git-svn is not a cpu/bandwidth hog.
 
 While you're at it,
 would David mind to also place a binutils, glibc and glibc-ports GIT
 repo next to it?  That way, there would be a nice single point of GIT
 repos for the whole toolchain.
 
 For this, I'd prefer waiting for David's answer.  David,
 my guess is that all of these combined should be smaller
 than GCC alone.  There should be fewer users, too.
 
 
 Thanks for the work, I'll just clone it right now :)
 
 Be our guest, and let me know if you find a way to
 repack the repo to a smaller size.

I just upgraded my git to 1.5.2 and repacked the git repository
with git-gc --aggressive. It is quite impressive: the size of 
the pack file was almost cut in half, from ~23MB to ~12MB!

Gabriel


Re: Git repository with full GCC history

2007-06-01 Thread Jan-Benedict Glaw
On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote:
 I've set up a Git mirror of the entire GCC history on
 server space kindly provided by David Woodhouse.
 
 You can clone it with:
 
 git-clone git://git.infradead.org/gcc.git

How often will it be synced with upstream SVN?  While you're at it,
would David mind to also place a binutils, glibc and glibc-ports GIT
repo next to it?  That way, there would be a nice single point of GIT
repos for the whole toolchain.

Thanks for the work, I'll just clone it right now :)

MfG, JBG

-- 
  Jan-Benedict Glaw  [EMAIL PROTECTED]  +49-172-7608481
Signature of:http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
the second  :


signature.asc
Description: Digital signature


Re: Git repository with full GCC history

2007-06-01 Thread Jan-Benedict Glaw
On Fri, 2007-06-01 12:12:59 +0200, Gabriel Paubert [EMAIL PROTECTED] wrote:
 On Fri, Jun 01, 2007 at 04:47:11AM -0400, Bernardo Innocenti wrote:
  Be our guest, and let me know if you find a way to
  repack the repo to a smaller size.
 
 I just upgraded my git to 1.5.2 and repacked the git repository
 with git-gc --aggressive. It is quite impressive: the size of 
 the pack file was almost cut in half, from ~23MB to ~12MB!

This is way more that I expected. I'm officially impressed right now.

MfG, JBG

-- 
  Jan-Benedict Glaw  [EMAIL PROTECTED]  +49-172-7608481
Signature of:  Fortschritt bedeutet, einen Schritt so zu machen,
the second  :   daß man den nächsten auch noch machen kann.


signature.asc
Description: Digital signature


Re: [OT] Re: Git repository with full GCC history

2007-06-01 Thread Bernardo Innocenti

Gabriel Paubert wrote:


This may be the pack depth which was increased to 50 according
to 1.5.2 release notes:


I've repacked with 1.5.2, and it doesn't seem to decrease
the repo size considerably.

I'm now repacking with git-repack -a -d -f --window=20 --depth=100,
but it takes a lot of time on this old mule.

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: Git repository with full GCC history

2007-06-01 Thread Bernardo Innocenti

Gabriel Paubert wrote:


I just upgraded my git to 1.5.2 and repacked the git repository
with git-gc --aggressive. It is quite impressive: the size of 
the pack file was almost cut in half, from ~23MB to ~12MB!


The --aggressive option is undocumented in 1.5.2.  What
is it supposed to do?

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


[OT] Re: Git repository with full GCC history

2007-06-01 Thread Gabriel Paubert
On Fri, Jun 01, 2007 at 11:00:29AM -0400, Bernardo Innocenti wrote:
 Gabriel Paubert wrote:
 
 I just upgraded my git to 1.5.2 and repacked the git repository
 with git-gc --aggressive. It is quite impressive: the size of 
 the pack file was almost cut in half, from ~23MB to ~12MB!
 
 The --aggressive option is undocumented in 1.5.2.  What
 is it supposed to do?
 

It is documented in my freshly compiled and installed git:

   --aggressive
  Usually git-gc runs very quickly while providing good disk space
  utilization and performance. This option will cause git-gc to more
  aggressive optimize the repository at the expense of taking much
  more time. The effects of this optimization are persistent, so this
  option only needs to be sporadically; every few hundred changesets
  or so.

Regards,
Gabriel


Re: [OT] Re: Git repository with full GCC history

2007-06-01 Thread Bernardo Innocenti

Gabriel Paubert wrote:

On Fri, Jun 01, 2007 at 11:00:29AM -0400, Bernardo Innocenti wrote:

Gabriel Paubert wrote:


I just upgraded my git to 1.5.2 and repacked the git repository
with git-gc --aggressive. It is quite impressive: the size of 
the pack file was almost cut in half, from ~23MB to ~12MB!

The --aggressive option is undocumented in 1.5.2.  What
is it supposed to do?


It is documented in my freshly compiled and installed git:


In the source, I see it just passes -f to git-repack, which
I already did manually, with no improvement.

So there must be something strange in your repository if
it packs that much better than mine.  Could you please
publish it somewhere so I can make some tests?

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: Git repository with full GCC history

2007-05-31 Thread Harvey Harrison

Are you sure it packs to 877MB?  Oh, are you including a checked out
gcc source tree in that total?

In my gcc-svn clone of trunk:

~/dev/trunk$ du -s
873200

~/dev/trunk$ du .git -s
423064

This is a fully packed repo with default packing settings. (git gc)

Cheers,

Harvey Harrison


Re: Git repository with full GCC history

2007-05-31 Thread Bernardo Innocenti

Harvey Harrison wrote:


Are you sure it packs to 877MB?  Oh, are you including a checked out
gcc source tree in that total?


No, I only computed the .git size.



~/dev/trunk$ du .git -s
423064


I have a single, huge pack of 863M:

-r--r--r-- 1 bernie bernie 863M May 31 21:42 
pack-88472c7e9d0d8b80da5f7a815685d4347bee9546.pack



This is a fully packed repo with default packing settings. (git gc)


I did it with git-repack -a -d... should have the same result.

How many commits do you have? 81193 here.

What version of git did you use? 1.5.0.6 here.

--
  // Bernardo Innocenti
\X/  http://www.codewiz.org/


Re: Git repository with full GCC history

2007-05-31 Thread Harvey Harrison

Whoops, trimmed CC:

On 5/31/07, Bernardo Innocenti [EMAIL PROTECTED] wrote:

Harvey Harrison wrote:

 Are you sure it packs to 877MB?  Oh, are you including a checked out
 gcc source tree in that total?

No, I only computed the .git size.



OK, just seemed like my size with working tree was close to your reported size.


 This is a fully packed repo with default packing settings. (git gc)

I did it with git-repack -a -d... should have the same result.


Was this repo made with svnimport or git-svn? svnimport is faster but
chooses bad delta bases as a result.  git repack -a -d -f would allow
git to choose better deltas rather than reusing the deltas that
svnimport created.  (I think, I'm not a git expert).



How many commits do you have? 81193 here.



git rev-list HEAD | wc
80419

Hmmm, mine is only a clone of trunk, but I am suprised by the blowup
in size.  I'll go pick up the rest of the svn commits and see what
that does to my pack.


What version of git did you use? 1.5.0.6 here.



1.5.2

Harvey