Re: [OT] Re: Git repository with full GCC history
On 6/7/07, Bernardo Innocenti [EMAIL PROTECTED] wrote: Harvey Harrison wrote: The final results of a repository holding a clone of trunk: With or without branches? (shouldn't matter that much, just for the record) Just trunk. Size of git packs: pack + index - 286344kB git svn metadata - nearly 13MB, allows incremental updates as more commits made in svn. That's great, but... could you tell us how you did it? git repack -a -d -f --window=100 --depth=100 (I know the depth is a bit much, depth=50 produces about 420MB pack). Be prepared to wait a long time to finish. But hopefully I'll have it up somewhere soon for others. Cheers, Harvey
Re: [OT] Re: Git repository with full GCC history
On 6/4/07, David Woodhouse [EMAIL PROTECTED] wrote: On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote: If I can reproduce it I'll see if I can find some webspace. I figured out my operator error with git gc. The final results of a repository holding a clone of trunk: Size of git packs: pack + index - 286344kB git svn metadata - nearly 13MB, allows incremental updates as more commits made in svn. I'm in touch with David getting some space to upload it. Cheers, Harvey Harrison
Re: [OT] Re: Git repository with full GCC history
Harvey Harrison wrote: The final results of a repository holding a clone of trunk: With or without branches? (shouldn't matter that much, just for the record) Size of git packs: pack + index - 286344kB git svn metadata - nearly 13MB, allows incremental updates as more commits made in svn. That's great, but... could you tell us how you did it? -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: [OT] Re: Git repository with full GCC history
On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote: If I can reproduce it I'll see if I can find some webspace. If you mail me a SSH public key you can also put it on git.infradead.org. -- dwmw2
Re: [OT] Re: Git repository with full GCC history
David Woodhouse wrote: On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote: If I can reproduce it I'll see if I can find some webspace. If you mail me a SSH public key you can also put it on git.infradead.org. Come visit git.infradead.org and its GCC development fork. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: [OT] Re: Git repository with full GCC history
On Mon, 2007-06-04 05:17:17 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote: David Woodhouse wrote: On Sun, 2007-06-03 at 19:57 -0700, Harvey Harrison wrote: If I can reproduce it I'll see if I can find some webspace. If you mail me a SSH public key you can also put it on git.infradead.org. Come visit git.infradead.org and its GCC development fork. *cough* No reason to fork. At least I'm just too used to GIT these days and like it quite a lot, that's why I work on getting the toolchain repos converted (and kept up-to-date!) somewhere as GIT repos. This just eases the pain keeping those patches up-to-date in some branches, that aren't yet ready for merging. MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of: http://perl.plover.com/Questions.html the second : signature.asc Description: Digital signature
Re: [OT] Re: Git repository with full GCC history
Jan-Benedict Glaw wrote: Come visit git.infradead.org and its GCC development fork. *cough* No reason to fork. At least I'm just too used to GIT these days and like it quite a lot, that's why I work on getting the toolchain repos converted (and kept up-to-date!) somewhere as GIT repos. Err... Of course I was just joking. This just eases the pain keeping those patches up-to-date in some branches, that aren't yet ready for merging. Indeed, but if we moved the git repository to gcc.gnu.org it could also be used for pushing patches to the centralized GCC repository. For everyday development, I'd very much prefer using Git than Subversion. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: [OT] Re: Git repository with full GCC history
Harvey Harrison wrote: I get about 1.4 GB for the pack with the default depth and window parameters. I forgot to mention that I obtained an ~800MB repository with git 1.5.0.x after increasing the window size to 20. The defaults changed significantly somewhere near version 1.5.1 I believe with the delta caching mechanism making it much less expensive to use a deeper delta depth. Did you enable UseDeltaBaseOffset perhaps? Or any of the git-gc configuration parameters? I suspecet I may have huge loose objects created by git-svn because of how it does branching and rebasing. I don't know what Harvey is up to, he claimed his tree was between 400 and 500 MB. Sorry, my mistake, the repo I was looking at only contained a clone of trunk. Mine too. I have not cloned the branches. I'm starting to pull the rest of the branches now, but I am also a little surprised at the difference. Please, make your tree available for inspection. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: [OT] Re: Git repository with full GCC history
On 6/3/07, Bernardo Innocenti [EMAIL PROTECTED] wrote: Harvey Harrison wrote: I get about 1.4 GB for the pack with the default depth and window parameters. I forgot to mention that I obtained an ~800MB repository with git 1.5.0.x after increasing the window size to 20. Now I don't know what is going on, I tried to reproduce my packfile and get a 1.4GB packfile. But I'm sure I redid the same operations? I'd just disregard my size numbers until I can figure out what I've done wrong. The defaults changed significantly somewhere near version 1.5.1 I believe with the delta caching mechanism making it much less expensive to use a deeper delta depth. Did you enable UseDeltaBaseOffset perhaps? Or any of the git-gc configuration parameters? I did recently upgrade to the tip of 'next' branch of git, will go back to my previous version. I have set [core] legacyheaders = false [repack] usedeltabaseoffset = true in my .gitconfig. Please, make your tree available for inspection. If I can reproduce it I'll see if I can find some webspace. Harvey
Re: Git repository with full GCC history
Harvey Harrison wrote: Was this repo made with svnimport or git-svn? svnimport is faster but chooses bad delta bases as a result. git repack -a -d -f would allow git to choose better deltas rather than reusing the deltas that svnimport created. I used: git-svn fetch git-fetch . remotes/git-svn Yes, I did a git-repack -a -d -f too. And I even did one with --window=20, but nothing changed. (I think, I'm not a git expert). Neither am I, but after all, who is? (Linus, you don't count) What version of git did you use? 1.5.0.6 here. 1.5.2 I shall try it... That's probably it. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
[OT] Re: Git repository with full GCC history
On Fri, Jun 01, 2007 at 02:52:43AM -0400, Bernardo Innocenti wrote: Harvey Harrison wrote: Was this repo made with svnimport or git-svn? svnimport is faster but chooses bad delta bases as a result. git repack -a -d -f would allow git to choose better deltas rather than reusing the deltas that svnimport created. I used: git-svn fetch git-fetch . remotes/git-svn Yes, I did a git-repack -a -d -f too. And I even did one with --window=20, but nothing changed. (I think, I'm not a git expert). Neither am I, but after all, who is? (Linus, you don't count) What version of git did you use? 1.5.0.6 here. 1.5.2 I shall try it... That's probably it. This may be the pack depth which was increased to 50 according to 1.5.2 release notes: - The default pack depth has been increased to 50, as the recent addition of delta_base_cache makes deeper delta chains much less expensive to access. Depending on the project, it was reported that this reduces the resulting pack file by 10% or so. I'm almost certain that the savings will be much larger than 10% for some files, for example the ChangeLogs. BTW, there is a strange line in the current ChangeLog, between May 30th and May 31st entries: .r125234. Is it just me, a subversion glitch or something else? Gabriel
Re: Git repository with full GCC history
Jan-Benedict Glaw wrote: On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote: I've set up a Git mirror of the entire GCC history on server space kindly provided by David Woodhouse. You can clone it with: git-clone git://git.infradead.org/gcc.git How often will it be synced with upstream SVN? I've setup a cron job every hour, but I can increase the frequency if needed. git-svn is not a cpu/bandwidth hog. While you're at it, would David mind to also place a binutils, glibc and glibc-ports GIT repo next to it? That way, there would be a nice single point of GIT repos for the whole toolchain. For this, I'd prefer waiting for David's answer. David, my guess is that all of these combined should be smaller than GCC alone. There should be fewer users, too. Thanks for the work, I'll just clone it right now :) Be our guest, and let me know if you find a way to repack the repo to a smaller size. Not that I care that much... 800MB is small enough for today's bandwidth. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: Git repository with full GCC history
On Fri, 2007-06-01 04:47:11 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote: Jan-Benedict Glaw wrote: On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote: I've set up a Git mirror of the entire GCC history on server space kindly provided by David Woodhouse. You can clone it with: git-clone git://git.infradead.org/gcc.git How often will it be synced with upstream SVN? I've setup a cron job every hour, but I can increase the frequency if needed. git-svn is not a cpu/bandwidth hog. Can single (or a really small number of SVN commits) be pulled efficiently? A year ago, I had something similar in place and called the update script from procmail (so that it would pull the last commit right after receiving the commit email.) But hourly mirroring sounds like being more than enough, though. While you're at it, would David mind to also place a binutils, glibc and glibc-ports GIT repo next to it? That way, there would be a nice single point of GIT repos for the whole toolchain. For this, I'd prefer waiting for David's answer. David, my guess is that all of these combined should be smaller than GCC alone. There should be fewer users, too. I guess that'll be in the 150..200 MB range. Thanks for the work, I'll just clone it right now :) Be our guest, and let me know if you find a way to repack the repo to a smaller size. You already did a full repack as I get from the other emails. I don't think it'll pack any smaller. You'd increase the window sizes, but that simply won't pack it to 400MB :) It's surely not worth burning lots of CPU cycles for one megabyte, or two... Not that I care that much... 800MB is small enough for today's bandwidth. Maybe we'd find one or two of these root servers offered by some ISPs with unlimited traffic and start to spread the load, or ask the kernel.org guys if they'd also host a copy of the repo. MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of: Alles wird gut! ...und heute wirds schon ein bißchen besser. the second : signature.asc Description: Digital signature
Re: Git repository with full GCC history
On Fri, 2007-06-01 at 10:39 +0200, Jan-Benedict Glaw wrote: How often will it be synced with upstream SVN? While you're at it, would David mind to also place a binutils, glibc and glibc-ports GIT repo next to it? That way, there would be a nice single point of GIT repos for the whole toolchain. Sounds like a fine plan. Bernie, if you want to create these in your home directory I'll move them to /srv/git next to gcc.git. -- dwmw2
Re: Git repository with full GCC history
On Fri, Jun 01, 2007 at 04:47:11AM -0400, Bernardo Innocenti wrote: Jan-Benedict Glaw wrote: On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote: I've set up a Git mirror of the entire GCC history on server space kindly provided by David Woodhouse. You can clone it with: git-clone git://git.infradead.org/gcc.git How often will it be synced with upstream SVN? I've setup a cron job every hour, but I can increase the frequency if needed. git-svn is not a cpu/bandwidth hog. While you're at it, would David mind to also place a binutils, glibc and glibc-ports GIT repo next to it? That way, there would be a nice single point of GIT repos for the whole toolchain. For this, I'd prefer waiting for David's answer. David, my guess is that all of these combined should be smaller than GCC alone. There should be fewer users, too. Thanks for the work, I'll just clone it right now :) Be our guest, and let me know if you find a way to repack the repo to a smaller size. I just upgraded my git to 1.5.2 and repacked the git repository with git-gc --aggressive. It is quite impressive: the size of the pack file was almost cut in half, from ~23MB to ~12MB! Gabriel
Re: Git repository with full GCC history
On Thu, 2007-05-31 21:34:33 -0400, Bernardo Innocenti [EMAIL PROTECTED] wrote: I've set up a Git mirror of the entire GCC history on server space kindly provided by David Woodhouse. You can clone it with: git-clone git://git.infradead.org/gcc.git How often will it be synced with upstream SVN? While you're at it, would David mind to also place a binutils, glibc and glibc-ports GIT repo next to it? That way, there would be a nice single point of GIT repos for the whole toolchain. Thanks for the work, I'll just clone it right now :) MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of:http://www.chiark.greenend.org.uk/~sgtatham/bugs.html the second : signature.asc Description: Digital signature
Re: Git repository with full GCC history
On Fri, 2007-06-01 12:12:59 +0200, Gabriel Paubert [EMAIL PROTECTED] wrote: On Fri, Jun 01, 2007 at 04:47:11AM -0400, Bernardo Innocenti wrote: Be our guest, and let me know if you find a way to repack the repo to a smaller size. I just upgraded my git to 1.5.2 and repacked the git repository with git-gc --aggressive. It is quite impressive: the size of the pack file was almost cut in half, from ~23MB to ~12MB! This is way more that I expected. I'm officially impressed right now. MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of: Fortschritt bedeutet, einen Schritt so zu machen, the second : daß man den nächsten auch noch machen kann. signature.asc Description: Digital signature
Re: [OT] Re: Git repository with full GCC history
Gabriel Paubert wrote: This may be the pack depth which was increased to 50 according to 1.5.2 release notes: I've repacked with 1.5.2, and it doesn't seem to decrease the repo size considerably. I'm now repacking with git-repack -a -d -f --window=20 --depth=100, but it takes a lot of time on this old mule. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: Git repository with full GCC history
Gabriel Paubert wrote: I just upgraded my git to 1.5.2 and repacked the git repository with git-gc --aggressive. It is quite impressive: the size of the pack file was almost cut in half, from ~23MB to ~12MB! The --aggressive option is undocumented in 1.5.2. What is it supposed to do? -- // Bernardo Innocenti \X/ http://www.codewiz.org/
[OT] Re: Git repository with full GCC history
On Fri, Jun 01, 2007 at 11:00:29AM -0400, Bernardo Innocenti wrote: Gabriel Paubert wrote: I just upgraded my git to 1.5.2 and repacked the git repository with git-gc --aggressive. It is quite impressive: the size of the pack file was almost cut in half, from ~23MB to ~12MB! The --aggressive option is undocumented in 1.5.2. What is it supposed to do? It is documented in my freshly compiled and installed git: --aggressive Usually git-gc runs very quickly while providing good disk space utilization and performance. This option will cause git-gc to more aggressive optimize the repository at the expense of taking much more time. The effects of this optimization are persistent, so this option only needs to be sporadically; every few hundred changesets or so. Regards, Gabriel
Re: [OT] Re: Git repository with full GCC history
Gabriel Paubert wrote: On Fri, Jun 01, 2007 at 11:00:29AM -0400, Bernardo Innocenti wrote: Gabriel Paubert wrote: I just upgraded my git to 1.5.2 and repacked the git repository with git-gc --aggressive. It is quite impressive: the size of the pack file was almost cut in half, from ~23MB to ~12MB! The --aggressive option is undocumented in 1.5.2. What is it supposed to do? It is documented in my freshly compiled and installed git: In the source, I see it just passes -f to git-repack, which I already did manually, with no improvement. So there must be something strange in your repository if it packs that much better than mine. Could you please publish it somewhere so I can make some tests? -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: Git repository with full GCC history
Are you sure it packs to 877MB? Oh, are you including a checked out gcc source tree in that total? In my gcc-svn clone of trunk: ~/dev/trunk$ du -s 873200 ~/dev/trunk$ du .git -s 423064 This is a fully packed repo with default packing settings. (git gc) Cheers, Harvey Harrison
Re: Git repository with full GCC history
Harvey Harrison wrote: Are you sure it packs to 877MB? Oh, are you including a checked out gcc source tree in that total? No, I only computed the .git size. ~/dev/trunk$ du .git -s 423064 I have a single, huge pack of 863M: -r--r--r-- 1 bernie bernie 863M May 31 21:42 pack-88472c7e9d0d8b80da5f7a815685d4347bee9546.pack This is a fully packed repo with default packing settings. (git gc) I did it with git-repack -a -d... should have the same result. How many commits do you have? 81193 here. What version of git did you use? 1.5.0.6 here. -- // Bernardo Innocenti \X/ http://www.codewiz.org/
Re: Git repository with full GCC history
Whoops, trimmed CC: On 5/31/07, Bernardo Innocenti [EMAIL PROTECTED] wrote: Harvey Harrison wrote: Are you sure it packs to 877MB? Oh, are you including a checked out gcc source tree in that total? No, I only computed the .git size. OK, just seemed like my size with working tree was close to your reported size. This is a fully packed repo with default packing settings. (git gc) I did it with git-repack -a -d... should have the same result. Was this repo made with svnimport or git-svn? svnimport is faster but chooses bad delta bases as a result. git repack -a -d -f would allow git to choose better deltas rather than reusing the deltas that svnimport created. (I think, I'm not a git expert). How many commits do you have? 81193 here. git rev-list HEAD | wc 80419 Hmmm, mine is only a clone of trunk, but I am suprised by the blowup in size. I'll go pick up the rest of the svn commits and see what that does to my pack. What version of git did you use? 1.5.0.6 here. 1.5.2 Harvey