Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 01/19/2015 09:53 AM, Tom Lane wrote: Andrew Dunstan writes: But I'm wondering if we should look at using the tricks git-new-workdir uses, setting up symlinks instead of a full clone. Then we'd have one clone with a bunch of different work dirs. That plus a but of explicitly done garbage collection and possibly a periodic re-clone might do the trick. Yeah, I was wondering whether it'd be okay to depend on git-new-workdir. That would fix the problem pretty nicely. But in the installations I've seen, that's not in PATH but squirreled away in some hard-to-guess library directory ... We should move this discussion to the buildfarm members list. I'll be publishing a patch there. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 1/19/15 1:07 PM, Andres Freund wrote: On 2015-01-18 17:48:11 -0500, Tom Lane wrote: One of the biggest causes of buildfarm run failures is "out of disk space". That's not just because people are running buildfarm critters on small slow machines; it's because "make check-world" is an enormous space hog. Some numbers from current HEAD: clean source tree: 120MB built source tree: 400MB tree after make check-world:3GB (This is excluding ~250MB for one's git repo.) The reason for all the bloat is the temporary install trees that we create, which tend to eat up about 100MB apiece, and there are dozens of them (eg, one per testable contrib module). Those don't get removed until the end of the test run, so the usage is cumulative. The attached proposed patch removes each temp install tree as soon as we're done with it, in the normal case where no error was detected. This brings the peak space usage down from ~3GB to ~750MB. I was wondering before if we couldn't always do the the temp installation into $top_builddir/tmp_install or something like it. With an additional small ugly hacking ontop we could even avoid reinstalling for every target in check-world. FWIW, if anyone's going to do some serious tinkering in here; it'd be really nice to create a separate utility for managing temporary installs. That would make it trivial for PGXN modules to use something other than pg_regress for their test framework. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 2015-01-18 17:48:11 -0500, Tom Lane wrote: > One of the biggest causes of buildfarm run failures is "out of disk > space". That's not just because people are running buildfarm critters > on small slow machines; it's because "make check-world" is an enormous > space hog. Some numbers from current HEAD: > > clean source tree:120MB > built source tree:400MB > tree after make check-world: 3GB > > (This is excluding ~250MB for one's git repo.) > > The reason for all the bloat is the temporary install trees that we > create, which tend to eat up about 100MB apiece, and there are dozens > of them (eg, one per testable contrib module). Those don't get removed > until the end of the test run, so the usage is cumulative. > > The attached proposed patch removes each temp install tree as soon as > we're done with it, in the normal case where no error was detected. > This brings the peak space usage down from ~3GB to ~750MB. I was wondering before if we couldn't always do the the temp installation into $top_builddir/tmp_install or something like it. With an additional small ugly hacking ontop we could even avoid reinstalling for every target in check-world. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 01/19/2015 09:53 AM, Tom Lane wrote: Andrew Dunstan writes: But I'm wondering if we should look at using the tricks git-new-workdir uses, setting up symlinks instead of a full clone. Then we'd have one clone with a bunch of different work dirs. That plus a but of explicitly done garbage collection and possibly a periodic re-clone might do the trick. Yeah, I was wondering whether it'd be okay to depend on git-new-workdir. That would fix the problem pretty nicely. But in the installations I've seen, that's not in PATH but squirreled away in some hard-to-guess library directory ... Yeah. Luckily, there are really only half a dozen or so lines of script that do the actual work - the rest is sanity checks. I think we can replicate that without requiring the script. I'll have a stab later in the week. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
Andrew Dunstan writes: > But I'm wondering if we should look at using the tricks git-new-workdir > uses, setting up symlinks instead of a full clone. Then we'd have one > clone with a bunch of different work dirs. That plus a but of explicitly > done garbage collection and possibly a periodic re-clone might do the trick. Yeah, I was wondering whether it'd be okay to depend on git-new-workdir. That would fix the problem pretty nicely. But in the installations I've seen, that's not in PATH but squirreled away in some hard-to-guess library directory ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 01/19/2015 12:28 AM, Tom Lane wrote: An alternative would be to remove the pgsql directory at the end of the run and thus do a complete fresh checkout each run. As you say it would cost some time but save some space. At least it would be doable as an option, not sure I'd want to make it non-optional. What I was thinking is that a complete-fresh-checkout approach would remove the need for the copy_source step that happens now, thus buying back at least most of the I/O cost. But that's only considering the working tree. The real issue here seems to be about having duplicative git repos ... seems like we ought to be able to avoid that. It won't save a copy in the case of a vpath build, because there's no copying done then. But I'm wondering if we should look at using the tricks git-new-workdir uses, setting up symlinks instead of a full clone. Then we'd have one clone with a bunch of different work dirs. That plus a but of explicitly done garbage collection and possibly a periodic re-clone might do the trick. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
Andrew Dunstan writes: > On 01/18/2015 09:20 PM, Tom Lane wrote: >> What I see on dromedary, which has been around a bit less than a year, >> is that the at-rest space consumption for all 6 active branches is >> 2.4G even though a single copy of the git repo is just over 400MB: >> $ du -hsc pgmirror.git HEAD REL* >> 416Mpgmirror.git >> 363MHEAD >> 345MREL9_0_STABLE >> 351MREL9_1_STABLE >> 354MREL9_2_STABLE >> 358MREL9_3_STABLE >> 274MREL9_4_STABLE >> 2.4Gtotal > This isn't happening for me. Here's crake: > [andrew@emma root]$ du -shc pgmirror.git/ [RH]*/pgsql > 218Mpgmirror.git/ > 149MHEAD/pgsql > 134MREL9_0_STABLE/pgsql > 138MREL9_1_STABLE/pgsql > 140MREL9_2_STABLE/pgsql > 143MREL9_3_STABLE/pgsql > 146MREL9_4_STABLE/pgsql > 1.1Gtotal > Maybe you need some git garbage collection? Weird ... for me, dromedary and prairiedog are both showing very similar numbers. Shouldn't GC be automatic? These machines are not running latest and greatest git (looks like 1.7.3.1 and 1.7.9.6 respectively), maybe that has something to do with it? A fresh clone from git://git.postgresql.org/git/postgresql.git right now is 167MB (using dromedary's git version), so we're both showing some bloat over the minimum possible repo size, but it's curious that mine is so much worse. But the larger point is that git fetch does not, AFAICT, have the same kind of optimization that git clone does to do hard-linking when copying an object from a local source repo. With or without GC, the resulting duplicative storage is going to be the dominant effect after awhile on a machine tracking a full set of branches. > An alternative would be to remove the pgsql directory at the end of the > run and thus do a complete fresh checkout each run. As you say it would > cost some time but save some space. At least it would be doable as an > option, not sure I'd want to make it non-optional. What I was thinking is that a complete-fresh-checkout approach would remove the need for the copy_source step that happens now, thus buying back at least most of the I/O cost. But that's only considering the working tree. The real issue here seems to be about having duplicative git repos ... seems like we ought to be able to avoid that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 01/18/2015 09:20 PM, Tom Lane wrote: Andrew Dunstan writes: On 01/18/2015 05:48 PM, Tom Lane wrote: One of the biggest causes of buildfarm run failures is "out of disk space". That's not just because people are running buildfarm critters on small slow machines; it's because "make check-world" is an enormous space hog. Some numbers from current HEAD: I don't have an issue, but you should be aware that the buildfarm doesn't in fact run "make check-world", and it doesn't to a test install for each contrib module, since it runs "installcheck", not "check" for those. It also cleans up some data directories as it goes. Darn. I knew that it didn't use check-world per se, but I'd supposed it was doing something morally equivalent. But I checked just now and didn't see the space consumption of the pgsql.build + inst trees going much above about 750MB, so it's clearly not as bad as "make check-world". I think the patch I proposed is still worthwhile though, because it looks like the buildfarm is doing this on a case-by-case basis and missing some cases: I see the tmp_check directories for pg_upgrade and test_decoding sticking around till the end of the run. That could be fixed in the script of course, but why not have pg_regress do it? Also, investigating space consumption on my actual buildfarm critters, it seems like there might be some low hanging fruit in terms of git checkout management. It looks to me like each branch has a git repo that only shares objects that existed as of the initial cloning, so that over time each branch eats more and more unshared space. Also I wonder about the value of keeping around a checked-out tree per branch and copying it each time rather than just checking out fresh. What I see on dromedary, which has been around a bit less than a year, is that the at-rest space consumption for all 6 active branches is 2.4G even though a single copy of the git repo is just over 400MB: $ du -hsc pgmirror.git HEAD REL* 416Mpgmirror.git 363MHEAD 345MREL9_0_STABLE 351MREL9_1_STABLE 354MREL9_2_STABLE 358MREL9_3_STABLE 274MREL9_4_STABLE 2.4Gtotal It'd presumably be worse on a critter that's existed longer. Curious to know if you've looked into alternatives here. I realize that the tradeoffs might be different with an external git repo, but for one being managed by the buildfarm script, it seems like we could do better than this space-wise, for (maybe) little time penalty. I'd be willing to do some experimenting if you don't have time for it. This isn't happening for me. Here's crake: [andrew@emma root]$ du -shc pgmirror.git/ [RH]*/pgsql 218Mpgmirror.git/ 149MHEAD/pgsql 134MREL9_0_STABLE/pgsql 138MREL9_1_STABLE/pgsql 140MREL9_2_STABLE/pgsql 143MREL9_3_STABLE/pgsql 146MREL9_4_STABLE/pgsql 1.1Gtotal Maybe you need some git garbage collection? An alternative would be to remove the pgsql directory at the end of the run and thus do a complete fresh checkout each run. As you say it would cost some time but save some space. At least it would be doable as an option, not sure I'd want to make it non-optional. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
Andrew Dunstan writes: > On 01/18/2015 05:48 PM, Tom Lane wrote: >> One of the biggest causes of buildfarm run failures is "out of disk >> space". That's not just because people are running buildfarm critters >> on small slow machines; it's because "make check-world" is an enormous >> space hog. Some numbers from current HEAD: > I don't have an issue, but you should be aware that the buildfarm > doesn't in fact run "make check-world", and it doesn't to a test install > for each contrib module, since it runs "installcheck", not "check" for > those. It also cleans up some data directories as it goes. Darn. I knew that it didn't use check-world per se, but I'd supposed it was doing something morally equivalent. But I checked just now and didn't see the space consumption of the pgsql.build + inst trees going much above about 750MB, so it's clearly not as bad as "make check-world". I think the patch I proposed is still worthwhile though, because it looks like the buildfarm is doing this on a case-by-case basis and missing some cases: I see the tmp_check directories for pg_upgrade and test_decoding sticking around till the end of the run. That could be fixed in the script of course, but why not have pg_regress do it? Also, investigating space consumption on my actual buildfarm critters, it seems like there might be some low hanging fruit in terms of git checkout management. It looks to me like each branch has a git repo that only shares objects that existed as of the initial cloning, so that over time each branch eats more and more unshared space. Also I wonder about the value of keeping around a checked-out tree per branch and copying it each time rather than just checking out fresh. What I see on dromedary, which has been around a bit less than a year, is that the at-rest space consumption for all 6 active branches is 2.4G even though a single copy of the git repo is just over 400MB: $ du -hsc pgmirror.git HEAD REL* 416Mpgmirror.git 363MHEAD 345MREL9_0_STABLE 351MREL9_1_STABLE 354MREL9_2_STABLE 358MREL9_3_STABLE 274MREL9_4_STABLE 2.4Gtotal It'd presumably be worse on a critter that's existed longer. Curious to know if you've looked into alternatives here. I realize that the tradeoffs might be different with an external git repo, but for one being managed by the buildfarm script, it seems like we could do better than this space-wise, for (maybe) little time penalty. I'd be willing to do some experimenting if you don't have time for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On 01/18/2015 05:48 PM, Tom Lane wrote: One of the biggest causes of buildfarm run failures is "out of disk space". That's not just because people are running buildfarm critters on small slow machines; it's because "make check-world" is an enormous space hog. Some numbers from current HEAD: clean source tree: 120MB built source tree: 400MB tree after make check-world:3GB (This is excluding ~250MB for one's git repo.) The reason for all the bloat is the temporary install trees that we create, which tend to eat up about 100MB apiece, and there are dozens of them (eg, one per testable contrib module). Those don't get removed until the end of the test run, so the usage is cumulative. The attached proposed patch removes each temp install tree as soon as we're done with it, in the normal case where no error was detected. This brings the peak space usage down from ~3GB to ~750MB. To make things better in the buildfarm, we'd have to back-patch this into all active branches, but I don't see any big problem with doing so. Any objections? I don't have an issue, but you should be aware that the buildfarm doesn't in fact run "make check-world", and it doesn't to a test install for each contrib module, since it runs "installcheck", not "check" for those. It also cleans up some data directories as it goes. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
On Mon, Jan 19, 2015 at 7:48 AM, Tom Lane wrote: > To make things better in the buildfarm, we'd have to back-patch this into > all active branches, but I don't see any big problem with doing so. > Any objections? Back-patching sounds like a good idea to me. At least this will allow hamster to build all the active branches. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reducing buildfarm disk usage: remove temp installs when done
One of the biggest causes of buildfarm run failures is "out of disk space". That's not just because people are running buildfarm critters on small slow machines; it's because "make check-world" is an enormous space hog. Some numbers from current HEAD: clean source tree: 120MB built source tree: 400MB tree after make check-world:3GB (This is excluding ~250MB for one's git repo.) The reason for all the bloat is the temporary install trees that we create, which tend to eat up about 100MB apiece, and there are dozens of them (eg, one per testable contrib module). Those don't get removed until the end of the test run, so the usage is cumulative. The attached proposed patch removes each temp install tree as soon as we're done with it, in the normal case where no error was detected. This brings the peak space usage down from ~3GB to ~750MB. To make things better in the buildfarm, we'd have to back-patch this into all active branches, but I don't see any big problem with doing so. Any objections? regards, tom lane diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index caae3f0..ee3b80b 100644 *** a/src/test/regress/pg_regress.c --- b/src/test/regress/pg_regress.c *** regression_main(int argc, char *argv[], *** 2668,2673 --- 2668,2686 stop_postmaster(); } + /* + * If there were no errors, remove the temp installation immediately to + * conserve disk space. (If there were errors, we leave the installation + * in place for possible manual investigation.) + */ + if (temp_install && fail_count == 0 && fail_ignore_count == 0) + { + header(_("removing temporary installation")); + if (!rmtree(temp_install, true)) + fprintf(stderr, _("\n%s: could not remove temp installation \"%s\": %s\n"), + progname, temp_install, strerror(errno)); + } + fclose(logfile); /* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers