Re: A new and better way to do make readmes?
[ Sorry to be so late in following up on this; lost track for a while ] On Sat, 28 Jan 2012 16:53:17 + Matthew Seaman m.sea...@infracaninophile.co.uk wrote: On 28/01/2012 16:28, Conrad J. Sabatier wrote: rubbing eyes in disbelief Am I understanding you correctly? Are you saying you built 20,000+ port READMEs in only 9 seconds?! How is that possible? Or do you mean 9 seconds for each one? 9 seconds sounds quite reasonable for generating 23000 or so files. It sounds incredible to me! :-) Selective updating isn't going to help because 99.9% of the time is spent in the categories and it only takes a single port update to make a category file obsolete. This is the part I find troubling. It would seem that it should be more work to create an individual port README, with its plucking the appropriate line out of the INDEX-* file and then parsing it into its respective pieces and filling in a template, than to simply string together a list of references to a bunch of already built port READMEs into a category README. What am I not getting here? No -- you're quite right. You could generate the category README.html files entirely from the data in the INDEX. It's not quite as easy as all that, because there aren't entries for each category separately, so you'll have to parse the structure out of all of the paths in the INDEX. Well, the idea I had in mind was that, if all of the individual ports' README.html files already are in place, then it should be trivial to just ls or find them under each category to fill in the category's README.html. No need to reference the INDEX or anything else. Or??? The workaround method I've been running out of cron for the last month or so is: 1) Create a sentinel file under /tmp to use as a timestamp, just before running cvs update on ports (I update my ports tree from a local copy of the CVS repo maintained via csup) 2) After cvs completes, look for any port directories containing updates (check timestamps against the sentinel file) and do a make readme for each one: find $PORTSDIR -type f ! -path */CVS/* -newercm $SENTINEL -depth 3 | xargs dirname | sort -u | xargs -I@ /bin/sh -c cd @ make readme 3) Last, but not least, build the category README.html for any categories with ports containing newly updated README.html files. I have noticed while doing this that, as you mentioned, the category READMEs take considerably longer than the individual ports'. I don't even bother to rebuild the top-level file, since it's basically unchanging anyway. I think the way to speed this up is to have the script generate the category files too. There's no point in bringing in the top-level README since that's already fast. So what's making the category READMEs so slow then? The big problem with performance in all this INDEX and README.html building is that it takes quite a long time relatively to run make(1) within any port or category directory. make(1) has to read in a lot of other files and stat(2) many more[*] -- all of which involves a lot of random-access disk IO, and that's always going to take quite a lot of time. Now, doing 'make readme' in a category directory doesn't just run make in that directory, but also in every port in that category. Popular categories can contain many hundreds of ports. I'm a little rusty on the actual mechanics of make, but shouldn't it be possible to run a single, over-arching make on each category that wouldn't need to spawn a bunch of sub-makes? Maybe I should add README.html generation to my FreeBSD::Portindex stuff. Should be pretty simple -- all the necessary bits are readily available and it is just a matter of formatting it as HTML and printing it out. Maybe? Whaddya mean, maybe? :-) Sounds like it would definitely be worth doing! Cheers, Matthew [*] Running 'make -dA' with maximum debug output is quite enlightening, as is running make under truss(1) Enlightening, perhaps. Sometimes overwhelming, is more like it. :-) -- Conrad J. Sabatier conr...@cox.net ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
On Sat, 28 Jan 2012 18:44:48 +0100 Torfinn Ingolfsen tin...@gmail.com wrote: Hello, On Sat, Jan 28, 2012 at 3:03 AM, Conrad J. Sabatier conr...@cox.net wrote: I've been thinking for a long time that we need a better way to do make readmes, one that would be properly integrated into our ports Mk infrastructure, to take advantage of make's ability to recognize which files are up-to-date and which really do need rebuilding. I like to make sure my README.html files are all up-to-date after my nightly ports tree update, but with the current scheme, that means either rebuilding *all* of the files in the tree, or (as I'm doing at present) using some sort of kludgey (kludgy?) workaround. So people are actually using the readme files? Are many people using them? I ask because I *never* use them (unless they are used by 'make search'?), I always use freshports.org (BTW, thanks for an excellent service!) when I need to find out anything about a port. Well, in actual practice, it's true, I don't use them a *lot*, but I do use them from time to time when I'm looking for a new port to install for a certain purpose. It's nice to have up-to-date README.html files locally when the need arises. But they sure are expensive to maintain currently. -- Conrad J. Sabatier conr...@cox.net ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
On Thu, Feb 02, 2012 at 03:21:37PM -0600, Conrad J. Sabatier thus spake: [ Sorry to be so late in following up on this; lost track for a while ] On Sat, 28 Jan 2012 16:53:17 + Matthew Seaman m.sea...@infracaninophile.co.uk wrote: On 28/01/2012 16:28, Conrad J. Sabatier wrote: rubbing eyes in disbelief Am I understanding you correctly? Are you saying you built 20,000+ port READMEs in only 9 seconds?! How is that possible? Or do you mean 9 seconds for each one? 9 seconds sounds quite reasonable for generating 23000 or so files. It sounds incredible to me! :-) Selective updating isn't going to help because 99.9% of the time is spent in the categories and it only takes a single port update to make a category file obsolete. This is the part I find troubling. It would seem that it should be more work to create an individual port README, with its plucking the appropriate line out of the INDEX-* file and then parsing it into its respective pieces and filling in a template, than to simply string together a list of references to a bunch of already built port READMEs into a category README. What am I not getting here? No -- you're quite right. You could generate the category README.html files entirely from the data in the INDEX. It's not quite as easy as all that, because there aren't entries for each category separately, so you'll have to parse the structure out of all of the paths in the INDEX. Well, the idea I had in mind was that, if all of the individual ports' README.html files already are in place, then it should be trivial to just ls or find them under each category to fill in the category's README.html. No need to reference the INDEX or anything else. Or??? The workaround method I've been running out of cron for the last month or so is: 1) Create a sentinel file under /tmp to use as a timestamp, just before running cvs update on ports (I update my ports tree from a local copy of the CVS repo maintained via csup) 2) After cvs completes, look for any port directories containing updates (check timestamps against the sentinel file) and do a make readme for each one: find $PORTSDIR -type f ! -path */CVS/* -newercm $SENTINEL -depth 3 | xargs dirname | sort -u | xargs -I@ /bin/sh -c cd @ make readme 3) Last, but not least, build the category README.html for any categories with ports containing newly updated README.html files. I have noticed while doing this that, as you mentioned, the category READMEs take considerably longer than the individual ports'. I don't even bother to rebuild the top-level file, since it's basically unchanging anyway. I think the way to speed this up is to have the script generate the category files too. There's no point in bringing in the top-level README since that's already fast. So what's making the category READMEs so slow then? The big problem with performance in all this INDEX and README.html building is that it takes quite a long time relatively to run make(1) within any port or category directory. make(1) has to read in a lot of other files and stat(2) many more[*] -- all of which involves a lot of random-access disk IO, and that's always going to take quite a lot of time. Now, doing 'make readme' in a category directory doesn't just run make in that directory, but also in every port in that category. Popular categories can contain many hundreds of ports. I'm a little rusty on the actual mechanics of make, but shouldn't it be possible to run a single, over-arching make on each category that wouldn't need to spawn a bunch of sub-makes? Maybe I should add README.html generation to my FreeBSD::Portindex stuff. Should be pretty simple -- all the necessary bits are readily available and it is just a matter of formatting it as HTML and printing it out. Maybe? Whaddya mean, maybe? :-) Sounds like it would definitely be worth doing! Cheers, Matthew [*] Running 'make -dA' with maximum debug output is quite enlightening, as is running make under truss(1) Enlightening, perhaps. Sometimes overwhelming, is more like it. :-) Not to fancy, but I used this when I was updating the readmes to not break. #!/bin/sh cd /usr/ports for i in `make -V SUBDIR |sed s/local//g`; do for p in `make -C $i -V SUBDIR`; do echo $i/$p sudo make -C $i/$p readme ; done; done ~/readmes.log -jgh -- Jason Helfman System Administrator experts-exchange.com http://www.experts-exchange.com/M_4830110.html E4AD 7CF1 1396 27F6 79DD 4342 5E92 AD66 8C8C FBA5 ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
On Thu, 2 Feb 2012 13:25:14 -0800 Jason Helfman jhelf...@e-e.com wrote: On Thu, Feb 02, 2012 at 03:21:37PM -0600, Conrad J. Sabatier thus spake: [snip] The workaround method I've been running out of cron for the last month or so is: 1) Create a sentinel file under /tmp to use as a timestamp, just before running cvs update on ports (I update my ports tree from a local copy of the CVS repo maintained via csup) 2) After cvs completes, look for any port directories containing updates (check timestamps against the sentinel file) and do a make readme for each one: find $PORTSDIR -type f ! -path */CVS/* -newercm $SENTINEL -depth 3 | xargs dirname | sort -u | xargs -I@ /bin/sh -c cd @ make readme 3) Last, but not least, build the category README.html for any categories with ports containing newly updated README.html files. I have noticed while doing this that, as you mentioned, the category READMEs take considerably longer than the individual ports'. I don't even bother to rebuild the top-level file, since it's basically unchanging anyway. [snip] Not to fancy, but I used this when I was updating the readmes to not break. #!/bin/sh cd /usr/ports for i in `make -V SUBDIR |sed s/local//g`; do for p in `make -C $i -V SUBDIR`; do echo $i/$p sudo make -C $i/$p readme ; done; done ~/readmes.log -jgh Interesting. I'll take a look at using that. Thanks! -- Conrad J. Sabatier conr...@cox.net ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
On Fri, 27 Jan 2012 20:03:25 -0600 Conrad J. Sabatier wrote: I've been thinking for a long time that we need a better way to do make readmes, one that would be properly integrated into our ports Mk infrastructure, to take advantage of make's ability to recognize which files are up-to-date and which really do need rebuilding. This wont help and I think there's a better way that will make it up to 700 times faster. When a make readmes is done at the top-level, the top-level and category READMEs are created by make targets and the per port READMEs are created by a perl script in one go from the INDEX-n file. I once timed this and the 64 category READMEs took 2 hours, but the ~20,000 port READMEs only took about 9 seconds. Selective updating isn't going to help because 99.9% of the time is spent in the categories and it only takes a single port update to make a category file obsolete. I think the way to speed this up is to have the script generate the category files too. There's no point in bringing in the top-level README since that's already fast. I've been toying with the idea of doing this, but have never got around to it. If anyone wants to have a go I think it would be sensible to write it in awk, since perl is no longer in the base system and the existing perl script isn't really complex enough to be worth hanging-on to. ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
On Sat, 28 Jan 2012 14:37:34 + RW rwmailli...@googlemail.com wrote: On Fri, 27 Jan 2012 20:03:25 -0600 Conrad J. Sabatier wrote: I've been thinking for a long time that we need a better way to do make readmes, one that would be properly integrated into our ports Mk infrastructure, to take advantage of make's ability to recognize which files are up-to-date and which really do need rebuilding. This wont help and I think there's a better way that will make it up to 700 times faster. When a make readmes is done at the top-level, the top-level and category READMEs are created by make targets and the per port READMEs are created by a perl script in one go from the INDEX-n file. I once timed this and the 64 category READMEs took 2 hours, but the ~20,000 port READMEs only took about 9 seconds. rubbing eyes in disbelief Am I understanding you correctly? Are you saying you built 20,000+ port READMEs in only 9 seconds?! How is that possible? Or do you mean 9 seconds for each one? Selective updating isn't going to help because 99.9% of the time is spent in the categories and it only takes a single port update to make a category file obsolete. This is the part I find troubling. It would seem that it should be more work to create an individual port README, with its plucking the appropriate line out of the INDEX-* file and then parsing it into its respective pieces and filling in a template, than to simply string together a list of references to a bunch of already built port READMEs into a category README. What am I not getting here? I think the way to speed this up is to have the script generate the category files too. There's no point in bringing in the top-level README since that's already fast. So what's making the category READMEs so slow then? I've been toying with the idea of doing this, but have never got around to it. If anyone wants to have a go I think it would be sensible to write it in awk, since perl is no longer in the base system and the existing perl script isn't really complex enough to be worth hanging-on to. Oooo, awk! Been a while since I wrote any sizeable bit of code in it, but I do remember it was rather fun to work with. :-) I'm still not sure I read that paragraph above correctly, though (re: the times). :-) -- Conrad J. Sabatier conr...@cox.net ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
On 28/01/2012 16:28, Conrad J. Sabatier wrote: rubbing eyes in disbelief Am I understanding you correctly? Are you saying you built 20,000+ port READMEs in only 9 seconds?! How is that possible? Or do you mean 9 seconds for each one? 9 seconds sounds quite reasonable for generating 23000 or so files. Selective updating isn't going to help because 99.9% of the time is spent in the categories and it only takes a single port update to make a category file obsolete. This is the part I find troubling. It would seem that it should be more work to create an individual port README, with its plucking the appropriate line out of the INDEX-* file and then parsing it into its respective pieces and filling in a template, than to simply string together a list of references to a bunch of already built port READMEs into a category README. What am I not getting here? No -- you're quite right. You could generate the category README.html files entirely from the data in the INDEX. It's not quite as easy as all that, because there aren't entries for each category separately, so you'll have to parse the structure out of all of the paths in the INDEX. I think the way to speed this up is to have the script generate the category files too. There's no point in bringing in the top-level README since that's already fast. So what's making the category READMEs so slow then? The big problem with performance in all this INDEX and README.html building is that it takes quite a long time relatively to run make(1) within any port or category directory. make(1) has to read in a lot of other files and stat(2) many more[*] -- all of which involves a lot of random-access disk IO, and that's always going to take quite a lot of time. Now, doing 'make readme' in a category directory doesn't just run make in that directory, but also in every port in that category. Popular categories can contain many hundreds of ports. Maybe I should add README.html generation to my FreeBSD::Portindex stuff. Should be pretty simple -- all the necessary bits are readily available and it is just a matter of formatting it as HTML and printing it out. Cheers, Matthew [*] Running 'make -dA' with maximum debug output is quite enlightening, as is running make under truss(1) -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate JID: matt...@infracaninophile.co.uk Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
Re: A new and better way to do make readmes?
Hello, On Sat, Jan 28, 2012 at 3:03 AM, Conrad J. Sabatier conr...@cox.net wrote: I've been thinking for a long time that we need a better way to do make readmes, one that would be properly integrated into our ports Mk infrastructure, to take advantage of make's ability to recognize which files are up-to-date and which really do need rebuilding. I like to make sure my README.html files are all up-to-date after my nightly ports tree update, but with the current scheme, that means either rebuilding *all* of the files in the tree, or (as I'm doing at present) using some sort of kludgey (kludgy?) workaround. So people are actually using the readme files? Are many people using them? I ask because I *never* use them (unless they are used by 'make search'?), I always use freshports.org (BTW, thanks for an excellent service!) when I need to find out anything about a port. -- Regards, Torfinn Ingolfsen ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org
Re: A new and better way to do make readmes?
Matthew Seaman said The big problem with performance in all this INDEX and README.html building is that it takes quite a long time relatively to run make(1) within any port or category directory. make(1) has to read in a lot of other files and stat(2) many more[*] -- all of which involves a lot of random-access disk IO, and that's always going to take quite a lot of time. Now, doing 'make readme' in a category directory doesn't just run make in that directory, but also in every port in that category. Popular categories can contain many hundreds of ports. Maybe I should add README.html generation to my FreeBSD::Portindex stuff. Should be pretty simple -- all the necessary bits are readily available and it is just a matter of formatting it as HTML and printing it out. Indeed, the following python script http://www.lpthe.jussieu.fr/~talon/show_index.py parses the index in a few seconds and can display exactly the same information as the readme.html on demand in a web browser, which is far cleaner than polluting the ports tree with the readmes. Alternatively i have a fcgi version that can be coupled to web servers supporting fcgi like lighttpd. http://www.lpthe.jussieu.fr/~talon/show_index.fcgi Already 5 years this was done ... -- Michel Talon ta...@lpthe.jussieu.fr ___ freebsd-ports@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-ports To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org