Re: A new and better way to do make readmes?

2012-02-02 Thread Conrad J. Sabatier
[ Sorry to be so late in following up on this; lost track for a while ]

On Sat, 28 Jan 2012 16:53:17 +
Matthew Seaman m.sea...@infracaninophile.co.uk wrote:

 On 28/01/2012 16:28, Conrad J. Sabatier wrote:
  rubbing eyes in disbelief  Am I understanding you correctly?  Are
  you saying you built 20,000+ port READMEs in only 9 seconds?!  How
  is that possible?  Or do you mean 9 seconds for each one?
 
 9 seconds sounds quite reasonable for generating 23000 or so files.

It sounds incredible to me!  :-)

   Selective updating isn't going to help because 99.9% of the time
   is spent in the categories and it only takes a single port
   update to make a category file obsolete.
 
  This is the part I find troubling.  It would seem that it should be
  more work to create an individual port README, with its plucking the
  appropriate line out of the INDEX-* file and then parsing it into
  its respective pieces and filling in a template, than to simply
  string together a list of references to a bunch of already built
  port READMEs into a category README.
  
  What am I not getting here?
 
 No -- you're quite right.  You could generate the category README.html
 files entirely from the data in the INDEX.  It's not quite as easy as
 all that, because there aren't entries for each category separately,
 so you'll have to parse the structure out of all of the paths in the
 INDEX.

Well, the idea I had in mind was that, if all of the individual ports'
README.html files already are in place, then it should be trivial to
just ls or find them under each category to fill in the category's
README.html.  No need to reference the INDEX or anything else.  Or???

The workaround method I've been running out of cron for the last month
or so is:

1) Create a sentinel file under /tmp to use as a timestamp, just
before running cvs update on ports (I update my ports tree from a
local copy of the CVS repo maintained via csup)

2) After cvs completes, look for any port directories containing
updates (check timestamps against the sentinel file) and do a make
readme for each one:

find $PORTSDIR -type f ! -path */CVS/* -newercm $SENTINEL -depth 3 |
xargs dirname |
sort -u | xargs -I@ /bin/sh -c cd @  make readme

3) Last, but not least, build the category README.html for any
categories with ports containing newly updated README.html files.

I have noticed while doing this that, as you mentioned, the category
READMEs take considerably longer than the individual ports'.

I don't even bother to rebuild the top-level file, since it's basically
unchanging anyway.

   I think the way to speed this up is to have the script generate
   the category files too. There's no point in bringing in the
   top-level README since that's already fast.
 
  So what's making the category READMEs so slow then?
 
 The big problem with performance in all this INDEX and README.html
 building is that it takes quite a long time relatively to run make(1)
 within any port or category directory.  make(1) has to read in a lot
 of other files and stat(2) many more[*] -- all of which involves a
 lot of random-access disk IO, and that's always going to take quite a
 lot of time.  Now, doing 'make readme' in a category directory
 doesn't just run make in that directory, but also in every port in
 that category. Popular categories can contain many hundreds of ports.

I'm a little rusty on the actual mechanics of make, but shouldn't it be
possible to run a single, over-arching make on each category that
wouldn't need to spawn a bunch of sub-makes?

 Maybe I should add README.html generation to my FreeBSD::Portindex
 stuff.  Should be pretty simple -- all the necessary bits are readily
 available and it is just a matter of formatting it as HTML and
 printing it out.

Maybe?  Whaddya mean, maybe?  :-)  Sounds like it would definitely
be worth doing!

   Cheers,
 
   Matthew
 
 [*] Running 'make -dA' with maximum debug output is quite
 enlightening, as is running make under truss(1)

Enlightening, perhaps.  Sometimes overwhelming, is more like it.  :-)

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-02-02 Thread Conrad J. Sabatier
On Sat, 28 Jan 2012 18:44:48 +0100
Torfinn Ingolfsen tin...@gmail.com wrote:

 Hello,
 
 On Sat, Jan 28, 2012 at 3:03 AM, Conrad J. Sabatier conr...@cox.net
 wrote:
 
  I've been thinking for a long time that we need a better way to do
  make readmes, one that would be properly integrated into our
  ports Mk infrastructure, to take advantage of make's ability to
  recognize which files are up-to-date and which really do need
  rebuilding.
 
  I like to make sure my README.html files are all up-to-date after my
  nightly ports tree update, but with the current scheme, that means
  either rebuilding *all* of the files in the tree, or (as I'm doing
  at present) using some sort of kludgey (kludgy?) workaround.
 
 
 So people are actually using the readme files?
 Are many people using them?
 I ask because I *never* use them (unless they are used by 'make
 search'?), I always use freshports.org (BTW, thanks for an excellent
 service!) when I need to find out anything about a port.
 

Well, in actual practice, it's true, I don't use them a *lot*, but I do
use them from time to time when I'm looking for a new port to install
for a certain purpose.  It's nice to have up-to-date README.html files
locally when the need arises.  But they sure are expensive to maintain
currently.

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-02-02 Thread Jason Helfman

On Thu, Feb 02, 2012 at 03:21:37PM -0600, Conrad J. Sabatier thus spake:

[ Sorry to be so late in following up on this; lost track for a while ]

On Sat, 28 Jan 2012 16:53:17 +
Matthew Seaman m.sea...@infracaninophile.co.uk wrote:


On 28/01/2012 16:28, Conrad J. Sabatier wrote:
 rubbing eyes in disbelief  Am I understanding you correctly?  Are
 you saying you built 20,000+ port READMEs in only 9 seconds?!  How
 is that possible?  Or do you mean 9 seconds for each one?

9 seconds sounds quite reasonable for generating 23000 or so files.


It sounds incredible to me!  :-)


  Selective updating isn't going to help because 99.9% of the time
  is spent in the categories and it only takes a single port
  update to make a category file obsolete.

 This is the part I find troubling.  It would seem that it should be
 more work to create an individual port README, with its plucking the
 appropriate line out of the INDEX-* file and then parsing it into
 its respective pieces and filling in a template, than to simply
 string together a list of references to a bunch of already built
 port READMEs into a category README.

 What am I not getting here?

No -- you're quite right.  You could generate the category README.html
files entirely from the data in the INDEX.  It's not quite as easy as
all that, because there aren't entries for each category separately,
so you'll have to parse the structure out of all of the paths in the
INDEX.


Well, the idea I had in mind was that, if all of the individual ports'
README.html files already are in place, then it should be trivial to
just ls or find them under each category to fill in the category's
README.html.  No need to reference the INDEX or anything else.  Or???

The workaround method I've been running out of cron for the last month
or so is:

1) Create a sentinel file under /tmp to use as a timestamp, just
before running cvs update on ports (I update my ports tree from a
local copy of the CVS repo maintained via csup)

2) After cvs completes, look for any port directories containing
updates (check timestamps against the sentinel file) and do a make
readme for each one:

find $PORTSDIR -type f ! -path */CVS/* -newercm $SENTINEL -depth 3 |
   xargs dirname |
   sort -u | xargs -I@ /bin/sh -c cd @  make readme

3) Last, but not least, build the category README.html for any
categories with ports containing newly updated README.html files.

I have noticed while doing this that, as you mentioned, the category
READMEs take considerably longer than the individual ports'.

I don't even bother to rebuild the top-level file, since it's basically
unchanging anyway.


  I think the way to speed this up is to have the script generate
  the category files too. There's no point in bringing in the
  top-level README since that's already fast.

 So what's making the category READMEs so slow then?

The big problem with performance in all this INDEX and README.html
building is that it takes quite a long time relatively to run make(1)
within any port or category directory.  make(1) has to read in a lot
of other files and stat(2) many more[*] -- all of which involves a
lot of random-access disk IO, and that's always going to take quite a
lot of time.  Now, doing 'make readme' in a category directory
doesn't just run make in that directory, but also in every port in
that category. Popular categories can contain many hundreds of ports.


I'm a little rusty on the actual mechanics of make, but shouldn't it be
possible to run a single, over-arching make on each category that
wouldn't need to spawn a bunch of sub-makes?


Maybe I should add README.html generation to my FreeBSD::Portindex
stuff.  Should be pretty simple -- all the necessary bits are readily
available and it is just a matter of formatting it as HTML and
printing it out.


Maybe?  Whaddya mean, maybe?  :-)  Sounds like it would definitely
be worth doing!


Cheers,

Matthew

[*] Running 'make -dA' with maximum debug output is quite
enlightening, as is running make under truss(1)


Enlightening, perhaps.  Sometimes overwhelming, is more like it.  :-)



Not to fancy, but I used this when I was updating the readmes to not break.

#!/bin/sh
cd /usr/ports
for i in `make -V SUBDIR |sed s/local//g`; do for p in `make -C $i -V SUBDIR`; do echo $i/$p  
sudo  make -C $i/$p readme ; done; done  ~/readmes.log

-jgh

--
Jason Helfman
System Administrator
experts-exchange.com
http://www.experts-exchange.com/M_4830110.html
E4AD 7CF1 1396 27F6 79DD  4342 5E92 AD66 8C8C FBA5
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-02-02 Thread Conrad J. Sabatier
On Thu, 2 Feb 2012 13:25:14 -0800
Jason Helfman jhelf...@e-e.com wrote:

 On Thu, Feb 02, 2012 at 03:21:37PM -0600, Conrad J. Sabatier thus
 spake:

[snip]

 The workaround method I've been running out of cron for the last
 month or so is:
 
 1) Create a sentinel file under /tmp to use as a timestamp, just
 before running cvs update on ports (I update my ports tree from a
 local copy of the CVS repo maintained via csup)
 
 2) After cvs completes, look for any port directories containing
 updates (check timestamps against the sentinel file) and do a make
 readme for each one:
 
 find $PORTSDIR -type f ! -path */CVS/* -newercm $SENTINEL -depth 3
 |
 xargs dirname |
 sort -u | xargs -I@ /bin/sh -c cd @  make readme
 
 3) Last, but not least, build the category README.html for any
 categories with ports containing newly updated README.html files.
 
 I have noticed while doing this that, as you mentioned, the category
 READMEs take considerably longer than the individual ports'.
 
 I don't even bother to rebuild the top-level file, since it's
 basically unchanging anyway.

[snip] 

 Not to fancy, but I used this when I was updating the readmes to not
 break.
 
 #!/bin/sh
 cd /usr/ports
 for i in `make -V SUBDIR |sed s/local//g`; do for p in `make -C $i -V
 SUBDIR`; do echo $i/$p  sudo  make -C $i/$p readme ; done; done
  ~/readmes.log
 
 -jgh

Interesting.  I'll take a look at using that.  Thanks!

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-01-28 Thread RW
On Fri, 27 Jan 2012 20:03:25 -0600
Conrad J. Sabatier wrote:

 I've been thinking for a long time that we need a better way to do
 make readmes, one that would be properly integrated into our
 ports Mk infrastructure, to take advantage of make's ability to
 recognize which files are up-to-date and which really do need
 rebuilding.

This wont help and I think there's a better way that will make it up to
700 times faster.

When a make readmes is done at the top-level, the top-level and
category READMEs are created by make targets and the per port READMEs
are created by a perl script in one go from the INDEX-n file. 

I once timed this and the 64 category  READMEs took 2 hours, but the
~20,000 port READMEs only took about 9 seconds.  Selective updating
isn't going to help because 99.9% of the time is spent in the
categories and it only takes a single port update to make a category
file obsolete.

I think the way to speed this up is to have the script generate the
category files too. There's no point in bringing in the top-level
README since that's already fast.

I've been toying with the idea of doing this, but have never got around
to it. If anyone wants to have a go I think it would be sensible to
write it in awk, since perl is no longer in the base system and the
existing perl script isn't really complex enough to be worth hanging-on
to. 
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-01-28 Thread Conrad J. Sabatier
On Sat, 28 Jan 2012 14:37:34 +
RW rwmailli...@googlemail.com wrote:

 On Fri, 27 Jan 2012 20:03:25 -0600
 Conrad J. Sabatier wrote:
 
  I've been thinking for a long time that we need a better way to do
  make readmes, one that would be properly integrated into our
  ports Mk infrastructure, to take advantage of make's ability to
  recognize which files are up-to-date and which really do need
  rebuilding.
 
 This wont help and I think there's a better way that will make it up
 to 700 times faster.
 
 When a make readmes is done at the top-level, the top-level and
 category READMEs are created by make targets and the per port READMEs
 are created by a perl script in one go from the INDEX-n file. 
 
 I once timed this and the 64 category  READMEs took 2 hours, but the
 ~20,000 port READMEs only took about 9 seconds.

rubbing eyes in disbelief  Am I understanding you correctly?  Are you
saying you built 20,000+ port READMEs in only 9 seconds?!  How is that
possible?  Or do you mean 9 seconds for each one?

 Selective updating isn't going to help because 99.9% of the time is
 spent in the categories and it only takes a single port update to
 make a category file obsolete.

This is the part I find troubling.  It would seem that it should be
more work to create an individual port README, with its plucking the
appropriate line out of the INDEX-* file and then parsing it into its
respective pieces and filling in a template, than to simply string
together a list of references to a bunch of already built port READMEs
into a category README.

What am I not getting here?

 I think the way to speed this up is to have the script generate the
 category files too. There's no point in bringing in the top-level
 README since that's already fast.

So what's making the category READMEs so slow then?

 I've been toying with the idea of doing this, but have never got
 around to it. If anyone wants to have a go I think it would be
 sensible to write it in awk, since perl is no longer in the base
 system and the existing perl script isn't really complex enough to be
 worth hanging-on to. 

Oooo, awk!  Been a while since I wrote any sizeable bit of code in it,
but I do remember it was rather fun to work with.  :-)

I'm still not sure I read that paragraph above correctly, though (re:
the times).  :-)

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-01-28 Thread Matthew Seaman
On 28/01/2012 16:28, Conrad J. Sabatier wrote:
 rubbing eyes in disbelief  Am I understanding you correctly?  Are you
 saying you built 20,000+ port READMEs in only 9 seconds?!  How is that
 possible?  Or do you mean 9 seconds for each one?

9 seconds sounds quite reasonable for generating 23000 or so files.

  Selective updating isn't going to help because 99.9% of the time is
  spent in the categories and it only takes a single port update to
  make a category file obsolete.

 This is the part I find troubling.  It would seem that it should be
 more work to create an individual port README, with its plucking the
 appropriate line out of the INDEX-* file and then parsing it into its
 respective pieces and filling in a template, than to simply string
 together a list of references to a bunch of already built port READMEs
 into a category README.
 
 What am I not getting here?

No -- you're quite right.  You could generate the category README.html
files entirely from the data in the INDEX.  It's not quite as easy as
all that, because there aren't entries for each category separately, so
you'll have to parse the structure out of all of the paths in the INDEX.

  I think the way to speed this up is to have the script generate the
  category files too. There's no point in bringing in the top-level
  README since that's already fast.

 So what's making the category READMEs so slow then?

The big problem with performance in all this INDEX and README.html
building is that it takes quite a long time relatively to run make(1)
within any port or category directory.  make(1) has to read in a lot of
other files and stat(2) many more[*] -- all of which involves a lot of
random-access disk IO, and that's always going to take quite a lot of
time.  Now, doing 'make readme' in a category directory doesn't just run
make in that directory, but also in every port in that category.
Popular categories can contain many hundreds of ports.

Maybe I should add README.html generation to my FreeBSD::Portindex
stuff.  Should be pretty simple -- all the necessary bits are readily
available and it is just a matter of formatting it as HTML and printing
it out.

Cheers,

Matthew

[*] Running 'make -dA' with maximum debug output is quite enlightening,
as is running make under truss(1)

-- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
JID: matt...@infracaninophile.co.uk   Kent, CT11 9PW



signature.asc
Description: OpenPGP digital signature


Re: A new and better way to do make readmes?

2012-01-28 Thread Torfinn Ingolfsen
Hello,

On Sat, Jan 28, 2012 at 3:03 AM, Conrad J. Sabatier conr...@cox.net wrote:

 I've been thinking for a long time that we need a better way to do
 make readmes, one that would be properly integrated into our
 ports Mk infrastructure, to take advantage of make's ability to
 recognize which files are up-to-date and which really do need
 rebuilding.

 I like to make sure my README.html files are all up-to-date after my
 nightly ports tree update, but with the current scheme, that means
 either rebuilding *all* of the files in the tree, or (as I'm doing at
 present) using some sort of kludgey (kludgy?) workaround.


So people are actually using the readme files?
Are many people using them?
I ask because I *never* use them (unless they are used by 'make search'?),
I always use freshports.org (BTW, thanks for an excellent service!) when I
need to find out anything about a port.

-- 
Regards,
Torfinn Ingolfsen
___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org


Re: A new and better way to do make readmes?

2012-01-28 Thread Michel Talon
Matthew Seaman said
The big problem with performance in all this INDEX and README.html
building is that it takes quite a long time relatively to run make(1)
within any port or category directory.  make(1) has to read in a lot of
other files and stat(2) many more[*] -- all of which involves a lot of
random-access disk IO, and that's always going to take quite a lot of
time.  Now, doing 'make readme' in a category directory doesn't just run
make in that directory, but also in every port in that category.
Popular categories can contain many hundreds of ports.

Maybe I should add README.html generation to my FreeBSD::Portindex
stuff.  Should be pretty simple -- all the necessary bits are readily
available and it is just a matter of formatting it as HTML and printing
it out.

Indeed, the following python script
http://www.lpthe.jussieu.fr/~talon/show_index.py
parses the index in a few seconds and can display exactly the same information 
as the
readme.html on demand in a web browser, which is far cleaner than polluting the 
ports tree
with the readmes. Alternatively i have a fcgi version that can be coupled to 
web servers
supporting fcgi like lighttpd.
http://www.lpthe.jussieu.fr/~talon/show_index.fcgi
Already 5 years this was done  ...


--

Michel Talon
ta...@lpthe.jussieu.fr





___
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to freebsd-ports-unsubscr...@freebsd.org