Re: Looking for speed increases in make index and pkg_version for ports

2007-05-30 Thread Bakul Shah
Peter Jeremy [EMAIL PROTECTED] wrote:
 On 2007-May-27 16:12:54 -0700, Bakul Shah [EMAIL PROTECTED] wrote:
 Given the size and complexity of the port system I have long
 felt that rather than do everything via more and more complex
 Mk/*.mk what is is needed is a ports server and a thin CLI
 frontend to it.
 
 I don't believe this is practical.  Both package names and
 port dependencies depend on the options that are selected as
 well as what other ports are already installed.  A centralised
 ports server is not going to have access to this information.

I didn't mean a centralized server at freebsd.org but on your
freebsd system and can know about what ports are installed.
Conditional dependencies have to be dealt with.  Perhaps the
underlying reason for changing package names can be handled
in a different way.

What happens now is that mostly static information from
various files is recomputed many times.  While that can be
handled by a local database, it seems to be a daemon provides
a lot of benefits.

Come to think of it, even a centralized server can work as
there are a finite number of combinations and it can cache
the ones in use.  But all this is just an educated guess.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-30 Thread Stephen Montgomery-Smith



On Wed, 30 May 2007, Bakul Shah wrote:


Peter Jeremy [EMAIL PROTECTED] wrote:

On 2007-May-27 16:12:54 -0700, Bakul Shah [EMAIL PROTECTED] wrote:

Given the size and complexity of the port system I have long
felt that rather than do everything via more and more complex
Mk/*.mk what is is needed is a ports server and a thin CLI
frontend to it.


I don't believe this is practical.  Both package names and
port dependencies depend on the options that are selected as
well as what other ports are already installed.  A centralised
ports server is not going to have access to this information.


I didn't mean a centralized server at freebsd.org but on your
freebsd system and can know about what ports are installed.
Conditional dependencies have to be dealt with.  Perhaps the
underlying reason for changing package names can be handled
in a different way.

What happens now is that mostly static information from
various files is recomputed many times.  While that can be
handled by a local database, it seems to be a daemon provides
a lot of benefits.

Come to think of it, even a centralized server can work as
there are a finite number of combinations and it can cache
the ones in use.  But all this is just an educated guess.


Your idea really looks very fine to me.  From reading other emails on this 
thread, I get the impression that a lot of the underlying work has already 
been done in perhaps the portupgrade port, and so all you would have to do 
is to provide an interface from the make file to the database produced by 
portupgrade.  Perhaps this could be made conditional, so that those who 
don't install portupgrade wouldn't use it.  Even so, I also get the 
feeling that to implement this would be quite some work, so a volunteer 
needs to step forward.  But my gut reaction is that this is almost certain 
to make things like make clean and pkg_version way way faster.


And I must admit that I am having more doubts about my calculate the 
variables just in time idea.  The thought of working really hard to make 
it work, and then seeing mediocre speed increases is rather offputting to me.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread Hartmut Brandt

Mike Meyer wrote:

In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:

Mike Meyer wrote:

In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:

1. make and its sub-makes for a) reading the file; b) parsing the file
(note that .if and .for processing is done while parsing); c) processing
targets.

Make and submakes have been gone over already. See URL:
http://miller.emu.id.au/pmiller/books/rmch/ .

I'm not sure it can be applied to the ports tree, though. I haven't
looked into it, but recalled this paper when you mentioned measuring
makes and sub-makes.
Unfortunately you deleted the sentence before, so I rephrase it: before 
looking into optimizations find out where the time is actually spend - 
how many seconds of the hours the process takes, are actually spent in 
make and sub-makes. If the entire process takes 2 hours of which the 
makes take 20 seconds then by enhancing performance of make by 50% you 
win 10 seconds. This is probably not worth a single line of additional code.


The paper you point to talks about something entirely different.


It think we're talking about two different things. You're talking
about the efficiency of make, whereas he's talking about the
efficiency of make. Um, wait.

You're talking about what I'll call the *internal* efficiency of make,
defined as how fast it does the things it does. He's talking about
what I'll call the *external* efficiency of make, which is how well it
does at doing the minimum amount of work it needs to do. I hope you
can see where the confusion comes from.


Yeah, from that you deleted the other two of my points in your response 
where I talked about shells and external commands executed by make. You 
cited the point where I asked for numbers on *internal* efficiency and 
the point to a paper about *external* efficiency.


I've seen no numbers WHAT actually makes the ports stuff so slow. To 
make my point a last time: until there are numbers, there is no guess 
around what to do.


harti
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread Peter Jeremy
On 2007-May-27 16:12:54 -0700, Bakul Shah [EMAIL PROTECTED] wrote:
Given the size and complexity of the port system I have long
felt that rather than do everything via more and more complex
Mk/*.mk what is is needed is a ports server and a thin CLI
frontend to it.

I don't believe this is practical.  Both package names and
port dependencies depend on the options that are selected as
well as what other ports are already installed.  A centralised
ports server is not going to have access to this information.

-- 
Peter Jeremy


pgpfPtzrTbSlc.pgp
Description: PGP signature


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread Peter Jeremy
On 2007-May-27 15:52:16 -0500, Stephen Montgomery-Smith [EMAIL PROTECTED] 
wrote:
 I have been thinking a lot about looking for speed increases for make 
 index and pkg_version and things like that.  So for example, in 
 pkg_version, it calls make -V PKGNAME for every installed package. Now 
 make -V PKGNAME should be a speedy operation, but the make has to load in 
 and analyze bsd.port.mk, a quite complicated file with about 200,000 
 characters in it, when all it is needing to do is to figure out the value of 
 the variable PKGNAME.

This would be trivial if some packages didn't change names depending
on options and what was installed.  I agree that parsing a 210KB file
17,000 times is not going to be fast.  Especially since some ports
include bsd.ports.mk multiple times...

 I suggest rewriting make so that variables are only evaluated on a need 
 to know basis.

This sounds like a good idea but I suspect it's not going to be
feasible.  The biggest problem I see is that the make language allows
variables to be expanded either when they are assigned or when they
are referenced.  If a variable expansion is delayed from the
assignment to the first use then the expansion must be performed using
the state of make as it was when the variable was assigned.  The cost
of keeping this state probably exceeds the cost of actually evaluating
the variable.

  So, for example, if all we need to know is PKGNAME, there 
 is no need to evaluate, for example, _RUN_LIB_DEPENDS, unless the writer of 
 that particular port has done something like having PORTNAME depend on the 
 value of _RUN_LIB_DEPENDS.

Rather than trying to develop a tool that can quickly expand PKGNAME
irrespective of what convoluted code the author has used, how about a
partial solution?  For most ports, PKGNAME depends solely on 3 or 4
variables that are statically defined in the port Makefile.  The
obvious solution would seem to be to develop a script that can handle
the easy cases itself and punt the difficult cases back to make.  The
definition of 'easy' can be adjusted over time to increase performance.
This approach would seem to have a relatively low bar to entry whilst
offering good effort/performance tradeoff at the low end.

The various depends lists would seem amenable to the same approach -
though the entry level tool will have far lower coverage due to the
extensive use of USE_GNOME=... and similar 'macro'-style constructs.

-- 
Peter Jeremy


pgpCyJ5Uh8Myq.pgp
Description: PGP signature


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread Peter Jeremy
On 2007-May-27 15:30:48 -0700, Jeremy Chadwick [EMAIL PROTECTED] wrote:
This sounds like a good solution.  In fact, I'm lead to believe that
heavy reliance on /bin/sh is part of why the ports collection is slow.

Someone needs to enable accounting on a recent -current (with the
high-resolution accounting records) and look at where the time is
actually going.  (My -current box needs upgrading before I could
do this).

That said, /bin/sh is dynamically linked and a fork/exec is not cheap.
Some quick-and-not-necessarily-reliable tests on 6.2-STABLE/amd64 show
that /bin/sh takes about 2.5 times as long to start as /rescue/sh
(though it's only 2:1 on i386).  (These are different boxes so the
absolute times aren't comparable).

amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done' 
/dev/null
sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done'0.20s 
user 0.08s system 98% cpu 0.283 total
amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done' 
/dev/null 
sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done'0.22s 
user 0.06s system 97% cpu 0.287 total
amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done' 
/dev/null
sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done'0.19s 
user 0.10s system 98% cpu 0.288 total
amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /rescue/sh -c 
echo foo; done' /dev/null
sh -c   /dev/null  0.84s user 6.12s system 97% cpu 7.162 total
amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /rescue/sh -c 
echo foo; done' /dev/null
sh -c   /dev/null  1.12s user 6.05s system 97% cpu 7.366 total
amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /bin/sh -c 
echo foo; done' /dev/null
sh -c   /dev/null  5.72s user 13.40s system 96% cpu 19.734 total
amd64% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /bin/sh -c 
echo foo; done' /dev/null 
sh -c   /dev/null  5.97s user 12.89s system 97% cpu 19.407 total
amd64%   

i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done' 
/dev/null
sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done'0.17s 
user 0.03s system 95% cpu 0.208 total
i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done' 
/dev/null
sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done'0.17s 
user 0.03s system 99% cpu 0.199 total
i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done' 
/dev/null
sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); echo foo; done'0.16s 
user 0.04s system 99% cpu 0.200 total
i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /rescue/sh -c 
echo foo; done' /dev/null
sh -c   /dev/null  3.68s user 18.19s system 98% cpu 22.212 total
i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /rescue/sh -c 
echo foo; done' /dev/null
sh -c   /dev/null  3.34s user 18.54s system 98% cpu 22.110 total
i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /bin/sh -c echo 
foo; done' /dev/null 
sh -c   /dev/null  12.03s user 29.42s system 98% cpu 41.965 total
i386% time sh -c 'i=0; while [ $i -lt 1 ]; do i=$(($i+1)); /bin/sh -c echo 
foo; done' /dev/null
sh -c   /dev/null  12.20s user 29.25s system 98% cpu 41.975 total

-- 
Peter Jeremy


pgpUpeOoZw3YV.pgp
Description: PGP signature


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread soralx

Ivan Voras [EMAIL PROTECTED] wrote:
  Because the information is not a constant.  For example, the mpg123
  port changes its PKGNAME as soon as esound is installed.
 
 Maybe the time has come to give up on some of the flexibility the
 ports tree has (and this particular one is confusing to the users)
 [...]

maybe more confusing to portupgrade -- one day I'll get tired of
running 'pkgdb -F' almose every time before portupgrade to fix old
dependencies (packages with old config or things like ru-xmms instead of
xmms). This is portupgrade's problem really, but doing away from such
flexibility would make life slightly easier...


[SorAlx]  ridin' VN1500-B2
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread Joan Picanyol i Puig
* Hartmut Brandt [EMAIL PROTECTED] [20070529 08:21]:
 I've seen no numbers WHAT actually makes the ports stuff so slow. To 
 make my point a last time: until there are numbers, there is no guess 
 around what to do.

It just occured to me that DTrace could be a big help with this task. I
suggest someone convinces jb@ to take a look.

qvb
--
pica
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-29 Thread Jeremy Chadwick
On Tue, May 29, 2007 at 08:34:29PM +1000, Peter Jeremy wrote:
 On 2007-May-27 15:30:48 -0700, Jeremy Chadwick [EMAIL PROTECTED] wrote:
 This sounds like a good solution.  In fact, I'm lead to believe that
 heavy reliance on /bin/sh is part of why the ports collection is slow.
 
 Someone needs to enable accounting on a recent -current (with the
 high-resolution accounting records) and look at where the time is
 actually going.  (My -current box needs upgrading before I could
 do this).

The best I was able to do: I have a 6.2-RELEASE box which contains
profiled libraries, and managed to make a profiled /bin/sh.  This
didn't take much work (just some modification of src/bin/sh/Makefile),
but one thing which did stump me was the .gmon file never getting written.
Turns out trap.c calls _exit(2) not exit(3).  Change that and voila.

Admittedly I'm not that familiar with gprof, and I'm also left wondering
if profiling /bin/sh is going to help us, since we don't have a direct
way of determining which shell commands take the most time -- just which
C functions are most heavily used.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Matthew Seaman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Stephen Montgomery-Smith wrote:
 I have been thinking a lot about looking for speed increases for make
 index and pkg_version and things like that.  So for example, in
 pkg_version, it calls make -V PKGNAME for every installed package. Now
 make -V PKGNAME should be a speedy operation, but the make has to load
 in and analyze bsd.port.mk, a quite complicated file with about 200,000
 characters in it, when all it is needing to do is to figure out the
 value of the variable PKGNAME.

pkg_version is one thing -- but to build the INDEX you need to extract
at least the values of the following variables:

  PKGNAME
  .CURDIR
  PREFIX
  COMMENT
  DESCR
  MAINTAINER
  CATEGORIES
  EXTRACT_DEPENDS
  PATCH_DEPENDS
  FETCH_DEPENDS
  BUILD_DEPENDS
  RUN_DEPENDS
  LIB_DEPENDS

Plus you need to grep in the referenced pkg-descr file for any WWW
links.  I also extract the values of:

  MASTER_PORT
  .MAKEFILE_LIST
  SUBDIR

for my FreeBSD::Portindex stuff.

Trouble is, by the time you've extracted all that lot, you have pretty
much done the same level of variable processing as you would were you
actually going to build the port.

One thing that would speed up this process would be a make option
to just do parsing of the Makefile and variable expansion, without
calling stat(2) on all the various sources and dependencies involved.

For instance:

happy-idiot-talk:...ports/databases/mysql-connector-java:% truss make -V 
PKGNAME | grep stat | wc -l
  49

It is quite instructive to see what files make(1) touches while doing
that.  At least half of them are irrelevant if all make(1) is going to
do is print out the values of some variables.  Multiply that by 17,000
and it adds up to a big waste of effort.

Cheers,

Matthew

- -- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
  Kent, CT11 9PW
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.3 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGWpTA8Mjk52CukIwRCI0BAJ0bX5hTAJkMCO6Pl+cA4THv3mKulwCgg+39
kCyAGOTYYz9vEzzM9NRe3no=
=MqFV
-END PGP SIGNATURE-
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Hartmut Brandt

Matthew Seaman wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Stephen Montgomery-Smith wrote:

I have been thinking a lot about looking for speed increases for make
index and pkg_version and things like that.  So for example, in
pkg_version, it calls make -V PKGNAME for every installed package. Now
make -V PKGNAME should be a speedy operation, but the make has to load
in and analyze bsd.port.mk, a quite complicated file with about 200,000
characters in it, when all it is needing to do is to figure out the
value of the variable PKGNAME.


pkg_version is one thing -- but to build the INDEX you need to extract
at least the values of the following variables:

  PKGNAME
  .CURDIR
  PREFIX
  COMMENT
  DESCR
  MAINTAINER
  CATEGORIES
  EXTRACT_DEPENDS
  PATCH_DEPENDS
  FETCH_DEPENDS
  BUILD_DEPENDS
  RUN_DEPENDS
  LIB_DEPENDS

Plus you need to grep in the referenced pkg-descr file for any WWW
links.  I also extract the values of:

  MASTER_PORT
  .MAKEFILE_LIST
  SUBDIR

for my FreeBSD::Portindex stuff.

Trouble is, by the time you've extracted all that lot, you have pretty
much done the same level of variable processing as you would were you
actually going to build the port.

One thing that would speed up this process would be a make option
to just do parsing of the Makefile and variable expansion, without
calling stat(2) on all the various sources and dependencies involved.

For instance:

happy-idiot-talk:...ports/databases/mysql-connector-java:% truss make -V PKGNAME 
| grep stat | wc -l
  49

It is quite instructive to see what files make(1) touches while doing
that.  At least half of them are irrelevant if all make(1) is going to
do is print out the values of some variables.  Multiply that by 17,000
and it adds up to a big waste of effort.


Are you sure? A good deal of the stats() is make poking around for the 
make infrastructure. This should be in the cache. And there are a couple 
of stats for the *done* files that might be avoided by doing something 
in the ports infrastructure. But as I already said in my previous mail: 
numbers, please, no guessing.


harti

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Hartmut Brandt

Stephen Montgomery-Smith wrote:
I have been thinking a lot about looking for speed increases for make 
index and pkg_version and things like that.  So for example, in 
pkg_version, it calls make -V PKGNAME for every installed package. Now 
make -V PKGNAME should be a speedy operation, but the make has to load 
in and analyze bsd.port.mk, a quite complicated file with about 200,000 
characters in it, when all it is needing to do is to figure out the 
value of the variable PKGNAME.


I suggest rewriting make so that variables are only evaluated on a 
need to know basis.  So, for example, if all we need to know is 
PKGNAME, there is no need to evaluate, for example, _RUN_LIB_DEPENDS, 
unless the writer of that particular port has done something like having 
PORTNAME depend on the value of _RUN_LIB_DEPENDS.  So make should 
analyze all the code it is given, and only figure it out if it is needed 
to do so.  This would include, for example, figuring out .for and .if 
directives on a need to know basis as well.


I have only poked around a little inside the source for make, but I have 
a sense that this would be a major undertaking.  I certainly have not 
thought through what it entails in more than a cursory manner.  However 
I am quite excited about the possibility of doing this, albeit I may 
well put off the whole thing for a year or two or even forever depending 
upon other priorities in my life.


However, in the mean time I want to throw this idea out there to get 
some feedback, either of the form of this won't work, or of the form 
I will do it, or I have tried to do this.


Having done a great deal of rewriting of make some two years ago I can
tell you that even a small change to make is a tough job testing-wise:
run all the combinations of !-j and -j N on all architectures and run
the change through the port-building cluster. That's a warning to start
with.

Second I would start with careful profiling to find out where the
problem actually is. You might be surprised. As an example: several
times the idea came up to use a hash structure instead of linear lists
for make variables. I got a patch for this and - it makes absolutely no
difference performance-wise (well, there was some indication that
performance gets worse, but that was around or below noise level). With
careful I mean to find out who takes the time:

1. make and its sub-makes for a) reading the file; b) parsing the file
(note that .if and .for processing is done while parsing); c) processing
targets.

2. sub-shells executed for executing targets commands (note, that make
optimizes the subshells away when there are no special shell symbol in
the command line)

3. executed programs (find, sort, ...)

Until you have numbers for this everything is rather moot. It might be a
good idea to put some performance measurement hooks into make for this
to do.

If anybody wants to work on make, I would rather recommend to implement
%-rules :-) And if anybody wants to recommend gmake over make(1) - look
into the code, what mess that is :-/

Regards,
harti


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Stephen Montgomery-Smith

Hartmut Brandt wrote:


Having done a great deal of rewriting of make some two years ago I can
tell you that even a small change to make is a tough job testing-wise:
run all the combinations of !-j and -j N on all architectures and run
the change through the port-building cluster. That's a warning to start
with.

Second I would start with careful profiling to find out where the
problem actually is. You might be surprised. As an example: several
times the idea came up to use a hash structure instead of linear lists
for make variables. I got a patch for this and - it makes absolutely no
difference performance-wise (well, there was some indication that
performance gets worse, but that was around or below noise level). With
careful I mean to find out who takes the time:


Yes, I must admit that I thought that a hash structure for the variables 
would greatly increase the speed of make.  I rewrote it using Berkeley 
databases, and like you said - absolutely no difference!!  I even tried 
btrees.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Hartmut Brandt

Stephen Montgomery-Smith wrote:

Hartmut Brandt wrote:


Having done a great deal of rewriting of make some two years ago I can
tell you that even a small change to make is a tough job testing-wise:
run all the combinations of !-j and -j N on all architectures and run
the change through the port-building cluster. That's a warning to start
with.

Second I would start with careful profiling to find out where the
problem actually is. You might be surprised. As an example: several
times the idea came up to use a hash structure instead of linear lists
for make variables. I got a patch for this and - it makes absolutely no
difference performance-wise (well, there was some indication that
performance gets worse, but that was around or below noise level). With
careful I mean to find out who takes the time:


Yes, I must admit that I thought that a hash structure for the variables 
would greatly increase the speed of make.  I rewrote it using Berkeley 
databases, and like you said - absolutely no difference!!  I even tried 
btrees.





My guess at that time was that because there are actually many variable 
tables (one per target and the global one) and only a small number of 
variables in most of the tables the initialisation overhead outweights 
what you win through the hashing.


As for the profiling - I did some profiling on buildworld then. From the 
several hours a buildworld took only one or two minutes were used by all 
the makes. At this point I stopped optimizing make :-) (I don't remember 
the exact numbers - that was two or three years ago).


harti
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Ivan Voras
Stephen Montgomery-Smith wrote:
 I have been thinking a lot about looking for speed increases for make
 index and pkg_version and things like that.  So for example, in
 pkg_version, it calls make -V PKGNAME for every installed package. Now
 make -V PKGNAME should be a speedy operation, but the make has to load
 in and analyze bsd.port.mk, a quite complicated file with about 200,000
 characters in it, when all it is needing to do is to figure out the
 value of the variable PKGNAME.

As long as far-out ideas are being discussed, how about caching such
information (including dependenices) in a file (I'd call it a database
but then I'd had to start a holy war :) ) so it's calculated only once,
preferably on the portsnap / cvsup servers and not at the end-user?



signature.asc
Description: OpenPGP digital signature


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Stephen Montgomery-Smith

Ivan Voras wrote:

Stephen Montgomery-Smith wrote:

I have been thinking a lot about looking for speed increases for make
index and pkg_version and things like that.  So for example, in
pkg_version, it calls make -V PKGNAME for every installed package. Now
make -V PKGNAME should be a speedy operation, but the make has to load
in and analyze bsd.port.mk, a quite complicated file with about 200,000
characters in it, when all it is needing to do is to figure out the
value of the variable PKGNAME.


As long as far-out ideas are being discussed, how about caching such
information (including dependenices) in a file (I'd call it a database
but then I'd had to start a holy war :) ) so it's calculated only once,
preferably on the portsnap / cvsup servers and not at the end-user?


Because the information is not a constant.  For example, the mpg123 port 
changes its PKGNAME as soon as esound is installed.



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Stephen Montgomery-Smith

Jeremy Chadwick wrote:

On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith wrote:
 I have been thinking a lot about looking for speed increases for make 
 index and pkg_version and things like that.  So for example, in 
 pkg_version, it calls make -V PKGNAME for every installed package. Now 
 make -V PKGNAME should be a speedy operation, but the make has to load in 
 and analyze bsd.port.mk, a quite complicated file with about 200,000 
 characters in it, when all it is needing to do is to figure out the value of 
 the variable PKGNAME.


I have a related question, pertaining to make all-depends-list and the
utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you don't
know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
bsd.ports.mk.


I posted this to [EMAIL PROTECTED], but now I am realizing that it is 
[EMAIL PROTECTED] that gets more responses.  Anyway, here is a 
multithreaded program all-depends-list that can get you double the 
speed on dual processor systems, and even some small speed gains on 
single processor systems.  E.g.


all-depends-list /usr/ports/x11/xorg

http://www.math.missouri.edu/~stephen/all-depends-list.c

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread David Naylor
On Monday 28 May 2007 03:43, you wrote:
 Maybe I should look at the inner workings of cmake and gmake.  Maybe
 they have some good ideas.  However having looked through the source
 code of make, and also looking at the cvs logs, it does seem to be well
 written.  The only possibility I see of making it go a lot faster is a
 complete redesign, e.g. my just in time idea for processing variables.

 Stephen

Just in time (jit), if I remember correctly, is a term used by java 
interpreters which compile the byte code into machine code!!!  Perhaps this 
could be developed for makefile's, especially bsd.*.mk.  

This, I think, could be done in two ways:
1) Develop the bsd.*.mk files in C and link it in with make, or
2) Use the makefiles as source to compile into machine code (passibly via 
C-ASM).  The machine code could be created on demand, or cached and only 
updated if the source makefile changes.  

I am not sure if this could work or even if it will have any significant speed 
increase.  However if method 2 does work it has the potential to radically 
increase the speed of ports _while_ maintaining the flexability.  

All that will be needed is an API for the machine code and a compiler???

David


pgp6Gma7Ai2zD.pgp
Description: PGP signature


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Matthew Seaman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Ivan Voras wrote:
 Stephen Montgomery-Smith wrote:
 I have been thinking a lot about looking for speed increases for make
 index and pkg_version and things like that.  So for example, in
 pkg_version, it calls make -V PKGNAME for every installed package. Now
 make -V PKGNAME should be a speedy operation, but the make has to load
 in and analyze bsd.port.mk, a quite complicated file with about 200,000
 characters in it, when all it is needing to do is to figure out the
 value of the variable PKGNAME.
 
 As long as far-out ideas are being discussed, how about caching such
 information (including dependenices) in a file (I'd call it a database
 but then I'd had to start a holy war :) ) so it's calculated only once,
 preferably on the portsnap / cvsup servers and not at the end-user?

Good idea.

   http://www.infracaninophile.co.uk/portindex/

Been done before though.  

Cheers,

Matthew





- -- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
  Kent, CT11 9PW
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.3 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGWwnL8Mjk52CukIwRCNDLAJ4jFCpr5y7uAQi97mVRV3Pc4+c99ACeN9vQ
tOc6IzTQ90+wObG34KWQzzw=
=XuiO
-END PGP SIGNATURE-
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Stephen Montgomery-Smith



On Mon, 28 May 2007, David Naylor wrote:


On Monday 28 May 2007 03:43, you wrote:

Maybe I should look at the inner workings of cmake and gmake.  Maybe
they have some good ideas.  However having looked through the source
code of make, and also looking at the cvs logs, it does seem to be well
written.  The only possibility I see of making it go a lot faster is a
complete redesign, e.g. my just in time idea for processing variables.

Stephen


Just in time (jit), if I remember correctly, is a term used by java
interpreters which compile the byte code into machine code!!!  Perhaps this
could be developed for makefile's, especially bsd.*.mk.

This, I think, could be done in two ways:
1) Develop the bsd.*.mk files in C and link it in with make, or
2) Use the makefiles as source to compile into machine code (passibly via
C-ASM).  The machine code could be created on demand, or cached and only
updated if the source makefile changes.

I am not sure if this could work or even if it will have any significant speed
increase.  However if method 2 does work it has the potential to radically
increase the speed of ports _while_ maintaining the flexability.

All that will be needed is an API for the machine code and a compiler???



My gut reaction is the same as yours - I doubt that this would bring any 
speed increases.  And the programming effort would be huge.



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Mike Meyer
In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:
 1. make and its sub-makes for a) reading the file; b) parsing the file
 (note that .if and .for processing is done while parsing); c) processing
 targets.

Make and submakes have been gone over already. See URL:
http://miller.emu.id.au/pmiller/books/rmch/ .

I'm not sure it can be applied to the ports tree, though. I haven't
looked into it, but recalled this paper when you mentioned measuring
makes and sub-makes.

mike
-- 
Mike Meyer [EMAIL PROTECTED]  http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Garrett Cooper

Mike Meyer wrote:

In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:

1. make and its sub-makes for a) reading the file; b) parsing the file
(note that .if and .for processing is done while parsing); c) processing
targets.


Make and submakes have been gone over already. See URL:
http://miller.emu.id.au/pmiller/books/rmch/ .

I'm not sure it can be applied to the ports tree, though. I haven't
looked into it, but recalled this paper when you mentioned measuring
makes and sub-makes.

mike


Reducing the number of variables will certainly cut down on the amount 
of overhead in the make/submake context switches by a long shot.


Maybe someone should consider running a 'pre-make' using the .mk files, 
find the variables of interest for all particular sub-ports, and then 
carry on the 'root make', i.e. make process in each port directory, with 
just the variables of interest.


-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Garrett Cooper

Garrett Cooper wrote:

Mike Meyer wrote:

In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:

1. make and its sub-makes for a) reading the file; b) parsing the file
(note that .if and .for processing is done while parsing); c) processing
targets.


Make and submakes have been gone over already. See URL:
http://miller.emu.id.au/pmiller/books/rmch/ .

I'm not sure it can be applied to the ports tree, though. I haven't
looked into it, but recalled this paper when you mentioned measuring
makes and sub-makes.

mike


Reducing the number of variables will certainly cut down on the amount 
of overhead in the make/submake context switches by a long shot.


Maybe someone should consider running a 'pre-make' using the .mk files, 
find the variables of interest for all particular sub-ports, and then 
carry on the 'root make', i.e. make process in each port directory, with 
just the variables of interest.


-Garrett


s/long shot/possibly a lot/g

Also, I was thinking in particular of the X.Org 7.2 packages, because 
the bulk majority of the packages are smaller, and compile in a short 
amount of time.


-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Ivan Voras
Stephen Montgomery-Smith wrote:
 Ivan Voras wrote:

 As long as far-out ideas are being discussed, how about caching such
 information (including dependenices) in a file (I'd call it a database
 but then I'd had to start a holy war :) ) so it's calculated only once,
 preferably on the portsnap / cvsup servers and not at the end-user?
 
 Because the information is not a constant.  For example, the mpg123 port
 changes its PKGNAME as soon as esound is installed.

Maybe the time has come to give up on some of the flexibility the ports
tree has (and this particular one is confusing to the users) in favour
of gaining speed?



signature.asc
Description: OpenPGP digital signature


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Hartmut Brandt

Mike Meyer wrote:

In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:

1. make and its sub-makes for a) reading the file; b) parsing the file
(note that .if and .for processing is done while parsing); c) processing
targets.


Make and submakes have been gone over already. See URL:
http://miller.emu.id.au/pmiller/books/rmch/ .

I'm not sure it can be applied to the ports tree, though. I haven't
looked into it, but recalled this paper when you mentioned measuring
makes and sub-makes.


Unfortunately you deleted the sentence before, so I rephrase it: before 
looking into optimizations find out where the time is actually spend - 
how many seconds of the hours the process takes, are actually spent in 
make and sub-makes. If the entire process takes 2 hours of which the 
makes take 20 seconds then by enhancing performance of make by 50% you 
win 10 seconds. This is probably not worth a single line of additional code.


The paper you point to talks about something entirely different.

harti
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Roman Divacky
On Mon, May 28, 2007 at 11:34:24AM -0500, Stephen Montgomery-Smith wrote:
 Jeremy Chadwick wrote:
 On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith wrote:
  I have been thinking a lot about looking for speed increases for make 
  index and pkg_version and things like that.  So for example, in 
  pkg_version, it calls make -V PKGNAME for every installed package. Now 
  make -V PKGNAME should be a speedy operation, but the make has to load 
  in and analyze bsd.port.mk, a quite complicated file with about 200,000 
  characters in it, when all it is needing to do is to figure out the 
  value of the variable PKGNAME.
 
 I have a related question, pertaining to make all-depends-list and the
 utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you don't
 know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
 bsd.ports.mk.
 
 I posted this to [EMAIL PROTECTED], but now I am realizing that it is 
 [EMAIL PROTECTED] that gets more responses.  Anyway, here is a 
 multithreaded program all-depends-list that can get you double the 
 speed on dual processor systems, and even some small speed gains on 
 single processor systems.  E.g.
 
 all-depends-list /usr/ports/x11/xorg
 
 http://www.math.missouri.edu/~stephen/all-depends-list.c

btw.. stehpen, when are you getting a commit bit? :) I certainly hope that soon 
enough ;)

roman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Mike Meyer
In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:
 Mike Meyer wrote:
  In [EMAIL PROTECTED], Hartmut Brandt [EMAIL PROTECTED] typed:
  1. make and its sub-makes for a) reading the file; b) parsing the file
  (note that .if and .for processing is done while parsing); c) processing
  targets.
  
  Make and submakes have been gone over already. See URL:
  http://miller.emu.id.au/pmiller/books/rmch/ .
  
  I'm not sure it can be applied to the ports tree, though. I haven't
  looked into it, but recalled this paper when you mentioned measuring
  makes and sub-makes.
 Unfortunately you deleted the sentence before, so I rephrase it: before 
 looking into optimizations find out where the time is actually spend - 
 how many seconds of the hours the process takes, are actually spent in 
 make and sub-makes. If the entire process takes 2 hours of which the 
 makes take 20 seconds then by enhancing performance of make by 50% you 
 win 10 seconds. This is probably not worth a single line of additional code.
 
 The paper you point to talks about something entirely different.

It think we're talking about two different things. You're talking
about the efficiency of make, whereas he's talking about the
efficiency of make. Um, wait.

You're talking about what I'll call the *internal* efficiency of make,
defined as how fast it does the things it does. He's talking about
what I'll call the *external* efficiency of make, which is how well it
does at doing the minimum amount of work it needs to do. I hope you
can see where the confusion comes from.

In particular, he talks about how recursive makefiles screw up
evaluating complex variables, causing them to be executed multiple
times. So if you're running a makefile to pull some variables value,
as opposed to do real commands, and your entire process takes 2 hours
and the Makefile takes 20 seconds, but it evaluates all the variables
twice, then by fixing your makefile you win at least 59 minutes and 50
seconds. I think cutting the run time by 50% is worth some work.

Benchmarking can help you decide which things it pays to work on if
all you're worried about is the internal efficiency. However, the goal
is to make the process faster, so we need to worry about the external
efficiency as well. The problem here is that the worse it is, the less
it looks like you stand to gain by looking at your makefile when you
look at the benchmarks.

Given that the ports system has both highly complex variables and is
very recursive, I believe that it warrants investigation if you're
going to work on making make in the ports faster.

mike
-- 
Mike Meyer [EMAIL PROTECTED]  http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Alexander Nedotsukov
Correct me if I wrong. Don't you missed the fact that chdir(2) changes 
process wide attribute?

Though it's easy to fix with -C option.
Stephen Montgomery-Smith wrote:

Jeremy Chadwick wrote:
On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith 
wrote:
 I have been thinking a lot about looking for speed increases for 
make  index and pkg_version and things like that.  So for example, 
in  pkg_version, it calls make -V PKGNAME for every installed 
package. Now  make -V PKGNAME should be a speedy operation, but 
the make has to load in  and analyze bsd.port.mk, a quite 
complicated file with about 200,000  characters in it, when all it 
is needing to do is to figure out the value of  the variable PKGNAME.


I have a related question, pertaining to make all-depends-list and the
utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you don't
know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
bsd.ports.mk.


I posted this to [EMAIL PROTECTED], but now I am realizing that it is 
[EMAIL PROTECTED] that gets more responses.  Anyway, here is a 
multithreaded program all-depends-list that can get you double the 
speed on dual processor systems, and even some small speed gains on 
single processor systems.  E.g.


all-depends-list /usr/ports/x11/xorg

http://www.math.missouri.edu/~stephen/all-depends-list.c

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Stephen Montgomery-Smith

Roman Divacky wrote:

On Mon, May 28, 2007 at 11:34:24AM -0500, Stephen Montgomery-Smith wrote:

Jeremy Chadwick wrote:

On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith wrote:
I have been thinking a lot about looking for speed increases for make 
index and pkg_version and things like that.  So for example, in 
pkg_version, it calls make -V PKGNAME for every installed package. Now 
make -V PKGNAME should be a speedy operation, but the make has to load 
in and analyze bsd.port.mk, a quite complicated file with about 200,000 
characters in it, when all it is needing to do is to figure out the 
value of the variable PKGNAME.

I have a related question, pertaining to make all-depends-list and the
utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you don't
know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
bsd.ports.mk.
I posted this to [EMAIL PROTECTED], but now I am realizing that it is 
[EMAIL PROTECTED] that gets more responses.  Anyway, here is a 
multithreaded program all-depends-list that can get you double the 
speed on dual processor systems, and even some small speed gains on 
single processor systems.  E.g.


all-depends-list /usr/ports/x11/xorg

http://www.math.missouri.edu/~stephen/all-depends-list.c


btw.. stehpen, when are you getting a commit bit? :) I certainly hope that soon 
enough ;)


Probably not.  The program seems to have a bug in it.  In particular, I 
didn't read the fgetln man page sufficiently well.  So think of it as a 
proof of concept rather than a finished product.


I'm going to rest from this stuff for a while, but I enjoyed the 
exchanges and it has given me encouragement to work on it again in the 
future sometime.


Stephen
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Garrett Cooper

Stephen Montgomery-Smith wrote:

Roman Divacky wrote:

On Mon, May 28, 2007 at 11:34:24AM -0500, Stephen Montgomery-Smith wrote:

Jeremy Chadwick wrote:
On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith 
wrote:
I have been thinking a lot about looking for speed increases for 
make index and pkg_version and things like that.  So for example, 
in pkg_version, it calls make -V PKGNAME for every installed 
package. Now make -V PKGNAME should be a speedy operation, but 
the make has to load in and analyze bsd.port.mk, a quite 
complicated file with about 200,000 characters in it, when all it 
is needing to do is to figure out the value of the variable PKGNAME.
I have a related question, pertaining to make all-depends-list and 
the
utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you 
don't

know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
bsd.ports.mk.
I posted this to [EMAIL PROTECTED], but now I am realizing that it is 
[EMAIL PROTECTED] that gets more responses.  Anyway, here is a 
multithreaded program all-depends-list that can get you double the 
speed on dual processor systems, and even some small speed gains on 
single processor systems.  E.g.


all-depends-list /usr/ports/x11/xorg

http://www.math.missouri.edu/~stephen/all-depends-list.c


btw.. stehpen, when are you getting a commit bit? :) I certainly hope 
that soon enough ;)


Probably not.  The program seems to have a bug in it.  In particular, I 
didn't read the fgetln man page sufficiently well.  So think of it as a 
proof of concept rather than a finished product.


I'm going to rest from this stuff for a while, but I enjoyed the 
exchanges and it has given me encouragement to work on it again in the 
future sometime.


Stephen


	fgetln(2) just scans ahead to the next newline, so the pointer to the 
next line is returned and the length of the string (with newline char 
included) is stored in the len variable (2nd parameter to function).


-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-28 Thread Rick C. Petty
On Sun, May 27, 2007 at 03:30:48PM -0700, Jeremy Chadwick wrote:
 
 That said, I'll ask this out in the open: am I the only one who sees the
 benefit of GNU make in regards to this?  There's a lot of built-in
 functions in GNU make which could help in regards to ports.  I have no
 qualms with PMake per se, but if another tool gives us what we need,
 then maybe we should consider the pros and cons of adapting that.
 There's also CMake, which is incredibly fast.

Yes, you are.  What gmake benefits?

Gmake does not provide the flexibility and power that pmake provides.  Off
the top of my head:  gmake does not have .for loops, variable expansion
modifiers, or even the != shell command variable assigment.  I use these
in almost every Makefile I write, and the ports uses these things quite a
bit.  Also, gmake syntax is horrendous compared to pmake.  People are
already complaining about how ugly the ports makefiles are-- they'd be
worse under gmake.  Might as well rewrite the whole infrastructure in
/bin/sh ...

Also, there's the licensing issues.  Remember-- any significant changes to
this infrastructure has to work with the core utilities..  this leaves out
gmake, python, ruby, etc.  I doubt anyone will find anything as powerful
as pmake without sacrificing the much-used flexibility it provides.

-- Rick C. Petty
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-27 Thread Jeremy Chadwick
On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith wrote:
  I have been thinking a lot about looking for speed increases for make 
  index and pkg_version and things like that.  So for example, in 
  pkg_version, it calls make -V PKGNAME for every installed package. Now 
  make -V PKGNAME should be a speedy operation, but the make has to load in 
  and analyze bsd.port.mk, a quite complicated file with about 200,000 
  characters in it, when all it is needing to do is to figure out the value of 
  the variable PKGNAME.

I have a related question, pertaining to make all-depends-list and the
utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you don't
know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
bsd.ports.mk.

I call it an atrocity because it's a mix of make variable expansion
combined with sh scripting, and it's nearly impossible to read.  It's
not commented either, so folks like myself are left thinking What IS
this mess?!.  It's expanded via $$(${ALL-DEPENDS-LIST}) in for loops,
throughout several places in bsd.port.mk.

I do not entirely understand what ALL-DEPENDS-LIST is about (that should
be apparent), but upon performing some of my benchmarks, I found this to
be a very slow piece of bsd.port.mk.  make -V _DEPEND_DIRS is incredibly
fast, but ALL-DEPENDS-LIST is not.

Does it need to be done this way?  Can we just iterate through all of
the ports, call make -V _DEPEND_DIRS, then sort | uniq the results?  I
suppose that depends on the operation (make vs. make clean vs.
others)...

The port I used for testing some of the benchmarks was net/gacxtool.  It
seems to be a good example of a hefty port.

  I suggest rewriting make so that variables are only evaluated on a need 
  to know basis.  So, for example, if all we need to know is PKGNAME, there 
  is no need to evaluate, for example, _RUN_LIB_DEPENDS, unless the writer of 
  that particular port has done something like having PORTNAME depend on the 
  value of _RUN_LIB_DEPENDS.  So make should analyze all the code it is 
  given, and only figure it out if it is needed to do so.  This would include, 
  for example, figuring out .for and .if directives on a need to know basis as 
  well.

This sounds like a good solution.  In fact, I'm lead to believe that
heavy reliance on /bin/sh is part of why the ports collection is slow.
No, it's not the sole reason, but it's one of many.  I'm of the belief
that anything we can do to migrate portions into native make would be
benefitial.

That said, I'll ask this out in the open: am I the only one who sees the
benefit of GNU make in regards to this?  There's a lot of built-in
functions in GNU make which could help in regards to ports.  I have no
qualms with PMake per se, but if another tool gives us what we need,
then maybe we should consider the pros and cons of adapting that.
There's also CMake, which is incredibly fast.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-27 Thread Bakul Shah
Not quite what you asked for but...
 
Given the size and complexity of the port system I have long
felt that rather than do everything via more and more complex
Mk/*.mk what is is needed is a ports server and a thin CLI
frontend to it.

This server can store dependency data in an efficient manner,
deal with conditional dependencies, port renames, security
and what not.  It can build or fetch or serve packages,
handle updates etc.  Things mentioned in UPDATING file can
instead be done by the server.  In general it can automate a
lot of stuff, remove error prone redundancies etc.  If it is
small enough and written in C, it can even be shipped with
the base system instead of various pkg_* programs.

It can provide two interfaces, one for normal users (with
commands like add, check, config, delete, info, search,
update, which) and one for port developers (command for
adding/remove/renaming ports, etc.).  Initially it must work
with existing Makefiles.

 I have been thinking a lot about looking for speed increases for make 
 index and pkg_version and things like that.  So for example, in 
 pkg_version, it calls make -V PKGNAME for every installed package. 
 Now make -V PKGNAME should be a speedy operation, but the make has to 
 load in and analyze bsd.port.mk, a quite complicated file with about 
 200,000 characters in it, when all it is needing to do is to figure out 
 the value of the variable PKGNAME.
 
 I suggest rewriting make so that variables are only evaluated on a 
 need to know basis.  So, for example, if all we need to know is 
 PKGNAME, there is no need to evaluate, for example, _RUN_LIB_DEPENDS, 
 unless the writer of that particular port has done something like having 
 PORTNAME depend on the value of _RUN_LIB_DEPENDS.  So make should 
 analyze all the code it is given, and only figure it out if it is needed 
 to do so.  This would include, for example, figuring out .for and .if 
 directives on a need to know basis as well.
 
 I have only poked around a little inside the source for make, but I have 
 a sense that this would be a major undertaking.  I certainly have not 
 thought through what it entails in more than a cursory manner.  However 
 I am quite excited about the possibility of doing this, albeit I may 
 well put off the whole thing for a year or two or even forever depending 
 upon other priorities in my life.
 
 However, in the mean time I want to throw this idea out there to get 
 some feedback, either of the form of this won't work, or of the form 
 I will do it, or I have tried to do this.
 
 Best regards, Stephen
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-27 Thread Stephen Montgomery-Smith

Jeremy Chadwick wrote:

On Sun, May 27, 2007 at 03:52:16PM -0500, Stephen Montgomery-Smith wrote:
 I have been thinking a lot about looking for speed increases for make 
 index and pkg_version and things like that.  So for example, in 
 pkg_version, it calls make -V PKGNAME for every installed package. Now 
 make -V PKGNAME should be a speedy operation, but the make has to load in 
 and analyze bsd.port.mk, a quite complicated file with about 200,000 
 characters in it, when all it is needing to do is to figure out the value of 
 the variable PKGNAME.


I have a related question, pertaining to make all-depends-list and the
utter atrocity that is the make variable ALL-DEPENDS-LIST.  If you don't
know what it is, look for ^ALL-DEPENDS-LIST around line 5175, in
bsd.ports.mk.

I call it an atrocity because it's a mix of make variable expansion
combined with sh scripting, and it's nearly impossible to read.  It's
not commented either, so folks like myself are left thinking What IS
this mess?!.  It's expanded via $$(${ALL-DEPENDS-LIST}) in for loops,
throughout several places in bsd.port.mk.

I do not entirely understand what ALL-DEPENDS-LIST is about (that should
be apparent), but upon performing some of my benchmarks, I found this to
be a very slow piece of bsd.port.mk.  make -V _DEPEND_DIRS is incredibly
fast, but ALL-DEPENDS-LIST is not.

Does it need to be done this way?  Can we just iterate through all of
the ports, call make -V _DEPEND_DIRS, then sort | uniq the results?  I
suppose that depends on the operation (make vs. make clean vs.
others)...

The port I used for testing some of the benchmarks was net/gacxtool.  It
seems to be a good example of a hefty port.


I looked very hard at this particular piece of code.  Once you 
understand how it works, you realize that it is rather well written.  It 
basically does what you suggest, except it keeps a list of ports it has 
already checked, so that it doesn't do them over again.  This piece of 
code is as efficient as it can possibly be, given that the program has 
to recursively call make on every dependency port at least once (and as 
it happens only once).





 I suggest rewriting make so that variables are only evaluated on a need 
 to know basis.  So, for example, if all we need to know is PKGNAME, there 
 is no need to evaluate, for example, _RUN_LIB_DEPENDS, unless the writer of 
 that particular port has done something like having PORTNAME depend on the 
 value of _RUN_LIB_DEPENDS.  So make should analyze all the code it is 
 given, and only figure it out if it is needed to do so.  This would include, 
 for example, figuring out .for and .if directives on a need to know basis as 
 well.


This sounds like a good solution.  In fact, I'm lead to believe that
heavy reliance on /bin/sh is part of why the ports collection is slow.
No, it's not the sole reason, but it's one of many.  I'm of the belief
that anything we can do to migrate portions into native make would be
benefitial.


I have done profiling tests on make, and in its current form, 
bsd.ports.mk actually spends rather little time inside of bin/sh.  Thw 
writers of bsd.ports.mk have done a very good job of minimizing the 
bin/sh calls.




That said, I'll ask this out in the open: am I the only one who sees the
benefit of GNU make in regards to this?  There's a lot of built-in
functions in GNU make which could help in regards to ports.  I have no
qualms with PMake per se, but if another tool gives us what we need,
then maybe we should consider the pros and cons of adapting that.
There's also CMake, which is incredibly fast.



Maybe I should look at the inner workings of cmake and gmake.  Maybe 
they have some good ideas.  However having looked through the source 
code of make, and also looking at the cvs logs, it does seem to be well 
written.  The only possibility I see of making it go a lot faster is a 
complete redesign, e.g. my just in time idea for processing variables.


Stephen
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Looking for speed increases in make index and pkg_version for ports

2007-05-27 Thread Stephen Montgomery-Smith
I'm looking for something that will work with the existing framework. 
But yes, I get the feeling that maybe using make to process the ports 
might be the source of the problem.  Make is a program primarily 
designed for figuring out which was made first, the target or the 
source, but in the ports what we really want is a scripting language 
that presides over cd WKSRC; make.


(P.S. sorry for top-posting, but I am following your lead.)


Bakul Shah wrote:

Not quite what you asked for but...
 
Given the size and complexity of the port system I have long

felt that rather than do everything via more and more complex
Mk/*.mk what is is needed is a ports server and a thin CLI
frontend to it.

This server can store dependency data in an efficient manner,
deal with conditional dependencies, port renames, security
and what not.  It can build or fetch or serve packages,
handle updates etc.  Things mentioned in UPDATING file can
instead be done by the server.  In general it can automate a
lot of stuff, remove error prone redundancies etc.  If it is
small enough and written in C, it can even be shipped with
the base system instead of various pkg_* programs.

It can provide two interfaces, one for normal users (with
commands like add, check, config, delete, info, search,
update, which) and one for port developers (command for
adding/remove/renaming ports, etc.).  Initially it must work
with existing Makefiles.

I have been thinking a lot about looking for speed increases for make 
index and pkg_version and things like that.  So for example, in 
pkg_version, it calls make -V PKGNAME for every installed package. 
Now make -V PKGNAME should be a speedy operation, but the make has to 
load in and analyze bsd.port.mk, a quite complicated file with about 
200,000 characters in it, when all it is needing to do is to figure out 
the value of the variable PKGNAME.


I suggest rewriting make so that variables are only evaluated on a 
need to know basis.  So, for example, if all we need to know is 
PKGNAME, there is no need to evaluate, for example, _RUN_LIB_DEPENDS, 
unless the writer of that particular port has done something like having 
PORTNAME depend on the value of _RUN_LIB_DEPENDS.  So make should 
analyze all the code it is given, and only figure it out if it is needed 
to do so.  This would include, for example, figuring out .for and .if 
directives on a need to know basis as well.


I have only poked around a little inside the source for make, but I have 
a sense that this would be a major undertaking.  I certainly have not 
thought through what it entails in more than a cursory manner.  However 
I am quite excited about the possibility of doing this, albeit I may 
well put off the whole thing for a year or two or even forever depending 
upon other priorities in my life.


However, in the mean time I want to throw this idea out there to get 
some feedback, either of the form of this won't work, or of the form 
I will do it, or I have tried to do this.


Best regards, Stephen
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]





___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]