Hi Peter, * Peter O'Gorman wrote on Thu, May 05, 2005 at 02:13:31PM CEST: > Ralf Wildenhues wrote: > > | I'm pretty sure I can get it quite a bit faster even. The patches need > | cleanup so that they use allowed file names and work properly in corner > | cases as well, but those don't scale with the number of objects, so they > | matter less. > > | + $ECHO "$oldobjs" | $SP2NL | $SED -n -e '/./p' >_objs > > Ralf, you really rock!
Thanks. :-) > I do worry about this echo though. How big is $oldobjs? Will we exceed the > max_cmd_len if echo is an external program? Yep, that's right, thanks for noting. Doesn't matter here, though, as my next patch makes it unnecessary. :-) That one still needs testing, but the idea is to kill all quadratic loops in the func_mode_link initialization: Stuff like compile_command="$compile_command $qarg" is better written as $ECHO " $qarg" >&FD_COMPILE_COMMAND where FD_COMPILE_COMMAND is a m4 macro which evaluates to a file descriptor for the file holding the contents of compile_command. Same for oldobjs and a couple of other iteratively-set parameters. The desired value is then retrieved quickly by compile_command=`$SED 's/^ //' <"$compile_command_file" | $NL2SP` This gets me down from 2 minutes initialization to about 30 seconds for the convenience lib in question. It can easily lead to worse results if $ECHO is not internal, though, so preventing that will be even more important. Also, I previously thought that some modern shell out there optimized foo="$foo $bar" assignments, but I can't find it now. Another thing I am wondering about, is the following: All shells I've tested (under linux) do repeated dup()s and close()s for something like this: exec 5>file echo foo >&5 echo bar >&5 strace of important part: | open("file", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3 | fcntl64(5, F_DUPFD, 10) = -1 EBADF (Bad file descriptor) | fcntl64(3, F_DUPFD, 5) = 5 | close(3) = 0 | fcntl64(1, F_DUPFD, 10) = 10 | close(1) = 0 | fcntl64(5, F_DUPFD, 1) = 1 | write(1, "foo\n", 4) = 4 | close(1) = 0 | fcntl64(10, F_DUPFD, 1) = 1 | close(10) = 0 | fcntl64(1, F_DUPFD, 10) = 10 | close(1) = 0 | fcntl64(5, F_DUPFD, 1) = 1 | write(1, "bar\n", 4) = 4 | close(1) = 0 | fcntl64(10, F_DUPFD, 1) = 1 | close(10) = 0 Why can't they write() to the right fd immediately and skip all the fd creation and destruction? Note I don't know whether that is a limiting factor, but it sure looks a good candidate. libjava really is a nice test candidate for killing bottlenecks. Regards, Ralf