Re: fork speed vs /bin/sh
Peter Wemm wrote: > What this shows is that vfork() is 3 times faster than fork() on static > binaries, and 9 times faster on dynamic binaries. If people are > worried about a 40% slowdown, then perhaps they'd like to investigate > a speedup that works no matter whether its static or dynamic? There is > a reason that popen(3) uses vfork(). /bin/sh should too, regardless of > whether its dynamic or static. csh/tcsh already uses vfork() for the > same reason. I'm a big fan of vfork(); the on problem I have with the use of it is that people tend to treat it as "a faster fork()", when it definitely is not. The utility of vfork() is limited to the list of allowed system calls, which are _exit() and execve(); all other usage is undefined -- specifically, you cannot control things like whether it's the parent or the child that gets effected by calls like setsid(), setpgrp(), etc.. The other place that vfork() really sucks is in applications like "screen" or other applications that have multiple children and act as mux'es for them: during the vfork() to spawn off a new child from the parent, the parent is stalled, and this in turn stalls all the children, as well. The vfork() system call is a good thing, particularly compared to the fork() system call, IFF it's used appropriately. For the most part, FreeBSD should consider creating a posix_spawn() system call, instead, for most uses to which people put either the fork() or vfork() system calls today. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: fork speed vs /bin/sh
:What this shows is that vfork() is 3 times faster than fork() on static :binaries, and 9 times faster on dynamic binaries. If people are :worried about a 40% slowdown, then perhaps they'd like to investigate :a speedup that works no matter whether its static or dynamic? There is :a reason that popen(3) uses vfork(). /bin/sh should too, regardless of :whether its dynamic or static. csh/tcsh already uses vfork() for the :same reason. : :NetBSD have already taken advantage of this speedup and their /bin/sh uses :vfork(). Some enterprising individual who cares about /bin/sh speed should :check out that. Start looking near #ifdef DO_SHAREDVFORK. That isn't really a fair comparison because your vfork is hitting a degenerate case and isn't actually doing anything significant. You really need to exec() something. I've included a program below that [v]fork/exec's "./sh -c exit 0" 5000 times. Dell2550, 2xCPU (MP build), DFly 0.000u 4.095s 0:02.53 161.6% 154+107k 0+0io 0pf+0w VFORK/EXEC STATIC SH 0.000u 6.681s 0:04.04 165.3% 94+97k 0+0io 0pf+0w FORK/EXEC STATIC SH 0.500u 16.844s 0:16.34 106.1% 53+84k 0+0io 0pf+0w VFORK/EXEC DYNAMIC SH 0.093u 18.303s 0:23.86 77.0%42+79k 0+0io 0pf+0w FORK/EXEC DYNAMIC SH Athlon64, 2xCPU (UP), DFly 0.078u 0.687s 0:00.74 101.3%399+226k 0+0io 0pf+0w VFORK/EXEC STATIC SH 0.117u 0.968s 0:01.07 100.0%273+208k 0+0io 0pf+0w FORK/EXEC STATIC SH 2.218u 2.484s 0:04.71 99.5% 121+180k 0+0io 1pf+0w VFORK/EXEC DYNAMIC SH 2.281u 2.773s 0:04.98 101.4%113+179k 0+0io 0pf+0w FORK/EXEC DYNAMIC SH 1.304u 2.289s 0:03.60 99.4% 121+180k 0+0io 0pf+0w VFORK/EXEC DYNAMIC SH WITH PREBINDING. 1.296u 2.648s 0:03.90 100.7%112+180k 0+0io 1pf+0w FORK/EXEC DYNAMIC SH WITH PREBINDING. These results were rather unexpected, actually. I'm not sure why the numbers on the DELL box are so bad with a dynamic 'sh' but I suspect that the dynamic linking is blowing out the L1 cache. In anycase, taking the Athlon64 system the difference between static and dynamic is around 4 seconds while the difference between vfork and fork is only around 0.25 seconds, so while moving to vfork() helps it doesn't help all that much. Unless you happen to be hitting a boundary condition on the L1 cache, that is. If that is presumably the case on the Dell box (which only has a 16K L1 cache where as the AMD64 has a 64K L1 cache), then the difference is around 14 seconds between vfork static and vfork dynamic verses an additional 8 seconds going from vfork to fork. Vfork would probably be a significant improvement on the DELL box. Prebinding generates around a 20% overhead improvement for the dynamic 'sh' on the Athlon64 but on the Dell2550 prebinding actually made things go slower (not shown above), from 23.8 seconds to 26 seconds. I think there is an edge case due to prebinding having a greater L1 cache impact. For larger, more complex programs prebinding shows definite, if small, improvements. -Matt /* * CD into the directory containing the ./sh executable before running */ #include #include #include main() { int i; pid_t pid; for (i = 0; i < 5000; ++i) { if ((pid = vfork()) == 0) { /* < CHANGE THIS FORK/VFORK */ execl("./sh", "./sh", "-c", "exit", "0", NULL); write(2, "problem\n", 8); _exit(1); } if (pid > 0) waitpid(pid, NULL, 0); } return(0); } ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
fork speed vs /bin/sh
I *know* I'm going to regret posting this, but if people care about the speed of their shell, then perhaps you want to look at this: [EMAIL PROTECTED]:46am]/tmp-149> cc -O -o vforkathon.dynamic vforkathon.c [EMAIL PROTECTED]:46am]/tmp-150> cc -O -static -o vforkathon.static vforkathon.c [EMAIL PROTECTED]:47am]/tmp-151> cc -O -static -o forkathon.static forkathon.c [EMAIL PROTECTED]:47am]/tmp-152> cc -O -o forkathon.dynamic forkathon.c [EMAIL PROTECTED]:47am]/tmp-153> time ./forkathon.dynamic 0.120u 17.192s 0:17.81 97.1%5+169k 0+0io 0pf+0w [EMAIL PROTECTED]:47am]/tmp-154> time ./forkathon.static 0.051u 5.939s 0:06.38 93.7% 15+177k 0+0io 0pf+0w [EMAIL PROTECTED]:47am]/tmp-155> time ./vforkathon.dynamic 0.015u 2.006s 0:02.30 87.3% 5+176k 0+0io 0pf+0w [EMAIL PROTECTED]:48am]/tmp-156> time ./vforkathon.static 0.022u 2.020s 0:02.34 87.1% 16+182k 0+0io 0pf+0w What this shows is that vfork() is 3 times faster than fork() on static binaries, and 9 times faster on dynamic binaries. If people are worried about a 40% slowdown, then perhaps they'd like to investigate a speedup that works no matter whether its static or dynamic? There is a reason that popen(3) uses vfork(). /bin/sh should too, regardless of whether its dynamic or static. csh/tcsh already uses vfork() for the same reason. NetBSD have already taken advantage of this speedup and their /bin/sh uses vfork(). Some enterprising individual who cares about /bin/sh speed should check out that. Start looking near #ifdef DO_SHAREDVFORK. In case anybody was wondering: [EMAIL PROTECTED]:48am]/tmp-157> cat forkathon.c #include #include #include int main(int ac, char *av[]) { int i; pid_t pid; for (i = 0; i < 10; i++) { pid = fork(); switch (pid) { case 0: _exit(0); default: waitpid(pid, NULL, 0); } } } [EMAIL PROTECTED]:53am]/tmp-158> diff forkathon.c vforkathon.c 12c12 < pid = fork(); --- > pid = vfork(); Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] "All of this is for nothing if we don't go to the stars" - JMS/B5 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"