Re: Optimising cygwin fork performance
On 16/12/2020 20:37, Brian Inglis wrote: > On 2020-12-16 10:36, Marco Atzeri via Cygwin-apps wrote: >> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote: >>> So I know it's been mentioned a lot that fork is slow on Cygwin, but >>> compared to other people's machines, eg when building, it seems way >>> slower for me. >>> >>> First I'd like to know if there's a good way to measure this that >>> anyone >>> has found, because I'm not sure how to measure it. If I print multiple >>> lines with echo in a script, I can see it printing maybe 2-3 a second - >>> it's very slow. >>> >>> I think this might be because I'm using a Virtual Machine with >>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus, >>> with exceptions for the cygwin install folders (C:\cygwin64, >>> C:\cygwin). >>> >>> It might be nice if we could so some comparisons so I can figure out >>> what's wrong. > > Running strace on your forking executable should give you accurate > numbers in the output, with some work to extract the relevant values. > >> Same AV here, W10 64bit (no VM), 2 year old Laptop >> >> model name : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz >> 4 cores >> >> https://github.com/mondalaci/fork-benchmark >> it seems there is a aging effect >> >> $ ./fork-benchmark.exe 1000 >> Forked, executed and destroyed 1000 processes in 39.928576 seconds. >> >> $ ./fork-benchmark.exe 1000 >> Forked, executed and destroyed 1000 processes in 42.701295 seconds. >> >> $ ./fork-benchmark.exe 1000 >> Forked, executed and destroyed 1000 processes in 49.890909 seconds. >> >> $ ./fork-benchmark.exe 1000 >> Forked, executed and destroyed 1000 processes in 61.657031 seconds. > > It's all part of your current process tree. > Running cygserver may help this with appropriate process settings: > > $ grep -C1 'kern\.srv\..*_.*' /etc/cygserver.conf > > # kern.srv.cleanup_threads: No. of cygserver threads used for cleanup > tasks. > # Default: 2, Min: 1, Max: 16, command line option -c, --cleanup-threads > #kern.srv.cleanup_threads 2 > kern.srv.cleanup_threads 16 > > # kern.srv.request_threads: No. of cygserver threads used to serve > # application requests. > # Default: 10, Min: 1, Max: 310, command line option -r, > --request-threads > #kern.srv.request_threads 10 > kern.srv.request_threads 310 > > # kern.srv.process_cache_size: No. of concurrent processes which can > be handled > # by Cygserver concurrently. > # Default: 62, Min: 1, Max: 310, command line option -p, --process-cache > #kern.srv.process_cache_size 62 > kern.srv.process_cache_size 310 > > $ ./fork-benchmark 1000 > Forked, executed and destroyed 1000 processes in 33.397727 seconds. > $ ./fork-benchmark 1000 > Forked, executed and destroyed 1000 processes in 34.70389 seconds. > $ ./fork-benchmark 1000 > Forked, executed and destroyed 1000 processes in 34.186709 seconds. > $ ./fork-benchmark 1000 > Forked, executed and destroyed 1000 processes in 33.65649 seconds. > $ sed -En '/^model name|^cpu MHz/p;/MHz/q' /proc/cpuinfo > model name : AMD A10-9700 RADEON R7, 10 COMPUTE CORES 4C+6G > cpu MHz : 3500.000 > > $ strace -o fork-benchmark.strace ./fork-benchmark 10 > $ egrep '^--- Process [0-9]+ (crea|\(pid: [0-9]+\) exi)ted|dwProcessId > [0-9]+|ExitProcess n 0x' fork-benchmark.strace > fork-benchmark.log > $ awk '/ [0-9]+! pinfo::exit: /{t+=$2};END{print t/1"ms"}' > fork-benchmark.log > 34.1855ms > > Faster CPUs, faster memory, bigger caches, SSD drive may help. Just running in KVM seems to have helped me. My times are more like 13 seconds for 1000 forks now. There appears to be drop-off, but after a big package build and some idling, it then gets back down to 13 seconds. Is there a way for me to check that the cygserver options are working? Either way, my package builds are about twice as quick as they were before now, possibly just from a reinstall and using KVM. Hamish 0x87B761FE07F548D6.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature
Re: Optimising cygwin fork performance
On 2020-12-16 10:36, Marco Atzeri via Cygwin-apps wrote: On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote: So I know it's been mentioned a lot that fork is slow on Cygwin, but compared to other people's machines, eg when building, it seems way slower for me. First I'd like to know if there's a good way to measure this that anyone has found, because I'm not sure how to measure it. If I print multiple lines with echo in a script, I can see it printing maybe 2-3 a second - it's very slow. I think this might be because I'm using a Virtual Machine with VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus, with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin). It might be nice if we could so some comparisons so I can figure out what's wrong. Running strace on your forking executable should give you accurate numbers in the output, with some work to extract the relevant values. Same AV here, W10 64bit (no VM), 2 year old Laptop model name : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz 4 cores https://github.com/mondalaci/fork-benchmark it seems there is a aging effect $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 39.928576 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 42.701295 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 49.890909 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 61.657031 seconds. It's all part of your current process tree. Running cygserver may help this with appropriate process settings: $ grep -C1 'kern\.srv\..*_.*' /etc/cygserver.conf # kern.srv.cleanup_threads: No. of cygserver threads used for cleanup tasks. # Default: 2, Min: 1, Max: 16, command line option -c, --cleanup-threads #kern.srv.cleanup_threads 2 kern.srv.cleanup_threads 16 # kern.srv.request_threads: No. of cygserver threads used to serve # application requests. # Default: 10, Min: 1, Max: 310, command line option -r, --request-threads #kern.srv.request_threads 10 kern.srv.request_threads 310 # kern.srv.process_cache_size: No. of concurrent processes which can be handled # by Cygserver concurrently. # Default: 62, Min: 1, Max: 310, command line option -p, --process-cache #kern.srv.process_cache_size 62 kern.srv.process_cache_size 310 $ ./fork-benchmark 1000 Forked, executed and destroyed 1000 processes in 33.397727 seconds. $ ./fork-benchmark 1000 Forked, executed and destroyed 1000 processes in 34.70389 seconds. $ ./fork-benchmark 1000 Forked, executed and destroyed 1000 processes in 34.186709 seconds. $ ./fork-benchmark 1000 Forked, executed and destroyed 1000 processes in 33.65649 seconds. $ sed -En '/^model name|^cpu MHz/p;/MHz/q' /proc/cpuinfo model name : AMD A10-9700 RADEON R7, 10 COMPUTE CORES 4C+6G cpu MHz : 3500.000 $ strace -o fork-benchmark.strace ./fork-benchmark 10 $ egrep '^--- Process [0-9]+ (crea|\(pid: [0-9]+\) exi)ted|dwProcessId [0-9]+|ExitProcess n 0x' fork-benchmark.strace > fork-benchmark.log $ awk '/ [0-9]+! pinfo::exit: /{t+=$2};END{print t/1"ms"}' fork-benchmark.log 34.1855ms Faster CPUs, faster memory, bigger caches, SSD drive may help. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.] --- Process 9704 created 65 18385 [main] fork-benchmark 13592 pinfo::thisproc: myself dwProcessId 9704 --- Process 6712 created 797102 [main] fork-benchmark 13593 pinfo::thisproc: myself dwProcessId 6712 --- Process 11804 created 755642 [main] fork-benchmark 13593 pinfo::thisproc: myself dwProcessId 11804 51 34992 [main] fork-benchmark 13593! pinfo::exit: Calling ExitProcess n 0x400, exitcode 0x0 41 11124 [main] fork-benchmark 13593 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0 --- Process 11804 (pid: 13593) exited with status 0x0 --- Process 4176 created 686571 [main] fork-benchmark 13594 pinfo::thisproc: myself dwProcessId 4176 --- Process 1924 created 775786 [main] fork-benchmark 13594 pinfo::thisproc: myself dwProcessId 1924 51 33582 [main] fork-benchmark 13594! pinfo::exit: Calling ExitProcess n 0x400, exitcode 0x0 40 10465 [main] fork-benchmark 13594 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0 --- Process 1924 (pid: 13594) exited with status 0x0 --- Process 10576 created 646030 [main] fork-benchmark 13595 pinfo::thisproc: myself dwProcessId 10576 --- Process 6932 created 765798 [main] fork-benchmark 13595 pinfo::thisproc: myself dwProcessId 6932 42 31674 [main] fork-benchmark 13595! pinfo::exit: Calling ExitProcess n 0x400, exitcode 0x0 41 10602 [main] fork-benchmark 13595 pinfo::exit: Calling ExitProcess n
Re: Optimising cygwin fork performance
Marco Atzeri via Cygwin-apps wrote: On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote: Hi, So I know it's been mentioned a lot that fork is slow on Cygwin, but compared to other people's machines, eg when building, it seems way slower for me. First I'd like to know if there's a good way to measure this that anyone has found, because I'm not sure how to measure it. If I print multiple lines with echo in a script, I can see it printing maybe 2-3 a second - it's very slow. I think this might be because I'm using a Virtual Machine with VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus, with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin). It might be nice if we could so some comparisons so I can figure out what's wrong. Hamish Same AV here, W10 64bit (no VM), 2 year old Laptop model name : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz 4 cores https://github.com/mondalaci/fork-benchmark it seems there is a aging effect Result of power management? Could possibly be prevented if another minor cpu load is run in parallel. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 39.928576 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 42.701295 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 49.890909 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 61.657031 seconds. 9 year old PC, W10 64bit, no VM, Intel i7-2600K CPU @ 3.4GHz, 4 cores / 8 threads AV: Windows Defender Cygwin x86_64 Protection on: $ ./fork-benchmark-64 1000 Forked, executed and destroyed 1000 processes in 16.651101 seconds. $ ./fork-benchmark-64 1000 Forked, executed and destroyed 1000 processes in 16.674107 seconds. Protection off: $ ./fork-benchmark-64 1000 Forked, executed and destroyed 1000 processes in 14.281071 seconds. $ ./fork-benchmark-64 1000 Forked, executed and destroyed 1000 processes in 14.426482 seconds. An alternative which could be run out-of-the-box is the good old 'date(s) per second' benchmark. Its results are comparable: $ while true; do date; done | uniq -c ... 53 Wed Dec 16 19:23:26 CET 2020 <== Protection on 54 Wed Dec 16 19:23:27 CET 2020 56 Wed Dec 16 19:23:28 CET 2020 55 Wed Dec 16 19:23:29 CET 2020 56 Wed Dec 16 19:23:30 CET 2020 ... 52 Wed Dec 16 19:23:37 CET 2020 54 Wed Dec 16 19:23:38 CET 2020 56 Wed Dec 16 19:23:39 CET 2020 63 Wed Dec 16 19:23:40 CET 2020 <== Protection off 63 Wed Dec 16 19:23:41 CET 2020 62 Wed Dec 16 19:23:42 CET 2020 64 Wed Dec 16 19:23:43 CET 2020 ... 63 Wed Dec 16 19:23:51 CET 2020 64 Wed Dec 16 19:23:52 CET 2020 63 Wed Dec 16 19:23:53 CET 2020 55 Wed Dec 16 19:23:54 CET 2020 <== Protection on 48 Wed Dec 16 19:23:55 CET 2020 53 Wed Dec 16 19:23:56 CET 2020 54 Wed Dec 16 19:23:57 CET 2020 54 Wed Dec 16 19:23:58 CET 2020 ... Cygwin x86 is somewhat slower: Protection on: $ ./fork-benchmark-32.exe 1000 Forked, executed and destroyed 1000 processes in 19.231766 seconds. Protection off: $ ./fork-benchmark-32.exe 1000 Forked, executed and destroyed 1000 processes in 17.107739 seconds. Regards, Christian
Re: Optimising cygwin fork performance
On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote: Hi, So I know it's been mentioned a lot that fork is slow on Cygwin, but compared to other people's machines, eg when building, it seems way slower for me. First I'd like to know if there's a good way to measure this that anyone has found, because I'm not sure how to measure it. If I print multiple lines with echo in a script, I can see it printing maybe 2-3 a second - it's very slow. I think this might be because I'm using a Virtual Machine with VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus, with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin). It might be nice if we could so some comparisons so I can figure out what's wrong. Hamish Same AV here, W10 64bit (no VM), 2 year old Laptop model name : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz 4 cores https://github.com/mondalaci/fork-benchmark it seems there is a aging effect $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 39.928576 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 42.701295 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 49.890909 seconds. $ ./fork-benchmark.exe 1000 Forked, executed and destroyed 1000 processes in 61.657031 seconds.
Optimising cygwin fork performance
Hi, So I know it's been mentioned a lot that fork is slow on Cygwin, but compared to other people's machines, eg when building, it seems way slower for me. First I'd like to know if there's a good way to measure this that anyone has found, because I'm not sure how to measure it. If I print multiple lines with echo in a script, I can see it printing maybe 2-3 a second - it's very slow. I think this might be because I'm using a Virtual Machine with VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus, with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin). It might be nice if we could so some comparisons so I can figure out what's wrong. Hamish 0x87B761FE07F548D6.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature