Profiling a dlopened library
Hi, How do i profile a library which is dlopened from an executable. The executable and the library are compiled with -g using gcc . Can somone tell me tool that would profile it at runtime. Regards, Nik ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Profiling a dlopened library
Hi-- On Apr 18, 2013, at 11:21 AM, Niklaus nikl...@gmail.com wrote: How do i profile a library which is dlopened from an executable. The executable and the library are compiled with -g using gcc . Can somone tell me tool that would profile it at runtime. For gprof-style profiling, you'll need to build with -pg in addition. Regards, -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: profiling library smaller than non-profiling, while it contains more symbols. Why?
-rw-r--r-- 1 root wheel 6582354 Jul 12 22:56 libslatec.a -rw-r--r-- 1 root wheel 6555122 Jul 12 23:02 libslatec_p.a # profile library or -fpic library? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: profiling library smaller than non-profiling, while it contains more symbols. Why?
On Thu, Jul 12, 2012 at 09:12:32PM -0500, Robert Bonomi wrote: From owner-freebsd-questi...@freebsd.org Thu Jul 12 17:34:12 2012 Date: Thu, 12 Jul 2012 23:31:31 +0100 From: Anton Shterenlikht me...@bristol.ac.uk To: freebsd-questions@freebsd.org Subject: profiling library smaller than non-profiling, while it contains more symbols. Why? While updating my port (math/slatec) to use the new OPTIONS framework, I did some experiments with the profiling library. I don't know much about this, so what surprised me is that the profiling library is smaller: # ls -al lib*a -rw-r--r-- 1 root wheel 6582354 Jul 12 22:56 libslatec.a -rw-r--r-- 1 root wheel 6555122 Jul 12 23:02 libslatec_p.a # It it possible that libslatac.a has debggingn symbols, and the profiling library does not? Or that the profiling library was compiled with a lower degree of optimization ? (many of the 'higher'-level optimizations cause _larger_, albeit faster, code to be generated) Any other differences in compilation flags? No, the compilation is very straightforward for this library: === Building for slatec-4.1 Warning: Object directory not changed from original /usr/ports/math/slatec/work/src gfortran46 -O -Wl,-rpath=/usr/local/lib/gcc46 -c aa.f *skip ~1400 other files* gfortran46 -O -Wl,-rpath=/usr/local/lib/gcc46 -c zwrsk.f building static slatec library /usr/local/bin/ranlib libslatec.a gfortran46 -pg -O -Wl,-rpath=/usr/local/lib/gcc46 -o aa.po -c aa.f *skip ~1400 other files* gfortran46 -pg -O -Wl,-rpath=/usr/local/lib/gcc46 -o zwrsk.po -c zwrsk.f building profiled slatec library /usr/local/bin/ranlib libslatec_p.a gfortran46 -fpic -DPIC -O -Wl,-rpath=/usr/local/lib/gcc46 -o aa.So -c aa.f *skip ~1400 other files* gfortran46 -fpic -DPIC -O -Wl,-rpath=/usr/local/lib/gcc46 -o zwrsk.So -c zwrsk.f building shared library libslatec.so.1 # That's all there is. As I mentioned in the original email, the only difference, according to nm(1), between the non-profiling and the profiling library, is that the profiling library contains symbol .mcount (or _mcount, depending on the arch) for each object file. All other symbols are identical. -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: profiling library smaller than non-profiling, while it contains more symbols. Why?
On Fri, Jul 13, 2012 at 11:15:45AM +0200, Wojciech Puchar wrote: -rw-r--r-- 1 root wheel 6582354 Jul 12 22:56 libslatec.a -rw-r--r-- 1 root wheel 6555122 Jul 12 23:02 libslatec_p.a # profile library or -fpic library? I think profile: === Building for slatec-4.1 Warning: Object directory not changed from original /usr/ports/math/slatec/work/src gfortran46 -O -Wl,-rpath=/usr/local/lib/gcc46 -c aa.f *skip ~1400 other files* gfortran46 -O -Wl,-rpath=/usr/local/lib/gcc46 -c zwrsk.f building static slatec library /usr/local/bin/ranlib libslatec.a gfortran46 -pg -O -Wl,-rpath=/usr/local/lib/gcc46 -o aa.po -c aa.f *skip ~1400 other files* gfortran46 -pg -O -Wl,-rpath=/usr/local/lib/gcc46 -o zwrsk.po -c zwrsk.f building profiled slatec library /usr/local/bin/ranlib libslatec_p.a gfortran46 -fpic -DPIC -O -Wl,-rpath=/usr/local/lib/gcc46 -o aa.So -c aa.f *skip ~1400 other files* gfortran46 -fpic -DPIC -O -Wl,-rpath=/usr/local/lib/gcc46 -o zwrsk.So -c zwrsk.f building shared library libslatec.so.1 -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
profiling library smaller than non-profiling, while it contains more symbols. Why?
While updating my port (math/slatec) to use the new OPTIONS framework, I did some experiments with the profiling library. I don't know much about this, so what surprised me is that the profiling library is smaller: # ls -al lib*a -rw-r--r-- 1 root wheel 6582354 Jul 12 22:56 libslatec.a -rw-r--r-- 1 root wheel 6555122 Jul 12 23:02 libslatec_p.a # while it adds .mcount symbol to each object file: # nm libslatec.a nm # nm libslatec_p.a nmp # wc nm nmp 16436 36675 373033 nm 17885 39573 413605 nmp 34321 76248 786638 total # grep -c mcount nmp 1449 # expr 16436 + 1449 17885 # Using diff I can confirm that the only difference between the 2 libs is the .mcount symbol for each object file in the profiling library. So how can the profiling library be smaller? -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: profiling library smaller than non-profiling, while it contains more symbols. Why?
On Thu, Jul 12, 2012 at 11:31:31PM +0100, Anton Shterenlikht wrote: While updating my port (math/slatec) to use the new OPTIONS framework, I did some experiments with the profiling library. I don't know much about this, so what surprised me is that the profiling library is smaller: # ls -al lib*a -rw-r--r-- 1 root wheel 6582354 Jul 12 22:56 libslatec.a -rw-r--r-- 1 root wheel 6555122 Jul 12 23:02 libslatec_p.a # while it adds .mcount symbol to each object file: # nm libslatec.a nm # nm libslatec_p.a nmp # wc nm nmp 16436 36675 373033 nm 17885 39573 413605 nmp 34321 76248 786638 total # grep -c mcount nmp 1449 # expr 16436 + 1449 17885 # Using diff I can confirm that the only difference between the 2 libs is the .mcount symbol for each object file in the profiling library. So how can the profiling library be smaller? Also, the library compiled on amd64 has lots more symbols than if compiled on ia64. For example: amd64: zbesy.o: r .LC0 0008 r .LC1 0010 r .LC11 0020 r .LC12 0028 r .LC13 0030 r .LC14 0010 r .LC2 0018 r .LC5 r .LC6 U cos U d1mach_ U exp U i1mach_ U sin U zbesh_ T zbesy_ and ia64: zbesy.o: U cos U d1mach_ U exp U i1mach_ U sin U zbesh_ T zbesy_ Why the difference? -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: profiling library smaller than non-profiling, while it contains more symbols. Why?
From owner-freebsd-questi...@freebsd.org Thu Jul 12 17:34:12 2012 Date: Thu, 12 Jul 2012 23:31:31 +0100 From: Anton Shterenlikht me...@bristol.ac.uk To: freebsd-questions@freebsd.org Subject: profiling library smaller than non-profiling, while it contains more symbols. Why? While updating my port (math/slatec) to use the new OPTIONS framework, I did some experiments with the profiling library. I don't know much about this, so what surprised me is that the profiling library is smaller: # ls -al lib*a -rw-r--r-- 1 root wheel 6582354 Jul 12 22:56 libslatec.a -rw-r--r-- 1 root wheel 6555122 Jul 12 23:02 libslatec_p.a # It it possible that libslatac.a has debggingn symbols, and the profiling library does not? Or that the profiling library was compiled with a lower degree of optimization ? (many of the 'higher'-level optimizations cause _larger_, albeit faster, code to be generated) Any other differences in compilation flags? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: profiling library smaller than non-profiling, while it contains more symbols. Why?
Date: Thu, 12 Jul 2012 23:52:18 +0100 From: Anton Shterenlikht me...@bristol.ac.uk Subject: Re: profiling library smaller than non-profiling, while it contains more symbols. Why? Also, the library compiled on amd64 has lots more symbols than if compiled on ia64. This is _not_ unexpected with different processor architectures, and thus different instruction sets. Speculation: the values lfor the .LC* variables look like bit-masks -- it may be that ia64 has opcodes that allow immediate operannds, obviating the need for a 'data' consntat in memory. Something in the source causes different processor-specific code to be generated -- one processor uses 'data' constants, the other doesn't. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
RE: kernel profiling: spinlock_exit consumes 36% CPU time.
Forgot to meantion that the test is based on FreeBSD kernel 7.0 2000807 snapshot. The kernel was compiled with a modified version of GENERIC configuration. With SMP and PREEMPTION disabled and kernel profiling enabled. -Original Message- From: Jeremy Chadwick [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 07, 2008 7:58 PM To: Cc: freebsd-questions@freebsd.org; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: kernel profiling: spinlock_exit consumes 36% CPU time. On Tue, Oct 07, 2008 at 07:44:00PM +0800, wrote: Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] 0.7 22.80 0.19 3145852 0.00 0.00 free [47] 0.6 22.96 0.15 6292172 0.00 0.00 uma_zfree_arg [52] 0.6 23.10 0.14 5243413 0.00 0.00 generic_bzero [53] 0.5 23.23 0.14 1048581 0.00 0.00 ip_output [23] 0.5 23.36 0.13 4221855 0.00 0.00 generic_bcopy [57] 0.4 23.47 0.11 36865859 0.00 0.00 critical_enter [61] 0.4 23.57 0.10 36865859 0.00 0.00 critical_exit [62] 0.4 23.67 0.09 17937541 0.00 0.00 spinlock_enter [63] 0.4 23.76 0.09 1048582 0.00 0.00 udp_input [21] 0.3 23.85 0.09 2108904 0.00 0.00 syscall [5] 0.3 23.93 0.08 1048587 0.00 0.00 ip_input [20] 0.3 24.00 0.07 2097156 0.00 0.00 getsock [65] 0.3 24.07 0.07 1048576 0.00 0.00 udp_send [22] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. For clarity, I paste the code of spinlock_exit here: void spinlock_exit(void) { struct thread *td; td = curthread; critical_exit(); td-td_md.md_spinlock_count--; if (td-td_md.md_spinlock_count == 0) intr_restore(td-td_md.md_saved_flags); } Since critical_exit consumes only 0.4% CPU time, does this mean the rest of spinlock_exit consume ~36% CPU time? Am I missing something? Could anybody help me understand this? Many thanks. BTW, the kernel is compiled with SMP and PREEMPTION disabled. The scheduler is ULE. What FreeBSD version, and what build date of the kernel? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: kernel profiling: spinlock_exit consumes 36% CPU time.
Many thanks for the information. Could we say that interrupt handlers consumed ~36% execution time? Is this number too high? Is it possible that we abuse the use of critical sections in kernel? Looking forward to your options. Many thanks. Qiu Jian On Tuesday 07 October 2008 07:44:00 am 邱剑 wrote: Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. It's because the intr_restore() re-enables interrupts and the resulting time spent executing the handlers for any pending interrupts are attributed to spinlock_exit(). -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel profiling: spinlock_exit consumes 36% CPU time.
On Wednesday 08 October 2008 03:51:48 am 邱剑 wrote: Many thanks for the information. Could we say that interrupt handlers consumed ~36% execution time? Is this number too high? Is it possible that we abuse the use of critical sections in kernel? I think whether or not it is high depends on the workload. -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel profiling: spinlock_exit consumes 36% CPU time.
On Tue, 7 Oct 2008, John Baldwin wrote: On Tuesday 07 October 2008 07:44:00 am wrote: Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. It's because the intr_restore() re-enables interrupts and the resulting time spent executing the handlers for any pending interrupts are attributed to spinlock_exit(). This is one of many defects that are not present in high resolution kernel profiling (kgmon -B instead of kgmon -b; availaible on amd64 and i386). However, high resolution kernel profiling doesn't work right with SMP, and was completely broken by gcc-4. Ordinary profiling was less completely broken by gcc-4, and you can recover the old behaviour by turning off new optimizations (mainly -funit-at-a-time and/or -finline-functions-called-once and or all of -O2). Bruce___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel profiling: spinlock_exit consumes 36% CPU time.
On Tue, Oct 07, 2008 at 07:44:00PM +0800, wrote: Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] 0.7 22.80 0.19 3145852 0.00 0.00 free [47] 0.6 22.96 0.15 6292172 0.00 0.00 uma_zfree_arg [52] 0.6 23.10 0.14 5243413 0.00 0.00 generic_bzero [53] 0.5 23.23 0.14 1048581 0.00 0.00 ip_output [23] 0.5 23.36 0.13 4221855 0.00 0.00 generic_bcopy [57] 0.4 23.47 0.11 36865859 0.00 0.00 critical_enter [61] 0.4 23.57 0.10 36865859 0.00 0.00 critical_exit [62] 0.4 23.67 0.09 17937541 0.00 0.00 spinlock_enter [63] 0.4 23.76 0.09 1048582 0.00 0.00 udp_input [21] 0.3 23.85 0.09 2108904 0.00 0.00 syscall [5] 0.3 23.93 0.08 1048587 0.00 0.00 ip_input [20] 0.3 24.00 0.07 2097156 0.00 0.00 getsock [65] 0.3 24.07 0.07 1048576 0.00 0.00 udp_send [22] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. For clarity, I paste the code of spinlock_exit here: void spinlock_exit(void) { struct thread *td; td = curthread; critical_exit(); td-td_md.md_spinlock_count--; if (td-td_md.md_spinlock_count == 0) intr_restore(td-td_md.md_saved_flags); } Since critical_exit consumes only 0.4% CPU time, does this mean the rest of spinlock_exit consume ~36% CPU time? Am I missing something? Could anybody help me understand this? Many thanks. BTW, the kernel is compiled with SMP and PREEMPTION disabled. The scheduler is ULE. What FreeBSD version, and what build date of the kernel? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel profiling: spinlock_exit consumes 36% CPU time.
邱剑 wrote: Hi, folks, [...] spinlocks disable interrupts so the profiling interrupt is held off from the moment that the spinlock is entered to the moment it is exited, and all of that time is attributed to spinlock_exit(). so that this tells you that 3% of your time is spent under spinlocks which is a lot. as others have asked, what version? you should look up lock profiling to see WHICH lock is teh ine in question. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel profiling: spinlock_exit consumes 36% CPU time.
On Tuesday 07 October 2008 07:44:00 am 邱剑 wrote: Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. It's because the intr_restore() re-enables interrupts and the resulting time spent executing the handlers for any pending interrupts are attributed to spinlock_exit(). -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
kernel profiling: spinlock_exit consumes 36% CPU time.
Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] 0.7 22.80 0.19 3145852 0.00 0.00 free [47] 0.6 22.96 0.15 6292172 0.00 0.00 uma_zfree_arg [52] 0.6 23.10 0.14 5243413 0.00 0.00 generic_bzero [53] 0.5 23.23 0.14 1048581 0.00 0.00 ip_output [23] 0.5 23.36 0.13 4221855 0.00 0.00 generic_bcopy [57] 0.4 23.47 0.11 36865859 0.00 0.00 critical_enter [61] 0.4 23.57 0.10 36865859 0.00 0.00 critical_exit [62] 0.4 23.67 0.09 17937541 0.00 0.00 spinlock_enter [63] 0.4 23.76 0.09 1048582 0.00 0.00 udp_input [21] 0.3 23.85 0.09 2108904 0.00 0.00 syscall [5] 0.3 23.93 0.08 1048587 0.00 0.00 ip_input [20] 0.3 24.00 0.07 2097156 0.00 0.00 getsock [65] 0.3 24.07 0.07 1048576 0.00 0.00 udp_send [22] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. For clarity, I paste the code of spinlock_exit here: void spinlock_exit(void) { struct thread *td; td = curthread; critical_exit(); td-td_md.md_spinlock_count--; if (td-td_md.md_spinlock_count == 0) intr_restore(td-td_md.md_saved_flags); } Since critical_exit consumes only 0.4% CPU time, does this mean the rest of spinlock_exit consume ~36% CPU time? Am I missing something? Could anybody help me understand this? Many thanks. BTW, the kernel is compiled with SMP and PREEMPTION disabled. The scheduler is ULE. Best regards, Qiu Jian ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
gcc42 and profiling
Whist trying to compile a program using the gcc42 (actually gnat-gcc42) port, using the -pg flag for profiling, I got the following upon linking: cc -pg -c -o prog.o prog.c cc -pg -o prog prog.o /usr/bin/ld: cannot find -lgcc_p I'm assuming this means that profiling libraries for the compiler are missing. What do I need to do to get working libraries from the port? There doesn't appear to be anything to set in the Makefile or any port options whatsoever. Please CC: me as I'm not subscribed. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: gcc42 and profiling
Please ignore my last post. I somehow neglected to install the proflibs distribution. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: wall-clock time profiling
In article [EMAIL PROTECTED], Bill Campbell [EMAIL PROTECTED] writes: freebsd On Thu, May 31, 2007, Akihiro KAYAMA wrote: freebsd In article [EMAIL PROTECTED], freebsd Chuck Swiger [EMAIL PROTECTED] writes: freebsd freebsd cswiger Akihiro KAYAMA wrote: freebsd cswiger Hi all. freebsd cswiger freebsd cswiger What is the right way to measure wall-clock time in profiling on FreeBSD? freebsd cswiger freebsd cswiger The time shell builtin command or /usr/bin/time -l _program_? freebsd cswiger freebsd cswiger The latter variant displays the rusage struct (ie, from man getrusage)? freebsd freebsd Thanks for your response. freebsd freebsd Yes, we can know whether the program is I/O bound or CPU bound by freebsd time(1). But it is still unclear which part of the program is really freebsd waiting for I/O. So profiling is needed for tuning, although CPU time freebsd profiling gives me non-distinct result on I/O bound programs. It is freebsd reason why I want wall-clock time profiling. freebsd freebsd Wall clock time doesn't generally tell you anything reliably freebsd useful on a multi-tasking system as it's very dependent on other freebsd system activity. I had many fights back in the mid '70s with freebsd people in accounting who wanted to bill wall-clock time on freebsd Burroughs main frames which generally had 20 programs in the mix freebsd at any time (I guess they were accustomed to IBM 360s that freebsd couldn't walk and chew gum at the same time :-). Yes, I know UNIX was born as TSS in 1970. And today I occupy my FreeBSD box. It is not necessary to share it with my colleagues. Thanks to Moore's Law and so many FreeBSD guys. It is easy to prepare a dedicated machine for debugging purpose and run a test program only. Daemons are still running in background but they are almost asleep (load averages: 0.00) so will not be critical on wall-clock time profiling. Wall-clock time profiling will be useful to know what routines/system calls take time in complex code, especially I/O related routines. -- Akihiro KAYAMA ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: wall-clock time profiling
Akihiro KAYAMA wrote: Hi all. What is the right way to measure wall-clock time in profiling on FreeBSD? The time shell builtin command or /usr/bin/time -l _program_? The latter variant displays the rusage struct (ie, from man getrusage)? -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: wall-clock time profiling
In article [EMAIL PROTECTED], Chuck Swiger [EMAIL PROTECTED] writes: cswiger Akihiro KAYAMA wrote: cswiger Hi all. cswiger cswiger What is the right way to measure wall-clock time in profiling on FreeBSD? cswiger cswiger The time shell builtin command or /usr/bin/time -l _program_? cswiger cswiger The latter variant displays the rusage struct (ie, from man getrusage)? Thanks for your response. Yes, we can know whether the program is I/O bound or CPU bound by time(1). But it is still unclear which part of the program is really waiting for I/O. So profiling is needed for tuning, although CPU time profiling gives me non-distinct result on I/O bound programs. It is reason why I want wall-clock time profiling. -- Akihiro KAYAMA ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: wall-clock time profiling
On Thu, May 31, 2007, Akihiro KAYAMA wrote: In article [EMAIL PROTECTED], Chuck Swiger [EMAIL PROTECTED] writes: cswiger Akihiro KAYAMA wrote: cswiger Hi all. cswiger cswiger What is the right way to measure wall-clock time in profiling on FreeBSD? cswiger cswiger The time shell builtin command or /usr/bin/time -l _program_? cswiger cswiger The latter variant displays the rusage struct (ie, from man getrusage)? Thanks for your response. Yes, we can know whether the program is I/O bound or CPU bound by time(1). But it is still unclear which part of the program is really waiting for I/O. So profiling is needed for tuning, although CPU time profiling gives me non-distinct result on I/O bound programs. It is reason why I want wall-clock time profiling. Wall clock time doesn't generally tell you anything reliably useful on a multi-tasking system as it's very dependent on other system activity. I had many fights back in the mid '70s with people in accounting who wanted to bill wall-clock time on Burroughs main frames which generally had 20 programs in the mix at any time (I guess they were accustomed to IBM 360s that couldn't walk and chew gum at the same time :-). If I had to measure wall-clock time for proceses, I would probably write a simple wrapper script to execute the command getting the system time before and after running the command I wanted to time: #!/usr/bin/env python import os, time tstart = time.time() os.system('some command') tend = time.time() et = tstart - tend print et Bill -- INTERNET: [EMAIL PROTECTED] Bill Campbell; Celestial Software LLC URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way FAX:(206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676 ``Lord, the money we do spend on Government and it's not one bit better than the government we got for one third the money twenty years ago.'' Will Rogers ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
wall-clock time profiling
Hi all. What is the right way to measure wall-clock time in profiling on FreeBSD? Standard profiling method in UNIX like 'gprof' measures CPU time, but it doesn't always offer a good indication for tuning if application is not CPU bound. For example, the below simple program spend most of the time for disk I/O, but gprof doesn't tell me that fsync(2) in output() is the hot-spot which should be removed at first. #include unistd.h #include sys/types.h #include sys/stat.h #include fcntl.h char calc() { int i; char c; for (i = 0; i 1; i++) { c = c * i; } return c; } void output(int fd, char c) { write(fd, c, 1); fsync(fd); /* XXX time consuming system call */ } int main() { int fd; int i; char c; fd = open(testfile, O_CREAT | O_WRONLY); for (i = 0; i 1; i++) { c = calc(i); output(fd, c); } return 0; } gprof's output: % cumulative self self total time seconds secondscalls ms/call ms/call name 84.2 0.56 0.561 0.06 0.06 calc [3] 13.6 0.65 0.091 0.01 0.01 _fsync [5] 1.4 0.66 0.011 0.00 0.00 write [6] 0.7 0.66 0.00 .mcount (12) 0.2 0.66 0.001 0.00 0.01 output [4] Measuring CPU time has been proper for traditional UNIX as TSS, but I think wall-clock time is also useful in these days. It is affected by system load, but it is easy to arrange some dedicated system for profiling purpose. Wall-clock time profiling itself could be implemented by simple kernel hack (i386/i386/trap.c:syscall() and kern/subr_trap.c:userret()), but if both CPU and wall-clock time profiling are available, I don't know how users should change the method. sysctl(8) may be inappropriate. Although I'm not familiar with another analysis tools, like DTrace for FreeBSD, is there something help for me? Thanks. -- Akihiro KAYAMA ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Profiling debugging on FreeBSD: loadable modules, profiler output, threads
Hi, I need to debug some software, written in Python (with threads), which uses dynamically loadable modules, written in C. FreeBSD version is 5.2.1-RC. All Python modules are compiled with -g -pg, so is the python binary. By default, -g -pg -pthread doesn't seem to link libc to Python binary (modules complain about missing 'strftime' on load). Any other combination of options, than -g -pg -lc_r -lc -pthread causes Python binary to dump core with sig 10 at start (with never ending _pthread_self calls in backtrace). Even if python.gmon is created, it contains only an information about calls to '__fpsetreg'. My sysctl kern.debug.threads is 1. kdump output seems to reveal: 88107 python2p3 NAMI python2p3.gmon 88107 python2p3 RET open 5 ... but only at the *end* of the log (shouldn't it be open some time earlier)? I'd appreciate any help in this matter. I need this working by yesterday. Thanks, -- Michal Pasternak :: http://pasternak.w.lub.pl :: http://winsrc.sf.net Frankly, any company would look bad if their corporate e-mail was available for anyone to read, -- Jim Dose of Ritual Entertainment. pgpOScSnwIlfy.pgp Description: PGP signature
Re: C++ Memory Profiling/Debugging
Hi Lou, Thanks - I'll give it a spin. I read with some reservation on http://dmalloc.com/: Dmalloc is not as good with C++ as C because the dynamic memory routines in C++ are new() and delete() as opposed to malloc() and free(). Since new and delete are usually not used as functions but rather as x = new type, there is no easy way for dmalloc to pass in file and line information unfortunately. The `libdmallocxx.a' library provides the file `dmallocc.cc' which effectively redirects new to the more familiar malloc and delete to the more familiar free. I'll give it a go anyway. Wonder if there is a C++ friendly debug tool out there... Thanks, David On Sat, 21 Feb 2004, Louis LeBlanc wrote: On 02/21/04 12:48 AM, David Carter-Hitchin sat at the `puter and typed: Hi, Does anyone out there know a good C++ memory profiling/debugging tool for FBSD? I'm looking for a tool like valgrind or purify. I grepped around in the ports directory and I found ElectricFence and mprof but these seem to be for C only (as they refer exclusively to malloc free). bohem-gc sounds like the kind of package I'm after - but I thought I ask in case anyone has better ideas... ? devel/dmalloc is pretty good. I'm using it with C on Solaris, but all you do is basically link its library into your process, set a few environment variables, and it will dump a complete list of statistics, based on the values of the environment variables. The most valuable statistic is the origin of every single memory allocation that is not freed. Simply track those made by your code (file name and line number of the malloc are given) and fix them. I found it MUCH easier to integrate and use than Efence or Purify. If your process dynamically allocates memory that isn't intended to be freed, like for internal structure allocation through the life of the process, you might include a routine that frees such pointers in your cleanup process. I have a number of things I have to clear that aren't intended to be freed during the life of the process, so I simply added them within a conditional precompiler block that only gets compiled when I'm building a memory debuggable version. You wouldn't believe the memory leaks I found in (someone else's) production code with this tool. HIGHLY recommended. Good luck. Please cc me on any replies - I had to drop out of this list sometime ago as the sheer volume was killing my mailbox... I know what you mean . . . Lou -- Louis LeBlanc [EMAIL PROTECTED] Fully Funded Hobbyist, KeySlapper Extrordinaire :) http://www.keyslapper.org ԿԬ Unless hours were cups of sack, and minutes capons, and clocks the tongues of bawds, and dials the signs of leaping houses, and the blessed sun himself a fair, hot wench in flame-colored taffeta, I see no reason why thou shouldst be so superfluous to demand the time of the day. I wasted time and now doth time waste me. -- William Shakespeare ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: C++ Memory Profiling/Debugging
On 02/21/04 12:48 AM, David Carter-Hitchin sat at the `puter and typed: Hi, Does anyone out there know a good C++ memory profiling/debugging tool for FBSD? I'm looking for a tool like valgrind or purify. I grepped around in the ports directory and I found ElectricFence and mprof but these seem to be for C only (as they refer exclusively to malloc free). bohem-gc sounds like the kind of package I'm after - but I thought I ask in case anyone has better ideas... ? devel/dmalloc is pretty good. I'm using it with C on Solaris, but all you do is basically link its library into your process, set a few environment variables, and it will dump a complete list of statistics, based on the values of the environment variables. The most valuable statistic is the origin of every single memory allocation that is not freed. Simply track those made by your code (file name and line number of the malloc are given) and fix them. I found it MUCH easier to integrate and use than Efence or Purify. If your process dynamically allocates memory that isn't intended to be freed, like for internal structure allocation through the life of the process, you might include a routine that frees such pointers in your cleanup process. I have a number of things I have to clear that aren't intended to be freed during the life of the process, so I simply added them within a conditional precompiler block that only gets compiled when I'm building a memory debuggable version. You wouldn't believe the memory leaks I found in (someone else's) production code with this tool. HIGHLY recommended. Good luck. Please cc me on any replies - I had to drop out of this list sometime ago as the sheer volume was killing my mailbox... I know what you mean . . . Lou -- Louis LeBlanc [EMAIL PROTECTED] Fully Funded Hobbyist, KeySlapper Extrordinaire :) http://www.keyslapper.org ԿԬ Unless hours were cups of sack, and minutes capons, and clocks the tongues of bawds, and dials the signs of leaping houses, and the blessed sun himself a fair, hot wench in flame-colored taffeta, I see no reason why thou shouldst be so superfluous to demand the time of the day. I wasted time and now doth time waste me. -- William Shakespeare ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
C++ Memory Profiling/Debugging
Hi, Does anyone out there know a good C++ memory profiling/debugging tool for FBSD? I'm looking for a tool like valgrind or purify. I grepped around in the ports directory and I found ElectricFence and mprof but these seem to be for C only (as they refer exclusively to malloc free). bohem-gc sounds like the kind of package I'm after - but I thought I ask in case anyone has better ideas... ? Please cc me on any replies - I had to drop out of this list sometime ago as the sheer volume was killing my mailbox... Thanks, David ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
profiling
I am looking to do profiling of one of my programs. I know about gprof and cachegrind for linux but was wondering if there are other tools that are recommended to have or to use instead. The test systems I have access to are Linux, FreeBSD, and some nice Sun machines. Thank you, Ben Mayer ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: memory profiling
In the last episode (Jun 02), Anurag Chaudhary said: I am porting a daemon form linux to freebsd. It works fine on linux but crashes giving segmentation fault in freebsd can some one suggest me some good memory profiling tool that works fine with freebsd and available in binary format. memory profiling sounds like a tool to optimize memory accesses, which is not what you want. Just make sure you have coredumps enabled and run gdb on the corefile to determine where your program is crashing. -- Dan Nelson [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Simple question about profiling
I've to confess this my first serious profile session, and i found something really strange (at least for me... =P) [flag@law3 src]$ gprof proto3 [snip] % cumulative self self total time seconds secondscalls ms/call ms/call name 74.4 39.2639.26 .mcount (83) 6.7 42.82 3.56 111575600 0.00 0.00 checkRule2d [4] 6.6 46.30 3.48 1024 3.3912.30 buildTree2d [3] 3.8 48.29 1.99 117334464 0.00 0.00 add_tail [6] 3.7 50.25 1.96 117234464 0.00 0.00 rem [7] 1.8 51.18 0.93 117234464 0.00 0.00 rem_head [5] 0.6 51.51 0.33 164 0.00 0.00 __qdivrem [13] 0.5 51.75 0.24 84 0.00 0.00 __svfscanf [9] 0.4 51.96 0.21 9037 0.02 0.09 buildTree1d [10] 0.3 52.14 0.18 80 0.00 0.00 strtoumax [12] 0.3 52.28 0.14 1078414 0.00 0.00 memset [20] 0.2 52.41 0.12 5658864 0.00 0.00 checkRule1d [21] 0.1 52.47 0.07 250129 0.00 0.00 count [23] 0.1 52.52 0.05 537809 0.00 0.00 malloc_bytes cycle 1 [22 ] 0.1 52.56 0.04 1166299 0.00 0.00 append_lists [25] 0.1 52.59 0.03 2411019 0.00 0.00 new_list [26] 0.1 52.62 0.03 536551 0.00 0.00 calloc [14] 0.0 52.64 0.02 70 0.24 0.42 tree2dHeight [29] [snip] i think this is the beef: what the hell is .mcount?!?! if i read the table correctly, .mcount is the guilty, isn't it? any help or pointer is appreciated, thank you... =) -- Paolo To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-questions in the body of the message
Re: Simple question about profiling
On Thu, Feb 20, 2003 at 03:40:43PM +0100, Paolo Pisati wrote: I've to confess this my first serious profile session, and i found something really strange (at least for me... =P) see my answer to your previous posting. mcount is a function used by profiling. toni -- Terror ist der Krieg der Armen, | [EMAIL PROTECTED] Krieg ist der Terror der Reichen. | Toni Schmidbauer - Sir Peter Ustinov | msg1/pgp0.pgp Description: PGP signature
Re: Simple question about profiling
On Tue, Feb 18, 2003 at 10:36:16PM +0100, Paolo Pisati wrote: I've to confess this my first serious profile session, and i found something really strange (at least for me... =P) 74.4 39.2639.26 .mcount (83) i think this is the beef: what the hell is .mcount?!?! if i read the table correctly, .mcount is the guilty, isn't it? i think mcount is used for profiling, so it doesn't count! see http://www.gnu.org/manual/gprof-2.9.1/html_node/gprof_25.html toni -- Terror ist der Krieg der Armen, | [EMAIL PROTECTED] Krieg ist der Terror der Reichen. | Toni Schmidbauer - Sir Peter Ustinov | msg19923/pgp0.pgp Description: PGP signature
Simple question about profiling
I've to confess this my first serious profile session, and i found something really strange (at least for me... =P) [flag@law3 src]$ gprof proto3 [snip] % cumulative self self total time seconds secondscalls ms/call ms/call name 74.4 39.2639.26 .mcount (83) 6.7 42.82 3.56 111575600 0.00 0.00 checkRule2d [4] 6.6 46.30 3.48 1024 3.3912.30 buildTree2d [3] 3.8 48.29 1.99 117334464 0.00 0.00 add_tail [6] 3.7 50.25 1.96 117234464 0.00 0.00 rem [7] 1.8 51.18 0.93 117234464 0.00 0.00 rem_head [5] 0.6 51.51 0.33 164 0.00 0.00 __qdivrem [13] 0.5 51.75 0.24 84 0.00 0.00 __svfscanf [9] 0.4 51.96 0.21 9037 0.02 0.09 buildTree1d [10] 0.3 52.14 0.18 80 0.00 0.00 strtoumax [12] 0.3 52.28 0.14 1078414 0.00 0.00 memset [20] 0.2 52.41 0.12 5658864 0.00 0.00 checkRule1d [21] 0.1 52.47 0.07 250129 0.00 0.00 count [23] 0.1 52.52 0.05 537809 0.00 0.00 malloc_bytes cycle 1 [22 ] 0.1 52.56 0.04 1166299 0.00 0.00 append_lists [25] 0.1 52.59 0.03 2411019 0.00 0.00 new_list [26] 0.1 52.62 0.03 536551 0.00 0.00 calloc [14] 0.0 52.64 0.02 70 0.24 0.42 tree2dHeight [29] [snip] i think this is the beef: what the hell is .mcount?!?! if i read the table correctly, .mcount is the guilty, isn't it? any help or pointer is appreciated, thank you... =) -- Paolo To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-questions in the body of the message
profiling in makefile
Is there a documentation on this? How, why it can be used. Thanks, Murat To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-questions in the body of the message