RE: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-08 Thread 邱剑
 Forgot to meantion that the test is based on FreeBSD kernel 7.0 2000807
snapshot.

The kernel was compiled with a modified version of GENERIC configuration.
With SMP and PREEMPTION disabled and kernel profiling enabled.

-Original Message-
From: Jeremy Chadwick [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 07, 2008 7:58 PM
To: 
Cc: freebsd-questions@freebsd.org; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: kernel profiling: spinlock_exit consumes 36% CPU time.

On Tue, Oct 07, 2008 at 07:44:00PM +0800,  wrote:
 Hi, folks,
  
 I did kernel profiling when a single thread client sends UDP packets 
 to a single thread server on the same machine.
  
 In the output kernel profile, the first few kernel functions that 
 consumes the most CPU time are listed below:
  
 granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 
 seconds
 
   %   cumulative   self  self total
  time   seconds   secondscalls  ms/call  ms/call  name
  42.4  10.8810.880  100.00%   __mcount [1]
  36.1  20.14 9.26 17937541 0.00 0.00  spinlock_exit [4]
   4.2  21.22 1.08  3145728 0.00 0.00  in_cksum_skip [40]
   1.8  21.68 0.45  7351987 0.00 0.00  generic_copyin [43]
   1.1  21.96 0.29  3146028 0.00 0.00  generic_copyout [48]
   1.0  22.21 0.24  2108904 0.00 0.00  Xint0x80_syscall [3]
   0.8  22.42 0.21  6292131 0.00 0.00  uma_zalloc_arg [46]
   0.8  22.62 0.20  1048576 0.00 0.00  soreceive_generic
[9]
   0.7  22.80 0.19  3145852 0.00 0.00  free [47]
   0.6  22.96 0.15  6292172 0.00 0.00  uma_zfree_arg [52]
   0.6  23.10 0.14  5243413 0.00 0.00  generic_bzero [53]
   0.5  23.23 0.14  1048581 0.00 0.00  ip_output [23]
   0.5  23.36 0.13  4221855 0.00 0.00  generic_bcopy [57]
   0.4  23.47 0.11 36865859 0.00 0.00  critical_enter [61]
   0.4  23.57 0.10 36865859 0.00 0.00  critical_exit [62]
   0.4  23.67 0.09 17937541 0.00 0.00  spinlock_enter [63]
   0.4  23.76 0.09  1048582 0.00 0.00  udp_input [21]
   0.3  23.85 0.09  2108904 0.00 0.00  syscall [5]
   0.3  23.93 0.08  1048587 0.00 0.00  ip_input [20]
   0.3  24.00 0.07  2097156 0.00 0.00  getsock [65]
   0.3  24.07 0.07  1048576 0.00 0.00  udp_send [22]
  
 It is very strange that spinlock_exit consumes over 36% CPU time while 
 it seems a very simple function.
  
 For clarity, I paste the code of spinlock_exit here:
  
 void
 spinlock_exit(void)
 {
  struct thread *td;
  
  td = curthread;
  critical_exit();
  td-td_md.md_spinlock_count--;
  if (td-td_md.md_spinlock_count == 0)
   intr_restore(td-td_md.md_saved_flags);
 }
 
 Since critical_exit consumes only 0.4% CPU time, does this mean the 
 rest of spinlock_exit consume ~36% CPU time?
  
 Am I missing something? Could anybody help me understand this? Many
thanks.
  
 BTW, the kernel is compiled with SMP and PREEMPTION disabled. The 
 scheduler is ULE.

What FreeBSD version, and what build date of the kernel?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-08 Thread 邱剑
Many thanks for the information.

Could we say that interrupt handlers consumed ~36% execution time?

Is this number too high? Is it possible that we abuse the use of critical
sections in kernel?

Looking forward to your options. Many thanks.

Qiu Jian


On Tuesday 07 October 2008 07:44:00 am 邱剑 wrote:
 Hi, folks,
  
 I did kernel profiling when a single thread client sends UDP packets 
 to a single thread server on the same machine.
  
 In the output kernel profile, the first few kernel functions that 
 consumes the most CPU time are listed below:
  
 granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 
 seconds
 
   %   cumulative   self  self total
  time   seconds   secondscalls  ms/call  ms/call  name
  42.4  10.8810.880  100.00%   __mcount [1]
  36.1  20.14 9.26 17937541 0.00 0.00  spinlock_exit [4]
   4.2  21.22 1.08  3145728 0.00 0.00  in_cksum_skip [40]
   1.8  21.68 0.45  7351987 0.00 0.00  generic_copyin [43]
   1.1  21.96 0.29  3146028 0.00 0.00  generic_copyout [48]
   1.0  22.21 0.24  2108904 0.00 0.00  Xint0x80_syscall [3]
   0.8  22.42 0.21  6292131 0.00 0.00  uma_zalloc_arg [46]
   0.8  22.62 0.20  1048576 0.00 0.00  soreceive_generic
[9]

 It is very strange that spinlock_exit consumes over 36% CPU time while 
 it seems a very simple function.

It's because the intr_restore() re-enables interrupts and the resulting time
spent executing the handlers for any pending interrupts are attributed to
spinlock_exit().

--
John Baldwin


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-08 Thread John Baldwin
On Wednesday 08 October 2008 03:51:48 am 邱剑 wrote:
 Many thanks for the information.
 
 Could we say that interrupt handlers consumed ~36% execution time?
 
 Is this number too high? Is it possible that we abuse the use of critical
 sections in kernel?

I think whether or not it is high depends on the workload.

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-08 Thread Bruce Evans

On Tue, 7 Oct 2008, John Baldwin wrote:


On Tuesday 07 October 2008 07:44:00 am  wrote:

Hi, folks,

I did kernel profiling when a single thread client sends UDP packets to a
single thread server on the same machine.

In the output kernel profile, the first few kernel functions that consumes
the most CPU time are listed below:

granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds

  %   cumulative   self  self total
 time   seconds   secondscalls  ms/call  ms/call  name
 42.4  10.8810.880  100.00%   __mcount [1]
 36.1  20.14 9.26 17937541 0.00 0.00  spinlock_exit [4]
  4.2  21.22 1.08  3145728 0.00 0.00  in_cksum_skip [40]
  1.8  21.68 0.45  7351987 0.00 0.00  generic_copyin [43]
  1.1  21.96 0.29  3146028 0.00 0.00  generic_copyout [48]
  1.0  22.21 0.24  2108904 0.00 0.00  Xint0x80_syscall [3]
  0.8  22.42 0.21  6292131 0.00 0.00  uma_zalloc_arg [46]
  0.8  22.62 0.20  1048576 0.00 0.00  soreceive_generic [9]

It is very strange that spinlock_exit consumes over 36% CPU time while it
seems a very simple function.


It's because the intr_restore() re-enables interrupts and the resulting time
spent executing the handlers for any pending interrupts are attributed to
spinlock_exit().


This is one of many defects that are not present in high resolution
kernel profiling (kgmon -B instead of kgmon -b; availaible on amd64
and i386).  However, high resolution kernel profiling doesn't work
right with SMP, and was completely broken by gcc-4.  Ordinary profiling
was less completely broken by gcc-4, and you can recover the old
behaviour by turning off new optimizations (mainly -funit-at-a-time
and/or -finline-functions-called-once and or all of -O2).

Bruce___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-07 Thread Jeremy Chadwick
On Tue, Oct 07, 2008 at 07:44:00PM +0800,  wrote:
 Hi, folks,
  
 I did kernel profiling when a single thread client sends UDP packets to a
 single thread server on the same machine.
  
 In the output kernel profile, the first few kernel functions that consumes
 the most CPU time are listed below:
  
 granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds
 
   %   cumulative   self  self total
  time   seconds   secondscalls  ms/call  ms/call  name
  42.4  10.8810.880  100.00%   __mcount [1]
  36.1  20.14 9.26 17937541 0.00 0.00  spinlock_exit [4]
   4.2  21.22 1.08  3145728 0.00 0.00  in_cksum_skip [40]
   1.8  21.68 0.45  7351987 0.00 0.00  generic_copyin [43]
   1.1  21.96 0.29  3146028 0.00 0.00  generic_copyout [48]
   1.0  22.21 0.24  2108904 0.00 0.00  Xint0x80_syscall [3]
   0.8  22.42 0.21  6292131 0.00 0.00  uma_zalloc_arg [46]
   0.8  22.62 0.20  1048576 0.00 0.00  soreceive_generic [9]
   0.7  22.80 0.19  3145852 0.00 0.00  free [47]
   0.6  22.96 0.15  6292172 0.00 0.00  uma_zfree_arg [52]
   0.6  23.10 0.14  5243413 0.00 0.00  generic_bzero [53]
   0.5  23.23 0.14  1048581 0.00 0.00  ip_output [23]
   0.5  23.36 0.13  4221855 0.00 0.00  generic_bcopy [57]
   0.4  23.47 0.11 36865859 0.00 0.00  critical_enter [61]
   0.4  23.57 0.10 36865859 0.00 0.00  critical_exit [62]
   0.4  23.67 0.09 17937541 0.00 0.00  spinlock_enter [63]
   0.4  23.76 0.09  1048582 0.00 0.00  udp_input [21]
   0.3  23.85 0.09  2108904 0.00 0.00  syscall [5]
   0.3  23.93 0.08  1048587 0.00 0.00  ip_input [20]
   0.3  24.00 0.07  2097156 0.00 0.00  getsock [65]
   0.3  24.07 0.07  1048576 0.00 0.00  udp_send [22]
  
 It is very strange that spinlock_exit consumes over 36% CPU time while it
 seems a very simple function.
  
 For clarity, I paste the code of spinlock_exit here:
  
 void
 spinlock_exit(void)
 {
  struct thread *td;
  
  td = curthread;
  critical_exit();
  td-td_md.md_spinlock_count--;
  if (td-td_md.md_spinlock_count == 0)
   intr_restore(td-td_md.md_saved_flags);
 }
 
 Since critical_exit consumes only 0.4% CPU time, does this mean the rest of
 spinlock_exit consume ~36% CPU time?
  
 Am I missing something? Could anybody help me understand this? Many thanks.
  
 BTW, the kernel is compiled with SMP and PREEMPTION disabled. The scheduler
 is ULE.

What FreeBSD version, and what build date of the kernel?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-07 Thread Julian Elischer
邱剑 wrote:
 Hi, folks,
[...]

spinlocks disable interrupts so the profiling interrupt is held off
from the
moment that the spinlock is entered to the moment it is exited, and all
of that time is attributed to spinlock_exit().

so that this tells you that 3% of your time is spent under spinlocks
which is  a  lot.

as others have asked, what version?

you should look up lock profiling to see WHICH lock
is teh ine in question.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel profiling: spinlock_exit consumes 36% CPU time.

2008-10-07 Thread John Baldwin
On Tuesday 07 October 2008 07:44:00 am 邱剑 wrote:
 Hi, folks,
  
 I did kernel profiling when a single thread client sends UDP packets to a
 single thread server on the same machine.
  
 In the output kernel profile, the first few kernel functions that consumes
 the most CPU time are listed below:
  
 granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds
 
   %   cumulative   self  self total
  time   seconds   secondscalls  ms/call  ms/call  name
  42.4  10.8810.880  100.00%   __mcount [1]
  36.1  20.14 9.26 17937541 0.00 0.00  spinlock_exit [4]
   4.2  21.22 1.08  3145728 0.00 0.00  in_cksum_skip [40]
   1.8  21.68 0.45  7351987 0.00 0.00  generic_copyin [43]
   1.1  21.96 0.29  3146028 0.00 0.00  generic_copyout [48]
   1.0  22.21 0.24  2108904 0.00 0.00  Xint0x80_syscall [3]
   0.8  22.42 0.21  6292131 0.00 0.00  uma_zalloc_arg [46]
   0.8  22.62 0.20  1048576 0.00 0.00  soreceive_generic [9]

 It is very strange that spinlock_exit consumes over 36% CPU time while it
 seems a very simple function.

It's because the intr_restore() re-enables interrupts and the resulting time 
spent executing the handlers for any pending interrupts are attributed to 
spinlock_exit().

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]