Dears!
We deployed some SIP server application on top of multi-core Sparc
machine.The CPU number is matched with the number of application processes. All
processes are waiting on the same socket and doing the same thing in a loop:
call a
blocking recvfrom and then process the received message. The transport we are
using is
UDP. The OS is solaris 10
What we observed is that when we set the CPU number to be larger than
16, severe performancedegradation will occur.
We used lockstat to profile the kernel and lock contention and
referenced opensolaris source code. What we found is that:
1) the function "mutex_vector_enter" accounts for 50% of the CPU time.
2) most of mutex_vector_enter is called by "cv_wait_sig", and the call
graph is: recvfrom -> syscall_trap32-> recvfrom -> recvit ->
sotpi_recvmsg->so_lock_read_intr-> cv_wait_sig
3) so_lock_read_intr seems like a function to serialize the function
"kstrgetmsg" which copy data from kernel space to user buffer. Thus, only one
user can do this copy at a
time. It seems unnessesary for simply UDP processing, because kernel can just
lock during dequeuing
the packet from sock queue, but doesn't lock during copying data from kernel to
user space, just
like what linux does.
Our question is: Is there an alternative path for "recvfrom" which is
more simple than the current tpi imeplementation, and will not lock during
copying data from kernel to
user space ?
We hope there exists a more direct channel between the user
application and the network stack for UDP, or any patches for it?
Can anyone with this knowledge give us a hand? Thanks!
Cheers
Yours
Jia
--------------------------------------------------------------------------
Jia Zou, PHD candidate
Lab of Data Security
Dept. of Computer Science and Technology,
Tsinghua University,
Beijing, 100084, P.R. China
Tel: 86-10-6279 1525
E_Mail: [EMAIL PROTECTED]
_______________________________________________
networking-discuss mailing list
[email protected]