Brian:

Some of what you say about LiS efficiencies is good advice.

Much is speculation not supported by measurements.

If you use oprofile to monitor a high message rate test involving LiS you find that LiS spends most of its time in its locking routines.  I now have some lock contention tracking code in LiS so that I can see which locks are the hot spots.  The lock with the most contention instances is lis_qhead_lock accounting for 57% of all lock contention.  Next is lis_msg_lock at about 23%.

I have done a fair amount of work just in the last day or so to get lis_qhead_lock contention DOWN to that level.

My tests also are now showing 73% of the time in the kernel and 15% in LiS.  Most of the kernel time is in the idle routine.

Some of the above may be an artifact of my particular test environment -- three machines using 100MB LAN segments to run a data pipline, rather than a CPU contained test.

But the measurements just don't support the assertion that pointer checking and the like is what slows down the execution.

Eugene will probably get different results with his tests because they appear to be more CPU bound than mine.  And his single CPU environment should not show the locking routines as the hot spots.

But I am not going to go "fix everywhere" in LiS based on unsupported analysis and/or conjecture.  Let's find the actual hot spots and work on them.  If Eugene gets 40 routines each using 2% of the time then the "everywhere" hypothesis is supported.  If he gets two or three routines accounting for 80% of the time then we will have found "hot spots".

-- Dave
 
At 04:33 PM 4/9/2004, Brian F. G. Bidulock wrote:

Dave,

I'm not suprised by 10:1 on the inet driver.  STREAMS over sockets is
upside down from usual and is horribly inefficient.  On the other hand,
benchmark tests between our SCTP STREAMS driver running on LiS and our
SCTP NET4 socket driver running inside the kernel indicate that the LiS
version (wihtout performance enhancements) of SVR 4.2 STREAMS
has superior performance over the Linux NET4 version of BSD Sockets.

But I think everyone knew that 14 years ago.  The socket's *interface*
is kinda nice, but the guts of a BSD sockets stack sucks in performance
in comparison to streams.

So the 10:1 in the inet driver is just an impeadance mismatch (a lot of
extra copying, queueing, locking, state tracking, error checking, bounds
checking, sheduling, wait queues, ....)

As for Linux SVR3 style non-STREAMS pipes vs. LiS SVR 4.2 bidirectional
STREAMS-based pipies, Linux code uses 1 memory page a a buffer and 1
page of source for the write routine.  But it won't pass a file descriptor,
and fattach and connld is out of the question.  But the throughput should
still be better on LiS.

Check the data path.  I see several things.  lis_strputpmsg() should
check for flow control before allocating a buffer and copying data.  A lot of
cpu work time is destroyed whenever EAGAIN is returned.

lis_strwrite() checks flow control before allocating the message block.
But then it uses PUTNEXT() instead of lis_putnext() as well.  I don't know
why running queues is correct for lis_strwrite() but not for lis_strputpmsg()
or visa versa.

With lis_strwrite() the queues will get marked to run.

But look at stuff like lis_head_get_fcn() and lis_head_put_fcn() (head.c about
line 500 to line 720).

In lis_head_get_fcn() an lis_atomic_inc is peformed on a freshly allocated
stream head structure that noone else knows about.  It is necessary to lock
the bus to initialize the value?

lis_head_put_fcn() is the deepest and most contorted atomic_dec_and_test that
I have seen.  IRQ disabling spin locks are taken around conditional
atomic_dec().  Also there is a race in the function: when called by two
threads, the first thread to exit can return NULL and the second one return
non-NULL.

So the thing in the data path that slows things down is the paraphenalia
of pointer checking and rechecking (walk through each function and count
how many time q and hd are checked against NULL or QMAGIC) and the
horrendous lis_ locks and abuse of kernel lock functions like the two
functions above.

Profiling won't help you much when lis_ functions waste time everywhere.
You will not see the hot spot because it is hot everywhere.

Well, I'm back to why I started LfS in the first place.  So I'll shut up now..

--brian


On Fri, 09 Apr 2004, Dave Grothe wrote:

>
>    It  has always been known (I think) that there is a huge difference in
>    speed between Linux native pipes and streams based pipes via LiS.
>    I  have  seen  similar results sending UDP datagrams through the Linux
>    loopback  driver  using  native  sockets vs the inet driver that Brian
>    publishes.  I get about a 10:1 ratio between sockets and streams.
>    Nothing  has  ever  jumped  out at me when I run oprofile during these
>    tests -- except lock contention.  Which is what I am working on now.
>    -- Dave
>    At 11:58 AM 4/9/2004, Eugene LiS User wrote:
>
>      To exclude my module from the picture I have decided to compare
>      data pumping rates for the pipe interface.
>      I  have  googled pipespeed2 program, downloaded it and with a minor
>      changes
>      got it compiled in 2 versions with and without LiS.
>      The results are as following:
>      # ./ps2 2000k 4k
>      ps2 -x 1 2048000 4096    9.163 Seconds --  915.493 MB/sec
>      # ./ps2lis 2000k 4k
>      ps2lis -x 1 2048000 4096   45.602 Seconds --  183.953 MB/sec
>      Clearly there is some overhead in the LiS version of a pipe.
>      [Hopefuly [for my module] that overhead is not only pipe related]
>      Attached is a programm.
>      Compiling with LiS:
>      #  cc  -I/usr/src/LiS/include  -L /usr/src/LiS/libc -lLiS -o ps2lis
>      pipespeed2.c
>      Compiling without LiS:
>      # cc -o ps2 pipespeed2.c
>      __________________________________________________________________
>      Introducing the New Netscape Internet Service.
>      Only $9.95 a month -- Sign up today at
>      [1]http://isp.netscape.com/register
>      Netscape. Just the Net You Need.
>      New! Netscape Toolbar for Internet Explorer
>      Search from anywhere on the Web and block those annoying pop-ups.
>      Download now at
>      [2]http://channels.netscape.com/ns/search/install.jsp
>      ---
>      Incoming mail is certified Virus Free.
>      Checked by AVG anti-virus system ([3]http://www.grisoft.com).
>      Version: 6.0.655 / Virus Database: 420 - Release Date: 4/8/2004
>
> References
>
>    1. http://isp.netscape.com/register
>    2. http://channels.netscape.com/ns/search/install.jsp
>    3. http://www.grisoft.com/

>
> ---
> Outgoing mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.655 / Virus Database: 420 - Release Date: 4/8/2004


--
Brian F. G. Bidulock    � The reasonable man adapts himself to the �
[EMAIL PROTECTED]    � world; the unreasonable one persists in  �
http://www.openss7.org/ � trying  to adapt the  world  to himself. �
                        � Therefore  all  progress  depends on the �
                        � unreasonable man. -- George Bernard Shaw �
_______________________________________________
Linux-streams mailing list
[EMAIL PROTECTED]
http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams


---
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.655 / Virus Database: 420 - Release Date: 4/8/2004
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.655 / Virus Database: 420 - Release Date: 4/8/2004

Reply via email to