:As the original author of the cil/cml code I can say I was glad to see that 
:had finally put it to rest. It was a desperate hack made in an attempt to pinch
:a little more performance out of the paradigm without dealing with the whole
:spl() problem set.  I would have done it myself if life hadn't taken me down a 
:path where I have too little time for these things...
:I've been playing with test buildworlds on my server and have concluded that
:we are currently kernel (big giant lock?) bound.  In my tests 3 CPUs actually
:complete buildworld faster than 4.  The major solutions to SMP in the future
:will come from:
: 1: pushdown of the BGL into the read/write routines.
: 2: kernel threads.
: 3: replacement of spl() with mutex() type protection of data regions.
:I am guessing that little of the above will be MFC'd into 4.0.  So the issue
:of the current SMP patch set should be based on its merits alone.  I would
:agree that they in themselves are worthy of MFCing.  Lets just not kid 
:ourselves about major future improvements of SMP in 4.0, the biggies I
:enumerate above just won't happen there.
:Steve Passe    | powered by 
:[EMAIL PROTECTED]    |            Symmetric MultiProcessor FreeBSD

    This is my feeling too.  I think there is a very good chance that we
    would be able to MFC SMP work for the Network stack and VM subsystem,
    for example.  The SMP work under 4.x and 5.x wound up getting stalled in 
    part because there were three or four different versions of each core
    assembly module in #ifdef's to handle all the different options 
    combinations, and it got to the point where I think only three or four 
    people really knew what was going on in there.

    From an algorithmic and testing point of view, what the cml and cil
    changes taught us is that segregating the spin locks doesn't improve
    performance because programs tend to repeat the use of a particular
    entry point over and over again (like calling read() or write()), and
    thus wind up competing for the same spin lock anyway.  Basically it
    told us that region-based locks don't produce significant performance
    improvements and we have to get rid of the high level locks entirely to 
    make things reasonably efficient.

    The network stack and VM system are a particularly good fit to this
    because locking can occur at a much finer grain.  vm_page_t/vm_object
    lookups can almost run MP-safe now with only the addition of a 
    shared lock at the vm_object and VM page cache level to allow lookups.
    Pages are already individually locked and that mechanism need not
    change.  There are a bunch of areas where the VM system is running at
    splvm() in order to be able to lookup pages without busying them which
    would have to change, but that isn't a huge deal.

    The network stack is equally easy to make MP-safe.  In this case we
    have a shared lock to lookup sockets for host/port combinations and
    then fine-grained exclusive locks within those sockets.  Route table
    and other high level operations could in fact remain BGL'd without
    interfering with the network stack because the network stack *already*
    caches route table lookups.

    The KISS principle applies to SMP big-time.

                                        Matthew Dillon 
                                        <[EMAIL PROTECTED]>

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to