On 16.08.2013 04:12, Adrian Chadd wrote:
Cool!

I assume you've run this with full witness debugging enabled, to catch
lock ordering issues?

Of course! I've endless times switched between debug and normal builds to test both correctness and performance after each change. But more external tests are welcome.

This is great. I look forward to per-CPU, pinned, completion threads
that I can do interesting things with (like schedule aio-sendfile
completions..)

On 15 August 2013 14:40, Alexander Motin <m...@freebsd.org
<mailto:m...@freebsd.org>> wrote:

    Hi.

    Last weeks I've made substantial progress on my CAM locking work. In
    fact, at this moment I think I've tied all loose ends good enough to
    consider the new design viable and implementation worth further
    testing and bug fixing. So I would like to ask for review of my work
    from everybody who interested in CAM internals.

    In short, my idea was to split single per-SIM lock, that creates
    huge congestion under high IOPS, into several smaller ones. So
    design I've finally chosen includes such locks:
      1) New per-device (per-LUN) locks to protect state of the devices
    and respective periphs. In most cases peripheral drivers just use
    that lock instead of SIM lock used before, so code modification is
    minimal and straightforward.
      2) New per-target lock to protect list of LUNs fetched from the
    device.
      3) Old single per-SIM lock to protect SIM driver internals, but
    only that. No parts of CAM itself use that lock. Keeping it for SIMs
    allows to keep API and hopefully ABI compatibility. Reducing its
    scope allows to reduce congestion.
      4) New per-SIM lock to protect SIM and device command queues. That
    allows execute queued commands from any context unrelated to other
    locks. Also this lock serializes accesses to sim_action() method for
    the most of commands, this allows to mostly avoid busy spilling on
    SIM lock collision.
      5) New per-bus locks to protect target, device and periphs
    reference counters. It allows to create and destroy paths unrelated
    to other locks in any possible context.

    Numbers above also define supposed lock ordering: while holding
    per-device lock 1) is allowed to request SIM lock 3), but not
    backward. Cases where opposite is required (command completions and
    async events) are handled via queuing events via several completion
    threads. The rest of locks are self-contained and does not really
    suppose cascading.

    All these changes combined with GEOM direct dispatch (it will be
    next separate project) allow to double system performance in disk
    I/O microbenchmarks, comparing to present head, same as it was
    announced on 2013-05 DevSummit:
    http://people.freebsd.org/~__mav/camlock.pdf
    <http://people.freebsd.org/~mav/camlock.pdf> . Tests without GEOM
    changes also show performance improvement, but limited by heavy
    bottleneck at the GEOM g_up/g_down threads at the level of 5-20%.

    Project sources could be found at SVN projects/camlock branch:
    http://svnweb.freebsd.org/__base/projects/camlock/
    <http://svnweb.freebsd.org/base/projects/camlock/> . Many early
    changes from that branch are already integrated to head, so to
    simplify review the rest patches for changes before r254059 were
    manually remade and could be found here:
    http://people.freebsd.org/~__mav/camlock_patches/
    <http://people.freebsd.org/~mav/camlock_patches/> .

    These changes do not require controller driver modifications,
    keeping KPIs and hopefully KBIs intact, but create base for later
    work to use multiqueue capabilities of new controllers.

    This work is sponsored by iXsystems, Inc.


--
Alexander Motin
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to