Re: New CAM locking preview

2013-08-16 Thread Alexander Motin

On 16.08.2013 04:12, Adrian Chadd wrote:

Cool!

I assume you've run this with full witness debugging enabled, to catch
lock ordering issues?


Of course! I've endless times switched between debug and normal builds 
to test both correctness and performance after each change. But more 
external tests are welcome.



This is great. I look forward to per-CPU, pinned, completion threads
that I can do interesting things with (like schedule aio-sendfile
completions..)

On 15 August 2013 14:40, Alexander Motin m...@freebsd.org
mailto:m...@freebsd.org wrote:

Hi.

Last weeks I've made substantial progress on my CAM locking work. In
fact, at this moment I think I've tied all loose ends good enough to
consider the new design viable and implementation worth further
testing and bug fixing. So I would like to ask for review of my work
from everybody who interested in CAM internals.

In short, my idea was to split single per-SIM lock, that creates
huge congestion under high IOPS, into several smaller ones. So
design I've finally chosen includes such locks:
  1) New per-device (per-LUN) locks to protect state of the devices
and respective periphs. In most cases peripheral drivers just use
that lock instead of SIM lock used before, so code modification is
minimal and straightforward.
  2) New per-target lock to protect list of LUNs fetched from the
device.
  3) Old single per-SIM lock to protect SIM driver internals, but
only that. No parts of CAM itself use that lock. Keeping it for SIMs
allows to keep API and hopefully ABI compatibility. Reducing its
scope allows to reduce congestion.
  4) New per-SIM lock to protect SIM and device command queues. That
allows execute queued commands from any context unrelated to other
locks. Also this lock serializes accesses to sim_action() method for
the most of commands, this allows to mostly avoid busy spilling on
SIM lock collision.
  5) New per-bus locks to protect target, device and periphs
reference counters. It allows to create and destroy paths unrelated
to other locks in any possible context.

Numbers above also define supposed lock ordering: while holding
per-device lock 1) is allowed to request SIM lock 3), but not
backward. Cases where opposite is required (command completions and
async events) are handled via queuing events via several completion
threads. The rest of locks are self-contained and does not really
suppose cascading.

All these changes combined with GEOM direct dispatch (it will be
next separate project) allow to double system performance in disk
I/O microbenchmarks, comparing to present head, same as it was
announced on 2013-05 DevSummit:
http://people.freebsd.org/~__mav/camlock.pdf
http://people.freebsd.org/~mav/camlock.pdf . Tests without GEOM
changes also show performance improvement, but limited by heavy
bottleneck at the GEOM g_up/g_down threads at the level of 5-20%.

Project sources could be found at SVN projects/camlock branch:
http://svnweb.freebsd.org/__base/projects/camlock/
http://svnweb.freebsd.org/base/projects/camlock/ . Many early
changes from that branch are already integrated to head, so to
simplify review the rest patches for changes before r254059 were
manually remade and could be found here:
http://people.freebsd.org/~__mav/camlock_patches/
http://people.freebsd.org/~mav/camlock_patches/ .

These changes do not require controller driver modifications,
keeping KPIs and hopefully KBIs intact, but create base for later
work to use multiqueue capabilities of new controllers.

This work is sponsored by iXsystems, Inc.



--
Alexander Motin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: New CAM locking preview

2013-08-16 Thread Jeremie Le Hen
On Fri, Aug 16, 2013 at 12:40:43AM +0300, Alexander Motin wrote:
 Hi.
 
 Last weeks I've made substantial progress on my CAM locking work. In 
 fact, at this moment I think I've tied all loose ends good enough to 
 consider the new design viable and implementation worth further testing 
 and bug fixing. So I would like to ask for review of my work from 
 everybody who interested in CAM internals.
 
 In short, my idea was to split single per-SIM lock, that creates huge 
 congestion under high IOPS, into several smaller ones. So design I've 
 finally chosen includes such locks:
   1) New per-device (per-LUN) locks to protect state of the devices and 
 respective periphs. In most cases peripheral drivers just use that lock 
 instead of SIM lock used before, so code modification is minimal and 
 straightforward.
   2) New per-target lock to protect list of LUNs fetched from the device.
   3) Old single per-SIM lock to protect SIM driver internals, but only 
 that. No parts of CAM itself use that lock. Keeping it for SIMs allows 
 to keep API and hopefully ABI compatibility. Reducing its scope allows 
 to reduce congestion.
   4) New per-SIM lock to protect SIM and device command queues. That 
 allows execute queued commands from any context unrelated to other 
 locks. Also this lock serializes accesses to sim_action() method for the 
 most of commands, this allows to mostly avoid busy spilling on SIM lock 
 collision.
   5) New per-bus locks to protect target, device and periphs reference 
 counters. It allows to create and destroy paths unrelated to other locks 
 in any possible context.
 
 Numbers above also define supposed lock ordering: while holding 
 per-device lock 1) is allowed to request SIM lock 3), but not backward. 
 Cases where opposite is required (command completions and async events) 
 are handled via queuing events via several completion threads. The rest 
 of locks are self-contained and does not really suppose cascading.
 
 All these changes combined with GEOM direct dispatch (it will be next 
 separate project) allow to double system performance in disk I/O 
 microbenchmarks, comparing to present head, same as it was announced on 
 2013-05 DevSummit: http://people.freebsd.org/~mav/camlock.pdf . Tests 
 without GEOM changes also show performance improvement, but limited by 
 heavy bottleneck at the GEOM g_up/g_down threads at the level of 5-20%.
 
 Project sources could be found at SVN projects/camlock branch: 
 http://svnweb.freebsd.org/base/projects/camlock/ . Many early changes 
 from that branch are already integrated to head, so to simplify review 
 the rest patches for changes before r254059 were manually remade and 
 could be found here: http://people.freebsd.org/~mav/camlock_patches/ .
 
 These changes do not require controller driver modifications, keeping 
 KPIs and hopefully KBIs intact, but create base for later work to use 
 multiqueue capabilities of new controllers.
 
 This work is sponsored by iXsystems, Inc.

Excellent, thanks to both you and iXsystems.  I'm eager to see
everything merged to -CURRENT before the code slush ;).

-- 
Jeremie Le Hen

Scientists say the world is made up of Protons, Neutrons and Electrons.
They forgot to mention Morons.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


New CAM locking preview

2013-08-15 Thread Alexander Motin

Hi.

Last weeks I've made substantial progress on my CAM locking work. In 
fact, at this moment I think I've tied all loose ends good enough to 
consider the new design viable and implementation worth further testing 
and bug fixing. So I would like to ask for review of my work from 
everybody who interested in CAM internals.


In short, my idea was to split single per-SIM lock, that creates huge 
congestion under high IOPS, into several smaller ones. So design I've 
finally chosen includes such locks:
 1) New per-device (per-LUN) locks to protect state of the devices and 
respective periphs. In most cases peripheral drivers just use that lock 
instead of SIM lock used before, so code modification is minimal and 
straightforward.

 2) New per-target lock to protect list of LUNs fetched from the device.
 3) Old single per-SIM lock to protect SIM driver internals, but only 
that. No parts of CAM itself use that lock. Keeping it for SIMs allows 
to keep API and hopefully ABI compatibility. Reducing its scope allows 
to reduce congestion.
 4) New per-SIM lock to protect SIM and device command queues. That 
allows execute queued commands from any context unrelated to other 
locks. Also this lock serializes accesses to sim_action() method for the 
most of commands, this allows to mostly avoid busy spilling on SIM lock 
collision.
 5) New per-bus locks to protect target, device and periphs reference 
counters. It allows to create and destroy paths unrelated to other locks 
in any possible context.


Numbers above also define supposed lock ordering: while holding 
per-device lock 1) is allowed to request SIM lock 3), but not backward. 
Cases where opposite is required (command completions and async events) 
are handled via queuing events via several completion threads. The rest 
of locks are self-contained and does not really suppose cascading.


All these changes combined with GEOM direct dispatch (it will be next 
separate project) allow to double system performance in disk I/O 
microbenchmarks, comparing to present head, same as it was announced on 
2013-05 DevSummit: http://people.freebsd.org/~mav/camlock.pdf . Tests 
without GEOM changes also show performance improvement, but limited by 
heavy bottleneck at the GEOM g_up/g_down threads at the level of 5-20%.


Project sources could be found at SVN projects/camlock branch: 
http://svnweb.freebsd.org/base/projects/camlock/ . Many early changes 
from that branch are already integrated to head, so to simplify review 
the rest patches for changes before r254059 were manually remade and 
could be found here: http://people.freebsd.org/~mav/camlock_patches/ .


These changes do not require controller driver modifications, keeping 
KPIs and hopefully KBIs intact, but create base for later work to use 
multiqueue capabilities of new controllers.


This work is sponsored by iXsystems, Inc.

--
Alexander Motin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: New CAM locking preview

2013-08-15 Thread Hans Petter Selasky

On 08/15/13 23:40, Alexander Motin wrote:

Hi.

Last weeks I've made substantial progress on my CAM locking work. In
fact, at this moment I think I've tied all loose ends good enough to
consider the new design viable and implementation worth further testing
and bug fixing. So I would like to ask for review of my work from
everybody who interested in CAM internals.

In short, my idea was to split single per-SIM lock, that creates huge
congestion under high IOPS, into several smaller ones. So design I've
finally chosen includes such locks:
  1) New per-device (per-LUN) locks to protect state of the devices and
respective periphs. In most cases peripheral drivers just use that lock
instead of SIM lock used before, so code modification is minimal and
straightforward.
  2) New per-target lock to protect list of LUNs fetched from the device.
  3) Old single per-SIM lock to protect SIM driver internals, but only
that. No parts of CAM itself use that lock. Keeping it for SIMs allows
to keep API and hopefully ABI compatibility. Reducing its scope allows
to reduce congestion.
  4) New per-SIM lock to protect SIM and device command queues. That
allows execute queued commands from any context unrelated to other
locks. Also this lock serializes accesses to sim_action() method for the
most of commands, this allows to mostly avoid busy spilling on SIM lock
collision.
  5) New per-bus locks to protect target, device and periphs reference
counters. It allows to create and destroy paths unrelated to other locks
in any possible context.



Sounds very good! I assume you have tested USB mass storage device 
unplug during various file system operations?


--HPS

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: New CAM locking preview

2013-08-15 Thread Steven Hartland


- Original Message - 
From: Hans Petter Selasky h...@bitfrost.no

To: Alexander Motin m...@freebsd.org
Cc: Scott Long sco...@freebsd.org; Jeff Roberson j...@freebsd.org; ken k...@freebsd.org; freebsd-hackers@FreeBSD.org; 
FreeBSD SCSI freebsd-s...@freebsd.org; Steven Hartland s...@freebsd.org; Justin T. Gibbs gi...@freebsd.org

Sent: Friday, August 16, 2013 1:38 AM
Subject: Re: New CAM locking preview



On 08/15/13 23:40, Alexander Motin wrote:

Hi.

Last weeks I've made substantial progress on my CAM locking work. In
fact, at this moment I think I've tied all loose ends good enough to
consider the new design viable and implementation worth further testing
and bug fixing. So I would like to ask for review of my work from
everybody who interested in CAM internals.

In short, my idea was to split single per-SIM lock, that creates huge
congestion under high IOPS, into several smaller ones. So design I've
finally chosen includes such locks:
  1) New per-device (per-LUN) locks to protect state of the devices and
respective periphs. In most cases peripheral drivers just use that lock
instead of SIM lock used before, so code modification is minimal and
straightforward.
  2) New per-target lock to protect list of LUNs fetched from the device.
  3) Old single per-SIM lock to protect SIM driver internals, but only
that. No parts of CAM itself use that lock. Keeping it for SIMs allows
to keep API and hopefully ABI compatibility. Reducing its scope allows
to reduce congestion.
  4) New per-SIM lock to protect SIM and device command queues. That
allows execute queued commands from any context unrelated to other
locks. Also this lock serializes accesses to sim_action() method for the
most of commands, this allows to mostly avoid busy spilling on SIM lock
collision.
  5) New per-bus locks to protect target, device and periphs reference
counters. It allows to create and destroy paths unrelated to other locks
in any possible context.



Sounds very good! I assume you have tested USB mass storage device unplug 
during various file system operations?


It does indeed sound like some very good work, thanks Alexander!

@Hans if USB mass storage device unplug is something important to you
then might I suggest its a good idea to grab the patches, run your own
tests and report any issues you might find as I'm sure this would be
most appreciated :)

   Regards
   Steve 




This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: New CAM locking preview

2013-08-15 Thread Adrian Chadd
Cool!

I assume you've run this with full witness debugging enabled, to catch lock
ordering issues?

This is great. I look forward to per-CPU, pinned, completion threads that I
can do interesting things with (like schedule aio-sendfile completions..)


-adrian



On 15 August 2013 14:40, Alexander Motin m...@freebsd.org wrote:

 Hi.

 Last weeks I've made substantial progress on my CAM locking work. In fact,
 at this moment I think I've tied all loose ends good enough to consider the
 new design viable and implementation worth further testing and bug fixing.
 So I would like to ask for review of my work from everybody who interested
 in CAM internals.

 In short, my idea was to split single per-SIM lock, that creates huge
 congestion under high IOPS, into several smaller ones. So design I've
 finally chosen includes such locks:
  1) New per-device (per-LUN) locks to protect state of the devices and
 respective periphs. In most cases peripheral drivers just use that lock
 instead of SIM lock used before, so code modification is minimal and
 straightforward.
  2) New per-target lock to protect list of LUNs fetched from the device.
  3) Old single per-SIM lock to protect SIM driver internals, but only
 that. No parts of CAM itself use that lock. Keeping it for SIMs allows to
 keep API and hopefully ABI compatibility. Reducing its scope allows to
 reduce congestion.
  4) New per-SIM lock to protect SIM and device command queues. That allows
 execute queued commands from any context unrelated to other locks. Also
 this lock serializes accesses to sim_action() method for the most of
 commands, this allows to mostly avoid busy spilling on SIM lock collision.
  5) New per-bus locks to protect target, device and periphs reference
 counters. It allows to create and destroy paths unrelated to other locks in
 any possible context.

 Numbers above also define supposed lock ordering: while holding per-device
 lock 1) is allowed to request SIM lock 3), but not backward. Cases where
 opposite is required (command completions and async events) are handled via
 queuing events via several completion threads. The rest of locks are
 self-contained and does not really suppose cascading.

 All these changes combined with GEOM direct dispatch (it will be next
 separate project) allow to double system performance in disk I/O
 microbenchmarks, comparing to present head, same as it was announced on
 2013-05 DevSummit: 
 http://people.freebsd.org/~**mav/camlock.pdfhttp://people.freebsd.org/~mav/camlock.pdf.
  Tests without GEOM changes also show performance improvement, but limited
 by heavy bottleneck at the GEOM g_up/g_down threads at the level of 5-20%.

 Project sources could be found at SVN projects/camlock branch:
 http://svnweb.freebsd.org/**base/projects/camlock/http://svnweb.freebsd.org/base/projects/camlock/.
  Many early changes from that branch are already integrated to head, so to
 simplify review the rest patches for changes before r254059 were manually
 remade and could be found here: http://people.freebsd.org/~**
 mav/camlock_patches/ http://people.freebsd.org/~mav/camlock_patches/ .

 These changes do not require controller driver modifications, keeping KPIs
 and hopefully KBIs intact, but create base for later work to use multiqueue
 capabilities of new controllers.

 This work is sponsored by iXsystems, Inc.

 --
 Alexander Motin
 __**_
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/**mailman/listinfo/freebsd-**hackershttp://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscribe@**
 freebsd.org freebsd-hackers-unsubscr...@freebsd.org

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org