Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2016-01-12 Thread Roger Pau Monné
El 11/01/16 a les 21.40, Colin Percival ha escrit:
> On 01/11/16 11:52, Kenneth D. Merry wrote:
>> On Mon, Jan 11, 2016 at 18:29:22 +0100, Roger Pau Monn?? wrote:
>>> The following patch solves the problem AFAICT, and I would like to 
>>> commit it ASAP:
>>
>> I think this should be fine.
> 
> In light of the "ASAP" and the hour in Roger's part of the world (and the
> fact that this was obstructing other work I want to do today) I committed
> this fix as r293698 after experimental confirmation that it fixes what I
> was seeing.
> 
> Thank you both for the quick investigation!

Thanks both for the fast response, let's hope that this doesn't
introduce any other regressions :).

Roger.

___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2016-01-11 Thread Colin Percival
On 01/11/16 11:52, Kenneth D. Merry wrote:
> On Mon, Jan 11, 2016 at 18:29:22 +0100, Roger Pau Monn?? wrote:
>> The following patch solves the problem AFAICT, and I would like to 
>> commit it ASAP:
> 
> I think this should be fine.

In light of the "ASAP" and the hour in Roger's part of the world (and the
fact that this was obstructing other work I want to do today) I committed
this fix as r293698 after experimental confirmation that it fixes what I
was seeing.

Thank you both for the quick investigation!

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2016-01-11 Thread Roger Pau Monné
El 03/12/15 a les 21.54, Kenneth D. Merry ha escrit:
> Author: ken
> Date: Thu Dec  3 20:54:55 2015
> New Revision: 291716
> URL: https://svnweb.freebsd.org/changeset/base/291716
> 
> Log:
>   Add asynchronous command support to the pass(4) driver, and the new
>   camdd(8) utility.
>   
>   CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
>   completed CCBs may be retrieved via the CAMIOGET ioctl.  User
>   processes can use poll(2) or kevent(2) to get notification when
>   I/O has completed.
>   
>   While the existing CAMIOCOMMAND blocking ioctl interface only
>   supports user virtual data pointers in a CCB (generally only
>   one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
>   physical address pointers, as well as user virtual and physical
>   scatter/gather lists.  This allows user applications to have more
>   flexibility in their data handling operations.
>   
>   Kernel memory for data transferred via the queued interface is
>   allocated from the zone allocator in MAXPHYS sized chunks, and user
>   data is copied in and out.  This is likely faster than the
>   vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
>   configurations with many processors (there are more TLB shootdowns
>   caused by the mapping/unmapping operation) but may not be as fast
>   as running with unmapped I/O.
>   
>   The new memory handling model for user requests also allows
>   applications to send CCBs with request sizes that are larger than
>   MAXPHYS.  The pass(4) driver now limits queued requests to the I/O
>   size listed by the SIM driver in the maxio field in the Path
>   Inquiry (XPT_PATH_INQ) CCB.
>   
>   There are some things things would be good to add:
>   
>   1. Come up with a way to do unmapped I/O on multiple buffers.
>  Currently the unmapped I/O interface operates on a struct bio,
>  which includes only one address and length.  It would be nice
>  to be able to send an unmapped scatter/gather list down to
>  busdma.  This would allow eliminating the copy we currently do
>  for data.
>   
>   2. Add an ioctl to list currently outstanding CCBs in the various
>  queues.
>   
>   3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
>  that.
>   
>   4. Test physical address support.  Virtual pointers and scatter
>  gather lists have been tested, but I have not yet tested
>  physical addresses or scatter/gather lists.
>   
>   5. Investigate multiple queue support.  At the moment there is one
>  queue of commands per pass(4) device.  If multiple processes
>  open the device, they will submit I/O into the same queue and
>  get events for the same completions.  This is probably the right
>  model for most applications, but it is something that could be
>  changed later on.
>   
>   Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
>   driver interface.
>   
>   This utility is intended to be a basic data transfer/copy utility,
>   a simple benchmark utility, and an example of how to use the
>   asynchronous pass(4) interface.
>   
>   It can copy data to and from pass(4) devices using any target queue
>   depth, starting offset and blocksize for the input and ouptut devices.
>   It currently only supports SCSI devices, but could be easily extended
>   to support ATA devices.
>   
>   It can also copy data to and from regular files, block devices, tape
>   devices, pipes, stdin, and stdout.  It does not support queueing
>   multiple commands to any of those targets, since it uses the standard
>   read(2)/write(2)/writev(2)/readv(2) system calls.
>   
>   The I/O is done by two threads, one for the reader and one for the
>   writer.  The reader thread sends completed read requests to the
>   writer thread in strictly sequential order, even if they complete
>   out of order.  That could be modified later on for random I/O patterns
>   or slightly out of order I/O.
>   
>   camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from
>   the pass(4) driver and also to send request notifications internally.
>   
>   For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR)
>   per CAM CCB on the reading side, and a scatter/gather list
>   (CAM_DATA_SG) on the writing side.  In addition to testing both
>   interfaces, this makes any potential reblocking of I/O easier.  No
>   data is copied between the reader and the writer, but rather the
>   reader's buffers are split into multiple I/O requests or combined
>   into a single I/O request depending on the input and output blocksize.
>   
>   For the file I/O path, camdd(8) also uses a single buffer (read(2),
>   write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list
>   (readv(2), writev(2), preadv(2), pwritev(2)) on writes.
>   
>   Things that would be nice to do for camdd(8) eventually:
>   
>   1.  Add support for I/O pattern generation.  Patterns like all
>   zeros, all ones, LBA-based 

Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2016-01-11 Thread Kenneth D. Merry
On Mon, Jan 11, 2016 at 18:29:22 +0100, Roger Pau Monn?? wrote:
> El 03/12/15 a les 21.54, Kenneth D. Merry ha escrit:
> > Author: ken
> > Date: Thu Dec  3 20:54:55 2015
> > New Revision: 291716
> > URL: https://svnweb.freebsd.org/changeset/base/291716
> > 
> > Log:
> >   Add asynchronous command support to the pass(4) driver, and the new
> >   camdd(8) utility.
> >   
> >   CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
> >   completed CCBs may be retrieved via the CAMIOGET ioctl.  User
> >   processes can use poll(2) or kevent(2) to get notification when
> >   I/O has completed.
> >   
> >   While the existing CAMIOCOMMAND blocking ioctl interface only
> >   supports user virtual data pointers in a CCB (generally only
> >   one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
> >   physical address pointers, as well as user virtual and physical
> >   scatter/gather lists.  This allows user applications to have more
> >   flexibility in their data handling operations.
> >   
> >   Kernel memory for data transferred via the queued interface is
> >   allocated from the zone allocator in MAXPHYS sized chunks, and user
> >   data is copied in and out.  This is likely faster than the
> >   vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
> >   configurations with many processors (there are more TLB shootdowns
> >   caused by the mapping/unmapping operation) but may not be as fast
> >   as running with unmapped I/O.
> >   
> >   The new memory handling model for user requests also allows
> >   applications to send CCBs with request sizes that are larger than
> >   MAXPHYS.  The pass(4) driver now limits queued requests to the I/O
> >   size listed by the SIM driver in the maxio field in the Path
> >   Inquiry (XPT_PATH_INQ) CCB.
> >   
> >   There are some things things would be good to add:
> >   
> >   1. Come up with a way to do unmapped I/O on multiple buffers.
> >  Currently the unmapped I/O interface operates on a struct bio,
> >  which includes only one address and length.  It would be nice
> >  to be able to send an unmapped scatter/gather list down to
> >  busdma.  This would allow eliminating the copy we currently do
> >  for data.
> >   
> >   2. Add an ioctl to list currently outstanding CCBs in the various
> >  queues.
> >   
> >   3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
> >  that.
> >   
> >   4. Test physical address support.  Virtual pointers and scatter
> >  gather lists have been tested, but I have not yet tested
> >  physical addresses or scatter/gather lists.
> >   
> >   5. Investigate multiple queue support.  At the moment there is one
> >  queue of commands per pass(4) device.  If multiple processes
> >  open the device, they will submit I/O into the same queue and
> >  get events for the same completions.  This is probably the right
> >  model for most applications, but it is something that could be
> >  changed later on.
> >   
> >   Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
> >   driver interface.
> >   
> >   This utility is intended to be a basic data transfer/copy utility,
> >   a simple benchmark utility, and an example of how to use the
> >   asynchronous pass(4) interface.
> >   
> >   It can copy data to and from pass(4) devices using any target queue
> >   depth, starting offset and blocksize for the input and ouptut devices.
> >   It currently only supports SCSI devices, but could be easily extended
> >   to support ATA devices.
> >   
> >   It can also copy data to and from regular files, block devices, tape
> >   devices, pipes, stdin, and stdout.  It does not support queueing
> >   multiple commands to any of those targets, since it uses the standard
> >   read(2)/write(2)/writev(2)/readv(2) system calls.
> >   
> >   The I/O is done by two threads, one for the reader and one for the
> >   writer.  The reader thread sends completed read requests to the
> >   writer thread in strictly sequential order, even if they complete
> >   out of order.  That could be modified later on for random I/O patterns
> >   or slightly out of order I/O.
> >   
> >   camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from
> >   the pass(4) driver and also to send request notifications internally.
> >   
> >   For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR)
> >   per CAM CCB on the reading side, and a scatter/gather list
> >   (CAM_DATA_SG) on the writing side.  In addition to testing both
> >   interfaces, this makes any potential reblocking of I/O easier.  No
> >   data is copied between the reader and the writer, but rather the
> >   reader's buffers are split into multiple I/O requests or combined
> >   into a single I/O request depending on the input and output blocksize.
> >   
> >   For the file I/O path, camdd(8) also uses a single buffer (read(2),
> >   write(2), pread(2) or pwrite(2)) on reads, and 

Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2015-12-05 Thread Ravi Pokala
-Original Message-


From: "Kenneth D. Merry" <k...@freebsd.org>
Date: 2015-12-04, Friday at 08:32
To: Ravi Pokala <rpok...@mac.com>
Cc: <src-committ...@freebsd.org>, <svn-src-...@freebsd.org>, 
<svn-src-head@freebsd.org>
Subject: Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata 
sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin 
usr.sbin/camdd

>On Thu, Dec 03, 2015 at 23:55:14 -0800, Ravi Pokala wrote:
>>(a) How does that work? That is, how does the argument get to the ioctl 
>>handler in the kernel?
>> 
>
>In sys_ioctl(), in sys/kern/sys_generic.c, the pointer argument ("data") to
>the ioctl syscall is passed through into kern_ioctl() and then on down
>until it gets into the passioctl() call.  It is passed through even when
>the declared size of the ioctl is 0, as it is for the two new ioctls:
>
>...
>
>The problem is, upon exit from the ioctl, that data is freed.  With a
>queueing interface, we need to keep a copy of the CCB around after the
>ioctl exits.  You have the same problem even after r274017, because that
>just provides a small buffer on the stack.  (And would only help in the
>pointer case.  And we don't need to copyin the pointer.)
>
>So, to avoid that, we don't declare an argument, but we do pass in a
>pointer and do the copy the user's CCB into a CCB that is allocated inside
>the pass(4) driver.

Clever! I've actually written and modified ioctl handlers many times, but it 
was always with a declared argument (via _IOR | _IOW | IOWR), and I never had 
to worry about persistence after the handler exits. So, I've never had to pay 
much attention to what happens between the userland call and the handler 
getting invoked.

>> (b) The CCB is large, but the CCB pointer is just a pointer; shouldn't that 
>> be passed in as the arg?
>> 
>
>It is.  Here's what camdd(8) does:

Yeah, I was thrown by the fact that there wasn't a declared arg; sys_ioctl() 
DTRT and figures it out anyway.

Thanks,

Ravi (rpokala@)

>Ken
>-- 
>Kenneth Merry
>k...@freebsd.org

___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2015-12-04 Thread Kenneth D. Merry
On Thu, Dec 03, 2015 at 23:55:14 -0800, Ravi Pokala wrote:
> Hi Ken,
> 
> A few questions:
> 
> > Although these ioctls do not have a declared argument, they
> > both take a union ccb pointer.  If we declare a size here,
> > the ioctl code in sys/kern/sys_generic.c will malloc and free
> > a buffer for either the CCB or the CCB pointer (depending on
> > how it is declared).  Since we have to keep a copy of the
> > CCB (which is fairly large) anyway, having the ioctl malloc
> > and free a CCB for each call is wasteful.
> 
> 
> (a) How does that work? That is, how does the argument get to the ioctl 
> handler in the kernel?
> 

In sys_ioctl(), in sys/kern/sys_generic.c, the pointer argument ("data") to
the ioctl syscall is passed through into kern_ioctl() and then on down
until it gets into the passioctl() call.  It is passed through even when
the declared size of the ioctl is 0, as it is for the two new ioctls:

/*
 * These two ioctls take a union ccb *, but that is not explicitly declared
 * to avoid having the ioctl handling code malloc and free their own copy
 * of the CCB or the CCB pointer.
 */
#define CAMIOQUEUE  _IO(CAM_VERSION, 4)
#define CAMIOGET_IO(CAM_VERSION, 5)

Here's the code in question:

if (size > 0) {
if (com & IOC_VOID) {
/* Integer argument. */
arg = (intptr_t)uap->data;
data = (void *)
size = 0;
} else {
if (size > SYS_IOCTL_SMALL_SIZE)
data = malloc((u_long)size, M_IOCTLOPS, M_WAITOK
);  
else
data = smalldata;
}
} else
data = (void *)>data;

So in the size == 0 case, data is just passed through as is.

Prior to r274017, if the ioctl were declared as _IOWR, there would be a
malloc and copyin of however much data is declared in the ioctl, no matter
what the size.  So, in this case, sizeof(union ccb *) or sizeof(union ccb).

The problem is, upon exit from the ioctl, that data is freed.  With a
queueing interface, we need to keep a copy of the CCB around after the
ioctl exits.  You have the same problem even after r274017, because that
just provides a small buffer on the stack.  (And would only help in the
pointer case.  And we don't need to copyin the pointer.)

So, to avoid that, we don't declare an argument, but we do pass in a
pointer and do the copy the user's CCB into a CCB that is allocated inside
the pass(4) driver.

> (b) The CCB is large, but the CCB pointer is just a pointer; shouldn't that 
> be passed in as the arg?
> 

It is.  Here's what camdd(8) does:

>From camdd_pass_run():

union ccb *ccb;
...
/*
 * Queue the CCB to the pass(4) driver.
 */
if (ioctl(pass_dev->dev->fd, CAMIOQUEUE, ccb) == -1) {

And from camdd_pass_fetch():

union ccb ccb;
...
while ((retval = ioctl(pass_dev->dev->fd, CAMIOGET, )) != -1) {

Ken
-- 
Kenneth Merry
k...@freebsd.org
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2015-12-03 Thread Ravi Pokala
Hi Ken,

A few questions:

>   Although these ioctls do not have a declared argument, they
>   both take a union ccb pointer.  If we declare a size here,
>   the ioctl code in sys/kern/sys_generic.c will malloc and free
>   a buffer for either the CCB or the CCB pointer (depending on
>   how it is declared).  Since we have to keep a copy of the
>   CCB (which is fairly large) anyway, having the ioctl malloc
>   and free a CCB for each call is wasteful.


(a) How does that work? That is, how does the argument get to the ioctl handler 
in the kernel?

(b) The CCB is large, but the CCB pointer is just a pointer; shouldn't that be 
passed in as the arg?

Thanks,

Ravi




-Original Message-
From:  on behalf of "Kenneth D. Merry" 

Date: 2015-12-03, Thursday at 12:54
To: , , 

Subject: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata 
sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin 
usr.sbin/camdd

>Author: ken
>Date: Thu Dec  3 20:54:55 2015
>New Revision: 291716
>URL: https://svnweb.freebsd.org/changeset/base/291716
>
>Log:
>  Add asynchronous command support to the pass(4) driver, and the new
>  camdd(8) utility.
>  
>  CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
>  completed CCBs may be retrieved via the CAMIOGET ioctl.  User
>  processes can use poll(2) or kevent(2) to get notification when
>  I/O has completed.
>  
>  While the existing CAMIOCOMMAND blocking ioctl interface only
>  supports user virtual data pointers in a CCB (generally only
>  one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
>  physical address pointers, as well as user virtual and physical
>  scatter/gather lists.  This allows user applications to have more
>  flexibility in their data handling operations.
>  
>  Kernel memory for data transferred via the queued interface is
>  allocated from the zone allocator in MAXPHYS sized chunks, and user
>  data is copied in and out.  This is likely faster than the
>  vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
>  configurations with many processors (there are more TLB shootdowns
>  caused by the mapping/unmapping operation) but may not be as fast
>  as running with unmapped I/O.
>  
>  The new memory handling model for user requests also allows
>  applications to send CCBs with request sizes that are larger than
>  MAXPHYS.  The pass(4) driver now limits queued requests to the I/O
>  size listed by the SIM driver in the maxio field in the Path
>  Inquiry (XPT_PATH_INQ) CCB.
>  
>  There are some things things would be good to add:
>  
>  1. Come up with a way to do unmapped I/O on multiple buffers.
> Currently the unmapped I/O interface operates on a struct bio,
> which includes only one address and length.  It would be nice
> to be able to send an unmapped scatter/gather list down to
> busdma.  This would allow eliminating the copy we currently do
> for data.
>  
>  2. Add an ioctl to list currently outstanding CCBs in the various
> queues.
>  
>  3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
> that.
>  
>  4. Test physical address support.  Virtual pointers and scatter
> gather lists have been tested, but I have not yet tested
> physical addresses or scatter/gather lists.
>  
>  5. Investigate multiple queue support.  At the moment there is one
> queue of commands per pass(4) device.  If multiple processes
> open the device, they will submit I/O into the same queue and
> get events for the same completions.  This is probably the right
> model for most applications, but it is something that could be
> changed later on.
>  
>  Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
>  driver interface.
>  
>  This utility is intended to be a basic data transfer/copy utility,
>  a simple benchmark utility, and an example of how to use the
>  asynchronous pass(4) interface.
>  
>  It can copy data to and from pass(4) devices using any target queue
>  depth, starting offset and blocksize for the input and ouptut devices.
>  It currently only supports SCSI devices, but could be easily extended
>  to support ATA devices.
>  
>  It can also copy data to and from regular files, block devices, tape
>  devices, pipes, stdin, and stdout.  It does not support queueing
>  multiple commands to any of those targets, since it uses the standard
>  read(2)/write(2)/writev(2)/readv(2) system calls.
>  
>  The I/O is done by two threads, one for the reader and one for the
>  writer.  The reader thread sends completed read requests to the
>  writer thread in strictly sequential order, even if they complete
>  out of order.  That could be modified later on for random I/O patterns
>  or slightly out of order I/O.
>  
>  camdd(8) uses kqueue(2)/kevent(2) to get 

Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2015-12-03 Thread Bryan Drewery
On 12/3/15 12:54 PM, Kenneth D. Merry wrote:
> Author: ken
> Date: Thu Dec  3 20:54:55 2015
> New Revision: 291716
> URL: https://svnweb.freebsd.org/changeset/base/291716
> 
> Log:
>   Add asynchronous command support to the pass(4) driver, and the new
>   camdd(8) utility.
>   
>   CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
>   completed CCBs may be retrieved via the CAMIOGET ioctl.  User
>   processes can use poll(2) or kevent(2) to get notification when
>   I/O has completed.
>   
>   While the existing CAMIOCOMMAND blocking ioctl interface only
>   supports user virtual data pointers in a CCB (generally only
>   one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
>   physical address pointers, as well as user virtual and physical
>   scatter/gather lists.  This allows user applications to have more
>   flexibility in their data handling operations.
>   
>   Kernel memory for data transferred via the queued interface is
>   allocated from the zone allocator in MAXPHYS sized chunks, and user
>   data is copied in and out.  This is likely faster than the
>   vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
>   configurations with many processors (there are more TLB shootdowns
>   caused by the mapping/unmapping operation) but may not be as fast
>   as running with unmapped I/O.
>   
>   The new memory handling model for user requests also allows
>   applications to send CCBs with request sizes that are larger than
>   MAXPHYS.  The pass(4) driver now limits queued requests to the I/O
>   size listed by the SIM driver in the maxio field in the Path
>   Inquiry (XPT_PATH_INQ) CCB.
>   
>   There are some things things would be good to add:
>   
>   1. Come up with a way to do unmapped I/O on multiple buffers.
>  Currently the unmapped I/O interface operates on a struct bio,
>  which includes only one address and length.  It would be nice
>  to be able to send an unmapped scatter/gather list down to
>  busdma.  This would allow eliminating the copy we currently do
>  for data.
>   
>   2. Add an ioctl to list currently outstanding CCBs in the various
>  queues.
>   
>   3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
>  that.
>   
>   4. Test physical address support.  Virtual pointers and scatter
>  gather lists have been tested, but I have not yet tested
>  physical addresses or scatter/gather lists.
>   
>   5. Investigate multiple queue support.  At the moment there is one
>  queue of commands per pass(4) device.  If multiple processes
>  open the device, they will submit I/O into the same queue and
>  get events for the same completions.  This is probably the right
>  model for most applications, but it is something that could be
>  changed later on.
>   
>   Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
>   driver interface.
>   
>   This utility is intended to be a basic data transfer/copy utility,
>   a simple benchmark utility, and an example of how to use the
>   asynchronous pass(4) interface.
>   
>   It can copy data to and from pass(4) devices using any target queue
>   depth, starting offset and blocksize for the input and ouptut devices.
>   It currently only supports SCSI devices, but could be easily extended
>   to support ATA devices.
>   
>   It can also copy data to and from regular files, block devices, tape
>   devices, pipes, stdin, and stdout.  It does not support queueing
>   multiple commands to any of those targets, since it uses the standard
>   read(2)/write(2)/writev(2)/readv(2) system calls.
>   
>   The I/O is done by two threads, one for the reader and one for the
>   writer.  The reader thread sends completed read requests to the
>   writer thread in strictly sequential order, even if they complete
>   out of order.  That could be modified later on for random I/O patterns
>   or slightly out of order I/O.
>   
>   camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from
>   the pass(4) driver and also to send request notifications internally.
>   
>   For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR)
>   per CAM CCB on the reading side, and a scatter/gather list
>   (CAM_DATA_SG) on the writing side.  In addition to testing both
>   interfaces, this makes any potential reblocking of I/O easier.  No
>   data is copied between the reader and the writer, but rather the
>   reader's buffers are split into multiple I/O requests or combined
>   into a single I/O request depending on the input and output blocksize.
>   
>   For the file I/O path, camdd(8) also uses a single buffer (read(2),
>   write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list
>   (readv(2), writev(2), preadv(2), pwritev(2)) on writes.
>   
>   Things that would be nice to do for camdd(8) eventually:
>   
>   1.  Add support for I/O pattern generation.  Patterns like all
>   zeros, all ones, LBA-based patterns, 

Re: svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd

2015-12-03 Thread Kenneth D. Merry
On Thu, Dec 03, 2015 at 13:13:25 -0800, Bryan Drewery wrote:
> On 12/3/15 12:54 PM, Kenneth D. Merry wrote:
> > Author: ken
> > Date: Thu Dec  3 20:54:55 2015
> > New Revision: 291716
> > URL: https://svnweb.freebsd.org/changeset/base/291716
> > 
> > Log:
> >   Add asynchronous command support to the pass(4) driver, and the new
> >   camdd(8) utility.
> >   
> >   CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
> >   completed CCBs may be retrieved via the CAMIOGET ioctl.  User
> >   processes can use poll(2) or kevent(2) to get notification when
> >   I/O has completed.
> >   
> >   While the existing CAMIOCOMMAND blocking ioctl interface only
> >   supports user virtual data pointers in a CCB (generally only
> >   one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
> >   physical address pointers, as well as user virtual and physical
> >   scatter/gather lists.  This allows user applications to have more
> >   flexibility in their data handling operations.
> >   
> >   Kernel memory for data transferred via the queued interface is
> >   allocated from the zone allocator in MAXPHYS sized chunks, and user
> >   data is copied in and out.  This is likely faster than the
> >   vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
> >   configurations with many processors (there are more TLB shootdowns
> >   caused by the mapping/unmapping operation) but may not be as fast
> >   as running with unmapped I/O.
> >   
> >   The new memory handling model for user requests also allows
> >   applications to send CCBs with request sizes that are larger than
> >   MAXPHYS.  The pass(4) driver now limits queued requests to the I/O
> >   size listed by the SIM driver in the maxio field in the Path
> >   Inquiry (XPT_PATH_INQ) CCB.
> >   
> >   There are some things things would be good to add:
> >   
> >   1. Come up with a way to do unmapped I/O on multiple buffers.
> >  Currently the unmapped I/O interface operates on a struct bio,
> >  which includes only one address and length.  It would be nice
> >  to be able to send an unmapped scatter/gather list down to
> >  busdma.  This would allow eliminating the copy we currently do
> >  for data.
> >   
> >   2. Add an ioctl to list currently outstanding CCBs in the various
> >  queues.
> >   
> >   3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
> >  that.
> >   
> >   4. Test physical address support.  Virtual pointers and scatter
> >  gather lists have been tested, but I have not yet tested
> >  physical addresses or scatter/gather lists.
> >   
> >   5. Investigate multiple queue support.  At the moment there is one
> >  queue of commands per pass(4) device.  If multiple processes
> >  open the device, they will submit I/O into the same queue and
> >  get events for the same completions.  This is probably the right
> >  model for most applications, but it is something that could be
> >  changed later on.
> >   
> >   Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
> >   driver interface.
> >   
> >   This utility is intended to be a basic data transfer/copy utility,
> >   a simple benchmark utility, and an example of how to use the
> >   asynchronous pass(4) interface.
> >   
> >   It can copy data to and from pass(4) devices using any target queue
> >   depth, starting offset and blocksize for the input and ouptut devices.
> >   It currently only supports SCSI devices, but could be easily extended
> >   to support ATA devices.
> >   
> >   It can also copy data to and from regular files, block devices, tape
> >   devices, pipes, stdin, and stdout.  It does not support queueing
> >   multiple commands to any of those targets, since it uses the standard
> >   read(2)/write(2)/writev(2)/readv(2) system calls.
> >   
> >   The I/O is done by two threads, one for the reader and one for the
> >   writer.  The reader thread sends completed read requests to the
> >   writer thread in strictly sequential order, even if they complete
> >   out of order.  That could be modified later on for random I/O patterns
> >   or slightly out of order I/O.
> >   
> >   camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from
> >   the pass(4) driver and also to send request notifications internally.
> >   
> >   For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR)
> >   per CAM CCB on the reading side, and a scatter/gather list
> >   (CAM_DATA_SG) on the writing side.  In addition to testing both
> >   interfaces, this makes any potential reblocking of I/O easier.  No
> >   data is copied between the reader and the writer, but rather the
> >   reader's buffers are split into multiple I/O requests or combined
> >   into a single I/O request depending on the input and output blocksize.
> >   
> >   For the file I/O path, camdd(8) also uses a single buffer (read(2),
> >   write(2), pread(2) or pwrite(2)) on reads, and a