Re: [PATCH 2.6.23-rc7 1/3] async_tx: usage documentation and developer notes

2007-09-21 Thread Bill Davidsen

Dan Williams wrote:

Signed-off-by: Dan Williams <[EMAIL PROTECTED]>
---

 Documentation/crypto/async-tx-api.txt |  217 +
 1 files changed, 217 insertions(+), 0 deletions(-)

diff --git a/Documentation/crypto/async-tx-api.txt 
b/Documentation/crypto/async-tx-api.txt
new file mode 100644
index 000..48d685a
--- /dev/null
+++ b/Documentation/crypto/async-tx-api.txt
@@ -0,0 +1,217 @@
+Asynchronous Transfers/Transforms API
+
+1 INTRODUCTION
+
+2 GENEALOGY
+
+3 USAGE
+3.1 General format of the API
+3.2 Supported operations
+3.2 Descriptor management
+3.3 When does the operation execute?
+3.4 When does the operation complete?
+3.5 Constraints
+3.6 Example
+


This is very readable, and appears extensible to any new forthcoming 
technology I'm aware of. Great job!


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.23-rc7 1/3] async_tx: usage documentation and developer notes

2007-09-20 Thread Randy Dunlap
On Thu, 20 Sep 2007 18:27:40 -0700 Dan Williams wrote:

> Signed-off-by: Dan Williams <[EMAIL PROTECTED]>
> ---

Hi Dan,

Looks pretty good and informative.  Thanks.

(nits below :)


>  Documentation/crypto/async-tx-api.txt |  217 
> +
>  1 files changed, 217 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/crypto/async-tx-api.txt 
> b/Documentation/crypto/async-tx-api.txt
> new file mode 100644
> index 000..48d685a
> --- /dev/null
> +++ b/Documentation/crypto/async-tx-api.txt
> @@ -0,0 +1,217 @@
> +  Asynchronous Transfers/Transforms API
> +
> +1 INTRODUCTION
> +
> +2 GENEALOGY
> +
> +3 USAGE
> +3.1 General format of the API
> +3.2 Supported operations
> +3.2 Descriptor management

duplicate 3.2

> +3.3 When does the operation execute?
> +3.4 When does the operation complete?
> +3.5 Constraints
> +3.6 Example
> +
> +4 DRIVER DEVELOPER NOTES
> +4.1 Conformance points
> +4.2 "My application needs finer control of hardware channels"
> +
> +5 SOURCE
> +
> +---
> +
> +1 INTRODUCTION
> +
> +The async_tx api provides methods for describing a chain of asynchronous
> +bulk memory transfers/transforms with support for inter-transactional
> +dependencies.  It is implemented as a dmaengine client that smooths over
> +the details of different hardware offload engine implementations.  Code
> +that is written to the api can optimize for asynchronous operation and
> +the api will fit the chain of operations to the available offload
> +resources.
> +

I would s/api/API/g .

> +2 GENEALOGY
> +
[snip]

> +
> +3 USAGE
> +
> +3.1 General format of the API:
> +struct dma_async_tx_descriptor *
> +async_(,
> +   enum async_tx_flags flags,
> +   struct dma_async_tx_descriptor *dependency,
> +   dma_async_tx_callback callback_routine,
> +   void *callback_parameter);
> +
> +3.2 Supported operations:
> +memcpy   - memory copy between a source and a destination buffer
> +memset   - fill a destination buffer with a byte value
> +xor   - xor a series of source buffers and write the result to a
> +destination buffer
> +xor_zero_sum - xor a series of source buffers and set a flag if the
> +result is zero.  The implementation attempts to prevent
> +writes to memory
> +
> +3.2 Descriptor management:

duplicate 3.2

> +The return value is non-NULL and points to a 'descriptor' when the operation
> +has been queued to execute asynchronously.  Descriptors are recycled
> +resources, under control of the offload engine driver, to be reused as
> +operations complete.  When an application needs to submit a chain of
> +operations it must guarantee that the descriptor is not automatically 
> recycled
> +before the dependency is submitted.  This requires that all descriptors be
> +acknowledged by the application before the offload engine driver is allowed 
> to
> +recycle (or free) the descriptor.  A descriptor can be acked by:

can be acked by any of:   (?)

> +1/ setting the ASYNC_TX_ACK flag if no operations are to be submitted
> +2/ setting the ASYNC_TX_DEP_ACK flag to acknowledge the parent
> +   descriptor of a new operation.
> +3/ calling async_tx_ack() on the descriptor.
> +
> +3.3 When does the operation execute?:

Drop ':'

> +Operations do not immediately issue after return from the
> +async_ call.  Offload engine drivers batch operations to
> +improve performance by reducing the number of mmio cycles needed to
> +manage the channel.  Once a driver specific threshold is met the driver

   driver-specific

> +automatically issues pending operations.  An application can force this
> +event by calling async_tx_issue_pending_all().  This operates on all
> +channels since the application has no knowledge of channel to operation
> +mapping.
> +
> +3.4 When does the operation complete?:

drop ':'

> +There are two methods for an application to learn about the completion
> +of an operation.
> +1/ Call dma_wait_for_async_tx().  This call causes the cpu to spin while

s/cpu/CPU/g

> +   it polls for the completion of the operation.  It handles dependency
> +   chains and issuing pending operations.
> +2/ Specify a completion callback.  The callback routine runs in tasklet
> +   context if the offload engine driver supports interrupts, or it is
> +   called in application context if the operation is carried out
> +   synchronously in software.  The callback can be set in the call to
> +   async_, or when the application needs to submit a chain of
> +   unknown length it can use the async_trigger_callback() routine to set a
> +   completion interrupt/callback at the end of the chain.
> +
> +3.5 Constraints:
> +1/ Calls to async_ are not permitted in irq context.  Other

s/irq/IRQ/g

> +   contexts are permitted provided constraint #2 is not violated.
> +2/ Completion callback routines can not submit new operations.  This

   cannot

> +

[PATCH 2.6.23-rc7 1/3] async_tx: usage documentation and developer notes

2007-09-20 Thread Dan Williams
Signed-off-by: Dan Williams <[EMAIL PROTECTED]>
---

 Documentation/crypto/async-tx-api.txt |  217 +
 1 files changed, 217 insertions(+), 0 deletions(-)

diff --git a/Documentation/crypto/async-tx-api.txt 
b/Documentation/crypto/async-tx-api.txt
new file mode 100644
index 000..48d685a
--- /dev/null
+++ b/Documentation/crypto/async-tx-api.txt
@@ -0,0 +1,217 @@
+Asynchronous Transfers/Transforms API
+
+1 INTRODUCTION
+
+2 GENEALOGY
+
+3 USAGE
+3.1 General format of the API
+3.2 Supported operations
+3.2 Descriptor management
+3.3 When does the operation execute?
+3.4 When does the operation complete?
+3.5 Constraints
+3.6 Example
+
+4 DRIVER DEVELOPER NOTES
+4.1 Conformance points
+4.2 "My application needs finer control of hardware channels"
+
+5 SOURCE
+
+---
+
+1 INTRODUCTION
+
+The async_tx api provides methods for describing a chain of asynchronous
+bulk memory transfers/transforms with support for inter-transactional
+dependencies.  It is implemented as a dmaengine client that smooths over
+the details of different hardware offload engine implementations.  Code
+that is written to the api can optimize for asynchronous operation and
+the api will fit the chain of operations to the available offload
+resources.
+
+2 GENEALOGY
+
+The api was initially designed to offload the memory copy and
+xor-parity-calculations of the md-raid5 driver using the offload engines
+present in the Intel(R) Xscale series of I/O processors.  It also built
+on the 'dmaengine' layer developed for offloading memory copies in the
+network stack using Intel(R) I/OAT engines.  The following design
+features surfaced as a result:
+1/ implicit synchronous path: users of the API do not need to know if
+   the platform they are running on has offload capabilities.  The
+   operation will be offloaded when an engine is available and carried out
+   in software otherwise.
+2/ cross channel dependency chains: the API allows a chain of dependent
+   operations to be submitted, like xor->copy->xor in the raid5 case.  The
+   API automatically handles cases where the transition from one operation
+   to another implies a hardware channel switch.
+3/ dmaengine extensions to support multiple clients and operation types
+   beyond 'memcpy'
+
+3 USAGE
+
+3.1 General format of the API:
+struct dma_async_tx_descriptor *
+async_(,
+ enum async_tx_flags flags,
+ struct dma_async_tx_descriptor *dependency,
+ dma_async_tx_callback callback_routine,
+ void *callback_parameter);
+
+3.2 Supported operations:
+memcpy   - memory copy between a source and a destination buffer
+memset   - fill a destination buffer with a byte value
+xor - xor a series of source buffers and write the result to a
+  destination buffer
+xor_zero_sum - xor a series of source buffers and set a flag if the
+  result is zero.  The implementation attempts to prevent
+  writes to memory
+
+3.2 Descriptor management:
+The return value is non-NULL and points to a 'descriptor' when the operation
+has been queued to execute asynchronously.  Descriptors are recycled
+resources, under control of the offload engine driver, to be reused as
+operations complete.  When an application needs to submit a chain of
+operations it must guarantee that the descriptor is not automatically recycled
+before the dependency is submitted.  This requires that all descriptors be
+acknowledged by the application before the offload engine driver is allowed to
+recycle (or free) the descriptor.  A descriptor can be acked by:
+1/ setting the ASYNC_TX_ACK flag if no operations are to be submitted
+2/ setting the ASYNC_TX_DEP_ACK flag to acknowledge the parent
+   descriptor of a new operation.
+3/ calling async_tx_ack() on the descriptor.
+
+3.3 When does the operation execute?:
+Operations do not immediately issue after return from the
+async_ call.  Offload engine drivers batch operations to
+improve performance by reducing the number of mmio cycles needed to
+manage the channel.  Once a driver specific threshold is met the driver
+automatically issues pending operations.  An application can force this
+event by calling async_tx_issue_pending_all().  This operates on all
+channels since the application has no knowledge of channel to operation
+mapping.
+
+3.4 When does the operation complete?:
+There are two methods for an application to learn about the completion
+of an operation.
+1/ Call dma_wait_for_async_tx().  This call causes the cpu to spin while
+   it polls for the completion of the operation.  It handles dependency
+   chains and issuing pending operations.
+2/ Specify a completion callback.  The callback routine runs in tasklet
+   context if the offload engine driver supports interrupts, or it is
+   called in application context if the operation is carried out
+   synchronously in software.  The callback can be set in the call to
+