On Sun, 16 Oct 2016, Silvio Fricke <silvio.fri...@gmail.com> wrote:
> Only formating changes.
>
> Signed-off-by: Silvio Fricke <silvio.fri...@gmail.com>
> ---
>  Documentation/index.rst     |   1 +-
>  Documentation/workqueue.rst | 394 +++++++++++++++++++++++++++++++++++++-
>  Documentation/workqueue.txt | 388 +------------------------------------
>  MAINTAINERS                 |   2 +-
>  include/linux/workqueue.h   |   4 +-
>  5 files changed, 398 insertions(+), 391 deletions(-)
>  create mode 100644 Documentation/workqueue.rst
>  delete mode 100644 Documentation/workqueue.txt
>
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index c53d089..f631655 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -18,6 +18,7 @@ Contents:
>     media/index
>     gpu/index
>     80211/index
> +   workqueue
>  
>  Indices and tables
>  ==================
> diff --git a/Documentation/workqueue.rst b/Documentation/workqueue.rst
> new file mode 100644
> index 0000000..03ac70f
> --- /dev/null
> +++ b/Documentation/workqueue.rst
> @@ -0,0 +1,394 @@
> +====================================
> +Concurrency Managed Workqueue (cmwq)
> +====================================
> +
> +:Date: September, 2010
> +:Authors: * Tejun Heo <t...@kernel.org>;
> +          * Florian Mickler <flor...@mickler.org>;

I'd imagine

:Author: Tejun Heo <t...@kernel.org>
:Author: Florian Mickler <flor...@mickler.org>

will work better.

BR,
Jani.


> +
> +
> +Introduction
> +============
> +
> +There are many cases where an asynchronous process execution context
> +is needed and the workqueue (wq) API is the most commonly used
> +mechanism for such cases.
> +
> +When such an asynchronous execution context is needed, a work item
> +describing which function to execute is put on a queue.  An
> +independent thread serves as the asynchronous execution context.  The
> +queue is called workqueue and the thread is called worker.
> +
> +While there are work items on the workqueue the worker executes the
> +functions associated with the work items one after the other.  When
> +there is no work item left on the workqueue the worker becomes idle.
> +When a new work item gets queued, the worker begins executing again.
> +
> +
> +Why cmwq?
> +=========
> +
> +In the original wq implementation, a multi threaded (MT) wq had one
> +worker thread per CPU and a single threaded (ST) wq had one worker
> +thread system-wide.  A single MT wq needed to keep around the same
> +number of workers as the number of CPUs.  The kernel grew a lot of MT
> +wq users over the years and with the number of CPU cores continuously
> +rising, some systems saturated the default 32k PID space just booting
> +up.
> +
> +Although MT wq wasted a lot of resource, the level of concurrency
> +provided was unsatisfactory.  The limitation was common to both ST and
> +MT wq albeit less severe on MT.  Each wq maintained its own separate
> +worker pool.  A MT wq could provide only one execution context per CPU
> +while a ST wq one for the whole system.  Work items had to compete for
> +those very limited execution contexts leading to various problems
> +including proneness to deadlocks around the single execution context.
> +
> +The tension between the provided level of concurrency and resource
> +usage also forced its users to make unnecessary tradeoffs like libata
> +choosing to use ST wq for polling PIOs and accepting an unnecessary
> +limitation that no two polling PIOs can progress at the same time.  As
> +MT wq don't provide much better concurrency, users which require
> +higher level of concurrency, like async or fscache, had to implement
> +their own thread pool.
> +
> +Concurrency Managed Workqueue (cmwq) is a reimplementation of wq with
> +focus on the following goals.
> +
> +* Maintain compatibility with the original workqueue API.
> +
> +* Use per-CPU unified worker pools shared by all wq to provide
> +  flexible level of concurrency on demand without wasting a lot of
> +  resource.
> +
> +* Automatically regulate worker pool and level of concurrency so that
> +  the API users don't need to worry about such details.
> +
> +
> +The Design
> +==========
> +
> +In order to ease the asynchronous execution of functions a new
> +abstraction, the work item, is introduced.
> +
> +A work item is a simple struct that holds a pointer to the function
> +that is to be executed asynchronously.  Whenever a driver or subsystem
> +wants a function to be executed asynchronously it has to set up a work
> +item pointing to that function and queue that work item on a
> +workqueue.
> +
> +Special purpose threads, called worker threads, execute the functions
> +off of the queue, one after the other.  If no work is queued, the
> +worker threads become idle.  These worker threads are managed in so
> +called worker-pools.
> +
> +The cmwq design differentiates between the user-facing workqueues that
> +subsystems and drivers queue work items on and the backend mechanism
> +which manages worker-pools and processes the queued work items.
> +
> +There are two worker-pools, one for normal work items and the other
> +for high priority ones, for each possible CPU and some extra
> +worker-pools to serve work items queued on unbound workqueues - the
> +number of these backing pools is dynamic.
> +
> +Subsystems and drivers can create and queue work items through special
> +workqueue API functions as they see fit. They can influence some
> +aspects of the way the work items are executed by setting flags on the
> +workqueue they are putting the work item on. These flags include
> +things like CPU locality, concurrency limits, priority and more.  To
> +get a detailed overview refer to the API description of
> +``alloc_workqueue()`` below.
> +
> +When a work item is queued to a workqueue, the target worker-pool is
> +determined according to the queue parameters and workqueue attributes
> +and appended on the shared worklist of the worker-pool.  For example,
> +unless specifically overridden, a work item of a bound workqueue will
> +be queued on the worklist of either normal or highpri worker-pool that
> +is associated to the CPU the issuer is running on.
> +
> +For any worker pool implementation, managing the concurrency level
> +(how many execution contexts are active) is an important issue.  cmwq
> +tries to keep the concurrency at a minimal but sufficient level.
> +Minimal to save resources and sufficient in that the system is used at
> +its full capacity.
> +
> +Each worker-pool bound to an actual CPU implements concurrency
> +management by hooking into the scheduler.  The worker-pool is notified
> +whenever an active worker wakes up or sleeps and keeps track of the
> +number of the currently runnable workers.  Generally, work items are
> +not expected to hog a CPU and consume many cycles.  That means
> +maintaining just enough concurrency to prevent work processing from
> +stalling should be optimal.  As long as there are one or more runnable
> +workers on the CPU, the worker-pool doesn't start execution of a new
> +work, but, when the last running worker goes to sleep, it immediately
> +schedules a new worker so that the CPU doesn't sit idle while there
> +are pending work items.  This allows using a minimal number of workers
> +without losing execution bandwidth.
> +
> +Keeping idle workers around doesn't cost other than the memory space
> +for kthreads, so cmwq holds onto idle ones for a while before killing
> +them.
> +
> +For unbound workqueues, the number of backing pools is dynamic.
> +Unbound workqueue can be assigned custom attributes using
> +``apply_workqueue_attrs()`` and workqueue will automatically create
> +backing worker pools matching the attributes.  The responsibility of
> +regulating concurrency level is on the users.  There is also a flag to
> +mark a bound wq to ignore the concurrency management.  Please refer to
> +the API section for details.
> +
> +Forward progress guarantee relies on that workers can be created when
> +more execution contexts are necessary, which in turn is guaranteed
> +through the use of rescue workers.  All work items which might be used
> +on code paths that handle memory reclaim are required to be queued on
> +wq's that have a rescue-worker reserved for execution under memory
> +pressure.  Else it is possible that the worker-pool deadlocks waiting
> +for execution contexts to free up.
> +
> +
> +Application Programming Interface (API)
> +=======================================
> +
> +``alloc_workqueue()`` allocates a wq.  The original
> +``create_*workqueue()`` functions are deprecated and scheduled for
> +removal.  ``alloc_workqueue()`` takes three arguments - @``name``,
> +``@flags`` and ``@max_active``.  ``@name`` is the name of the wq and
> +also used as the name of the rescuer thread if there is one.
> +
> +A wq no longer manages execution resources but serves as a domain for
> +forward progress guarantee, flush and work item attributes. ``@flags``
> +and ``@max_active`` control how work items are assigned execution
> +resources, scheduled and executed.
> +
> +
> +``flags``
> +---------
> +
> +``WQ_UNBOUND``
> +  Work items queued to an unbound wq are served by the special
> +  worker-pools which host workers which are not bound to any
> +  specific CPU.  This makes the wq behave as a simple execution
> +  context provider without concurrency management.  The unbound
> +  worker-pools try to start execution of work items as soon as
> +  possible.  Unbound wq sacrifices locality but is useful for
> +  the following cases.
> +
> +  * Wide fluctuation in the concurrency level requirement is
> +    expected and using bound wq may end up creating large number
> +    of mostly unused workers across different CPUs as the issuer
> +    hops through different CPUs.
> +
> +  * Long running CPU intensive workloads which can be better
> +    managed by the system scheduler.
> +
> +``WQ_FREEZABLE``
> +  A freezable wq participates in the freeze phase of the system
> +  suspend operations.  Work items on the wq are drained and no
> +  new work item starts execution until thawed.
> +
> +``WQ_MEM_RECLAIM``
> +  All wq which might be used in the memory reclaim paths **MUST**
> +  have this flag set.  The wq is guaranteed to have at least one
> +  execution context regardless of memory pressure.
> +
> +``WQ_HIGHPRI``
> +  Work items of a highpri wq are queued to the highpri
> +  worker-pool of the target cpu.  Highpri worker-pools are
> +  served by worker threads with elevated nice level.
> +
> +  Note that normal and highpri worker-pools don't interact with
> +  each other.  Each maintain its separate pool of workers and
> +  implements concurrency management among its workers.
> +
> +``WQ_CPU_INTENSIVE``
> +  Work items of a CPU intensive wq do not contribute to the
> +  concurrency level.  In other words, runnable CPU intensive
> +  work items will not prevent other work items in the same
> +  worker-pool from starting execution.  This is useful for bound
> +  work items which are expected to hog CPU cycles so that their
> +  execution is regulated by the system scheduler.
> +
> +  Although CPU intensive work items don't contribute to the
> +  concurrency level, start of their executions is still
> +  regulated by the concurrency management and runnable
> +  non-CPU-intensive work items can delay execution of CPU
> +  intensive work items.
> +
> +  This flag is meaningless for unbound wq.
> +
> +Note that the flag ``WQ_NON_REENTRANT`` no longer exists as all
> +workqueues are now non-reentrant - any work item is guaranteed to be
> +executed by at most one worker system-wide at any given time.
> +
> +
> +``max_active``
> +--------------
> +
> +``@max_active`` determines the maximum number of execution contexts
> +per CPU which can be assigned to the work items of a wq.  For example,
> +with ``@max_active`` of 16, at most 16 work items of the wq can be
> +executing at the same time per CPU.
> +
> +Currently, for a bound wq, the maximum limit for ``@max_active`` is
> +512 and the default value used when 0 is specified is 256.  For an
> +unbound wq, the limit is higher of 512 and 4 *
> +``num_possible_cpus()``.  These values are chosen sufficiently high
> +such that they are not the limiting factor while providing protection
> +in runaway cases.
> +
> +The number of active work items of a wq is usually regulated by the
> +users of the wq, more specifically, by how many work items the users
> +may queue at the same time.  Unless there is a specific need for
> +throttling the number of active work items, specifying '0' is
> +recommended.
> +
> +Some users depend on the strict execution ordering of ST wq.  The
> +combination of ``@max_active`` of 1 and ``WQ_UNBOUND`` is used to
> +achieve this behavior.  Work items on such wq are always queued to the
> +unbound worker-pools and only one work item can be active at any given
> +time thus achieving the same ordering property as ST wq.
> +
> +
> +Example Execution Scenarios
> +===========================
> +
> +The following example execution scenarios try to illustrate how cmwq
> +behave under different configurations.
> +
> + Work items w0, w1, w2 are queued to a bound wq q0 on the same CPU.
> + w0 burns CPU for 5ms then sleeps for 10ms then burns CPU for 5ms
> + again before finishing.  w1 and w2 burn CPU for 5ms then sleep for
> + 10ms.
> +
> +Ignoring all other tasks, works and processing overhead, and assuming
> +simple FIFO scheduling, the following is one highly simplified version
> +of possible sequences of events with the original wq. ::
> +
> + TIME IN MSECS       EVENT
> + 0           w0 starts and burns CPU
> + 5           w0 sleeps
> + 15          w0 wakes up and burns CPU
> + 20          w0 finishes
> + 20          w1 starts and burns CPU
> + 25          w1 sleeps
> + 35          w1 wakes up and finishes
> + 35          w2 starts and burns CPU
> + 40          w2 sleeps
> + 50          w2 wakes up and finishes
> +
> +And with cmwq with ``@max_active`` >= 3, ::
> +
> + TIME IN MSECS       EVENT
> + 0           w0 starts and burns CPU
> + 5           w0 sleeps
> + 5           w1 starts and burns CPU
> + 10          w1 sleeps
> + 10          w2 starts and burns CPU
> + 15          w2 sleeps
> + 15          w0 wakes up and burns CPU
> + 20          w0 finishes
> + 20          w1 wakes up and finishes
> + 25          w2 wakes up and finishes
> +
> +If ``@max_active`` == 2, ::
> +
> + TIME IN MSECS       EVENT
> + 0           w0 starts and burns CPU
> + 5           w0 sleeps
> + 5           w1 starts and burns CPU
> + 10          w1 sleeps
> + 15          w0 wakes up and burns CPU
> + 20          w0 finishes
> + 20          w1 wakes up and finishes
> + 20          w2 starts and burns CPU
> + 25          w2 sleeps
> + 35          w2 wakes up and finishes
> +
> +Now, let's assume w1 and w2 are queued to a different wq q1 which has
> +``WQ_CPU_INTENSIVE`` set, ::
> +
> + TIME IN MSECS       EVENT
> + 0           w0 starts and burns CPU
> + 5           w0 sleeps
> + 5           w1 and w2 start and burn CPU
> + 10          w1 sleeps
> + 15          w2 sleeps
> + 15          w0 wakes up and burns CPU
> + 20          w0 finishes
> + 20          w1 wakes up and finishes
> + 25          w2 wakes up and finishes
> +
> +
> +Guidelines
> +==========
> +
> +* Do not forget to use ``WQ_MEM_RECLAIM`` if a wq may process work
> +  items which are used during memory reclaim.  Each wq with
> +  ``WQ_MEM_RECLAIM`` set has an execution context reserved for it.  If
> +  there is dependency among multiple work items used during memory
> +  reclaim, they should be queued to separate wq each with
> +  ``WQ_MEM_RECLAIM``.
> +
> +* Unless strict ordering is required, there is no need to use ST wq.
> +
> +* Unless there is a specific need, using 0 for @max_active is
> +  recommended.  In most use cases, concurrency level usually stays
> +  well under the default limit.
> +
> +* A wq serves as a domain for forward progress guarantee
> +  (``WQ_MEM_RECLAIM``, flush and work item attributes.  Work items
> +  which are not involved in memory reclaim and don't need to be
> +  flushed as a part of a group of work items, and don't require any
> +  special attribute, can use one of the system wq.  There is no
> +  difference in execution characteristics between using a dedicated wq
> +  and a system wq.
> +
> +* Unless work items are expected to consume a huge amount of CPU
> +  cycles, using a bound wq is usually beneficial due to the increased
> +  level of locality in wq operations and work item execution.
> +
> +
> +Debugging
> +=========
> +
> +Because the work functions are executed by generic worker threads
> +there are a few tricks needed to shed some light on misbehaving
> +workqueue users.
> +
> +Worker threads show up in the process list as: ::
> +
> +  root      5671  0.0  0.0      0     0 ?        S    12:07   0:00 
> [kworker/0:1]
> +  root      5672  0.0  0.0      0     0 ?        S    12:07   0:00 
> [kworker/1:2]
> +  root      5673  0.0  0.0      0     0 ?        S    12:12   0:00 
> [kworker/0:0]
> +  root      5674  0.0  0.0      0     0 ?        S    12:13   0:00 
> [kworker/1:0]
> +
> +If kworkers are going crazy (using too much cpu), there are two types
> +of possible problems:
> +
> +     1. Something being scheduled in rapid succession
> +     2. A single work item that consumes lots of cpu cycles
> +
> +The first one can be tracked using tracing: ::
> +
> +     $ echo workqueue:workqueue_queue_work > 
> /sys/kernel/debug/tracing/set_event
> +     $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
> +     (wait a few secs)
> +     ^C
> +
> +If something is busy looping on work queueing, it would be dominating
> +the output and the offender can be determined with the work item
> +function.
> +
> +For the second type of problems it should be possible to just check
> +the stack trace of the offending worker thread. ::
> +
> +     $ cat /proc/THE_OFFENDING_KWORKER/stack
> +
> +The work item's function should be trivially visible in the stack
> +trace.
> +
> +
> +Kernel Inline Documentations Reference
> +======================================
> +
> +.. kernel-doc:: include/linux/workqueue.h
> diff --git a/Documentation/workqueue.txt b/Documentation/workqueue.txt
> deleted file mode 100644
> index c49e317..0000000
> --- a/Documentation/workqueue.txt
> +++ /dev/null
> @@ -1,388 +0,0 @@
> -
> -Concurrency Managed Workqueue (cmwq)
> -
> -September, 2010              Tejun Heo <t...@kernel.org>
> -                     Florian Mickler <flor...@mickler.org>
> -
> -CONTENTS
> -
> -1. Introduction
> -2. Why cmwq?
> -3. The Design
> -4. Application Programming Interface (API)
> -5. Example Execution Scenarios
> -6. Guidelines
> -7. Debugging
> -
> -
> -1. Introduction
> -
> -There are many cases where an asynchronous process execution context
> -is needed and the workqueue (wq) API is the most commonly used
> -mechanism for such cases.
> -
> -When such an asynchronous execution context is needed, a work item
> -describing which function to execute is put on a queue.  An
> -independent thread serves as the asynchronous execution context.  The
> -queue is called workqueue and the thread is called worker.
> -
> -While there are work items on the workqueue the worker executes the
> -functions associated with the work items one after the other.  When
> -there is no work item left on the workqueue the worker becomes idle.
> -When a new work item gets queued, the worker begins executing again.
> -
> -
> -2. Why cmwq?
> -
> -In the original wq implementation, a multi threaded (MT) wq had one
> -worker thread per CPU and a single threaded (ST) wq had one worker
> -thread system-wide.  A single MT wq needed to keep around the same
> -number of workers as the number of CPUs.  The kernel grew a lot of MT
> -wq users over the years and with the number of CPU cores continuously
> -rising, some systems saturated the default 32k PID space just booting
> -up.
> -
> -Although MT wq wasted a lot of resource, the level of concurrency
> -provided was unsatisfactory.  The limitation was common to both ST and
> -MT wq albeit less severe on MT.  Each wq maintained its own separate
> -worker pool.  A MT wq could provide only one execution context per CPU
> -while a ST wq one for the whole system.  Work items had to compete for
> -those very limited execution contexts leading to various problems
> -including proneness to deadlocks around the single execution context.
> -
> -The tension between the provided level of concurrency and resource
> -usage also forced its users to make unnecessary tradeoffs like libata
> -choosing to use ST wq for polling PIOs and accepting an unnecessary
> -limitation that no two polling PIOs can progress at the same time.  As
> -MT wq don't provide much better concurrency, users which require
> -higher level of concurrency, like async or fscache, had to implement
> -their own thread pool.
> -
> -Concurrency Managed Workqueue (cmwq) is a reimplementation of wq with
> -focus on the following goals.
> -
> -* Maintain compatibility with the original workqueue API.
> -
> -* Use per-CPU unified worker pools shared by all wq to provide
> -  flexible level of concurrency on demand without wasting a lot of
> -  resource.
> -
> -* Automatically regulate worker pool and level of concurrency so that
> -  the API users don't need to worry about such details.
> -
> -
> -3. The Design
> -
> -In order to ease the asynchronous execution of functions a new
> -abstraction, the work item, is introduced.
> -
> -A work item is a simple struct that holds a pointer to the function
> -that is to be executed asynchronously.  Whenever a driver or subsystem
> -wants a function to be executed asynchronously it has to set up a work
> -item pointing to that function and queue that work item on a
> -workqueue.
> -
> -Special purpose threads, called worker threads, execute the functions
> -off of the queue, one after the other.  If no work is queued, the
> -worker threads become idle.  These worker threads are managed in so
> -called worker-pools.
> -
> -The cmwq design differentiates between the user-facing workqueues that
> -subsystems and drivers queue work items on and the backend mechanism
> -which manages worker-pools and processes the queued work items.
> -
> -There are two worker-pools, one for normal work items and the other
> -for high priority ones, for each possible CPU and some extra
> -worker-pools to serve work items queued on unbound workqueues - the
> -number of these backing pools is dynamic.
> -
> -Subsystems and drivers can create and queue work items through special
> -workqueue API functions as they see fit. They can influence some
> -aspects of the way the work items are executed by setting flags on the
> -workqueue they are putting the work item on. These flags include
> -things like CPU locality, concurrency limits, priority and more.  To
> -get a detailed overview refer to the API description of
> -alloc_workqueue() below.
> -
> -When a work item is queued to a workqueue, the target worker-pool is
> -determined according to the queue parameters and workqueue attributes
> -and appended on the shared worklist of the worker-pool.  For example,
> -unless specifically overridden, a work item of a bound workqueue will
> -be queued on the worklist of either normal or highpri worker-pool that
> -is associated to the CPU the issuer is running on.
> -
> -For any worker pool implementation, managing the concurrency level
> -(how many execution contexts are active) is an important issue.  cmwq
> -tries to keep the concurrency at a minimal but sufficient level.
> -Minimal to save resources and sufficient in that the system is used at
> -its full capacity.
> -
> -Each worker-pool bound to an actual CPU implements concurrency
> -management by hooking into the scheduler.  The worker-pool is notified
> -whenever an active worker wakes up or sleeps and keeps track of the
> -number of the currently runnable workers.  Generally, work items are
> -not expected to hog a CPU and consume many cycles.  That means
> -maintaining just enough concurrency to prevent work processing from
> -stalling should be optimal.  As long as there are one or more runnable
> -workers on the CPU, the worker-pool doesn't start execution of a new
> -work, but, when the last running worker goes to sleep, it immediately
> -schedules a new worker so that the CPU doesn't sit idle while there
> -are pending work items.  This allows using a minimal number of workers
> -without losing execution bandwidth.
> -
> -Keeping idle workers around doesn't cost other than the memory space
> -for kthreads, so cmwq holds onto idle ones for a while before killing
> -them.
> -
> -For unbound workqueues, the number of backing pools is dynamic.
> -Unbound workqueue can be assigned custom attributes using
> -apply_workqueue_attrs() and workqueue will automatically create
> -backing worker pools matching the attributes.  The responsibility of
> -regulating concurrency level is on the users.  There is also a flag to
> -mark a bound wq to ignore the concurrency management.  Please refer to
> -the API section for details.
> -
> -Forward progress guarantee relies on that workers can be created when
> -more execution contexts are necessary, which in turn is guaranteed
> -through the use of rescue workers.  All work items which might be used
> -on code paths that handle memory reclaim are required to be queued on
> -wq's that have a rescue-worker reserved for execution under memory
> -pressure.  Else it is possible that the worker-pool deadlocks waiting
> -for execution contexts to free up.
> -
> -
> -4. Application Programming Interface (API)
> -
> -alloc_workqueue() allocates a wq.  The original create_*workqueue()
> -functions are deprecated and scheduled for removal.  alloc_workqueue()
> -takes three arguments - @name, @flags and @max_active.  @name is the
> -name of the wq and also used as the name of the rescuer thread if
> -there is one.
> -
> -A wq no longer manages execution resources but serves as a domain for
> -forward progress guarantee, flush and work item attributes.  @flags
> -and @max_active control how work items are assigned execution
> -resources, scheduled and executed.
> -
> -@flags:
> -
> -  WQ_UNBOUND
> -
> -     Work items queued to an unbound wq are served by the special
> -     worker-pools which host workers which are not bound to any
> -     specific CPU.  This makes the wq behave as a simple execution
> -     context provider without concurrency management.  The unbound
> -     worker-pools try to start execution of work items as soon as
> -     possible.  Unbound wq sacrifices locality but is useful for
> -     the following cases.
> -
> -     * Wide fluctuation in the concurrency level requirement is
> -       expected and using bound wq may end up creating large number
> -       of mostly unused workers across different CPUs as the issuer
> -       hops through different CPUs.
> -
> -     * Long running CPU intensive workloads which can be better
> -       managed by the system scheduler.
> -
> -  WQ_FREEZABLE
> -
> -     A freezable wq participates in the freeze phase of the system
> -     suspend operations.  Work items on the wq are drained and no
> -     new work item starts execution until thawed.
> -
> -  WQ_MEM_RECLAIM
> -
> -     All wq which might be used in the memory reclaim paths _MUST_
> -     have this flag set.  The wq is guaranteed to have at least one
> -     execution context regardless of memory pressure.
> -
> -  WQ_HIGHPRI
> -
> -     Work items of a highpri wq are queued to the highpri
> -     worker-pool of the target cpu.  Highpri worker-pools are
> -     served by worker threads with elevated nice level.
> -
> -     Note that normal and highpri worker-pools don't interact with
> -     each other.  Each maintain its separate pool of workers and
> -     implements concurrency management among its workers.
> -
> -  WQ_CPU_INTENSIVE
> -
> -     Work items of a CPU intensive wq do not contribute to the
> -     concurrency level.  In other words, runnable CPU intensive
> -     work items will not prevent other work items in the same
> -     worker-pool from starting execution.  This is useful for bound
> -     work items which are expected to hog CPU cycles so that their
> -     execution is regulated by the system scheduler.
> -
> -     Although CPU intensive work items don't contribute to the
> -     concurrency level, start of their executions is still
> -     regulated by the concurrency management and runnable
> -     non-CPU-intensive work items can delay execution of CPU
> -     intensive work items.
> -
> -     This flag is meaningless for unbound wq.
> -
> -Note that the flag WQ_NON_REENTRANT no longer exists as all workqueues
> -are now non-reentrant - any work item is guaranteed to be executed by
> -at most one worker system-wide at any given time.
> -
> -@max_active:
> -
> -@max_active determines the maximum number of execution contexts per
> -CPU which can be assigned to the work items of a wq.  For example,
> -with @max_active of 16, at most 16 work items of the wq can be
> -executing at the same time per CPU.
> -
> -Currently, for a bound wq, the maximum limit for @max_active is 512
> -and the default value used when 0 is specified is 256.  For an unbound
> -wq, the limit is higher of 512 and 4 * num_possible_cpus().  These
> -values are chosen sufficiently high such that they are not the
> -limiting factor while providing protection in runaway cases.
> -
> -The number of active work items of a wq is usually regulated by the
> -users of the wq, more specifically, by how many work items the users
> -may queue at the same time.  Unless there is a specific need for
> -throttling the number of active work items, specifying '0' is
> -recommended.
> -
> -Some users depend on the strict execution ordering of ST wq.  The
> -combination of @max_active of 1 and WQ_UNBOUND is used to achieve this
> -behavior.  Work items on such wq are always queued to the unbound
> -worker-pools and only one work item can be active at any given time thus
> -achieving the same ordering property as ST wq.
> -
> -
> -5. Example Execution Scenarios
> -
> -The following example execution scenarios try to illustrate how cmwq
> -behave under different configurations.
> -
> - Work items w0, w1, w2 are queued to a bound wq q0 on the same CPU.
> - w0 burns CPU for 5ms then sleeps for 10ms then burns CPU for 5ms
> - again before finishing.  w1 and w2 burn CPU for 5ms then sleep for
> - 10ms.
> -
> -Ignoring all other tasks, works and processing overhead, and assuming
> -simple FIFO scheduling, the following is one highly simplified version
> -of possible sequences of events with the original wq.
> -
> - TIME IN MSECS       EVENT
> - 0           w0 starts and burns CPU
> - 5           w0 sleeps
> - 15          w0 wakes up and burns CPU
> - 20          w0 finishes
> - 20          w1 starts and burns CPU
> - 25          w1 sleeps
> - 35          w1 wakes up and finishes
> - 35          w2 starts and burns CPU
> - 40          w2 sleeps
> - 50          w2 wakes up and finishes
> -
> -And with cmwq with @max_active >= 3,
> -
> - TIME IN MSECS       EVENT
> - 0           w0 starts and burns CPU
> - 5           w0 sleeps
> - 5           w1 starts and burns CPU
> - 10          w1 sleeps
> - 10          w2 starts and burns CPU
> - 15          w2 sleeps
> - 15          w0 wakes up and burns CPU
> - 20          w0 finishes
> - 20          w1 wakes up and finishes
> - 25          w2 wakes up and finishes
> -
> -If @max_active == 2,
> -
> - TIME IN MSECS       EVENT
> - 0           w0 starts and burns CPU
> - 5           w0 sleeps
> - 5           w1 starts and burns CPU
> - 10          w1 sleeps
> - 15          w0 wakes up and burns CPU
> - 20          w0 finishes
> - 20          w1 wakes up and finishes
> - 20          w2 starts and burns CPU
> - 25          w2 sleeps
> - 35          w2 wakes up and finishes
> -
> -Now, let's assume w1 and w2 are queued to a different wq q1 which has
> -WQ_CPU_INTENSIVE set,
> -
> - TIME IN MSECS       EVENT
> - 0           w0 starts and burns CPU
> - 5           w0 sleeps
> - 5           w1 and w2 start and burn CPU
> - 10          w1 sleeps
> - 15          w2 sleeps
> - 15          w0 wakes up and burns CPU
> - 20          w0 finishes
> - 20          w1 wakes up and finishes
> - 25          w2 wakes up and finishes
> -
> -
> -6. Guidelines
> -
> -* Do not forget to use WQ_MEM_RECLAIM if a wq may process work items
> -  which are used during memory reclaim.  Each wq with WQ_MEM_RECLAIM
> -  set has an execution context reserved for it.  If there is
> -  dependency among multiple work items used during memory reclaim,
> -  they should be queued to separate wq each with WQ_MEM_RECLAIM.
> -
> -* Unless strict ordering is required, there is no need to use ST wq.
> -
> -* Unless there is a specific need, using 0 for @max_active is
> -  recommended.  In most use cases, concurrency level usually stays
> -  well under the default limit.
> -
> -* A wq serves as a domain for forward progress guarantee
> -  (WQ_MEM_RECLAIM, flush and work item attributes.  Work items which
> -  are not involved in memory reclaim and don't need to be flushed as a
> -  part of a group of work items, and don't require any special
> -  attribute, can use one of the system wq.  There is no difference in
> -  execution characteristics between using a dedicated wq and a system
> -  wq.
> -
> -* Unless work items are expected to consume a huge amount of CPU
> -  cycles, using a bound wq is usually beneficial due to the increased
> -  level of locality in wq operations and work item execution.
> -
> -
> -7. Debugging
> -
> -Because the work functions are executed by generic worker threads
> -there are a few tricks needed to shed some light on misbehaving
> -workqueue users.
> -
> -Worker threads show up in the process list as:
> -
> -root      5671  0.0  0.0      0     0 ?        S    12:07   0:00 
> [kworker/0:1]
> -root      5672  0.0  0.0      0     0 ?        S    12:07   0:00 
> [kworker/1:2]
> -root      5673  0.0  0.0      0     0 ?        S    12:12   0:00 
> [kworker/0:0]
> -root      5674  0.0  0.0      0     0 ?        S    12:13   0:00 
> [kworker/1:0]
> -
> -If kworkers are going crazy (using too much cpu), there are two types
> -of possible problems:
> -
> -     1. Something being scheduled in rapid succession
> -     2. A single work item that consumes lots of cpu cycles
> -
> -The first one can be tracked using tracing:
> -
> -     $ echo workqueue:workqueue_queue_work > 
> /sys/kernel/debug/tracing/set_event
> -     $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
> -     (wait a few secs)
> -     ^C
> -
> -If something is busy looping on work queueing, it would be dominating
> -the output and the offender can be determined with the work item
> -function.
> -
> -For the second type of problems it should be possible to just check
> -the stack trace of the offending worker thread.
> -
> -     $ cat /proc/THE_OFFENDING_KWORKER/stack
> -
> -The work item's function should be trivially visible in the stack
> -trace.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1cd38a7..777fe69 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -13101,7 +13101,7 @@ T:    git 
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git
>  S:   Maintained
>  F:   include/linux/workqueue.h
>  F:   kernel/workqueue.c
> -F:   Documentation/workqueue.txt
> +F:   Documentation/workqueue.rst
>  
>  X-POWERS MULTIFUNCTION PMIC DEVICE DRIVERS
>  M:   Chen-Yu Tsai <w...@csie.org>
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 8455869..a7e72ac 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -276,7 +276,7 @@ static inline unsigned int work_static(struct work_struct 
> *work) { return 0; }
>  
>  /*
>   * Workqueue flags and constants.  For details, please refer to
> - * Documentation/workqueue.txt.
> + * Documentation/workqueue.rst.
>   */
>  enum {
>       WQ_UNBOUND              = 1 << 1, /* not bound to any cpu */
> @@ -374,7 +374,7 @@ __alloc_workqueue_key(const char *fmt, unsigned int 
> flags, int max_active,
>   * @args...: args for @fmt
>   *
>   * Allocate a workqueue with the specified parameters.  For detailed
> - * information on WQ_* flags, please refer to Documentation/workqueue.txt.
> + * information on WQ_* flags, please refer to Documentation/workqueue.rst.
>   *
>   * The __lock_name macro dance is to guarantee that single lock_class_key
>   * doesn't end up with different namesm, which isn't allowed by lockdep.

-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to