On Wed, May 07, 2025 at 03:49:42PM -0600, Uday Shankar wrote:
> +Load balancing
> +--------------
> +
> +A simple approach to designing a ublk server might involve selecting a
> +number of I/O handler threads N, creating devices with N queues, and
> +pairing up I/O handler threads with queues, so that each thread gets a
> +unique qid, and it issues ``FETCH_REQ``s against all tags for that qid.
                             ``FETCH_REQ``\s (escape s)
> +Indeed, before the introduction of the ``UBLK_F_RR_TAGS`` feature, this
> +was essentially the only option (*)

Use reST footnotes syntax, i.e.:

---- >8 ----
diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index 440b63be4ea8b6..b1d29fceff4e80 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -325,7 +325,7 @@ number of I/O handler threads N, creating devices with N 
queues, and
 pairing up I/O handler threads with queues, so that each thread gets a
 unique qid, and it issues ``FETCH_REQ``\s against all tags for that qid.
 Indeed, before the introduction of the ``UBLK_F_RR_TAGS`` feature, this
-was essentially the only option (*)
+was essentially the only option [#]_

 This approach can run into performance issues under imbalanced load.
 This architecture taken together with the `blk-mq architecture
@@ -368,8 +368,8 @@ With this setup, I/O submitted on a CPU which maps to queue 
0 will be
 balanced across all threads instead of all landing on the same thread.
 Thus, a potential bottleneck is avoided.

-(*) technically, one I/O handling thread could service multiple queues
-if it wanted to, but that doesn't help with imbalanced load
+.. [#] Technically, one I/O handling thread could service multiple queues
+       if it wanted to, but that doesn't help with imbalanced load

 Zero copy
 ---------

> +
> +This approach can run into performance issues under imbalanced load.
> +This architecture taken together with the `blk-mq architecture
> +<https://docs.kernel.org/block/blk-mq.html>`_ implies that there is a
This architecture, taken together with the
:doc:`blk-mq architecture </block/blk-mq>`, implies that ...
> +fixed mapping from I/O submission CPU to the ublk server thread that
> +handles it. If the workload is CPU-bottlenecked, only allowing one ublk
> +server thread to handle all the I/O generated from a single CPU can
> +limit peak bandwidth.
> +
> <snipped>...
> +With these changes, a ublk server can balance load as follows:
> +
> +- create the device with ``UBLK_F_RR_TAGS`` set in
> +  ``ublksrv_ctrl_dev_info::flags`` when issuing the ``ADD_DEV`` command
> +- issue ``FETCH_REQ``s from ublk server threads to (qid,tag) pairs in
> +  a round-robin manner. For example, for a device configured with
> +  ``nr_hw_queues=2`` and ``queue_depth=4``, and a ublk server having 4
> +  I/O handling threads, ``FETCH_REQ``s could be issued as follows, where
> +  each entry in the table is the pair (``ublksrv_io_cmd::q_id``,
> +  ``ublksrv_io_cmd::tag``) in the payload of the ``FETCH_REQ``.

s/``FETCH_REQ``/``FETCH_REQ``\s/ (escape s after FETCH_REQ).

> +
> +  ======== ======== ======== ========
> +  thread 0 thread 1 thread 2 thread 3
> +  ======== ======== ======== ========
> +  (0, 0)   (0, 1)   (0, 2)   (0, 3)
> +  (1, 3)   (1, 0)   (1, 1)   (1, 2)

Add table border in the bottom, i.e.:

---- >8 ----
diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index e9cbabdd69c553..dc6fdfedba9ab4 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -362,6 +362,7 @@ With these changes, a ublk server can balance load as 
follows:
   ======== ======== ======== ========
   (0, 0)   (0, 1)   (0, 2)   (0, 3)
   (1, 3)   (1, 0)   (1, 1)   (1, 2)
+  ======== ======== ======== ========

 With this setup, I/O submitted on a CPU which maps to queue 0 will be
 balanced across all threads instead of all landing on the same thread.

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

Attachment: signature.asc
Description: PGP signature

Reply via email to