Commit 1ae399382512 ("Btrfs: do not use async submit for small DIO io's")
benefits small DIO io's.

However, if we're owning the SINGLE profile, this also affects large DIO io's
since in that case, map_length is (chunk_length - bio's offset_in_chunk),
it's farily large so that it's very likely to be larger than a large bio's
size, which avoids asynchronous submit.
For instance, if we have a 512k bio, the efforts of calculating (512k/4k=128)
checksums will be taken by the DIO task.

Test results with fio (tested on a hard disk, not tested on ssd, 4cpu, 8g 
memory)

bs      async   sync            async   sync
        bw      bw(KB/S)        iops    iop
4k      115312  115480          28827.6 28869.6
8k      114381  115586          14297.4 14447.6
16k     115393  116290          7211.4  7267.6
32k     114268  116589          3570.4  3643
64k     115421  113417          1803    1771.8  <-----ASYNC wins here
128k    115545  112585          902     879
256k    115178  111521          449.2   435
512k    115874  111620          226     217.6

This adds a limit 'BTRFS_STRIPE_LEN(64k)' to decide if it's small enough to 
avoid
asynchronous submit.

Still, in this case we don't need to split the bio and can submit it directly.

Signed-off-by: Liu Bo <[email protected]>
---
The job in the test,

[global]
rw=write
ioengine=libaio
direct=1
iodepth=64
iodepth_batch=64
iodepth_batch_complete=64
iodepth_low=64
bs=4k
size=8g
sync=0
group_reporting
fallocate=posix
invalidate=1
runtime=30

[dio]
directory=/mnt/btrfs
filename=foobar

 fs/btrfs/inode.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e687bb0..c640d7e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7792,6 +7792,7 @@ static int btrfs_submit_direct_hook(int rw, struct 
btrfs_dio_private *dip,
        int nr_pages = 0;
        int ret;
        int async_submit = 0;
+       u64 alloc_profile;
 
        map_length = orig_bio->bi_iter.bi_size;
        ret = btrfs_map_block(root->fs_info, rw, start_sector << 9,
@@ -7799,15 +7800,26 @@ static int btrfs_submit_direct_hook(int rw, struct 
btrfs_dio_private *dip,
        if (ret)
                return -EIO;
 
+       alloc_profile = btrfs_get_alloc_profile(root, 1);
+
        if (map_length >= orig_bio->bi_iter.bi_size) {
                bio = orig_bio;
                dip->flags |= BTRFS_DIO_ORIG_BIO_SUBMITTED;
+
+               /*
+                * In the case of 'single' profile, the above check is very
+                * likely to be true as map_length is (chunk_length - offset),
+                * so checking BTRFS_STRIPE_LEN here.
+                */
+               if ((alloc_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
+                   orig_bio->bi_iter.bi_size >= BTRFS_STRIPE_LEN)
+                       async_submit = 1;
+
                goto submit;
        }
 
        /* async crcs make it difficult to collect full stripe writes. */
-       if (btrfs_get_alloc_profile(root, 1) &
-           (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6))
+       if (alloc_profile & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6))
                async_submit = 0;
        else
                async_submit = 1;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to