Re: [Qemu-devel] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
Paolo Bonzini pbonz...@redhat.com writes: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 01/06/2015 12:34, Dmitry Monakhov wrote: Yes. Improvement is not huge, but it can be detected for old qemu unpatched kernel: 728 MiB/sec ± 20Mb patched kernel : 748 MiB/sec ± 10Mb Ok, so about 3-4%. What does the blktrace look like with a patched kernel, and what is the performance with old QEMU? new qemu looks perfect 259,0 31 62 0.711100207 10768 D WS 29364992 + 256 [qemu-io] 259,0 31 63 0.711128098 10768 Q WS 29365248 + 256 [qemu-io] 259,0 31 64 0.711128509 10768 G WS 29365248 + 256 [qemu-io] 259,0 31 65 0.711130463 10768 D WS 29365248 + 256 [qemu-io] 259,0 31 66 0.711134222 10768 C WS 29364480 + 256 [0] 259,0 31 67 0.711158858 10768 Q WS 29365504 + 256 [qemu-io] 259,0 31 68 0.711159226 10768 G WS 29365504 + 256 [qemu-io] 259,0 31 69 0.711161104 10768 D WS 29365504 + 256 [qemu-io] 259,0 31 70 0.711171863 10768 C WS 29364736 + 256 [0] old qemu on patched kernel stil issues two bios Max_sz-1page + 1page, such bios are not merged and completed separately 259,0 31 385 0.719283423 10729 Q WS 29376775 + 248 [qemu-io] 259,0 31 386 0.719283694 10729 G WS 29376775 + 248 [qemu-io] 259,0 31 387 0.719285909 10729 D WS 29376775 + 248 [qemu-io] 259,0 31 388 0.719287600 10729 Q WS 29377023 + 8 [qemu-io] 259,0 31 389 0.719287810 10729 G WS 29377023 + 8 [qemu-io] 259,0 31 390 0.719288666 10729 D WS 29377023 + 8 [qemu-io] 259,0 31 391 0.719315193 10729 Q WS 29377031 + 248 [qemu-io] 259,0 31 392 0.719315400 10729 G WS 29377031 + 248 [qemu-io] 259,0 31 393 0.719317411 10729 D WS 29377031 + 248 [qemu-io] 259,0 31 394 0.719319179 10729 Q WS 29377279 + 8 [qemu-io] 259,0 31 395 0.719319366 10729 G WS 29377279 + 8 [qemu-io] 259,0 31 396 0.719320014 10729 D WS 29377279 + 8 [qemu-io] . 259,0 31 916 0.722132787 10729 C WS 29377031 + 248 [0] 259,0 31 995 0.722549469 10729 C WS 29376775 + 248 [0] 259,0 31 4260 0.748148953 10729 C WS 29377023 + 8 [0] . 259,0 31 4274 0.748217568 10729 C WS 29377279 + 8 [0] I'm not quite sure why each bio dispatched to device that soon, AFAIU plug/unplug scope whole do_direct_IO() loop, So bio it looks like bio submit to request path should be modified. Paolo IMHO patch is not sufficient. We have to correct bio construction inside direct-io code. I'll be back with the patch.
Re: [Qemu-devel] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
#secure method=pgpmime mode=sign Paolo Bonzini pbonz...@redhat.com writes: On 01/06/2015 13:16, Dmitry Monakhov wrote: 259,0 31 385 0.719283423 10729 Q WS 29376775 + 248 [qemu-io] 259,0 31 388 0.719287600 10729 Q WS 29377023 + 8 [qemu-io] 259,0 31 391 0.719315193 10729 Q WS 29377031 + 248 [qemu-io] 259,0 31 394 0.719319179 10729 Q WS 29377279 + 8 [qemu-io] Compared to the old one: 9,0 111 0.0 11151 Q WS 312737792 + 1023 [qemu-img] 9,0 112 0.07938 11151 Q WS 312738815 + 8 [qemu-img] 9,0 113 0.30735 11151 Q WS 312738823 + 1016 [qemu-img] 9,0 114 0.32482 11151 Q WS 312739839 + 8 [qemu-img] 9,0 115 0.41379 11151 Q WS 312739847 + 1016 [qemu-img] 9,0 116 0.42818 11151 Q WS 312740863 + 8 [qemu-img] 9,0 117 0.51236 11151 Q WS 312740871 + 1017 [qemu-img] I suspect the difference is that my patch avoids RMW operations on disks that have 512-byte logical blocks and 4K physical blocks. Probably. I've checked bio/request submission path for my nvme device and it is appeared that merge is merges are disabled for that device. /sys/block/nvme0n1/queue/nomerges - 2 (QUEUE_FLAG_NOMERGES) so we follow this trace -blk_mq_make_request(bio) if(!blk_mq_merge_queue_io) blk_mq_run_hw_queue(data.hctx, !is_sync || is_flush_fua); So the only thing we can in order to improve performance is to is to modify fs/direct-io.c - bio submission path by constructing bios with maximum size. Paolo
Re: [Qemu-devel] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
Paolo Bonzini pbonz...@redhat.com writes: On 13/05/2015 18:46, Denis V. Lunev wrote: I agree with this. Kernel guys are aware and may be we will have the fix after a while... I have heard (not tested) that performance loss over multi-queue SSD is around 30%. I came up with this patch... can you test it with your test case (and old QEMU) to see if it works for you? I don't see much improvement, but neither do I see it with new QEMU. Yes. Improvement is not huge, but it can be detected for old qemu unpatched kernel: 728 MiB/sec +/- 20Mb patched kernel : 748 MiB/sec +/- 10Mb IMHO patch is not sufficient. We have to correct bio construction inside direct-io code. I'll be back with the patch. Paolo diff --git a/block/bio.c b/block/bio.c index f66a4eae16ee..df5bde7ebded 100644 --- a/block/bio.c +++ b/block/bio.c @@ -705,6 +705,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page { int retried_segments = 0; struct bio_vec *bvec; + int speculative_len; /* * cloned bio must not modify vec list @@ -712,7 +713,16 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (unlikely(bio_flagged(bio, BIO_CLONED))) return 0; - if (((bio-bi_iter.bi_size + len) 9) max_sectors) + /* + * If the bio is not page-aligned, stop if we cannot fit this entire + * page in the vector. This is a very large write, so we'd like + * to split it so as to keep the remaining bios aligned. + */ + speculative_len = len; + if (bio-bi_iter.bi_size (PAGE_SIZE - 1) offset == 0) + speculative_len = PAGE_SIZE; + + if (((bio-bi_iter.bi_size + speculative_len) 9) max_sectors) return 0; /* signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
On 01/06/2015 13:16, Dmitry Monakhov wrote: 259,0 31 385 0.719283423 10729 Q WS 29376775 + 248 [qemu-io] 259,0 31 388 0.719287600 10729 Q WS 29377023 + 8 [qemu-io] 259,0 31 391 0.719315193 10729 Q WS 29377031 + 248 [qemu-io] 259,0 31 394 0.719319179 10729 Q WS 29377279 + 8 [qemu-io] Compared to the old one: 9,0 111 0.0 11151 Q WS 312737792 + 1023 [qemu-img] 9,0 112 0.07938 11151 Q WS 312738815 + 8 [qemu-img] 9,0 113 0.30735 11151 Q WS 312738823 + 1016 [qemu-img] 9,0 114 0.32482 11151 Q WS 312739839 + 8 [qemu-img] 9,0 115 0.41379 11151 Q WS 312739847 + 1016 [qemu-img] 9,0 116 0.42818 11151 Q WS 312740863 + 8 [qemu-img] 9,0 117 0.51236 11151 Q WS 312740871 + 1017 [qemu-img] I suspect the difference is that my patch avoids RMW operations on disks that have 512-byte logical blocks and 4K physical blocks. Paolo
Re: [Qemu-devel] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 01/06/2015 12:34, Dmitry Monakhov wrote: Yes. Improvement is not huge, but it can be detected for old qemu unpatched kernel: 728 MiB/sec ± 20Mb patched kernel : 748 MiB/sec ± 10Mb Ok, so about 3-4%. What does the blktrace look like with a patched kernel, and what is the performance with old QEMU? Paolo IMHO patch is not sufficient. We have to correct bio construction inside direct-io code. I'll be back with the patch.
Re: [Qemu-devel] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
On 13/05/2015 18:46, Denis V. Lunev wrote: I agree with this. Kernel guys are aware and may be we will have the fix after a while... I have heard (not tested) that performance loss over multi-queue SSD is around 30%. I came up with this patch... can you test it with your test case (and old QEMU) to see if it works for you? I don't see much improvement, but neither do I see it with new QEMU. Paolo diff --git a/block/bio.c b/block/bio.c index f66a4eae16ee..df5bde7ebded 100644 --- a/block/bio.c +++ b/block/bio.c @@ -705,6 +705,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page { int retried_segments = 0; struct bio_vec *bvec; + int speculative_len; /* * cloned bio must not modify vec list @@ -712,7 +713,16 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (unlikely(bio_flagged(bio, BIO_CLONED))) return 0; - if (((bio-bi_iter.bi_size + len) 9) max_sectors) + /* +* If the bio is not page-aligned, stop if we cannot fit this entire +* page in the vector. This is a very large write, so we'd like +* to split it so as to keep the remaining bios aligned. +*/ + speculative_len = len; + if (bio-bi_iter.bi_size (PAGE_SIZE - 1) offset == 0) + speculative_len = PAGE_SIZE; + + if (((bio-bi_iter.bi_size + speculative_len) 9) max_sectors) return 0; /*