Dne 14.9.2017 v 00:39 Dale Stephenson napsal(a):

On Sep 13, 2017, at 4:19 PM, Zdenek Kabelac <zkabe...@redhat.com> wrote:

Dne 13.9.2017 v 17:33 Dale Stephenson napsal(a):
Distribution: centos-release-7-3.1611.el7.centos.x86_64
Kernel: Linux 3.10.0-514.26.2.el7.x86_64
LVM: 2.02.166(2)-RHEL7 (2016-11-16)
Volume group consisted of an 8-drive SSD (500G drives) array, plus an 
additional SSD of the same size.  The array had 64 k stripes.
Thin pool had -Zn option and 512k chunksize (full stripe), size 3T with 
metadata volume 16G.  data was entirely on the 8-drive raid, metadata was 
entirely on the 9th drive.
Virtual volume “thin” was 300 GB.  I also filled it with dd so that it would be 
fully provisioned before the test.
Volume “thick” was also 300GB, just an ordinary volume also entirely on the 
8-drive array.
Four tests were run directlyagainst each volume using fio-2.2.8, random read, 
random write, sequential read, sequential write.  Single thread, 4k blocksize, 
90s run time.

Hi

Can you please provide output of:

lvs -a -o+stripes,stripesize,seg_pe_ranges

so we can see how is your stripe placed on devices ?

Sure, thank you for your help:
# lvs -a -o+stripes,stripesize,seg_pe_ranges
   LV               VG     Attr       LSize   Pool     Origin Data%  Meta%  
Move Log Cpy%Sync Convert #Str Stripe PE Ranges
   [lvol0_pmspare]  volgr0 ewi-------  16.00g                                   
                         1     0  /dev/md127:867328-871423
   thick            volgr0 -wi-a----- 300.00g                                   
                         1     0  /dev/md127:790528-867327
   thin             volgr0 Vwi-a-t--- 300.00g thinpool        100.00            
                         0     0
   thinpool         volgr0 twi-aot---   3.00t                 9.77   0.13       
                         1     0  thinpool_tdata:0-786431
   [thinpool_tdata] volgr0 Twi-ao----   3.00t                                   
                         1     0  /dev/md127:0-786431
   [thinpool_tmeta] volgr0 ewi-ao----  16.00g                                   
                         1     0  /dev/sdb4:0-4095

md127 is an 8-drive RAID 0

As you can see, there’s no lvm striping; I rely on the software RAID underneath 
for that.  Both thick and thin lvols are on the same PV.

SSD typically do needs ideally write 512K chunks.

I could create the md to use 512k chunks for RAID 0, but I wouldn’t expect that 
to have any impact on a single threaded test using 4k request size.  Is there a 
hidden relationship that I’m unaware of?


Yep - it seems the setup in this case is the best fit.

If you can reevaluate different setups you may possibly get much higher throughput.

My guess would be - the best targeting layout should be probably striping no more then 2-3 disks and use bigger striping block.

And then just 'join' 'smaller' arrays together in lvm2 in 1 big LV.



(something like  'lvcreate -LXXX -i8 -I512k vgname’)

Would making lvm stripe on top of an md that already stripes confer any 
performance benefit in general, or for small (4k) requests in particular?

Rule #1 - try to avoid 'over-combining' things together.
 - measure performance from 'bottom'  upward in your device stack.
If the underlying devices gives poor speed - you can't make it better by any super0smart disk-layout on top of it.



Wouldn't be 'faster' to just concatenate 8 disks together instead of striping - 
or stripe only across 2 disk - and then you concatenate 4 such striped areas…

For sustained throughput I would expect striping of 8 disks to blow away 
concatenation — however, for small requests I wouldn’t expect any advantage.  
On a non-redundant array, I would expect a single threaded test using 4k 
requests is going to end up reading/writing data from exactly one disk 
regardless of whether the underlying drives are concatenated or stripes.
It always depends which kind of load you expect the most.

I suspect spreading 4K blocks across 8 SSD is likely very far away from ideal layout.

Any SSD is typically very bad with 4K blocks - it you want to 'spread' the load on mores SSDs do not use less the 64K stripe chunks per SSD - this gives you (8*64) 512K stripe size.

As for thin-pool chunksize - if you plan to use lots of snapshots - keep the value lowest possible - 64K or 128K thin-pool chunksize.

But I'd still suggest to reevaluate/benchmark setup where you will use much lower number of SSD for load spreading - and use bigger strip chunks per each device. This should nicely improve performance in case of 'bigger' writes
and not that much slow things down with  4K loads....


What is the best choice for handling 4k request sizes?

Possibly NVMe can do a better job here.

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Reply via email to