Re: [ceph-users] Disabling write cache on SATA HDDs reduces write latency 7 times

Marc Roos Sun, 11 Nov 2018 02:56:14 -0800


I just did very very short test and don’t see any difference with this 
cache on or off, so I am leaving it on for now.






-----Original Message-----
From: Ashley Merrick [mailto:singap...@amerrick.co.uk] 
Sent: zondag 11 november 2018 11:43
To: Marc Roos
Cc: ceph-users; vitalif
Subject: Re: [ceph-users] Disabling write cache on SATA HDDs reduces 
write latency 7 times

Don’t have any SSD in the cluster to test.

Also without knowing the exact reason why it being enabled has such a 
negative effect I wouldn’t be sure if also would be the same on SSD’s.

On Sun, 11 Nov 2018 at 6:41 PM, Marc Roos <m.r...@f1-outsourcing.eu> 
wrote:


         
        
        Does it make sense to test disabling this on hdd cluster only?
        
        
        -----Original Message-----
        From: Ashley Merrick [mailto:singap...@amerrick.co.uk] 
        Sent: zondag 11 november 2018 6:24
        To: vita...@yourcmc.ru
        Cc: ceph-users@lists.ceph.com
        Subject: Re: [ceph-users] Disabling write cache on SATA HDDs 
reduces 
        write latency 7 times
        
        I've just worked out I had the same issue, been trying to work out 
the 
        cause for the past few days!
        
        However I am using brand new enterprise Toshiba drivers with 256MB 
write 
        cache, was seeing I/O wait peaks of 40% even during a small writing 

        operation to CEPH and commit / apply latency's in the 40ms+.
        
        Just went through and disabled the write cache on each drive, and 
done a 
        few tests with the exact same write performance, but I/O wait in 
the <1% 
        and commit / apply latency's in the 1-3ms max.
        
        Something somewhere definitely doesn't seem to like the write cache 

        being enabled on the disks, this is a EC Pool in the latest Mimic 
        version.
        
        On Sun, Nov 11, 2018 at 5:34 AM Vitaliy Filippov 
<vita...@yourcmc.ru> 
        wrote:
        
        
                Hi
        
                A weird thing happens in my test cluster made from desktop 
        hardware.
        
                The command `for i in /dev/sd?; do hdparm -W 0 $i; done` 
increases  
        
                single-thread write iops (reduces latency) 7 times!
        
                It is a 3-node cluster with Ryzen 2700 CPUs, 3x SATA 
7200rpm HDDs + 
        1x  
                SATA desktop SSD for system and ceph-mon + 1x SATA server 
SSD for  
                block.db/wal in each host. Hosts are linked by 10gbit 
ethernet (not 
        the  
                fastest one though, average RTT according to flood-ping is 
        0.098ms). Ceph  
                and OpenNebula are installed on the same hosts, OSDs are 
prepared 
        with  
                ceph-volume and bluestore with default options. SSDs have 
        capacitors  
                ('power-loss protection'), write cache is turned off for 
them since 
        the  
                very beginning (hdparm -W 0 /dev/sdb). They're quite old, 
but each 
        of them  
                is capable of delivering ~22000 iops in journal mode (fio 
-sync=1  
                -direct=1 -iodepth=1 -bs=4k -rw=write).
        
                However, RBD single-threaded random-write benchmark 
originally gave 
        awful  
                results - when testing with `fio -ioengine=libaio -size=10G 
-sync=1 
        
                -direct=1 -name=test -bs=4k -iodepth=1 -rw=randwrite 
-runtime=60  
                -filename=./testfile` from inside a VM, the result was only 
58 iops 
        
                average (17ms latency). This was not what I expected from 
the 
        HDD+SSD  
                setup.
        
                But today I tried to play with cache settings for data 
disks. And I 
        was  
                really surprised to discover that just disabling HDD write 
cache 
        (hdparm  
                -W 0 /dev/sdX for all HDD devices) increases 
single-threaded 
        performance  
                ~7 times! The result from the same VM (without even 
rebooting it) 
        is  
                iops=405, avg lat=2.47ms. That's a magnitude faster and in 
fact 
        2.5ms  
                seems sort of an expected number.
        
                As I understand 4k writes are always deferred at the 
default 
        setting of  
                prefer_deferred_size_hdd=32768, this means they should only 
get 
        written to  
                the journal device before OSD acks the write operation.
        
                So my question is WHY? Why does HDD write cache affect 
commit 
        latency with  
                WAL on an SSD?
        
                I would also appreciate if anybody with similar setup 
(HDD+SSD with 
        
                desktop SATA controllers or HBA) could test the same 
thing...
        
                -- 
                With best regards,
                   Vitaliy Filippov
                _______________________________________________
                ceph-users mailing list
                ceph-users@lists.ceph.com
                http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        
        
        
        


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Disabling write cache on SATA HDDs reduces write latency 7 times

Reply via email to