Re: [DRBD-user] umount costs lots of time in drbd 8.4.3

Lars Ellenberg Mon, 13 May 2013 17:07:21 -0700

On Thu, May 09, 2013 at 10:33:16AM +0800, Mia Lueng wrote:
> # sysctl -a|grep dirty
> vm.dirty_background_ratio = 10
> vm.dirty_background_bytes = 0
> vm.dirty_ratio = 20
> vm.dirty_bytes = 0
> vm.dirty_writeback_centisecs = 500
> vm.dirty_expire_centisecs = 3000
> 
> bandwidth is 100M bps


You can replicate around 10 to 12 MByte per second.
To avoid long "write-out stalls" when flushing caches,
you should not allow more than about 20 MByte dirty,
and start write out much earlier.

vm.dirty_bytes=20100100
vm.dirty_background_bytes=500100
vm.dirty_writeback_centisecs=97

A ratio of 20 % of available RAM may well mean several GB.
How much RAM do you have?

Depending on what usage patterns and data characteristics
you actually have in production, maybe you want to try drbd-proxy.
Or check with LINBIT what other options you have.

> 2013/5/9 Lars Ellenberg <[email protected]>
> 
> > On Thu, May 09, 2013 at 12:16:56AM +0800, Mia Lueng wrote:
> > > in drbd 8.4.3，I do the following test:
> > >
> > > [root@kvm3 drbd.d]# drbdadm dump drbd0
> > > # resource drbd0 on kvm3: not ignored, not stacked
> > > # defined at /etc/drbd.d/drbd0.res:1
> > > resource drbd0 {
> > >     on kvm3 {
> > >         device           /dev/drbd0 minor 0;
> > >         disk             /dev/vg_kvm3/drbd0;
> > >         meta-disk        internal;
> > >         address          ipv4 192.168.10.6:7700;
> > >     }
> > >     on kvm4 {
> > >         device           /dev/drbd0 minor 0;
> > >         disk             /dev/vg_kvm4/drbd0;
> > >         meta-disk        internal;
> > >         address          ipv4 192.168.10.7:7700;
> > >     }
> > >     net {
> > >         protocol           A;
> > >         csums-alg        md5;
> > >         verify-alg       md5;
> > >         ping-timeout      30;
> > >         ping-int          30;
> > >         max-epoch-size   8192;
> > >         max-buffers      8912;
> > >         unplug-watermark 131072;
> > >     }
> > >     disk {
> > >         on-io-error      pass_on;
> > >         disk-barrier      no;
> > >         disk-flushes      no;
> > >         resync-rate      100M;
> > >         c-plan-ahead      20;
> > >         c-delay-target   100;
> > >         c-max-rate       400M;
> > >         c-min-rate        2M;
> > >         al-extents       601;
> > >     }
> > > }
> > >
> > > [root@kvm3 oradata]# dd if=t1 of=t2 bs=1M
> > > 5585+1 records in
> > > 5585+1 records out
> > > 5856305152 bytes (5.9 GB) copied, 286.119 s, 20.5 MB/s
> >
> > That writes to the page cache, and from there to the block device.
> >
> > No fsync, no sync: there will still be a few GB in the cache (RAM only).
> >
> > > [root@kvm3 oradata]# cd
> > > [root@kvm3 ~]# umount /oradata
> > >
> > >
> > > it takes lots of time(up to 600 seconds)  to umount the drbd mount point.
> >
> > On umount, the filesystem obviously has to flush all dirty pages first.
> >
> > What is your replication bandwidth?
> >
> > > echo "1" >/proc/sys/vm/block_dump
> > > show when umount ,
> > >
> > > [root@kvm3 ~]# dmesg|tail -n 100
> > ...
> > > umount(3958): WRITE block 100925440 on dm-5
> > > umount(3958): WRITE block 100925440 on dm-5
> > > umount(3958): WRITE block 100925440 on dm-5
> > > umount(3958): WRITE block 0 on dm-5
> > > umount(3958): dirtied inode 1053911 (mtab.tmp) on dm-0
> > > umount(3958): dirtied inode 1053911 (mtab.tmp) on dm-0
> > > umount(3958): WRITE block 33845632 on dm-0
> > > umount(3958): dirtied inode 1053912 (?) on dm-0
> > >
> > >
> > > Is the reason that I use protocol A?
> >
> > No.
> >
> > But that you need to understand caching, and tunables.
> >
> > Some hints and keywords for a followup search:
> >
> > Check how much "dirty" data (writes not yet on stable storage)
> > is still in RAM:
> > grep Dirty /proc/meminfo
> >
> > Tune how much dirty data is "allowed"
> > sysctl
> >         vm.dirty_background_bytes
> >         vm.dirty_bytes
> >         vm.dirty_writeback_centisecs
> >         vm.dirty_expire_centisecs
> >
> > also compare:
> > time dd if=t1 of=t2 bs=1M; time sync
> > time dd if=t1 of=t2 bs=1M conv=fsync
> >
> >
> >
> >
> > --
> > : Lars Ellenberg
> > : LINBIT | Your Way to High Availability
> > : DRBD/HA support and consulting http://www.linbit.com
> >
> > DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
> > __
> > please don't Cc me, but send to list   --   I'm subscribed
> > _______________________________________________
> > drbd-user mailing list
> > [email protected]
> > http://lists.linbit.com/mailman/listinfo/drbd-user
> >

> _______________________________________________
> drbd-user mailing list
> [email protected]
> http://lists.linbit.com/mailman/listinfo/drbd-user


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] umount costs lots of time in drbd 8.4.3

Reply via email to