For anyone else who has this problem, we have reduced the time required to trim 
a 1.3TB volume from 3 days to 1.5 minutes.

Initially, we had used mdraid to build a raid0 array with a 32K chunk size. We 
initialized it as a drbd disk, synced it, built an lvm logical volume on it, 
and created an ext4 filesystem on the volume. Creating the filesystem and 
trimming it took 3 days (each time, every time, across multiple tests). 

When running lsblk -D, we noticed that the DISC-MAX value for the array was 
only 32K, compared to 4GB for the SSD drive itself. We also noticed that the 
number matched the chunk size. We deleted the array and built a new one with a 
4MB chunk size. The DISC-MAX value changed to 4MB, which is the max selectable 
chunk size (but still way below the other DISC-MAX values shown in lsblk -D). 
We realized that, when using mdadm, the DISK-MAX value ends up matching the 
array chunk size. We theorized that the small DISC-MAX value was responsible 
for the slow trim rate across the DRBD link.

Instead of using mdadm to build the array, we used LVM to create a striped 
logical volume and made that the backing device for drbd. Then lsblk -D showed 
a DISC-MAX size of 128MB.  Creating an ext4 filesystem on it and trimming only 
took 1.5 minutes (across multiple tests).

Somebody knowledgeable may be able to explain how DISC-MAX affects the trim 
speed, and why the DISC-MAX value is different when creating the array with 
mdadm versus lvm.

--
Eric Robinson

> -----Original Message-----
> From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de]
> Sent: Wednesday, August 02, 2017 11:36 PM
> To: users@clusterlabs.org
> Subject: [ClusterLabs] Antw: Re: Antw: DRBD and SSD TRIM - Slow!
> 
> >>> Eric Robinson <eric.robin...@psmnv.com> schrieb am 02.08.2017 um
> >>> 23:20 in
> Nachricht
> <DM5PR03MB2729C66CEC1E3B8B9E297185FAB00@DM5PR03MB2729.nampr
> d03.prod.outlook.com>
> 
> > 1) iotop did not show any significant io, just maybe 30k/second of
> > drbd traffic.
> >
> > 2) okay. I've never done that before. I'll give it a shot.
> >
> > 3) I'm not sure what I'm looking at there.
> 
> See /usr/src/linux/Documentation/block/stat.txt ;-) I wrote an NRPE plugin
> to monitor those with performance data and verbose text output, e.g.:
> CFS_VMs-xen: [delta 120s], 1.15086 IO/s read, 60.7789 IO/s write, 0 req/s
> read merges, 0 req/s write merges, 4.53674 sec/s read, 486.231 sec/s write,
> 2.36844 ms/s read wait, 2702.19 ms/s write wait, 0 req in_flight, 115.987 ms/s
> active, 2704.53 ms/s wait
> 
> Regards,
> Ulrich
> 
> >
> > --
> > Eric Robinson
> >
> >> -----Original Message-----
> >> From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de]
> >> Sent: Tuesday, August 01, 2017 11:28 PM
> >> To: users@clusterlabs.org
> >> Subject: [ClusterLabs] Antw: DRBD and SSD TRIM - Slow!
> >>
> >> Hi!
> >>
> >> I know little about trim operations, but you could try one of these:
> >>
> >> 1) iotop to see whether some I/O is done during trimming (assuming
> >> trimming itself is not considered to be I/O)
> >>
> >> 2) Try blocktrace on the affected devices to see what's going on.
> >> It's hard
> > to
> >> set up and to extract the info you are looking for, but it provides
> >> deep insights
> >>
> >> 3) Watch /sys/block/$BDEV/stat for performance statistics. I don't
> >> know how well DRBD supports these, however (e.g. MDRAID shows no
> wait
> >> times and no busy operations, while a multipath map has it all).
> >>
> >> Regards,
> >> Ulrich
> >>
> >> >>> Eric Robinson <eric.robin...@psmnv.com> schrieb am 02.08.2017 um
> >> >>> 07:09 in
> >> Nachricht
> >>
> <DM5PR03MB27297014DF96DC01FE849A63FAB00@DM5PR03MB2729.nampr
> >> d03.prod.outlook.com>
> >>
> >> > Does anyone know why trimming a filesystem mounted on a DRBD
> volume
> >> > takes so long? I mean like three days to trim a 1.2TB filesystem.
> >> >
> >> > Here are some pertinent details:
> >> >
> >> > OS: SLES 12 SP2
> >> > Kernel: 4.4.74-92.29
> >> > Drives: 6 x Samsung SSD 840 Pro 512GB
> >> > RAID: 0 (mdraid)
> >> > DRBD: 9.0.8
> >> > Protocol: C
> >> > Network: Gigabit
> >> > Utilization: 10%
> >> > Latency: < 1ms
> >> > Loss: 0%
> >> > Iperf test: 900 mbits/sec
> >> >
> >> > When I write to a non-DRBD partition, I get 400MB/sec (bypassing
> caches).
> >> > When I trim a non-DRBD partition, it completes fast.
> >> > When I write to a DRBD volume, I get 80MB/sec.
> >> >
> >> > When I trim a DRBD volume, it takes bloody ages!
> >> >
> >> > --
> >> > Eric Robinson
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Users mailing list: Users@clusterlabs.org
> >> http://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org Getting started:
> >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >
> > _______________________________________________
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> 
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to