Re: [linux-lvm] Unsync-ed LVM Mirror

Eric Ren Mon, 05 Feb 2018 06:32:09 -0800

Months ago, I worked on a NULL pointer deference crash on dm mirrortarget. I worked out two patchesto fix the crash issue, but when I was submitting them, I found thatupstream had "fixed" the crash by

reverting, you can find the discussion here:

   - https://patchwork.kernel.org/patch/9808897/



Zdenek did through out his doubt, but no body gave response:
"""

Which kernel version is this ?

I'd thought we've already fixed this BZ for old mirrors:
https://bugzilla.redhat.com/show_bug.cgi?id=1382382

There similar BZ for md-raid based mirrors (--type raid1)
https://bugzilla.redhat.com/show_bug.cgi?id=1416099

My base kernel version is 4.4.68, but with this 2 latest fixes applied:

"""
Revert "dm mirror: use all available legs on multiple failures"


Ohh  - I've -rc6 - while this  'revert' patch went to 4.12-rc7.

I'm now starting to wonder why?

It's been a real fix for a real issue - and 'revert' message states
there is no such problem ??

I'm confused....

Mike  - have you tried the sequence from BZ  ?

Zdenek

"""

I wrongly accepted the facts:

1. the crash issue do disappear;

2. the "reverting" fixing way is likely wrong, but I did follow up itfurther becausepeople now mainly uses raid1 instead of mirror - my fault to think thatway.

But, I was just feeling it's hard to persuade the maintainer to revertthe "reverting fixes"

and try my fix.

Anyway, why are you using mirror? why not raid1?

Eric


On 02/05/2018 03:42 PM, Liwei wrote:

Hi Eric,
    Thanks for answering! Here are the details:

# lvm version
  LVM version:     2.02.176(2) (2017-11-03)
  Library version: 1.02.145 (2017-11-03)
  Driver version:  4.37.0

Configuration: ./configure --build=x86_64-linux-gnu --prefix=/usr--includedir=${prefix}/include --mandir=${prefix}/share/man--infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var--disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu--libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run--disable-maintainer-mode --disable-dependency-tracking --exec-prefix=--bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin--with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2--with-cache=internal --with-clvmd=corosync --with-cluster=internal--with-device-uid=0 --with-device-gid=6 --with-device-mode=0660--with-default-pid-dir=/run --with-default-run-dir=/run/lvm--with-default-locking-dir=/run/lock/lvm --with-thin=internal--with-thin-check=/usr/sbin/thin_check--with-thin-dump=/usr/sbin/thin_dump--with-thin-repair=/usr/sbin/thin_repair --enable-applib--enable-blkid_wiping --enable-cmdlib --enable-cmirrord--enable-dmeventd --enable-dbus-service --enable-lvmetad--enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld--enable-notify-dbus --enable-pkgconfig --enable-readline--enable-udev_rules --enable-udev_sync


# uname -a

Linux dataserv 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14)x86_64 GNU/Linux


Warm regards,
Liwei

On 5 Feb 2018 15:27, "Eric Ren" <[email protected] <mailto:[email protected]>>wrote:


    Hi,

    Your LVM version and kernel version please?

    like:
    """"
    # lvm version
      LVM version:     2.02.177(2) (2017-12-18)
      Library version: 1.03.01 (2017-12-18)
      Driver version:  4.35.0

    # uname -a
    Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51
    UTC 2018 (849a2fe) x86_64 x86_64 x86_64 GNU/Linux
    """

    Eric

    On 02/03/2018 05:43 PM, Liwei wrote:

        Hi list,
             I had a LV that I was converting from linear to mirrored (not
        raid1) whose source device failed partway-through during the
        initial
        sync.

             I've since recovered the source device, but it seems like the
        mirror is still acting as if some blocks are not readable? I'm
        getting
        this in my logs, and the FS is full of errors:

        [  +1.613126] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
        while out-of-sync: Reads may fail.
        [  +0.085916] device-mapper: raid1: Mirror read failed.
        [  +0.196562] device-mapper: raid1: Mirror read failed.
        [  +0.000237] Buffer I/O error on dev dm-27, logical block
        5371800560,
        async page read
        [  +0.592135] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.082882] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.246945] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.107374] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.083344] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.114949] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.085056] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.203929] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.157953] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +3.065247] recovery_complete: 23 callbacks suppressed
        [  +0.000001] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.128064] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.103100] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.107827] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.140871] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.132844] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.124698] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.138502] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.117827] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [  +0.125705] device-mapper: raid1: Unable to read primary mirror
        during recovery
        [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
        [  +0.167553] device-mapper: raid1: Mirror read failed.
        [  +0.000268] Buffer I/O error on dev dm-27, logical block
        5367765816,
        async page read
        [  +0.135138] device-mapper: raid1: Mirror read failed.
        [  +0.000238] Buffer I/O error on dev dm-27, logical block
        5367765816,
        async page read
        [  +0.000365] device-mapper: raid1: Mirror read failed.
        [  +0.000315] device-mapper: raid1: Mirror read failed.
        [  +0.000213] Buffer I/O error on dev dm-27, logical block
        5367896888,
        async page read
        [  +0.000276] device-mapper: raid1: Mirror read failed.
        [  +0.000199] Buffer I/O error on dev dm-27, logical block
        5367765816,
        async page read

             However, if I take down the destination device and
        restart the LV
        with --activateoption partial, I can read my data and everything
        checks out.

             My theory (and what I observed) is that lvm continued the
        initial
        sync even after the source drive stopped responding, and has now
        mapped the blocks that it 'synced' as dead. How can I make lvm
        retry
        those blocks again?

             In fact, I don't trust the mirror anymore, is there a way
        I can
        conduct a scrub of the mirror after the initial sync is done?
        I read
        about --syncaction check, but seems like it only notes the
        number of
        inconsistencies. Can I have lvm re-mirror the inconsistencies
        from the
        source to destination device? I trust the source device
        because we ran
        a btrfs scrub on it and it reported that all checksums are valid.

             It took months for the mirror sync to get to this stage
        (actually,
        why does it take months to mirror 20TB?), I don't want to
        start it all
        over again.

        Warm regards,
        Liwei

        _______________________________________________
        linux-lvm mailing list
        [email protected] <mailto:[email protected]>
        https://www.redhat.com/mailman/listinfo/linux-lvm
        <https://www.redhat.com/mailman/listinfo/linux-lvm>
        read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
        <http://tldp.org/HOWTO/LVM-HOWTO/>

_______________________________________________
linux-lvm mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] Unsync-ed LVM Mirror

Reply via email to