Re: [Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

Ing. Luca Lazzeroni - Trend Servizi Srl Thu, 18 Jan 2018 23:37:15 -0800

Nope. If I enable write-behind the corruption happens every time.



Il 19/01/2018 08:26, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:

After other test (I'm trying to convice myself about glusterreliability :-) I've found that with


performance.write-behind off

the vm works without problem. Now I'll try with write-behind on andflush-behind on too.




Il 18/01/2018 13:30, Krutika Dhananjay ha scritto:

Thanks for that input. Adding Niels since the issue is reproducibleonly with libgfapi.


-Krutika

On Thu, Jan 18, 2018 at 1:39 PM, Ing. Luca Lazzeroni - Trend ServiziSrl <[email protected] <mailto:[email protected]>> wrote:


    Another update.

    I've setup a replica 3 volume without sharding and tried to
    install a VM on a qcow2 volume on that device; however the result
    is the same and the vm image has been corrupted, exactly at the
    same point.

    Here's the volume info of the create volume:

    Volume Name: gvtest
    Type: Replicate
    Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 3 = 3
    Transport-type: tcp
    Bricks:
    Brick1: gluster1:/bricks/brick1/gvtest
    Brick2: gluster2:/bricks/brick1/gvtest
    Brick3: gluster3:/bricks/brick1/gvtest
    Options Reconfigured:
    user.cifs: off
    features.shard: off
    cluster.shd-wait-qlength: 10000
    cluster.shd-max-threads: 8
    cluster.locking-scheme: granular
    cluster.data-self-heal-algorithm: full
    cluster.server-quorum-type: server
    cluster.quorum-type: auto
    cluster.eager-lock: enable
    network.remote-dio: enable
    performance.low-prio-threads: 32
    performance.io-cache: off
    performance.read-ahead: off
    performance.quick-read: off
    transport.address-family: inet
    nfs.disable: on
    performance.client-io-threads: off


    Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha
    scritto:


    Hi,

    after our IRC chat I've rebuilt a virtual machine with FUSE
    based virtual disk. Everything worked flawlessly.

    Now I'm sending you the output of the requested getfattr command
    on the disk image:

    # file: TestFUSE-vda.qcow2
    trusted.afr.dirty=0x000000000000000000000000
    trusted.gfid=0x40ffafbbe987445692bb31295fa40105
    
trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d346262302d383738632d3966623765306232336263652f54657374465553452d7664612e71636f7732
    trusted.glusterfs.shard.block-size=0x0000000004000000
    
trusted.glusterfs.shard.file-size=0x00000000c15300000000000000000000000000000060be900000000000000000

    Hope this helps.



    Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha
    scritto:


    I actually use FUSE and it works. If i try to use "libgfapi"
    direct interface to gluster in qemu-kvm, the problem appears.



    Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:

    Really? Then which protocol exactly do you see this issue
    with? libgfapi? NFS?

    -Krutika

    On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend
    Servizi Srl <[email protected] <mailto:[email protected]>> wrote:

        Of course. Here's the full log. Please, note that in FUSE
        mode everything works apparently without problems. I've
        installed 4 vm and updated them without problems.



        Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:



        On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni -
        Trend Servizi Srl <[email protected] <mailto:[email protected]>>
        wrote:

            I've made the test with raw image format
            (preallocated too) and the corruption problem is
            still there (but without errors in bricks' log file).

            What does the "link" error in bricks log files means ?

            I've seen the source code looking for the lines where
            it happens and it seems a warning (it doesn't imply a
            failure).


        Indeed, it only represents a transient state when the
        shards are created for the first time and does not
        indicate a failure.
        Could you also get the logs of the gluster fuse mount
        process? It should be under /var/log/glusterfs of your
        client machine with the filename as a hyphenated mount
        point path.

        For example, if your volume was mounted at
        /mnt/glusterfs, then your log file would be named
        mnt-glusterfs.log.

        -Krutika



            Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend
            Servizi Srl ha scritto:


            An update:

            I've tried, for my tests, to create the vm volume as

            qemu-img create -f qcow2 -o preallocation=full
            gluster://gluster1/Test/Test-vda.img 20G

            et voila !

            No errors at all, neither in bricks' log file (the
            "link failed" message disappeared), neither in VM
            (no corruption and installed succesfully).

            I'll do another test with a fully preallocated raw
            image.



            Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend
            Servizi Srl ha scritto:


            I've just done all the steps to reproduce the problem.

            Tha VM volume has been created via "qemu-img create
            -f qcow2 Test-vda2.qcow2 20G" on the gluster volume
            mounted via FUSE. I've tried also to create the
            volume with preallocated metadata, which moves the
            problem a bit far away (in time). The volume is a
            replice 3 arbiter 1 volume hosted on XFS bricks.

            Here are the informations:

            [root@ovh-ov1 bricks]# gluster volume info gv2a2

            Volume Name: gv2a2
            Type: Replicate
            Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
            Status: Started
            Snapshot Count: 0
            Number of Bricks: 1 x (2 + 1) = 3
            Transport-type: tcp
            Bricks:
            Brick1: gluster1:/bricks/brick2/gv2a2
            Brick2: gluster3:/bricks/brick3/gv2a2
            Brick3: gluster2:/bricks/arbiter_brick_gv2a2/gv2a2
            (arbiter)
            Options Reconfigured:
            storage.owner-gid: 107
            storage.owner-uid: 107
            user.cifs: off
            features.shard: on
            cluster.shd-wait-qlength: 10000
            cluster.shd-max-threads: 8
            cluster.locking-scheme: granular
            cluster.data-self-heal-algorithm: full
            cluster.server-quorum-type: server
            cluster.quorum-type: auto
            cluster.eager-lock: enable
            network.remote-dio: enable
            performance.low-prio-threads: 32
            performance.io-cache: off
            performance.read-ahead: off
            performance.quick-read: off
            transport.address-family: inet
            nfs.disable: off
            performance.client-io-threads: off

            /var/log/glusterfs/glusterd.log:

            [2018-01-15 14:17:50.196228] I [MSGID: 106488]
            [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume]
            0-management: Received get vol req
            [2018-01-15 14:25:09.555214] I [MSGID: 106488]
            [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume]
            0-management: Received get vol req

            (empty because today it's 2018-01-16)

            /var/log/glusterfs/glustershd.log:

            [2018-01-14 02:23:02.731245] I
            [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk]
            0-glusterfs: No change in volfile,continuing

            (empty too)

            /var/log/glusterfs/bricks/brick-brick2-gv2a2.log
            (the interested volume):

            [2018-01-16 15:14:37.809965] I [MSGID: 115029]
            [server-handshake.c:793:server_setvolume]
            0-gv2a2-server: accepted client from
            ovh-ov1-10302-2018/01/16-15:14:37:790306-gv2a2-client-0-0-0
            (version: 3.12.4)
            [2018-01-16 15:16:41.471751] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4
            failed
            [2018-01-16 15:16:41.471745] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4
            ->
            
/bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed
            [File exists]
            [2018-01-16 15:16:42.593392] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5
            ->
            
/bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed
            [File exists]
            [2018-01-16 15:16:42.593426] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5
            failed
            [2018-01-16 15:17:04.129593] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8
            ->
            
/bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed
            [File exists]
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8
            ->
            
/bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed
            [File exists]" repeated 5 times between [2018-01-16
            15:17:04.129593] and [2018-01-16 15:17:04.129593]
            [2018-01-16 15:17:04.129661] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8
            failed
            [2018-01-16 15:17:08.279162] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9
            ->
            
/bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed
            [File exists]
            [2018-01-16 15:17:08.279162] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9
            ->
            
/bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed
            [File exists]
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9
            ->
            
/bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed
            [File exists]" repeated 2 times between [2018-01-16
            15:17:08.279162] and [2018-01-16 15:17:08.279162]

            [2018-01-16 15:17:08.279177] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9
            failed
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4
            ->
            
/bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed
            [File exists]" repeated 6 times between [2018-01-16
            15:16:41.471745] and [2018-01-16 15:16:41.471807]
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5
            ->
            
/bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed
            [File exists]" repeated 2 times between [2018-01-16
            15:16:42.593392] and [2018-01-16 15:16:42.593430]
            [2018-01-16 15:17:32.229689] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14
            ->
            
/bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed
            [File exists]
            [2018-01-16 15:17:32.229720] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14
            failed
            [2018-01-16 15:18:07.154330] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17
            ->
            
/bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed
            [File exists]
            [2018-01-16 15:18:07.154375] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17
            failed
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14
            ->
            
/bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed
            [File exists]" repeated 7 times between [2018-01-16
            15:17:32.229689] and [2018-01-16 15:17:32.229806]
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17
            ->
            
/bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed
            [File exists]" repeated 3 times between [2018-01-16
            15:18:07.154330] and [2018-01-16 15:18:07.154357]
            [2018-01-16 15:19:23.618794] W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21
            ->
            
/bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed
            [File exists]
            [2018-01-16 15:19:23.618827] E [MSGID: 113020]
            [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting
            gfid on
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21
            failed
            The message "W [MSGID: 113096]
            [posix-handle.c:770:posix_handle_hard]
            0-gv2a2-posix: link
            /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21
            ->
            
/bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed
            [File exists]" repeated 3 times between [2018-01-16
            15:19:23.618794] and [2018-01-16 15:19:23.618794]

            Thank you,


            Il 16/01/2018 11:40, Krutika Dhananjay ha scritto:

            Also to help isolate the component, could you
            answer these:

            1. on a different volume with shard not enabled,
            do you see this issue?
            2. on a plain 3-way replicated volume (no
            arbiter), do you see this issue?



            On Tue, Jan 16, 2018 at 4:03 PM, Krutika Dhananjay
            <[email protected] <mailto:[email protected]>>
            wrote:

                Please share the volume-info output and the
                logs under /var/log/glusterfs/ from all your
                nodes. for investigating the issue.

                -Krutika

                On Tue, Jan 16, 2018 at 1:30 PM, Ing. Luca
                Lazzeroni - Trend Servizi Srl <[email protected]
                <mailto:[email protected]>> wrote:

                    Hi to everyone.

                    I've got a strange problem with a gluster
                    setup: 3 nodes with Centos 7.4, Gluster
                    3.12.4 from Centos/Gluster repositories,
                    QEMU-KVM version 2.9.0 (compiled from RHEL
                    sources).

                    I'm running volumes in replica 3 arbiter 1
                    mode (but I've got a volume in "pure"
                    replica 3 mode too). I've applied the
                    "virt" group settings to my volumes since
                    they host VM images.

                    If I try to install something (eg: Ubuntu
                    Server 16.04.3) on a VM (and so I generate
                    a bit of I/O inside it) and configure KVM
                    to access gluster volume directly (via
                    libvirt), install fails after a while
                    because the disk content is corrupted. If
                    I inspect the block inside the disk (by
                    accessing the image directly from outside)
                    I can found many files filled with "^@".


            Also, what exactly do you mean by accessing the
            image directly from outside? Was it from the brick
            directories directly? Was it from the mount point
            of the volume? Could you elaborate? Which files
            exactly did you check?

            -Krutika


                    If, instead, I configure KVM to access VM
                    images via a FUSE mount, everything seems
                    to work correctly.

                    Note that the problem with install is
                    verified 100% time with QCOW2 image, while
                    it appears only after with RAW disk images.

                    Is there anyone who experienced the same
                    problem ?

                    Thank you,

--Ing. Luca Lazzeroni

                    Responsabile Ricerca e Sviluppo
                    Trend Servizi Srl
                    Tel: 0376/631761
                    Web: https://www.trendservizi.it

                    _______________________________________________
                    Gluster-users mailing list
                    [email protected]
                    <mailto:[email protected]>
                    http://lists.gluster.org/mailman/listinfo/gluster-users
                    <http://lists.gluster.org/mailman/listinfo/gluster-users>

--Ing. Luca Lazzeroni

            Responsabile Ricerca e Sviluppo
            Trend Servizi Srl
            Tel: 0376/631761
            Web:https://www.trendservizi.it


            _______________________________________________
            Gluster-users mailing list
            [email protected]
            <mailto:[email protected]>
            http://lists.gluster.org/mailman/listinfo/gluster-users
            <http://lists.gluster.org/mailman/listinfo/gluster-users>

--Ing. Luca Lazzeroni

            Responsabile Ricerca e Sviluppo
            Trend Servizi Srl
            Tel: 0376/631761
            Web:https://www.trendservizi.it


            _______________________________________________
            Gluster-users mailing list
            [email protected]
            <mailto:[email protected]>
            http://lists.gluster.org/mailman/listinfo/gluster-users
            <http://lists.gluster.org/mailman/listinfo/gluster-users>

--Ing. Luca Lazzeroni

            Responsabile Ricerca e Sviluppo
            Trend Servizi Srl
            Tel: 0376/631761
            Web:https://www.trendservizi.it


            _______________________________________________
            Gluster-users mailing list
            [email protected]
            <mailto:[email protected]>
            http://lists.gluster.org/mailman/listinfo/gluster-users
            <http://lists.gluster.org/mailman/listinfo/gluster-users>

--Ing. Luca Lazzeroni

        Responsabile Ricerca e Sviluppo
        Trend Servizi Srl
        Tel: 0376/631761
        Web:https://www.trendservizi.it

--Ing. Luca Lazzeroni

    Responsabile Ricerca e Sviluppo
    Trend Servizi Srl
    Tel: 0376/631761
    Web:https://www.trendservizi.it


    _______________________________________________
    Gluster-users mailing list
    [email protected] <mailto:[email protected]>
    http://lists.gluster.org/mailman/listinfo/gluster-users
    <http://lists.gluster.org/mailman/listinfo/gluster-users>

--Ing. Luca Lazzeroni

    Responsabile Ricerca e Sviluppo
    Trend Servizi Srl
    Tel: 0376/631761
    Web:https://www.trendservizi.it


    _______________________________________________
    Gluster-users mailing list
    [email protected] <mailto:[email protected]>
    http://lists.gluster.org/mailman/listinfo/gluster-users
    <http://lists.gluster.org/mailman/listinfo/gluster-users>

--Ing. Luca Lazzeroni

    Responsabile Ricerca e Sviluppo
    Trend Servizi Srl
    Tel: 0376/631761
    Web:https://www.trendservizi.it


    _______________________________________________
    Gluster-users mailing list
    [email protected] <mailto:[email protected]>
    http://lists.gluster.org/mailman/listinfo/gluster-users
    <http://lists.gluster.org/mailman/listinfo/gluster-users>


--
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web:https://www.trendservizi.it


_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users


--
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

Reply via email to