Re: [Gluster-users] libgfapi failover problem on replica bricks

Pranith Kumar Karampuri Wed, 06 Aug 2014 05:16:26 -0700

Roman,

The file went into split-brain. I think we should do these testswith 3.5.2. Where monitoring the heals is easier. Let me also come upwith a document about how to do this testing you are trying to do.


Humble/Niels,

Do we have debs available for 3.5.2? In 3.5.1 there was packagingissue where /usr/bin/glfsheal is not packaged along with the deb. Ithink that should be fixed now as well?


Pranith

On 08/06/2014 11:52 AM, Roman wrote:

good morning,

root@stor1:~# getfattr -d -m. -e hex/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2

getfattr: Removing leading '/' from absolute path names
# file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000
trusted.gfid=0x23c79523075a4158bea38078da570449

getfattr: Removing leading '/' from absolute path names
# file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
trusted.gfid=0x23c79523075a4158bea38078da570449

2014-08-06 9:20 GMT+03:00 Pranith Kumar Karampuri <[email protected]<mailto:[email protected]>>:



    On 08/06/2014 11:30 AM, Roman wrote:

    Also, this time files are not the same!

    root@stor1:~# md5sum
    /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
    32411360c53116b96a059f17306caeda
     /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2

    root@stor2:~# md5sum
    /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
    65b8a6031bcb6f5fb3a11cb1e8b1c9c9
     /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2

    What is the getfattr output?

    Pranith



    2014-08-05 16:33 GMT+03:00 Roman <[email protected]
    <mailto:[email protected]>>:

        Nope, it is not working. But this time it went a bit other way

        root@gluster-client:~# dmesg
        Segmentation fault


        I was not able even to start the VM after I done the tests

        Could not read qcow2 header: Operation not permitted

        And it seems, it never starts to sync files after first
        disconnect. VM survives first disconnect, but not second (I
        waited around 30 minutes). Also, I've
        got network.ping-timeout: 2 in volume settings, but logs
        react on first disconnect around 30 seconds. Second was
        faster, 2 seconds.

        Reaction was different also:

        slower one:
        [2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv]
        0-glusterfs: readv failed (Connection timed out)
        [2014-08-05 13:26:19.558485] W
        [socket.c:1962:__socket_proto_state_machine] 0-glusterfs:
        reading from socket failed. Error (Connection timed out),
        peer (10.250.0.1:24007 <http://10.250.0.1:24007>)
        [2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv]
        0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out)
        [2014-08-05 13:26:21.281474] W
        [socket.c:1962:__socket_proto_state_machine]
        0-HA-fast-150G-PVE1-client-0: reading from socket failed.
        Error (Connection timed out), peer (10.250.0.1:49153
        <http://10.250.0.1:49153>)
        [2014-08-05 13:26:21.281507] I
        [client.c:2098:client_rpc_notify]
        0-HA-fast-150G-PVE1-client-0: disconnected

        the fast one:
        2014-08-05 12:52:44.607389] C
        [client-handshake.c:127:rpc_client_ping_timer_expired]
        0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153
        <http://10.250.0.2:49153> has not responded in the last 2
        seconds, disconnecting.
        [2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv]
        0-HA-fast-150G-PVE1-client-1: readv failed (No data available)
        [2014-08-05 12:52:44.607585] E
        [rpc-clnt.c:368:saved_frames_unwind]
        (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
        [0x7fcb1b4b0558]
        
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
        [0x7fcb1b4aea63]
        (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
        [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced
        unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at
        2014-08-05 12:52:42.463881 (xid=0x381883x)
        [2014-08-05 12:52:44.607604] W
        [client-rpc-fops.c:2624:client3_3_lookup_cbk]
        0-HA-fast-150G-PVE1-client-1: remote operation failed:
        Transport endpoint is not connected. Path: /
        (00000000-0000-0000-0000-000000000001)
        [2014-08-05 12:52:44.607736] E
        [rpc-clnt.c:368:saved_frames_unwind]
        (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
        [0x7fcb1b4b0558]
        
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
        [0x7fcb1b4aea63]
        (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
        [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced
        unwinding frame type(GlusterFS Handshake) op(PING(3)) called
        at 2014-08-05 12:52:42.463891 (xid=0x381884x)
        [2014-08-05 12:52:44.607753] W
        [client-handshake.c:276:client_ping_cbk]
        0-HA-fast-150G-PVE1-client-1: timer must have expired
        [2014-08-05 12:52:44.607776] I
        [client.c:2098:client_rpc_notify]
        0-HA-fast-150G-PVE1-client-1: disconnected



        I've got SSD disks (just for an info).
        Should I go and give a try for 3.5.2?



        2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri
        <[email protected] <mailto:[email protected]>>:

            reply along with gluster-users please :-). May be you are
            hitting 'reply' instead of 'reply all'?

            Pranith

            On 08/05/2014 03:35 PM, Roman wrote:

            To make sure and clean, I've created another VM with raw
            format and goint to repeat those steps. So now I've got
            two VM-s one with qcow2 format and other with raw
            format. I will send another e-mail shortly.


            2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri
            <[email protected] <mailto:[email protected]>>:


                On 08/05/2014 03:07 PM, Roman wrote:

                really, seems like the same file

                stor1:
                a951641c5230472929836f9fcede6b04
                 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2

                stor2:
                a951641c5230472929836f9fcede6b04
                 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2


                one thing I've seen from logs, that somehow proxmox
                VE is connecting with wrong version to servers?
                [2014-08-05 09:23:45.218550] I
                [client-handshake.c:1659:select_server_supported_programs]
                0-HA-fast-150G-PVE1-client-0: Using Program
                GlusterFS 3.3, Num (1298437), Version (330)

                It is the rpc (over the network data structures)
                version, which is not changed at all from 3.3 so
                thats not a problem. So what is the conclusion? Is
                your test case working now or not?

                Pranith

                but if I issue:
                root@pve1:~# glusterfs -V
                glusterfs 3.4.4 built on Jun 28 2014 03:44:57
                seems ok.

                server  use 3.4.4 meanwhile
                [2014-08-05 09:23:45.117875] I
                [server-handshake.c:567:server_setvolume]
                0-HA-fast-150G-PVE1-server: accepted client from
                
stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0
                (version: 3.4.4)
                [2014-08-05 09:23:49.103035] I
                [server-handshake.c:567:server_setvolume]
                0-HA-fast-150G-PVE1-server: accepted client from
                
stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0
                (version: 3.4.4)

                if this could be the reason, of course.
                I did restart the Proxmox VE yesterday (just for an
                information)





                2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri
                <[email protected] <mailto:[email protected]>>:


                    On 08/05/2014 02:33 PM, Roman wrote:

                    Waited long enough for now, still different
                    sizes and no logs about healing :(

                    stor1
                    # file:
                    exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
                    
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
                    
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
                    trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921

                    root@stor1:~# du -sh
                    /exports/fast-test/150G/images/127/
                    1.2G  /exports/fast-test/150G/images/127/


                    stor2
                    # file:
                    exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
                    
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
                    
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
                    trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921


                    root@stor2:~# du -sh
                    /exports/fast-test/150G/images/127/
                    1.4G  /exports/fast-test/150G/images/127/

                    According to the changelogs, the file doesn't
                    need any healing. Could you stop the operations
                    on the VMs and take md5sum on both these machines?

                    Pranith





                    2014-08-05 11:49 GMT+03:00 Pranith Kumar
                    Karampuri <[email protected]
                    <mailto:[email protected]>>:


                        On 08/05/2014 02:06 PM, Roman wrote:

                        Well, it seems like it doesn't see the
                        changes were made to the volume ? I
                        created two files 200 and 100 MB (from
                        /dev/zero) after I disconnected the first
                        brick. Then connected it back and got
                        these logs:

                        [2014-08-05 08:30:37.830150] I
                        [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
                        0-glusterfs: No change in volfile, continuing
                        [2014-08-05 08:30:37.830207] I
                        [rpc-clnt.c:1676:rpc_clnt_reconfig]
                        0-HA-fast-150G-PVE1-client-0: changing
                        port to 49153 (from 0)
                        [2014-08-05 08:30:37.830239] W
                        [socket.c:514:__socket_rwv]
                        0-HA-fast-150G-PVE1-client-0: readv
                        failed (No data available)
                        [2014-08-05 08:30:37.831024] I
                        
[client-handshake.c:1659:select_server_supported_programs]
                        0-HA-fast-150G-PVE1-client-0: Using
                        Program GlusterFS 3.3, Num (1298437),
                        Version (330)
                        [2014-08-05 08:30:37.831375] I
                        [client-handshake.c:1456:client_setvolume_cbk]
                        0-HA-fast-150G-PVE1-client-0: Connected
                        to 10.250.0.1:49153
                        <http://10.250.0.1:49153>, attached to
                        remote volume '/exports/fast-test/150G'.
                        [2014-08-05 08:30:37.831394] I
                        [client-handshake.c:1468:client_setvolume_cbk]
                        0-HA-fast-150G-PVE1-client-0: Server and
                        Client lk-version numbers are not same,
                        reopening the fds
                        [2014-08-05 08:30:37.831566] I
                        [client-handshake.c:450:client_set_lk_version_cbk]
                        0-HA-fast-150G-PVE1-client-0: Server lk
                        version = 1


                        [2014-08-05 08:30:37.830150] I
                        [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
                        0-glusterfs: No change in volfile, continuing
                        this line seems weird to me tbh.
                        I do not see any traffic on switch
                        interfaces between gluster servers, which
                        means, there is no syncing between them.
                        I tried to ls -l the files on the client
                        and servers to trigger the healing, but
                        seems like no success. Should I wait more?

                        Yes, it should take around 10-15 minutes.
                        Could you provide 'getfattr -d -m. -e hex
                        <file-on-brick>' on both the bricks.

                        Pranith



                        2014-08-05 11:25 GMT+03:00 Pranith Kumar
                        Karampuri <[email protected]
                        <mailto:[email protected]>>:


                            On 08/05/2014 01:10 PM, Roman wrote:

                            Ahha! For some reason I was not able
                            to start the VM anymore, Proxmox VE
                            told me, that it is not able to read
                            the qcow2 header due to permission
                            is denied for some reason. So I just
                            deleted that file and created a new
                            VM. And the nex message I've got was
                            this:

                            Seems like these are the messages
                            where you took down the bricks before
                            self-heal. Could you restart the run
                            waiting for self-heals to complete
                            before taking down the next brick?

                            Pranith



                            [2014-08-05 07:31:25.663412] E
                            
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
                            0-HA-fast-150G-PVE1-replicate-0:
                            Unable to self-heal contents of
                            '/images/124/vm-124-disk-1.qcow2'
                            (possible split-brain). Please
                            delete the file from all but the
                            preferred subvolume.- Pending
                            matrix:  [ [ 0 60 ] [ 11 0 ] ]
                            [2014-08-05 07:31:25.663955] E
                            
[afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
                            0-HA-fast-150G-PVE1-replicate-0:
                            background  data self-heal failed on
                            /images/124/vm-124-disk-1.qcow2



                            2014-08-05 10:13 GMT+03:00 Pranith
                            Kumar Karampuri <[email protected]
                            <mailto:[email protected]>>:

                                I just responded to your earlier
                                mail about how the log looks.
                                The log comes on the mount's logfile

                                Pranith

                                On 08/05/2014 12:41 PM, Roman wrote:

                                Ok, so I've waited enough, I
                                think. Had no any traffic on
                                switch ports between servers.
                                Could not find any suitable log
                                message about completed
                                self-heal (waited about 30
                                minutes). Plugged out the other
                                server's UTP cable this time
                                and got in the same situation:
                                root@gluster-test1:~# cat
                                /var/log/dmesg
                                -bash: /bin/cat: Input/output error

                                brick logs:
                                [2014-08-05 07:09:03.005474] I
                                [server.c:762:server_rpc_notify] 
0-HA-fast-150G-PVE1-server:
                                disconnecting connectionfrom
                                
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
                                [2014-08-05 07:09:03.005530] I
                                [server-helpers.c:729:server_connection_put]
                                0-HA-fast-150G-PVE1-server:
                                Shutting down connection
                                
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
                                [2014-08-05 07:09:03.005560] I
                                [server-helpers.c:463:do_fd_cleanup]
                                0-HA-fast-150G-PVE1-server: fd
                                cleanup on
                                /images/124/vm-124-disk-1.qcow2
                                [2014-08-05 07:09:03.005797] I
                                [server-helpers.c:617:server_connection_destroy]
                                0-HA-fast-150G-PVE1-server:
                                destroyed connection of
                                
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0





                                2014-08-05 9:53 GMT+03:00
                                Pranith Kumar Karampuri
                                <[email protected]
                                <mailto:[email protected]>>:

                                    Do you think it is possible
                                    for you to do these tests
                                    on the latest version
                                    3.5.2? 'gluster volume heal
                                    <volname> info' would give
                                    you that information in
                                    versions > 3.5.1.
                                    Otherwise you will have to
                                    check it from either the
                                    logs, there will be
                                    self-heal completed message
                                    on the mount logs (or) by
                                    observing 'getfattr -d -m.
                                    -e hex <image-file-on-bricks>'

                                    Pranith


                                    On 08/05/2014 12:09 PM,
                                    Roman wrote:

                                    Ok, I understand. I will
                                    try this shortly.
                                    How can I be sure, that
                                    healing process is done,
                                    if I am not able to see
                                    its status?


                                    2014-08-05 9:30 GMT+03:00
                                    Pranith Kumar Karampuri
                                    <[email protected]
                                    <mailto:[email protected]>>:

                                        Mounts will do the
                                        healing, not the
                                        self-heal-daemon. The
                                        problem I feel is that
                                        whichever process does
                                        the healing has the
                                        latest information
                                        about the good bricks
                                        in this usecase. Since
                                        for VM usecase, mounts
                                        should have the latest
                                        information, we should
                                        let the mounts do the
                                        healing. If the mount
                                        accesses the VM image
                                        either by someone
                                        doing operations
                                        inside the VM or
                                        explicit stat on the
                                        file it should do the
                                        healing.

                                        Pranith.


                                        On 08/05/2014 10:39
                                        AM, Roman wrote:

                                        Hmmm, you told me to
                                        turn it off. Did I
                                        understood something
                                        wrong? After I issued
                                        the command you've
                                        sent me, I was not
                                        able to watch the
                                        healing process, it
                                        said, it won't be
                                        healed, becouse its
                                        turned off.


                                        2014-08-05 5:39
                                        GMT+03:00 Pranith
                                        Kumar Karampuri
                                        <[email protected]
                                        <mailto:[email protected]>>:

                                            You didn't
                                            mention anything
                                            about
                                            self-healing. Did
                                            you wait until
                                            the self-heal is
                                            complete?

                                            Pranith

                                            On 08/04/2014
                                            05:49 PM, Roman
                                            wrote:

                                            Hi!
                                            Result is pretty
                                            same. I set the
                                            switch port down
                                            for 1st server,
                                            it was ok. Then
                                            set it up back
                                            and set other
                                            server's port
                                            off. and it
                                            triggered IO
                                            error on two
                                            virtual
                                            machines: one
                                            with local root
                                            FS but network
                                            mounted storage.
                                            and other with
                                            network root FS.
                                            1st gave an
                                            error on copying
                                            to or from the
                                            mounted network
                                            disk, other just
                                            gave me an error
                                            for even reading
                                            log.files.

                                            cat:
                                            /var/log/alternatives.log:
                                            Input/output error
                                            then I reset the
                                            kvm VM and it
                                            said me, there
                                            is no boot
                                            device. Next I
                                            virtually
                                            powered it off
                                            and then back on
                                            and it has booted.

                                            By the way, did
                                            I have to
                                            start/stop volume?

                                            >> Could you do
                                            the following
                                            and test it again?
                                            >> gluster volume
                                            set <volname>
                                            cluster.self-heal-daemon
                                            off

                                            >>Pranith




                                            2014-08-04 14:10
                                            GMT+03:00
                                            Pranith Kumar
                                            Karampuri
                                            <[email protected]
                                            <mailto:[email protected]>>:


                                                On
                                                08/04/2014
                                                03:33 PM,
                                                Roman wrote:

                                                Hello!

                                                Facing the
                                                same
                                                problem as
                                                mentioned
                                                here:

                                                
http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html

                                                my set up
                                                is up and
                                                running, so
                                                i'm ready
                                                to help you
                                                back with
                                                feedback.

                                                setup:
                                                proxmox
                                                server as
                                                client
                                                2 gluster
                                                physical
                                                 servers

                                                server side
                                                and client
                                                side both
                                                running atm
                                                3.4.4
                                                glusterfs
                                                from
                                                gluster repo.

                                                the problem is:

                                                1. craeted
                                                replica bricks.
                                                2. mounted
                                                in proxmox
                                                (tried both
                                                promox
                                                ways: via
                                                GUI and
                                                fstab (with
                                                backup
                                                volume
                                                line), btw
                                                while
                                                mounting
                                                via fstab
                                                I'm unable
                                                to launch a
                                                VM without
                                                cache,
                                                meanwhile
                                                direct-io-mode
                                                is enabled
                                                in fstab line)
                                                3. installed VM
                                                4. bring
                                                one volume
                                                down - ok
                                                5. bringing
                                                up, waiting
                                                for sync is
                                                done.
                                                6. bring
                                                other
                                                volume down
                                                - getting
                                                IO errors
                                                on VM guest
                                                and not
                                                able to
                                                restore the
                                                VM after I
                                                reset the
                                                VM via
                                                host. It
                                                says (no
                                                bootable
                                                media).
                                                After I
                                                shut it
                                                down
                                                (forced)
                                                and bring
                                                back up, it
                                                boots.

                                                Could you do
                                                the
                                                following
                                                and test it
                                                again?
                                                gluster
                                                volume set
                                                <volname>
                                                cluster.self-heal-daemon
                                                off

                                                Pranith


                                                Need help.
                                                Tried
                                                3.4.3, 3.4.4.
                                                Still
                                                missing
                                                pkg-s for
                                                3.4.5 for
                                                debian and
                                                3.5.2
                                                (3.5.1
                                                always
                                                gives a
                                                healing
                                                error for
                                                some reason)

--Best regards,

                                                Roman.


                                                
_______________________________________________
                                                Gluster-users mailing list
                                                [email protected]  
<mailto:[email protected]>
                                                
http://supercolony.gluster.org/mailman/listinfo/gluster-users

--Best regards,

                                            Roman.

--Best regards,

                                        Roman.

--Best regards,

                                    Roman.

--Best regards,

                                Roman.

--Best regards,

                            Roman.

--Best regards,

                        Roman.

--Best regards,

                    Roman.

--Best regards,

                Roman.

--Best regards,

            Roman.

--Best regards,

        Roman.

--Best regards,

    Roman.





--
Best regards,
Roman.

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] libgfapi failover problem on replica bricks

Reply via email to