Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

Micha Ober Mon, 19 Dec 2016 10:06:50 -0800

Hi,

the log does not show anything like that. Also, I'm using ext4 on thebricks.


The log only contains entries like these:

[Fri Nov 25 14:23:27 2016] INFO: task gpu_graphene_bv:4476 blocked formore than 120 seconds.[Fri Nov 25 14:23:27 2016] Tainted: P OE3.19.0-25-generic #26~14.04.1-Ubuntu[Fri Nov 25 14:23:27 2016] "echo 0 >/proc/sys/kernel/hung_task_timeout_secs" disables this message.[Fri Nov 25 14:23:27 2016] gpu_graphene_bv D ffff8804aa39be08 0 44764461 0x00000000[Fri Nov 25 14:23:27 2016] ffff8804aa39be08 ffff8804ad0febf00000000000013e80 ffff8804aa39bfd8[Fri Nov 25 14:23:27 2016] 0000000000013e80 ffff8804ad403110ffff8804ad0febf0 ffff8804aa39be18[Fri Nov 25 14:23:27 2016] ffff8804aa2c87d0 ffff88049df2e000ffff8804aa39be30 ffff8804aa2c88a0

[Fri Nov 25 14:23:27 2016] Call Trace:
[Fri Nov 25 14:23:27 2016]  [<ffffffff817b22e9>] schedule+0x29/0x70

[Fri Nov 25 14:23:27 2016] [<ffffffff812dc06d>]__fuse_request_send+0x11d/0x290[Fri Nov 25 14:23:27 2016] [<ffffffff810b4e10>] ?prepare_to_wait_event+0x110/0x110

[Fri Nov 25 14:23:27 2016]  [<ffffffff812dc1f2>] fuse_request_send+0x12/0x20
[Fri Nov 25 14:23:27 2016]  [<ffffffff812e576d>] fuse_flush+0x12d/0x180
[Fri Nov 25 14:23:27 2016]  [<ffffffff811e9973>] filp_close+0x33/0x80
[Fri Nov 25 14:23:27 2016]  [<ffffffff8120a152>] __close_fd+0x82/0xa0
[Fri Nov 25 14:23:27 2016]  [<ffffffff811e99e3>] SyS_close+0x23/0x50

[Fri Nov 25 14:23:27 2016] [<ffffffff817b668d>]system_call_fastpath+0x16/0x1b


Which is due to the file system not responding, I guess.
Since I switched the mounts from FUSE to NFS, occasionally I also see:

[Wed Dec 14 23:42:47 2016] nfs: server giant2 not responding, still trying
[Wed Dec 14 23:43:12 2016] nfs: server giant2 not responding, still trying
[Wed Dec 14 23:45:04 2016] nfs: server giant2 OK
[Wed Dec 14 23:45:04 2016] nfs: server giant2 OK

In another post you asked for logfiles with TRACE loglevel, I'll providethem shortly.


Best regards and thanks,
Micha

Am 19.12.2016 um 16:09 schrieb Mohammed Rafi K C:


Hi Micha,

Can you please also see if there is any error messages in dmesg ?Basically I'm trying to see whether your hitting issues described inhttps://bugzilla.kernel.org/show_bug.cgi?id=73831 .



Regards

Rafi KC


On 12/19/2016 11:58 AM, Mohammed Rafi K C wrote:


Hi Micha,

Sorry for the late reply. I was busy with some other things.

If you have still the setup available Can you enable TRACE log level[1],[2] and see if you could find any log entries when the networkstart disconnecting. Basically I'm trying to find out anydisconnection had occurred other than ping timer expire issue.




[1] : gluster volume <volname> diagnostics.brick-log-level TRACE

[2] : gluster volume <volname> diagnostics.client-log-level TRACE


Regards

Rafi KC


On 12/08/2016 07:59 PM, Atin Mukherjee wrote:

On Thu, Dec 8, 2016 at 4:37 PM, Micha Ober <[email protected]<mailto:[email protected]>> wrote:


    Hi Rafi,

    thank you for your support. It is greatly appreciated.

    Just some more thoughts from my side:

    There have been no reports from other  users in *this* thread
    until now, but I have found at least one user with a very simiar
    problem in an older thread:

    https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html
    <https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html>

    He is also reporting disconnects  with no apparent reasons,
    althogh his setup is a bit more complicated, also involving a
    firewall. In our setup, all servers/clients are connected via 1
    GbE with no firewall or anything that might block/throttle
    traffic. Also, we are using exactly the same software versions
    on all nodes.


    I can also find some reports in the bugtracker when searching
    for "rpc_client_ping_timer_expired" and
    "rpc_clnt_ping_timer_expired" (looks like spelling changed
    during versions).

    https://bugzilla.redhat.com/show_bug.cgi?id=1096729
    <https://bugzilla.redhat.com/show_bug.cgi?id=1096729>

Just FYI, this is a different issue, here GlusterD fails to handlethe volume of incoming requests on time since MT-epoll is notenabled here.



    https://bugzilla.redhat.com/show_bug.cgi?id=1370683
    <https://bugzilla.redhat.com/show_bug.cgi?id=1370683>

    But both reports involve large traffic/load on the bricks/disks,
    which is not the case for out setup.
    To give a ballpark figure: Over three days, 30 GiB were written.
    And the data was not written at once, but continuously over the
    whole time.


    Just to be sure, I have checked the logfiles of one of the other
    clusters right now, which are sitting in the same building, in
    the same rack, even on the same switch, running the same jobs,
    but with glusterfs 3.4.2 and I can see no disconnects in the
    logfiles. So I can definitely rule out our infrastructure as
    problem.

    Regards,
    Micha



    Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K C:


    Hi Micha,

    This is great. I will provide you one debug build which has two
    fixes which I possible suspect for a frequent disconnect issue,
    though I don't have much data to validate my theory. So I will
    take one more day to dig in to that.

    Thanks for your support, and opensource++

    Regards

    Rafi KC

    On 12/07/2016 05:02 AM, Micha Ober wrote:

    Hi,

    thank you for your answer and even more for the question!
    Until now, I was using FUSE. Today I changed all mounts to NFS
    using the same 3.7.17 version.

    But: The problem is still the same. Now, the NFS logfile
    contains lines like these:

    [2016-12-06 15:12:29.006325] C
    [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
    0-gv0-client-7: server X.X.18.62:49153 has not responded in
    the last 42 seconds, disconnecting.

    Interestingly enough,  the IP address X.X.18.62 is the same
    machine! As I wrote earlier, each node serves both as a server
    and a client, as each node contributes bricks to the volume.
    Every server is connecting to itself via its hostname. For
    example, the fstab on the node "giant2" looks like:

    #giant2:/gv0    /shared_data glusterfs       defaults,noauto
    0       0
    #giant2:/gv2    /shared_slurm glusterfs       defaults,noauto
    0       0

giant2:/gv0 /shared_data nfsdefaults,_netdev,vers=3 0 0giant2:/gv2 /shared_slurm nfsdefaults,_netdev,vers=3 0 0


    So I understand the disconnects even less.

    I don't know if it's possible to create a dummy cluster which
    exposes the same behaviour, because the disconnects only
    happen when there are compute jobs running on those nodes -
    and they are GPU compute jobs, so that's something which
    cannot be easily emulated in a VM.

    As we have more clusters (which are running fine with an
    ancient 3.4 version :-)) and we are currently not dependent on
    this particular cluster (which may stay like this for this
    month, I think) I should be able to deploy the debug build on
    the "real" cluster, if you can provide a debug build.

    Regards and thanks,
    Micha



    Am 06.12.2016 um 08:15 schrieb Mohammed Rafi K C:




    On 12/03/2016 12:56 AM, Micha Ober wrote:

    ** Update: ** I have downgraded from 3.8.6 to 3.7.17 now,
    but the problem still exists.

    Client log: http://paste.ubuntu.com/23569065/
    Brick log: http://paste.ubuntu.com/23569067/

    Please note that each server has two bricks.
    Whereas, according to the logs, one brick loses the
    connection to all other hosts:
    [2016-12-02 18:38:53.703301] W [socket.c:596:__socket_rwv] 
0-tcp.gv0-server: writev on X.X.X.219:49121 failed (Broken pipe)
    [2016-12-02 18:38:53.703381] W [socket.c:596:__socket_rwv] 
0-tcp.gv0-server: writev on X.X.X.62:49118 failed (Broken pipe)
    [2016-12-02 18:38:53.703380] W [socket.c:596:__socket_rwv] 
0-tcp.gv0-server: writev on X.X.X.107:49121 failed (Broken pipe)
    [2016-12-02 18:38:53.703424] W [socket.c:596:__socket_rwv] 
0-tcp.gv0-server: writev on X.X.X.206:49120 failed (Broken pipe)
    [2016-12-02 18:38:53.703359] W [socket.c:596:__socket_rwv] 
0-tcp.gv0-server: writev on X.X.X.58:49121 failed (Broken pipe)

    The SECOND brick on the SAME host is NOT affected, i.e. no disconnects!
    As I said, the network connection is fine and the disks are idle.
    The CPU always has 2 free cores.

    It looks like I have to downgrade to 3.4 now in order for the disconnects 
to stop.


    Hi Micha,

    Thanks for the update and sorry for what happened with
    gluster higher versions. I can understand the need for
    downgrade as it is a production setup.

    Can you tell me the clients used here ? whether it is a
    fuse,nfs,nfs-ganesha, smb or libgfapi ?

    Since I'm not able to reproduce the issue (I have been trying
    from last 3days) and the logs are not much helpful here (we
    don't have much logs in socket layer), Could you please
    create a dummy cluster and try to reproduce the issue? If
    then we can play with that volume and I could provide some
    debug build which we can use for further debugging?

    If you don't have bandwidth for this, please leave it ;).

    Regards
    Rafi KC

    - Micha

    Am 30.11.2016 um 06:57 schrieb Mohammed Rafi K C:


    Hi Micha,

    I have changed the thread and subject so that your original
    thread remain same for your query. Let's try to fix the
    problem what you observed with 3.8.4, So I have started a
    new thread to discuss the frequent disconnect problem.

    *If any one else has experienced the same problem, please
    respond to the mail.*

    It would be very helpful if you could give us some more
    logs from clients and bricks.  Also any reproducible steps
    will surely help to chase the problem further.

    Regards

    Rafi KC

    On 11/30/2016 04:44 AM, Micha Ober wrote:

    I had opened another thread on this mailing list (Subject:
    "After upgrade from 3.4.2 to 3.8.5 - High CPU usage
    resulting in disconnects and split-brain").

    The title may be a bit misleading now, as I am no longer
    observing high CPU usage after upgrading to 3.8.6, but the
    disconnects are still happening and the number of files in
    split-brain is growing.

    Setup: 6 compute nodes, each serving as a glusterfs server
    and client, Ubuntu 14.04, two bricks per node,
    distribute-replicate

    I have two gluster volumes set up (one for scratch data,
    one for the slurm scheduler). Only the scratch data volume
    shows critical errors "[...] has not responded in the last
    42 seconds, disconnecting.". So I can rule out network
    problems, the gigabit link between the nodes is not
    saturated at all. The disks are almost idle (<10%).

    I have glusterfs 3.4.2 on Ubuntu 12.04 on a another
    compute cluster, running fine since it was deployed.
    I had glusterfs 3.4.2 on Ubuntu 14.04 on this cluster,
    running fine for almost a year.

    After upgrading to 3.8.5, the problems (as described)
    started. I would like to use some of the new features of
    the newer versions (like bitrot), but the users can't run
    their compute jobs right now because the result files are
    garbled.

    There also seems to be a bug report with a smiliar
    problem: (but no progress)
    https://bugzilla.redhat.com/show_bug.cgi?id=1370683

    For me, ALL servers are affected (not isolated to one or
    two servers)

    I also see messages like "INFO: task gpu_graphene_bv:4476
    blocked for more than 120 seconds." in the syslog.

    For completeness (gv0 is the scratch volume, gv2 the slurm
    volume):

    [root@giant2: ~]# gluster v info

    Volume Name: gv0
    Type: Distributed-Replicate
    Volume ID: 993ec7c9-e4bc-44d0-b7c4-2d977e622e86
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 6 x 2 = 12
    Transport-type: tcp
    Bricks:
    Brick1: giant1:/gluster/sdc/gv0
    Brick2: giant2:/gluster/sdc/gv0
    Brick3: giant3:/gluster/sdc/gv0
    Brick4: giant4:/gluster/sdc/gv0
    Brick5: giant5:/gluster/sdc/gv0
    Brick6: giant6:/gluster/sdc/gv0
    Brick7: giant1:/gluster/sdd/gv0
    Brick8: giant2:/gluster/sdd/gv0
    Brick9: giant3:/gluster/sdd/gv0
    Brick10: giant4:/gluster/sdd/gv0
    Brick11: giant5:/gluster/sdd/gv0
    Brick12: giant6:/gluster/sdd/gv0
    Options Reconfigured:
    auth.allow: X.X.X.*,127.0.0.1
    nfs.disable: on

    Volume Name: gv2
    Type: Replicate
    Volume ID: 30c78928-5f2c-4671-becc-8deaee1a7a8d
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 2 = 2
    Transport-type: tcp
    Bricks:
    Brick1: giant1:/gluster/sdd/gv2
    Brick2: giant2:/gluster/sdd/gv2
    Options Reconfigured:
    auth.allow: X.X.X.*,127.0.0.1
    cluster.granular-entry-heal: on
    cluster.locking-scheme: granular
    nfs.disable: on


    2016-11-30 0:10 GMT+01:00 Micha Ober <[email protected]>:

        There also seems to be a bug report with a smiliar
        problem: (but no progress)
        https://bugzilla.redhat.com/show_bug.cgi?id=1370683

        For me, ALL servers are affected (not isolated to one
        or two servers)

        I also see messages like "INFO: task
        gpu_graphene_bv:4476 blocked for more than 120
        seconds." in the syslog.

        For completeness (gv0 is the scratch volume, gv2 the
        slurm volume):

        [root@giant2: ~]# gluster v info

        Volume Name: gv0
        Type: Distributed-Replicate
        Volume ID: 993ec7c9-e4bc-44d0-b7c4-2d977e622e86
        Status: Started
        Snapshot Count: 0
        Number of Bricks: 6 x 2 = 12
        Transport-type: tcp
        Bricks:
        Brick1: giant1:/gluster/sdc/gv0
        Brick2: giant2:/gluster/sdc/gv0
        Brick3: giant3:/gluster/sdc/gv0
        Brick4: giant4:/gluster/sdc/gv0
        Brick5: giant5:/gluster/sdc/gv0
        Brick6: giant6:/gluster/sdc/gv0
        Brick7: giant1:/gluster/sdd/gv0
        Brick8: giant2:/gluster/sdd/gv0
        Brick9: giant3:/gluster/sdd/gv0
        Brick10: giant4:/gluster/sdd/gv0
        Brick11: giant5:/gluster/sdd/gv0
        Brick12: giant6:/gluster/sdd/gv0
        Options Reconfigured:
        auth.allow: X.X.X.*,127.0.0.1
        nfs.disable: on

        Volume Name: gv2
        Type: Replicate
        Volume ID: 30c78928-5f2c-4671-becc-8deaee1a7a8d
        Status: Started
        Snapshot Count: 0
        Number of Bricks: 1 x 2 = 2
        Transport-type: tcp
        Bricks:
        Brick1: giant1:/gluster/sdd/gv2
        Brick2: giant2:/gluster/sdd/gv2
        Options Reconfigured:
        auth.allow: X.X.X.*,127.0.0.1
        cluster.granular-entry-heal: on
        cluster.locking-scheme: granular
        nfs.disable: on


        2016-11-29 19:21 GMT+01:00 Micha Ober <[email protected]>:

            I had opened another thread on this mailing list
            (Subject: "After upgrade from 3.4.2 to 3.8.5 -
            High CPU usage resulting in disconnects and
            split-brain").

            The title may be a bit misleading now, as I am no
            longer observing high CPU usage after upgrading to
            3.8.6, but the disconnects are still happening and
            the number of files in split-brain is growing.

            Setup: 6 compute nodes, each serving as a
            glusterfs server and client, Ubuntu 14.04, two
            bricks per node, distribute-replicate

            I have two gluster volumes set up (one for scratch
            data, one for the slurm scheduler). Only the
            scratch data volume shows critical errors "[...]
            has not responded in the last 42 seconds,
            disconnecting.". So I can rule out network
            problems, the gigabit link between the nodes is
            not saturated at all. The disks are almost idle
            (<10%).

            I have glusterfs 3.4.2 on Ubuntu 12.04 on a
            another compute cluster, running fine since it was
            deployed.
            I had glusterfs 3.4.2 on Ubuntu 14.04 on this
            cluster, running fine for almost a year.

            After upgrading to 3.8.5, the problems (as
            described) started. I would like to use some of
            the new features of the newer versions (like
            bitrot), but the users can't run their compute
            jobs right now because the result files are garbled.

            2016-11-29 18:53 GMT+01:00 Atin Mukherjee
            <[email protected]>:

                Would you be able to share what is not working
                for you in 3.8.x (mention the exact version).
                3.4 is quite old and falling back to an
                unsupported version doesn't look a feasible
                option.

                On Tue, 29 Nov 2016 at 17:01, Micha Ober
                <[email protected]> wrote:

                    Hi,

                    I was using gluster 3.4 and upgraded to
                    3.8, but that version showed to be
                    unusable for me. I now need to downgrade.

                    I'm running Ubuntu 14.04. As upgrades of
                    the op version are irreversible, I guess I
                    have to delete all gluster volumes and
                    re-create them with the downgraded version.

                    0. Backup data
                    1. Unmount all gluster volumes
                    2. apt-get purge glusterfs-server
                    glusterfs-client
                    3. Remove PPA for 3.8
                    4. Add PPA for older version
                    5. apt-get install glusterfs-server
                    glusterfs-client
                    6. Create volumes

                    Is "purge" enough to delete all
                    configuration files of the currently
                    installed version or do I need to
                     manually clear some residues before
                    installing an older version?

                    Thanks.
                    _______________________________________________
                    Gluster-users mailing list
                    [email protected]
                    http://www.gluster.org/mailman/listinfo/gluster-users

--- Atin (atinm)







    _______________________________________________
    Gluster-users mailing list
    [email protected] <mailto:[email protected]>
    http://www.gluster.org/mailman/listinfo/gluster-users
    <http://www.gluster.org/mailman/listinfo/gluster-users>

--
~ Atin (atinm)

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

Reply via email to