Re: [Gluster-users] heal info OK but statistics not working

Ravishankar N Tue, 05 Sep 2017 00:41:16 -0700


On 09/04/2017 07:35 PM, Atin Mukherjee wrote:

Ravi/Karthick,
If one of the self heal process is down, will the statstics heal-countcommand work?

No it doesn't seem to: glusterd stage-op phase fails because shd wasdown on that node and we error out.FWIW, the error message "Gathering crawl statistics on volume GROUP-WORKhas been unsuccessful on bricks that are down. Please check if all brickprocesses are running." is incorrect and oncehttps://review.gluster.org/#/c/15724/ gets merged, you will get thecorrect error message like so:

/
/root@vm2 glusterfs]# gluster v heal testvol statistics
Gathering crawl statistics on volume testvol has been unsuccessful:

Staging failed on vm1. Error: Self-heal daemon is not running. Checkself-heal daemon log file./

/

-Ravi

On Mon, Sep 4, 2017 at 7:24 PM, lejeczek <pelj...@yahoo.co.uk<mailto:pelj...@yahoo.co.uk>> wrote:


    1) one peer, out of four, got separated from the network, from the
    rest of the cluster.
    2) that unavailable(while it was unavailable) peer got detached
    with "gluster peer detach" command which succeeded, so now cluster
    comprise of three peers
    3) Self-heal daemon (for some reason) does not start(with an
    attempt to restart glusted) on the peer which probed that fourth peer.
    4) fourth unavailable peer is still up & running but is
    inaccessible to other peers for network is disconnected,
    segmented. That peer's gluster status show peer is still in the
    cluster.
    5) So, fourth peer's gluster(nor other processes) stack did not
    fail nor crushed, just network got, is disconnected.
    6) peer status show ok & connected for current three peers.

    This is third time when it happens to me, very same way: each time
    net-disjointed peer was brought back online then statistics &
    details worked again.

    can you not reproduce it?

    $ gluster vol info QEMU-VMs

    Volume Name: QEMU-VMs
    Type: Replicate
    Volume ID: 8709782a-daa5-4434-a816-c4e0aef8fef2
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 3 = 3
    Transport-type: tcp
    Bricks:
    Brick1: 10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-QEMU-VMs
    Brick2: 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-QEMU-VMs
    Brick3: 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-QEMU-VMs
    Options Reconfigured:
    transport.address-family: inet
    nfs.disable: on
    storage.owner-gid: 107
    storage.owner-uid: 107
    performance.readdir-ahead: on
    geo-replication.indexing: on
    geo-replication.ignore-pid-check: on
    changelog.changelog: on

    $ gluster vol status QEMU-VMs
    Status of volume: QEMU-VMs
    Gluster process       TCP Port RDMA Port Online  Pid
    
------------------------------------------------------------------------------
    Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS
    TERs/0GLUSTER-QEMU-VMs               49156     0 Y       9302
    Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS
    TERs/0GLUSTER-QEMU-VMs               49156     0 Y       7610
    Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU
    STERs/0GLUSTER-QEMU-VMs               49156     0 Y       11013
    Self-heal Daemon on localhost               N/A       N/A Y      
    3069276
    Self-heal Daemon on 10.5.6.32               N/A       N/A Y      
    3315870
    Self-heal Daemon on 10.5.6.49               N/A       N/A N      
    N/A  <--- HERE
    Self-heal Daemon on 10.5.6.17               N/A       N/A Y       5163

    Task Status of Volume QEMU-VMs
    
------------------------------------------------------------------------------
    There are no active volume tasks

    $ gluster vol heal QEMU-VMs statistics heal-count
    Gathering count of entries to be healed on volume QEMU-VMs has
    been unsuccessful on bricks that are down. Please check if all
    brick processes are running.



    On 04/09/17 11:47, Atin Mukherjee wrote:

        Please provide the output of gluster volume info, gluster
        volume status and gluster peer status.

        On Mon, Sep 4, 2017 at 4:07 PM, lejeczek <pelj...@yahoo.co.uk
        <mailto:pelj...@yahoo.co.uk> <mailto:pelj...@yahoo.co.uk
        <mailto:pelj...@yahoo.co.uk>>> wrote:

            hi all

            this:
            $ vol heal $_vol info
            outputs ok and exit code is 0
            But if I want to see statistics:
            $ gluster vol heal $_vol statistics
            Gathering crawl statistics on volume GROUP-WORK has
            been unsuccessful on bricks that are down. Please
            check if all brick processes are running.

            I suspect - gluster inability to cope with a situation
            where one peer(which is not even a brick for a single
            vol on the cluster!) is inaccessible to the rest of
            cluster.
            I have not played with any other variations of this
            case, eg. more than one peer goes down, etc.
            But I hope someone could try to replicate this simple
            test case.

            Cluster and vols, when something like this happens,
            seem accessible and as such "all" works, except when
            you want more details.
            This also fails:
            $ gluster vol status $_vol detail
            Error : Request timed out

            My gluster(3.10.5-1.el7.x86_64) exhibits these
            symptoms every time one(at least) peers goes out of
            the rest reach.

            maybe @devel can comment?

            many thanks, L.
            _______________________________________________
            Gluster-users mailing list
        Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
            <mailto:Gluster-users@gluster.org
        <mailto:Gluster-users@gluster.org>>
        http://lists.gluster.org/mailman/listinfo/gluster-users
        <http://lists.gluster.org/mailman/listinfo/gluster-users>
            <http://lists.gluster.org/mailman/listinfo/gluster-users
        <http://lists.gluster.org/mailman/listinfo/gluster-users>>






_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] heal info OK but statistics not working

Reply via email to