On 09/27/2016 10:29 PM, Dennis Michael wrote:


[root@fs4 bricks]# gluster volume info
Volume Name: cees-data
Type: Distribute
Volume ID: 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
Status: Started
Number of Bricks: 4
Transport-type: tcp,rdma
Bricks:
Brick1: fs1:/data/brick
Brick2: fs2:/data/brick
Brick3: fs3:/data/brick
Brick4: fs4:/data/brick
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
[root@fs4 bricks]# gluster volume status
Status of volume: cees-data
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick fs1:/data/brick 49152 49153 Y 1878
Brick fs2:/data/brick                       49152     0      Y       1707
Brick fs3:/data/brick                       49152     0      Y       4696
Brick fs4:/data/brick N/A N/A N N/A
NFS Server on localhost                     2049      0      Y       13808
Quota Daemon on localhost N/A N/A Y 13813
NFS Server on fs1                           2049      0      Y       6722
Quota Daemon on fs1 N/A N/A Y 6730
NFS Server on fs3                           2049      0      Y       12553
Quota Daemon on fs3 N/A N/A Y 12561
NFS Server on fs2                           2049      0      Y       11702
Quota Daemon on fs2 N/A N/A Y 11710
Task Status of Volume cees-data
------------------------------------------------------------------------------
There are no active volume tasks
[root@fs4 bricks]# ps auxww | grep gluster
root 13791 0.0 0.0 701472 19768 ? Ssl 09:06 0:00 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO root 13808 0.0 0.0 560236 41420 ? Ssl 09:07 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/01c61523374369658a62b75c582b5ac2.socket root 13813 0.0 0.0 443164 17908 ? Ssl 09:07 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/lib/glusterd/quotad/run/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/gluster/3753def90f5c34f656513dba6a544f7d.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off root 13874 0.0 0.0 1200472 31700 ? Ssl 09:16 0:00 /usr/sbin/glusterfsd -s fs4 --volfile-id cees-data.fs4.data-brick -p /var/lib/glusterd/vols/cees-data/run/fs4-data-brick.pid -S /var/run/gluster/5203ab38be21e1d37c04f6bdfee77d4a.socket --brick-name /data/brick -l /var/log/glusterfs/bricks/data-brick.log --xlator-option *-posix.glusterd-uuid=f04b231e-63f8-4374-91ae-17c0c623f165 --brick-port 49152 49153 --xlator-option cees-data-server.transport.rdma.listen-port=49153 --xlator-option cees-data-server.listen-port=49152 --volfile-server-transport=socket,rdma root 13941 0.0 0.0 112648 976 pts/0 S+ 09:50 0:00 grep --color=auto gluster

[root@fs4 bricks]# systemctl restart glusterfsd glusterd

[root@fs4 bricks]# ps auxww | grep gluster
root 13808 0.0 0.0 560236 41420 ? Ssl 09:07 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/01c61523374369658a62b75c582b5ac2.socket root 13813 0.0 0.0 443164 17908 ? Ssl 09:07 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/lib/glusterd/quotad/run/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/gluster/3753def90f5c34f656513dba6a544f7d.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off root 13953 0.1 0.0 570740 14988 ? Ssl 09:51 0:00 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO root 13965 0.0 0.0 112648 976 pts/0 S+ 09:51 0:00 grep --color=auto gluster

[root@fs4 bricks]# gluster volume info
Volume Name: cees-data
Type: Distribute
Volume ID: 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
Status: Started
Number of Bricks: 3
Transport-type: tcp,rdma
Bricks:
Brick1: fs1:/data/brick
Brick2: fs2:/data/brick
Brick3: fs3:/data/brick
Options Reconfigured:
performance.readdir-ahead: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on


I'm not sure what's going on here. Restarting glusterd seems to change the output of gluster volume info? I also see you are using RDMA. Not sure why the RDMA ports for fs2 and fs3 are not shown in the volume status output. CC'ing some glusterd/rdma devs for pointers.

-Ravi


[root@fs4 bricks]# gluster volume status
Status of volume: cees-data
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick fs1:/data/brick 49152 49153 Y 1878
Brick fs2:/data/brick                       49152     0      Y       1707
Brick fs3:/data/brick                       49152     0      Y       4696
NFS Server on localhost                     2049      0      Y       13968
Quota Daemon on localhost N/A N/A Y 13976
NFS Server on fs2                           2049      0      Y       11702
Quota Daemon on fs2 N/A N/A Y 11710
NFS Server on fs3                           2049      0      Y       12553
Quota Daemon on fs3 N/A N/A Y 12561
NFS Server on fs1                           2049      0      Y       6722
Task Status of Volume cees-data
------------------------------------------------------------------------------
There are no active volume tasks

[root@fs4 bricks]# gluster peer status
Number of Peers: 3

Hostname: fs1
Uuid: ddc0a23e-05e5-48f7-993e-a37e43b21605
State: Peer in Cluster (Connected)

Hostname: fs2
Uuid: e37108f8-d2f1-4f28-adc8-0b3d3401df29
State: Peer in Cluster (Connected)

Hostname: fs3
Uuid: 19a42201-c932-44db-b1a7-8b5b1af32a36
State: Peer in Cluster (Connected)

Dennis


On Tue, Sep 27, 2016 at 9:40 AM, Ravishankar N <[email protected] <mailto:[email protected]>> wrote:

    On 09/27/2016 09:53 PM, Dennis Michael wrote:
    Yes, you are right.  I mixed up the logs.  I just ran the
    add-brick command again after cleaning up fs4 and re-installing
    gluster.  This is the complete fs4 data-brick.log.

    [root@fs1 ~]# gluster volume add-brick cees-data fs4:/data/brick
    volume add-brick: failed: Commit failed on fs4. Please check log
    file for details.

    [root@fs4 bricks]# pwd
    /var/log/glusterfs/bricks
    [root@fs4 bricks]# cat data-brick.log
    [2016-09-27 16:16:28.095661] I [MSGID: 100030]
    [glusterfsd.c:2338:main] 0-/usr/sbin/glusterfsd: Started running
    /usr/sbin/glusterfsd version 3.7.14 (args: /usr/sbin/glusterfsd
    -s fs4 --volfile-id cees-data.fs4.data-brick -p
    /var/lib/glusterd/vols/cees-data/run/fs4-data-brick.pid -S
    /var/run/gluster/5203ab38be21e1d37c04f6bdfee77d4a.socket
    --brick-name /data/brick -l
    /var/log/glusterfs/bricks/data-brick.log --xlator-option
    *-posix.glusterd-uuid=f04b231e-63f8-4374-91ae-17c0c623f165
    --brick-port 49152 --xlator-option
    cees-data-server.transport.rdma.listen-port=49153 --xlator-option
    cees-data-server.listen-port=49152
    --volfile-server-transport=socket,rdma)
    [2016-09-27 16:16:28.101547] I [MSGID: 101190]
    [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
    thread with index 1
    [2016-09-27 16:16:28.104637] I
    [graph.c:269:gf_add_cmdline_options] 0-cees-data-server: adding
    option 'listen-port' for volume 'cees-data-server' with value '49152'
    [2016-09-27 16:16:28.104646] I
    [graph.c:269:gf_add_cmdline_options] 0-cees-data-server: adding
    option 'transport.rdma.listen-port' for volume 'cees-data-server'
    with value '49153'
    [2016-09-27 16:16:28.104662] I
    [graph.c:269:gf_add_cmdline_options] 0-cees-data-posix: adding
    option 'glusterd-uuid' for volume 'cees-data-posix' with value
    'f04b231e-63f8-4374-91ae-17c0c623f165'
    [2016-09-27 16:16:28.104808] I [MSGID: 115034]
    [server.c:403:_check_for_auth_option] 0-/data/brick: skip format
    check for non-addr auth option auth.login./data/brick.allow
    [2016-09-27 16:16:28.104814] I [MSGID: 115034]
    [server.c:403:_check_for_auth_option] 0-/data/brick: skip format
    check for non-addr auth option
    auth.login.18ddaf4c-ad98-4155-9372-717eae718b4c.password
    [2016-09-27 16:16:28.104883] I [MSGID: 101190]
    [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
    thread with index 2
    [2016-09-27 16:16:28.105479] I
    [rpcsvc.c:2196:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
    Configured rpc.outstanding-rpc-limit with value 64
    [2016-09-27 16:16:28.105532] W [MSGID: 101002]
    [options.c:957:xl_opt_validate] 0-cees-data-server: option
    'listen-port' is deprecated, preferred is
    'transport.socket.listen-port', continuing with correction
    [2016-09-27 16:16:28.109456] W [socket.c:3665:reconfigure]
    0-cees-data-quota: NBIO on -1 failed (Bad file descriptor)
    [2016-09-27 16:16:28.489255] I [MSGID: 121050]
    [ctr-helper.c:259:extract_ctr_options] 0-gfdbdatastore: CTR
    Xlator is disabled.
    [2016-09-27 16:16:28.489272] W [MSGID: 101105]
    [gfdb_sqlite3.h:239:gfdb_set_sql_params]
    0-cees-data-changetimerecorder: Failed to retrieve
    sql-db-pagesize from params.Assigning default value: 4096
    [2016-09-27 16:16:28.489278] W [MSGID: 101105]
    [gfdb_sqlite3.h:239:gfdb_set_sql_params]
    0-cees-data-changetimerecorder: Failed to retrieve
    sql-db-journalmode from params.Assigning default value: wal
    [2016-09-27 16:16:28.489284] W [MSGID: 101105]
    [gfdb_sqlite3.h:239:gfdb_set_sql_params]
    0-cees-data-changetimerecorder: Failed to retrieve sql-db-sync
    from params.Assigning default value: off
    [2016-09-27 16:16:28.489288] W [MSGID: 101105]
    [gfdb_sqlite3.h:239:gfdb_set_sql_params]
    0-cees-data-changetimerecorder: Failed to retrieve
    sql-db-autovacuum from params.Assigning default value: none
    [2016-09-27 16:16:28.490431] I [trash.c:2412:init]
    0-cees-data-trash: no option specified for 'eliminate', using NULL
    [2016-09-27 16:16:28.672814] W
    [graph.c:357:_log_if_unknown_option] 0-cees-data-server: option
    'rpc-auth.auth-glusterfs' is not recognized
    [2016-09-27 16:16:28.672854] W
    [graph.c:357:_log_if_unknown_option] 0-cees-data-server: option
    'rpc-auth.auth-unix' is not recognized
    [2016-09-27 16:16:28.672872] W
    [graph.c:357:_log_if_unknown_option] 0-cees-data-server: option
    'rpc-auth.auth-null' is not recognized
    [2016-09-27 16:16:28.672924] W
    [graph.c:357:_log_if_unknown_option] 0-cees-data-quota: option
    'timeout' is not recognized
    [2016-09-27 16:16:28.672955] W
    [graph.c:357:_log_if_unknown_option] 0-cees-data-trash: option
    'brick-path' is not recognized
    Final graph:
    
+------------------------------------------------------------------------------+
      1: volume cees-data-posix
      2:     type storage/posix
      3:     option glusterd-uuid f04b231e-63f8-4374-91ae-17c0c623f165
      4:     option directory /data/brick
      5:     option volume-id 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
      6:     option update-link-count-parent on
      7: end-volume
      8:
      9: volume cees-data-trash
     10:     type features/trash
     11:     option trash-dir .trashcan
     12:     option brick-path /data/brick
     13:     option trash-internal-op off
     14:     subvolumes cees-data-posix
     15: end-volume
     16:
     17: volume cees-data-changetimerecorder
     18:     type features/changetimerecorder
     19:     option db-type sqlite3
     20:     option hot-brick off
     21:     option db-name brick.db
     22:     option db-path /data/brick/.glusterfs/
     23:     option record-exit off
     24:     option ctr_link_consistency off
     25:     option ctr_lookupheal_link_timeout 300
     26:     option ctr_lookupheal_inode_timeout 300
     27:     option record-entry on
     28:     option ctr-enabled off
     29:     option record-counters off
     30:     option ctr-record-metadata-heat off
     31:     option sql-db-cachesize 1000
     32:     option sql-db-wal-autocheckpoint 1000
     33:     subvolumes cees-data-trash
     34: end-volume
     35:
     36: volume cees-data-changelog
     37:     type features/changelog
     38:     option changelog-brick /data/brick
     39:     option changelog-dir /data/brick/.glusterfs/changelogs
     40:     option changelog-barrier-timeout 120
     41:     subvolumes cees-data-changetimerecorder
     42: end-volume
     43:
     44: volume cees-data-bitrot-stub
     45:     type features/bitrot-stub
     46:     option export /data/brick
     47:     subvolumes cees-data-changelog
     48: end-volume
     49:
     50: volume cees-data-access-control
     51:     type features/access-control
     52:     subvolumes cees-data-bitrot-stub
     53: end-volume
     54:
     55: volume cees-data-locks
     56:     type features/locks
     57:     subvolumes cees-data-access-control
     58: end-volume
     59:
     60: volume cees-data-upcall
     61:     type features/upcall
     62:     option cache-invalidation off
     63:     subvolumes cees-data-locks
     64: end-volume
     65:
     66: volume cees-data-io-threads
     67:     type performance/io-threads
     68:     subvolumes cees-data-upcall
     69: end-volume
     70:
     71: volume cees-data-marker
     72:     type features/marker
     73:     option volume-uuid 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
     74:     option timestamp-file
    /var/lib/glusterd/vols/cees-data/marker.tstamp
     75:     option quota-version 1
     76:     option xtime off
     77:     option gsync-force-xtime off
     78:     option quota on
     79:     option inode-quota on
     80:     subvolumes cees-data-io-threads
     81: end-volume
     82:
     83: volume cees-data-barrier
     84:     type features/barrier
     85:     option barrier disable
     86:     option barrier-timeout 120
     87:     subvolumes cees-data-marker
     88: end-volume
     89:
     90: volume cees-data-index
     91:     type features/index
     92:     option index-base /data/brick/.glusterfs/indices
     93:     subvolumes cees-data-barrier
     94: end-volume
     95:
     96: volume cees-data-quota
     97:     type features/quota
     98:     option transport.socket.connect-path
    /var/run/gluster/quotad.socket
     99:     option transport-type socket
    100:     option transport.address-family unix
    101:     option volume-uuid cees-data
    102:     option server-quota on
    103:     option timeout 0
    104:     option deem-statfs on
    105:     subvolumes cees-data-index
    106: end-volume
    107:
    108: volume cees-data-worm
    109:     type features/worm
    110:     option worm off
    111:     subvolumes cees-data-quota
    112: end-volume
    113:
    114: volume cees-data-read-only
    115:     type features/read-only
    116:     option read-only off
    117:     subvolumes cees-data-worm
    118: end-volume
    119:
    120: volume /data/brick
    121:     type debug/io-stats
    122:     option log-level INFO
    123:     option latency-measurement off
    124:     option count-fop-hits off
    125:     subvolumes cees-data-read-only
    126: end-volume
    127:
    128: volume cees-data-server
    129:     type protocol/server
    130:     option transport.socket.listen-port 49152
    131:     option rpc-auth.auth-glusterfs on
    132:     option rpc-auth.auth-unix on
    133:     option rpc-auth.auth-null on
    134:     option rpc-auth-allow-insecure on
    135:     option transport.rdma.listen-port 49153
    136:     option transport-type tcp,rdma
    137:     option auth.login./data/brick.allow
    18ddaf4c-ad98-4155-9372-717eae718b4c
    138:     option
    auth.login.18ddaf4c-ad98-4155-9372-717eae718b4c.password
    9e913e92-7de0-47f9-94ed-d08cbb130d23
    139:     option auth.addr./data/brick.allow *
    140:     subvolumes /data/brick
    141: end-volume
    142:
    
+------------------------------------------------------------------------------+
    [2016-09-27 16:16:30.079541] I [login.c:81:gf_auth] 0-auth/login:
    allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
    [2016-09-27 16:16:30.079567] I [MSGID: 115029]
    [server-handshake.c:690:server_setvolume] 0-cees-data-server:
    accepted client from
    fs3-12560-2016/09/27-16:16:30:47674-cees-data-client-3-0-0
    (version: 3.7.14)
    [2016-09-27 16:16:30.081487] I [login.c:81:gf_auth] 0-auth/login:
    allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
    [2016-09-27 16:16:30.081505] I [MSGID: 115029]
    [server-handshake.c:690:server_setvolume] 0-cees-data-server:
    accepted client from
    fs2-11709-2016/09/27-16:16:30:50047-cees-data-client-3-0-0
    (version: 3.7.14)
    [2016-09-27 16:16:30.111091] I [login.c:81:gf_auth] 0-auth/login:
    allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
    [2016-09-27 16:16:30.111113] I [MSGID: 115029]
    [server-handshake.c:690:server_setvolume] 0-cees-data-server:
    accepted client from
    fs2-11701-2016/09/27-16:16:29:24060-cees-data-client-3-0-0
    (version: 3.7.14)
    [2016-09-27 16:16:30.112822] I [login.c:81:gf_auth] 0-auth/login:
    allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
    [2016-09-27 16:16:30.112836] I [MSGID: 115029]
    [server-handshake.c:690:server_setvolume] 0-cees-data-server:
    accepted client from
    fs3-12552-2016/09/27-16:16:29:23041-cees-data-client-3-0-0
    (version: 3.7.14)
    [2016-09-27 16:16:31.950978] I [login.c:81:gf_auth] 0-auth/login:
    allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
    [2016-09-27 16:16:31.950998] I [MSGID: 115029]
    [server-handshake.c:690:server_setvolume] 0-cees-data-server:
    accepted client from
    fs1-6721-2016/09/27-16:16:26:939991-cees-data-client-3-0-0
    (version: 3.7.14)
    [2016-09-27 16:16:31.981977] I [login.c:81:gf_auth] 0-auth/login:
    allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
    [2016-09-27 16:16:31.981994] I [MSGID: 115029]
    [server-handshake.c:690:server_setvolume] 0-cees-data-server:
    accepted client from
    fs1-6729-2016/09/27-16:16:27:971228-cees-data-client-3-0-0
    (version: 3.7.14)


    Hmm, this shows the brick has started.
    Does gluster volume info on fs4 shows all 4 bricks? (I guess it
    does based on your first email).
    Does gluster volume status on fs4  (or ps aux|grep glusterfsd)
    show the brick as running?
    Does gluster peer status on all nodes list the other 3 nodes as
    connected?

    If yes, you could try `service glusterd restart` on fs4 and see if
    if brings up the brick? I'm just shooting in the dark here for
    possible clues.
    -Ravi

    On Tue, Sep 27, 2016 at 8:46 AM, Ravishankar N
    <[email protected] <mailto:[email protected]>> wrote:

        On 09/27/2016 09:06 PM, Dennis Michael wrote:
        Yes, the brick log /var/log/glusterfs/bricks/data-brick.log
        is created on fs4, and the snippets showing the errors were
        from that log.

        Unless I'm missing something, the snippet below is from
        glusterd's log and not the brick's as is evident from the
        function names.
        -Ravi
        Dennis

        On Mon, Sep 26, 2016 at 5:58 PM, Ravishankar N
        <[email protected] <mailto:[email protected]>> wrote:

            On 09/27/2016 05:25 AM, Dennis Michael wrote:

                [2016-09-26 22:44:39.254921] E [MSGID: 106005]
                [glusterd-utils.c:4771:glusterd_brick_start]
                0-management: Unable to start brick fs4:/data/brick
                [2016-09-26 22:44:39.254949] E [MSGID: 106074]
                [glusterd-brick-ops.c:2372:glusterd_op_add_brick]
                0-glusterd: Unable to add bricks


            Is the brick log created on fs4? Does it contain
            warnings/errors?

            -Ravi







_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to