On 09/27/2016 10:29 PM, Dennis Michael wrote:
[root@fs4 bricks]# gluster volume info
Volume Name: cees-data
Type: Distribute
Volume ID: 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
Status: Started
Number of Bricks: 4
Transport-type: tcp,rdma
Bricks:
Brick1: fs1:/data/brick
Brick2: fs2:/data/brick
Brick3: fs3:/data/brick
Brick4: fs4:/data/brick
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
[root@fs4 bricks]# gluster volume status
Status of volume: cees-data
Gluster process TCP Port RDMA Port
Online Pid
------------------------------------------------------------------------------
Brick fs1:/data/brick 49152 49153 Y
1878
Brick fs2:/data/brick 49152 0 Y 1707
Brick fs3:/data/brick 49152 0 Y 4696
Brick fs4:/data/brick N/A N/A N
N/A
NFS Server on localhost 2049 0 Y 13808
Quota Daemon on localhost N/A N/A Y
13813
NFS Server on fs1 2049 0 Y 6722
Quota Daemon on fs1 N/A N/A Y
6730
NFS Server on fs3 2049 0 Y 12553
Quota Daemon on fs3 N/A N/A Y
12561
NFS Server on fs2 2049 0 Y 11702
Quota Daemon on fs2 N/A N/A Y
11710
Task Status of Volume cees-data
------------------------------------------------------------------------------
There are no active volume tasks
[root@fs4 bricks]# ps auxww | grep gluster
root 13791 0.0 0.0 701472 19768 ? Ssl 09:06 0:00
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root 13808 0.0 0.0 560236 41420 ? Ssl 09:07 0:00
/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p
/var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
/var/run/gluster/01c61523374369658a62b75c582b5ac2.socket
root 13813 0.0 0.0 443164 17908 ? Ssl 09:07 0:00
/usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p
/var/lib/glusterd/quotad/run/quotad.pid -l
/var/log/glusterfs/quotad.log -S
/var/run/gluster/3753def90f5c34f656513dba6a544f7d.socket
--xlator-option *replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off
root 13874 0.0 0.0 1200472 31700 ? Ssl 09:16 0:00
/usr/sbin/glusterfsd -s fs4 --volfile-id cees-data.fs4.data-brick -p
/var/lib/glusterd/vols/cees-data/run/fs4-data-brick.pid -S
/var/run/gluster/5203ab38be21e1d37c04f6bdfee77d4a.socket --brick-name
/data/brick -l /var/log/glusterfs/bricks/data-brick.log
--xlator-option
*-posix.glusterd-uuid=f04b231e-63f8-4374-91ae-17c0c623f165
--brick-port 49152 49153 --xlator-option
cees-data-server.transport.rdma.listen-port=49153 --xlator-option
cees-data-server.listen-port=49152 --volfile-server-transport=socket,rdma
root 13941 0.0 0.0 112648 976 pts/0 S+ 09:50 0:00 grep
--color=auto gluster
[root@fs4 bricks]# systemctl restart glusterfsd glusterd
[root@fs4 bricks]# ps auxww | grep gluster
root 13808 0.0 0.0 560236 41420 ? Ssl 09:07 0:00
/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p
/var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
/var/run/gluster/01c61523374369658a62b75c582b5ac2.socket
root 13813 0.0 0.0 443164 17908 ? Ssl 09:07 0:00
/usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p
/var/lib/glusterd/quotad/run/quotad.pid -l
/var/log/glusterfs/quotad.log -S
/var/run/gluster/3753def90f5c34f656513dba6a544f7d.socket
--xlator-option *replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off
root 13953 0.1 0.0 570740 14988 ? Ssl 09:51 0:00
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root 13965 0.0 0.0 112648 976 pts/0 S+ 09:51 0:00 grep
--color=auto gluster
[root@fs4 bricks]# gluster volume info
Volume Name: cees-data
Type: Distribute
Volume ID: 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
Status: Started
Number of Bricks: 3
Transport-type: tcp,rdma
Bricks:
Brick1: fs1:/data/brick
Brick2: fs2:/data/brick
Brick3: fs3:/data/brick
Options Reconfigured:
performance.readdir-ahead: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
I'm not sure what's going on here. Restarting glusterd seems to change
the output of gluster volume info? I also see you are using RDMA. Not
sure why the RDMA ports for fs2 and fs3 are not shown in the volume
status output. CC'ing some glusterd/rdma devs for pointers.
-Ravi
[root@fs4 bricks]# gluster volume status
Status of volume: cees-data
Gluster process TCP Port RDMA Port
Online Pid
------------------------------------------------------------------------------
Brick fs1:/data/brick 49152 49153 Y
1878
Brick fs2:/data/brick 49152 0 Y 1707
Brick fs3:/data/brick 49152 0 Y 4696
NFS Server on localhost 2049 0 Y 13968
Quota Daemon on localhost N/A N/A Y
13976
NFS Server on fs2 2049 0 Y 11702
Quota Daemon on fs2 N/A N/A Y
11710
NFS Server on fs3 2049 0 Y 12553
Quota Daemon on fs3 N/A N/A Y
12561
NFS Server on fs1 2049 0 Y 6722
Task Status of Volume cees-data
------------------------------------------------------------------------------
There are no active volume tasks
[root@fs4 bricks]# gluster peer status
Number of Peers: 3
Hostname: fs1
Uuid: ddc0a23e-05e5-48f7-993e-a37e43b21605
State: Peer in Cluster (Connected)
Hostname: fs2
Uuid: e37108f8-d2f1-4f28-adc8-0b3d3401df29
State: Peer in Cluster (Connected)
Hostname: fs3
Uuid: 19a42201-c932-44db-b1a7-8b5b1af32a36
State: Peer in Cluster (Connected)
Dennis
On Tue, Sep 27, 2016 at 9:40 AM, Ravishankar N <[email protected]
<mailto:[email protected]>> wrote:
On 09/27/2016 09:53 PM, Dennis Michael wrote:
Yes, you are right. I mixed up the logs. I just ran the
add-brick command again after cleaning up fs4 and re-installing
gluster. This is the complete fs4 data-brick.log.
[root@fs1 ~]# gluster volume add-brick cees-data fs4:/data/brick
volume add-brick: failed: Commit failed on fs4. Please check log
file for details.
[root@fs4 bricks]# pwd
/var/log/glusterfs/bricks
[root@fs4 bricks]# cat data-brick.log
[2016-09-27 16:16:28.095661] I [MSGID: 100030]
[glusterfsd.c:2338:main] 0-/usr/sbin/glusterfsd: Started running
/usr/sbin/glusterfsd version 3.7.14 (args: /usr/sbin/glusterfsd
-s fs4 --volfile-id cees-data.fs4.data-brick -p
/var/lib/glusterd/vols/cees-data/run/fs4-data-brick.pid -S
/var/run/gluster/5203ab38be21e1d37c04f6bdfee77d4a.socket
--brick-name /data/brick -l
/var/log/glusterfs/bricks/data-brick.log --xlator-option
*-posix.glusterd-uuid=f04b231e-63f8-4374-91ae-17c0c623f165
--brick-port 49152 --xlator-option
cees-data-server.transport.rdma.listen-port=49153 --xlator-option
cees-data-server.listen-port=49152
--volfile-server-transport=socket,rdma)
[2016-09-27 16:16:28.101547] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
[2016-09-27 16:16:28.104637] I
[graph.c:269:gf_add_cmdline_options] 0-cees-data-server: adding
option 'listen-port' for volume 'cees-data-server' with value '49152'
[2016-09-27 16:16:28.104646] I
[graph.c:269:gf_add_cmdline_options] 0-cees-data-server: adding
option 'transport.rdma.listen-port' for volume 'cees-data-server'
with value '49153'
[2016-09-27 16:16:28.104662] I
[graph.c:269:gf_add_cmdline_options] 0-cees-data-posix: adding
option 'glusterd-uuid' for volume 'cees-data-posix' with value
'f04b231e-63f8-4374-91ae-17c0c623f165'
[2016-09-27 16:16:28.104808] I [MSGID: 115034]
[server.c:403:_check_for_auth_option] 0-/data/brick: skip format
check for non-addr auth option auth.login./data/brick.allow
[2016-09-27 16:16:28.104814] I [MSGID: 115034]
[server.c:403:_check_for_auth_option] 0-/data/brick: skip format
check for non-addr auth option
auth.login.18ddaf4c-ad98-4155-9372-717eae718b4c.password
[2016-09-27 16:16:28.104883] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 2
[2016-09-27 16:16:28.105479] I
[rpcsvc.c:2196:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
Configured rpc.outstanding-rpc-limit with value 64
[2016-09-27 16:16:28.105532] W [MSGID: 101002]
[options.c:957:xl_opt_validate] 0-cees-data-server: option
'listen-port' is deprecated, preferred is
'transport.socket.listen-port', continuing with correction
[2016-09-27 16:16:28.109456] W [socket.c:3665:reconfigure]
0-cees-data-quota: NBIO on -1 failed (Bad file descriptor)
[2016-09-27 16:16:28.489255] I [MSGID: 121050]
[ctr-helper.c:259:extract_ctr_options] 0-gfdbdatastore: CTR
Xlator is disabled.
[2016-09-27 16:16:28.489272] W [MSGID: 101105]
[gfdb_sqlite3.h:239:gfdb_set_sql_params]
0-cees-data-changetimerecorder: Failed to retrieve
sql-db-pagesize from params.Assigning default value: 4096
[2016-09-27 16:16:28.489278] W [MSGID: 101105]
[gfdb_sqlite3.h:239:gfdb_set_sql_params]
0-cees-data-changetimerecorder: Failed to retrieve
sql-db-journalmode from params.Assigning default value: wal
[2016-09-27 16:16:28.489284] W [MSGID: 101105]
[gfdb_sqlite3.h:239:gfdb_set_sql_params]
0-cees-data-changetimerecorder: Failed to retrieve sql-db-sync
from params.Assigning default value: off
[2016-09-27 16:16:28.489288] W [MSGID: 101105]
[gfdb_sqlite3.h:239:gfdb_set_sql_params]
0-cees-data-changetimerecorder: Failed to retrieve
sql-db-autovacuum from params.Assigning default value: none
[2016-09-27 16:16:28.490431] I [trash.c:2412:init]
0-cees-data-trash: no option specified for 'eliminate', using NULL
[2016-09-27 16:16:28.672814] W
[graph.c:357:_log_if_unknown_option] 0-cees-data-server: option
'rpc-auth.auth-glusterfs' is not recognized
[2016-09-27 16:16:28.672854] W
[graph.c:357:_log_if_unknown_option] 0-cees-data-server: option
'rpc-auth.auth-unix' is not recognized
[2016-09-27 16:16:28.672872] W
[graph.c:357:_log_if_unknown_option] 0-cees-data-server: option
'rpc-auth.auth-null' is not recognized
[2016-09-27 16:16:28.672924] W
[graph.c:357:_log_if_unknown_option] 0-cees-data-quota: option
'timeout' is not recognized
[2016-09-27 16:16:28.672955] W
[graph.c:357:_log_if_unknown_option] 0-cees-data-trash: option
'brick-path' is not recognized
Final graph:
+------------------------------------------------------------------------------+
1: volume cees-data-posix
2: type storage/posix
3: option glusterd-uuid f04b231e-63f8-4374-91ae-17c0c623f165
4: option directory /data/brick
5: option volume-id 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
6: option update-link-count-parent on
7: end-volume
8:
9: volume cees-data-trash
10: type features/trash
11: option trash-dir .trashcan
12: option brick-path /data/brick
13: option trash-internal-op off
14: subvolumes cees-data-posix
15: end-volume
16:
17: volume cees-data-changetimerecorder
18: type features/changetimerecorder
19: option db-type sqlite3
20: option hot-brick off
21: option db-name brick.db
22: option db-path /data/brick/.glusterfs/
23: option record-exit off
24: option ctr_link_consistency off
25: option ctr_lookupheal_link_timeout 300
26: option ctr_lookupheal_inode_timeout 300
27: option record-entry on
28: option ctr-enabled off
29: option record-counters off
30: option ctr-record-metadata-heat off
31: option sql-db-cachesize 1000
32: option sql-db-wal-autocheckpoint 1000
33: subvolumes cees-data-trash
34: end-volume
35:
36: volume cees-data-changelog
37: type features/changelog
38: option changelog-brick /data/brick
39: option changelog-dir /data/brick/.glusterfs/changelogs
40: option changelog-barrier-timeout 120
41: subvolumes cees-data-changetimerecorder
42: end-volume
43:
44: volume cees-data-bitrot-stub
45: type features/bitrot-stub
46: option export /data/brick
47: subvolumes cees-data-changelog
48: end-volume
49:
50: volume cees-data-access-control
51: type features/access-control
52: subvolumes cees-data-bitrot-stub
53: end-volume
54:
55: volume cees-data-locks
56: type features/locks
57: subvolumes cees-data-access-control
58: end-volume
59:
60: volume cees-data-upcall
61: type features/upcall
62: option cache-invalidation off
63: subvolumes cees-data-locks
64: end-volume
65:
66: volume cees-data-io-threads
67: type performance/io-threads
68: subvolumes cees-data-upcall
69: end-volume
70:
71: volume cees-data-marker
72: type features/marker
73: option volume-uuid 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
74: option timestamp-file
/var/lib/glusterd/vols/cees-data/marker.tstamp
75: option quota-version 1
76: option xtime off
77: option gsync-force-xtime off
78: option quota on
79: option inode-quota on
80: subvolumes cees-data-io-threads
81: end-volume
82:
83: volume cees-data-barrier
84: type features/barrier
85: option barrier disable
86: option barrier-timeout 120
87: subvolumes cees-data-marker
88: end-volume
89:
90: volume cees-data-index
91: type features/index
92: option index-base /data/brick/.glusterfs/indices
93: subvolumes cees-data-barrier
94: end-volume
95:
96: volume cees-data-quota
97: type features/quota
98: option transport.socket.connect-path
/var/run/gluster/quotad.socket
99: option transport-type socket
100: option transport.address-family unix
101: option volume-uuid cees-data
102: option server-quota on
103: option timeout 0
104: option deem-statfs on
105: subvolumes cees-data-index
106: end-volume
107:
108: volume cees-data-worm
109: type features/worm
110: option worm off
111: subvolumes cees-data-quota
112: end-volume
113:
114: volume cees-data-read-only
115: type features/read-only
116: option read-only off
117: subvolumes cees-data-worm
118: end-volume
119:
120: volume /data/brick
121: type debug/io-stats
122: option log-level INFO
123: option latency-measurement off
124: option count-fop-hits off
125: subvolumes cees-data-read-only
126: end-volume
127:
128: volume cees-data-server
129: type protocol/server
130: option transport.socket.listen-port 49152
131: option rpc-auth.auth-glusterfs on
132: option rpc-auth.auth-unix on
133: option rpc-auth.auth-null on
134: option rpc-auth-allow-insecure on
135: option transport.rdma.listen-port 49153
136: option transport-type tcp,rdma
137: option auth.login./data/brick.allow
18ddaf4c-ad98-4155-9372-717eae718b4c
138: option
auth.login.18ddaf4c-ad98-4155-9372-717eae718b4c.password
9e913e92-7de0-47f9-94ed-d08cbb130d23
139: option auth.addr./data/brick.allow *
140: subvolumes /data/brick
141: end-volume
142:
+------------------------------------------------------------------------------+
[2016-09-27 16:16:30.079541] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
[2016-09-27 16:16:30.079567] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-cees-data-server:
accepted client from
fs3-12560-2016/09/27-16:16:30:47674-cees-data-client-3-0-0
(version: 3.7.14)
[2016-09-27 16:16:30.081487] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
[2016-09-27 16:16:30.081505] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-cees-data-server:
accepted client from
fs2-11709-2016/09/27-16:16:30:50047-cees-data-client-3-0-0
(version: 3.7.14)
[2016-09-27 16:16:30.111091] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
[2016-09-27 16:16:30.111113] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-cees-data-server:
accepted client from
fs2-11701-2016/09/27-16:16:29:24060-cees-data-client-3-0-0
(version: 3.7.14)
[2016-09-27 16:16:30.112822] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
[2016-09-27 16:16:30.112836] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-cees-data-server:
accepted client from
fs3-12552-2016/09/27-16:16:29:23041-cees-data-client-3-0-0
(version: 3.7.14)
[2016-09-27 16:16:31.950978] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
[2016-09-27 16:16:31.950998] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-cees-data-server:
accepted client from
fs1-6721-2016/09/27-16:16:26:939991-cees-data-client-3-0-0
(version: 3.7.14)
[2016-09-27 16:16:31.981977] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
[2016-09-27 16:16:31.981994] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-cees-data-server:
accepted client from
fs1-6729-2016/09/27-16:16:27:971228-cees-data-client-3-0-0
(version: 3.7.14)
Hmm, this shows the brick has started.
Does gluster volume info on fs4 shows all 4 bricks? (I guess it
does based on your first email).
Does gluster volume status on fs4 (or ps aux|grep glusterfsd)
show the brick as running?
Does gluster peer status on all nodes list the other 3 nodes as
connected?
If yes, you could try `service glusterd restart` on fs4 and see if
if brings up the brick? I'm just shooting in the dark here for
possible clues.
-Ravi
On Tue, Sep 27, 2016 at 8:46 AM, Ravishankar N
<[email protected] <mailto:[email protected]>> wrote:
On 09/27/2016 09:06 PM, Dennis Michael wrote:
Yes, the brick log /var/log/glusterfs/bricks/data-brick.log
is created on fs4, and the snippets showing the errors were
from that log.
Unless I'm missing something, the snippet below is from
glusterd's log and not the brick's as is evident from the
function names.
-Ravi
Dennis
On Mon, Sep 26, 2016 at 5:58 PM, Ravishankar N
<[email protected] <mailto:[email protected]>> wrote:
On 09/27/2016 05:25 AM, Dennis Michael wrote:
[2016-09-26 22:44:39.254921] E [MSGID: 106005]
[glusterd-utils.c:4771:glusterd_brick_start]
0-management: Unable to start brick fs4:/data/brick
[2016-09-26 22:44:39.254949] E [MSGID: 106074]
[glusterd-brick-ops.c:2372:glusterd_op_add_brick]
0-glusterd: Unable to add bricks
Is the brick log created on fs4? Does it contain
warnings/errors?
-Ravi
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users