On 11/28/2013 09:30 PM, Patrick Haley wrote:
Hi Ravi,

I'm pretty sure the clients use fuse mounts.  The relevant line from /etc/fstab 
is

mseas-data:/gdata       /gdata           glusterfs  defaults,_netdev     0 0


gluster-data sees the other bricks as connected.  The other bricks see each
other as connected but gluster-data as disconnected:

---------------
gluster-data:
---------------
[root@mseas-data ~]# gluster peer status
Number of Peers: 2

Hostname: gluster-0-1
Uuid: 393fc4a6-1573-4564-971e-1b1aec434167
State: Peer in Cluster (Connected)

Hostname: gluster-0-0
Uuid: 3619440a-4ca3-4151-b62e-d4d6bf2e0c03
State: Peer in Cluster (Connected)

-------------
gluster-0-0:
--------------
[root@nas-0-0 ~]# gluster peer status
Number of Peers: 2

Hostname: gluster-data
Uuid: 22f1102a-08e6-482d-ad23-d8e063cf32ed
State: Peer in Cluster (Disconnected)

Hostname: gluster-0-1
Uuid: 393fc4a6-1573-4564-971e-1b1aec434167
State: Peer in Cluster (Connected)

-------------
gluster-0-1:
--------------
[root@nas-0-1 ~]# gluster peer status
Number of Peers: 2

Hostname: gluster-data
Uuid: 22f1102a-08e6-482d-ad23-d8e063cf32ed
State: Peer in Cluster (Disconnected)

Hostname: gluster-0-0
Uuid: 3619440a-4ca3-4151-b62e-d4d6bf2e0c03
State: Peer in Cluster (Connected)

Does any of this suggest what I need to look at next?
Hi Patrick,
If gluster-data is pingable from the other bricks, you could try detaching and retttaching it from gluster-0-0 or 0-1.
1) On gluster-0-0:
`gluster peer detach gluster-data`, if that fails, `gluster peer detach gluster-data force`
2) On gluster-data:
    `rm -rf /var/lib/glusterd`
    `service glusterd restart`
3) Again on gluster-0-0:
    'gluster peer probe gluster-data'

Now check if things work.
PS:You should really do a 'reply-to-all' so that your queries reach a wider audience, getting you faster responses from the community. Also serves as a double-check in case I goof up :)

I'm off to sleep now.

Thanks.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                                                  Email:     
[email protected]
Center for Ocean Engineering                     Phone:    (617) 253-6824
Dept. of Mechanical Engineering                 Fax:        (617) 253-8125
MIT, Room 5-213                                      
http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301


________________________________________
From: Ravishankar N [[email protected]]
Sent: Thursday, November 28, 2013 2:48 AM
To: Patrick Haley
Cc: [email protected]
Subject: Re: [Gluster-users] After reboot, one brick is not being seen by 
clients

On 11/28/2013 12:52 PM, Patrick Haley wrote:
Hi Ravi,

Thanks for the reply.  If I interpret the output of gluster volume status
correctly, glusterfsd was running

[root@mseas-data ~]# gluster volume status
Status of volume: gdata
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick gluster-0-0:/mseas-data-0-0                       24009   Y       27006
Brick gluster-0-1:/mseas-data-0-1                       24009   Y       7063
Brick gluster-data:/data                                24009   Y       2897
NFS Server on localhost                                 38467   Y       2903
NFS Server on gluster-0-1                               38467   Y       7069
NFS Server on gluster-0-0                               38467   Y       27012

For completeness, I tried both "service glusterd restart" and
"gluster volume start gdata force".  Neither solved the problem.
Note that after "gluster volume start gdata force" the gluster volume status
failed

[root@mseas-data ~]# gluster volume status
operation failed

Failed to get names of volumes

Doing another "service glusterd restart"  let the "gluster volume status"
command work, but the clients still don't see the files on mseas-data.
Are your clients using fuse mounts or NFS mounts?
A second piece of data, on the other bricks, "gluster volume status"does not
show gluster-data:/data:
Hmm, could you check if all 3 bricks are connected ? `gluster peer
status` on each brick should show the others as connected.
[root@nas-0-0 ~]# gluster volume status
Status of volume: gdata
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick gluster-0-0:/mseas-data-0-0                       24009   Y       27006
Brick gluster-0-1:/mseas-data-0-1                       24009   Y       7063
NFS Server on localhost                                 38467   Y       27012
NFS Server on gluster-0-1                               38467   Y       8051

Any thoughts on what I should look at next?
Also noticed the NFS server process on gluster-0-1 (on which I guess no
commands were run ) seems to have changed it's pid from 7069 to 8051.
FWIW, I am able to observe a similar bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1035586) which needs to be
investigated.

Thanks,
Ravi
Thanks again.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                                                  Email:     
[email protected]
Center for Ocean Engineering                     Phone:    (617) 253-6824
Dept. of Mechanical Engineering                 Fax:        (617) 253-8125
MIT, Room 5-213                                      
http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301


________________________________________
From: Ravishankar N [[email protected]]
Sent: Wednesday, November 27, 2013 11:21 PM
To: Patrick Haley; [email protected]
Subject: Re: [Gluster-users] After reboot, one brick is not being seen by 
clients

On 11/28/2013 03:12 AM, Pat Haley wrote:
Hi,

We are currently using gluster with 3 bricks.  We just
rebooted one of the bricks (mseas-data, also identified
as gluster-data) which is actually the main server.  After
rebooting this brick, our client machine (mseas) only sees
the files on the other 2 bricks.  Note that if I mount
the gluster filespace (/gdata) on the brick I rebooted,
it sees the entire space.

The last time I had this problem, there was an error in
one of our /etc/hosts file.  This does not seem to be the
case now.

What else can I look at to debug this problem?

Some information I have from the gluster server

[root@mseas-data ~]# gluster --version
glusterfs 3.3.1 built on Oct 11 2012 22:01:05

[root@mseas-data ~]# gluster volume info

Volume Name: gdata
Type: Distribute
Volume ID: eccc3a90-212d-4563-ae8d-10a77758738d
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: gluster-0-0:/mseas-data-0-0
Brick2: gluster-0-1:/mseas-data-0-1
Brick3: gluster-data:/data

[root@mseas-data ~]# ps -ef | grep gluster

root      2781     1  0 15:16 ?        00:00:00 /usr/sbin/glusterd -p
/var/run/glusterd.pid
root      2897     1  0 15:16 ?        00:00:00 /usr/sbin/glusterfsd
-s localhost --volfile-id gdata.gluster-data.data -p
/var/lib/glusterd/vols/gdata/run/gluster-data-data.pid -S
/tmp/e3eac7ce95e786a3d909b8fc65ed2059.socket --brick-name /data -l
/var/log/glusterfs/bricks/data.log --xlator-option
*-posix.glusterd-uuid=22f1102a-08e6-482d-ad23-d8e063cf32ed
--brick-port 24009 --xlator-option gdata-server.listen-port=24009
root      2903     1  0 15:16 ?        00:00:00 /usr/sbin/glusterfs -s
localhost --volfile-id gluster/nfs -p
/var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
/tmp/d5c892de43c28a1ee7481b780245b789.socket
root      4258     1  0 15:52 ?        00:00:00 /usr/sbin/glusterfs
--volfile-id=/gdata --volfile-server=mseas-data /gdata
root      4475  4033  0 16:35 pts/0    00:00:00 grep gluster
[

   From the ps output, the brick process (glusterfsd) doesn't seem to be
running on the gluster-data server. Run `gluster volume status` and
check if that is indeed the case. If yes, you could either restart
glusterd on the brick node (`service glusterd restart`) or restart the
entire volume (`gluster volume start gdata force`) which should bring
back the brick process online.

I'm not sure why glusterd did not start the brick process when you
rebooted the machine in the first place. You could perhaps check the
glusterd log for clues).

Hope this helps,
Ravi

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  [email protected]
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to