Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-07-03 Thread Pawan Alwandi
Hi Kaleb,

Thanks, this refers to 3.11.x

On Mon, Jul 3, 2017 at 4:16 PM, Kaleb S. KEITHLEY <kkeit...@redhat.com>
wrote:

> On 07/03/2017 06:29 AM, Kaleb S. KEITHLEY wrote:
>
>> On 07/03/2017 04:34 AM, Atin Mukherjee wrote:
>>
>>>
>>> On Mon, 3 Jul 2017 at 12:28, Pawan Alwandi <pa...@platform.sh> wrote:
>>>
>>> Hello Atin,
>>>
>>> I've gotten around to this and was able to get upgrade done using
>>> 3.7.0 before moving to 3.11.  For some reason 3.7.9 wasn't working
>>> well.
>>>
>>> On 3.11 though I notice that gluster/nfs is really made optional and
>>> nfs-ganesha is being recommended.  We have plans to switch to
>>> nfs-ganesha on new clusters but would like to have glusterfs-gnfs on
>>> existing clusters so a seamless upgrade without downtime is possible.
>>>
>>> [2017-07-03 06:43:25.511893] I [MSGID: 106600]
>>> [glusterd-nfs-svc.c:82:glusterd_nfssvc_manager] 0-management:
>>> nfs/server.so xlator is not installed
>>>
>>> I was really looking for glusterfs-gnfs package and noticed that
>>> .deb is missing -
>>> https://download.gluster.org/pub/gluster/glusterfs/LATEST/De
>>> bian/8/apt/pool/main/g/glusterfs/
>>> (fwiw, only the rpm is available).  Is it possible that
>>> glusterfs-gnfs be made available for debian too?
>>>
>>>
>>> Kaleb - can you please help answering to this query?
>>>
>>> The Debian packages aren't split up like the Fedora/RHEL/CentOS RPMs are.
>>
>> I'll respin the Debian packages.
>>
>> Wait. 3.10.x still has gnfs enabled by default.
>
> Are we talking about 3.10.x or 3.11.x? The Subject says 3.10.1.
>
> --
>
> Kaleb
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-07-03 Thread Pawan Alwandi
Hello Atin,

I've gotten around to this and was able to get upgrade done using 3.7.0
before moving to 3.11.  For some reason 3.7.9 wasn't working well.

On 3.11 though I notice that gluster/nfs is really made optional and
nfs-ganesha is being recommended.  We have plans to switch to nfs-ganesha
on new clusters but would like to have glusterfs-gnfs on existing clusters
so a seamless upgrade without downtime is possible.

[2017-07-03 06:43:25.511893] I [MSGID: 106600]
[glusterd-nfs-svc.c:82:glusterd_nfssvc_manager] 0-management: nfs/server.so
xlator is not installed

I was really looking for glusterfs-gnfs package and noticed that .deb is
missing -
https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/8/apt/pool/main/g/glusterfs/
(fwiw, only the rpm is available).  Is it possible that glusterfs-gnfs be
made available for debian too?

Thanks,
Pawan


On Wed, May 31, 2017 at 5:26 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

>
>
> On Wed, May 31, 2017 at 3:53 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>
>> Hello Atin,
>>
>> Sure.  A note though, we are running gluster on Debain Jessie/Wheezy
>> hosts, but if you let me know what info you would need I'll work to collect
>> that and send across.
>>
>
> Basically I need glusterd log file (starting from last restart) along with
> the brick logs collected from all the nodes.
>
>
>> Pawan
>>
>> On Wed, May 31, 2017 at 2:10 PM, Atin Mukherjee <amukh...@redhat.com>
>> wrote:
>>
>>> Pawan,
>>>
>>> I'd need the sosreport from all the nodes to debug and figure out what's
>>> going wrong. You'd have to give me some time as I have some critical
>>> backlog items to work on.
>>>
>>> On Wed, 31 May 2017 at 11:30, Pawan Alwandi <pa...@platform.sh> wrote:
>>>
>>>> Hello Atin,
>>>>
>>>> I've tried restarting gluster one after another, but still see the same
>>>> result.
>>>>
>>>>
>>>> On Tue, May 30, 2017 at 10:40 AM, Atin Mukherjee <amukh...@redhat.com>
>>>> wrote:
>>>>
>>>>> Pawan - I couldn't reach to any conclusive analysis so far. But,
>>>>> looking at the client (nfs)  & glusterd log files, it does look like that
>>>>> there is an issue w.r.t peer connections. Does restarting all the glusterd
>>>>> one by one solve this?
>>>>>
>>>>> On Mon, May 29, 2017 at 4:50 PM, Pawan Alwandi <pa...@platform.sh>
>>>>> wrote:
>>>>>
>>>>>> Sorry for big attachment in previous mail...last 1000 lines of those
>>>>>> logs attached now.
>>>>>>
>>>>>> On Mon, May 29, 2017 at 4:44 PM, Pawan Alwandi <pa...@platform.sh>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, May 25, 2017 at 9:54 PM, Atin Mukherjee <amukh...@redhat.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, 25 May 2017 at 19:11, Pawan Alwandi <pa...@platform.sh>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Atin,
>>>>>>>>>
>>>>>>>>> Yes, glusterd on other instances are up and running.  Below is the
>>>>>>>>> requested output on all the three hosts.
>>>>>>>>>
>>>>>>>>> Host 1
>>>>>>>>>
>>>>>>>>> # gluster peer status
>>>>>>>>> Number of Peers: 2
>>>>>>>>>
>>>>>>>>> Hostname: 192.168.0.7
>>>>>>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>> State: Peer in Cluster (Disconnected)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Glusterd is disconnected here.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hostname: 192.168.0.6
>>>>>>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>> State: Peer in Cluster (Disconnected)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Same as above
>>>>>>>>
>>>>>>>> Can you please check what does glusterd log have to say here about
>>>>>&

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-31 Thread Pawan Alwandi
Hello Atin,

Sure.  A note though, we are running gluster on Debain Jessie/Wheezy hosts,
but if you let me know what info you would need I'll work to collect that
and send across.

Pawan

On Wed, May 31, 2017 at 2:10 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

> Pawan,
>
> I'd need the sosreport from all the nodes to debug and figure out what's
> going wrong. You'd have to give me some time as I have some critical
> backlog items to work on.
>
> On Wed, 31 May 2017 at 11:30, Pawan Alwandi <pa...@platform.sh> wrote:
>
>> Hello Atin,
>>
>> I've tried restarting gluster one after another, but still see the same
>> result.
>>
>>
>> On Tue, May 30, 2017 at 10:40 AM, Atin Mukherjee <amukh...@redhat.com>
>> wrote:
>>
>>> Pawan - I couldn't reach to any conclusive analysis so far. But, looking
>>> at the client (nfs)  & glusterd log files, it does look like that there is
>>> an issue w.r.t peer connections. Does restarting all the glusterd one by
>>> one solve this?
>>>
>>> On Mon, May 29, 2017 at 4:50 PM, Pawan Alwandi <pa...@platform.sh>
>>> wrote:
>>>
>>>> Sorry for big attachment in previous mail...last 1000 lines of those
>>>> logs attached now.
>>>>
>>>> On Mon, May 29, 2017 at 4:44 PM, Pawan Alwandi <pa...@platform.sh>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, May 25, 2017 at 9:54 PM, Atin Mukherjee <amukh...@redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Thu, 25 May 2017 at 19:11, Pawan Alwandi <pa...@platform.sh>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello Atin,
>>>>>>>
>>>>>>> Yes, glusterd on other instances are up and running.  Below is the
>>>>>>> requested output on all the three hosts.
>>>>>>>
>>>>>>> Host 1
>>>>>>>
>>>>>>> # gluster peer status
>>>>>>> Number of Peers: 2
>>>>>>>
>>>>>>> Hostname: 192.168.0.7
>>>>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>> State: Peer in Cluster (Disconnected)
>>>>>>>
>>>>>>
>>>>>> Glusterd is disconnected here.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hostname: 192.168.0.6
>>>>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>> State: Peer in Cluster (Disconnected)
>>>>>>>
>>>>>>
>>>>>> Same as above
>>>>>>
>>>>>> Can you please check what does glusterd log have to say here about
>>>>>> these disconnects?
>>>>>>
>>>>>
>>>>> glusterd keeps logging this every 3s
>>>>>
>>>>> [2017-05-29 11:04:52.182782] W [socket.c:852:__socket_keepalive]
>>>>> 0-socket: failed to set keep idle -1 on socket 5, Invalid argument
>>>>> [2017-05-29 11:04:52.182808] E [socket.c:2966:socket_connect]
>>>>> 0-management: Failed to set keep-alive: Invalid argument
>>>>> [2017-05-29 11:04:52.183032] W [socket.c:852:__socket_keepalive]
>>>>> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument
>>>>> [2017-05-29 11:04:52.183052] E [socket.c:2966:socket_connect]
>>>>> 0-management: Failed to set keep-alive: Invalid argument
>>>>> [2017-05-29 11:04:52.183622] E [rpc-clnt.c:362:saved_frames_unwind]
>>>>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_
>>>>> callingfn+0x1a3)[0x7f767c46d483] (--> /usr/lib/x86_64-linux-gnu/
>>>>> libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af] (-->
>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_
>>>>> connection_cleanup+0x7e)[0x7f767c239c8e] (-->
>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>>>> ) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>>>> called at 2017-05-29 11:04:52.183210 (xid=0x23419)
>>>>> [2017-05-29 11:04:52.183735] W 
>>>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>> (-->/usr/lib/x86_64-linu

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-31 Thread Pawan Alwandi
Hello Atin,

I've tried restarting gluster one after another, but still see the same
result.


On Tue, May 30, 2017 at 10:40 AM, Atin Mukherjee <amukh...@redhat.com>
wrote:

> Pawan - I couldn't reach to any conclusive analysis so far. But, looking
> at the client (nfs)  & glusterd log files, it does look like that there is
> an issue w.r.t peer connections. Does restarting all the glusterd one by
> one solve this?
>
> On Mon, May 29, 2017 at 4:50 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>
>> Sorry for big attachment in previous mail...last 1000 lines of those logs
>> attached now.
>>
>> On Mon, May 29, 2017 at 4:44 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>>
>>>
>>>
>>> On Thu, May 25, 2017 at 9:54 PM, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>>
>>>> On Thu, 25 May 2017 at 19:11, Pawan Alwandi <pa...@platform.sh> wrote:
>>>>
>>>>> Hello Atin,
>>>>>
>>>>> Yes, glusterd on other instances are up and running.  Below is the
>>>>> requested output on all the three hosts.
>>>>>
>>>>> Host 1
>>>>>
>>>>> # gluster peer status
>>>>> Number of Peers: 2
>>>>>
>>>>> Hostname: 192.168.0.7
>>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>> State: Peer in Cluster (Disconnected)
>>>>>
>>>>
>>>> Glusterd is disconnected here.
>>>>
>>>>>
>>>>>
>>>>> Hostname: 192.168.0.6
>>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>> State: Peer in Cluster (Disconnected)
>>>>>
>>>>
>>>> Same as above
>>>>
>>>> Can you please check what does glusterd log have to say here about
>>>> these disconnects?
>>>>
>>>
>>> glusterd keeps logging this every 3s
>>>
>>> [2017-05-29 11:04:52.182782] W [socket.c:852:__socket_keepalive]
>>> 0-socket: failed to set keep idle -1 on socket 5, Invalid argument
>>> [2017-05-29 11:04:52.182808] E [socket.c:2966:socket_connect]
>>> 0-management: Failed to set keep-alive: Invalid argument
>>> [2017-05-29 11:04:52.183032] W [socket.c:852:__socket_keepalive]
>>> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument
>>> [2017-05-29 11:04:52.183052] E [socket.c:2966:socket_connect]
>>> 0-management: Failed to set keep-alive: Invalid argument
>>> [2017-05-29 11:04:52.183622] E [rpc-clnt.c:362:saved_frames_unwind]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>> ) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>> called at 2017-05-29 11:04:52.183210 (xid=0x23419)
>>> [2017-05-29 11:04:52.183735] W 
>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
>>> 0-management: Lock for vol shared not held
>>> [2017-05-29 11:04:52.183928] E [rpc-clnt.c:362:saved_frames_unwind]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>>> (--> 
>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>> ) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>&g

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-25 Thread Pawan Alwandi
Hello Atin,

Yes, glusterd on other instances are up and running.  Below is the
requested output on all the three hosts.

Host 1

# gluster peer status
Number of Peers: 2

Hostname: 192.168.0.7
Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
State: Peer in Cluster (Disconnected)

Hostname: 192.168.0.6
Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
State: Peer in Cluster (Disconnected)

# gluster volume status
Status of volume: shared
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick 192.168.0.5:/data/exports/shared  49152 0  Y
2105
NFS Server on localhost 2049  0  Y
2089
Self-heal Daemon on localhost   N/A   N/AY
2097

Task Status of Volume shared
--
There are no active volume tasks

Host 2

# gluster peer status
Number of Peers: 2

Hostname: 192.168.0.7
Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
State: Peer in Cluster (Connected)

Hostname: 192.168.0.5
Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
State: Peer in Cluster (Connected)


# gluster volume status
Status of volume: shared
Gluster processPortOnlinePid
--
Brick 192.168.0.5:/data/exports/shared49152Y2105
Brick 192.168.0.6:/data/exports/shared49152Y2188
Brick 192.168.0.7:/data/exports/shared49152Y2453
NFS Server on localhost2049Y2194
Self-heal Daemon on localhostN/AY2199
NFS Server on 192.168.0.52049Y2089
Self-heal Daemon on 192.168.0.5N/AY2097
NFS Server on 192.168.0.72049Y2458
Self-heal Daemon on 192.168.0.7N/AY2463

Task Status of Volume shared
--
There are no active volume tasks

Host 3

# gluster peer status
Number of Peers: 2

Hostname: 192.168.0.5
Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
State: Peer in Cluster (Connected)

Hostname: 192.168.0.6
Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
State: Peer in Cluster (Connected)

# gluster volume status
Status of volume: shared
Gluster processPortOnlinePid
--
Brick 192.168.0.5:/data/exports/shared49152Y2105
Brick 192.168.0.6:/data/exports/shared49152Y2188
Brick 192.168.0.7:/data/exports/shared49152Y2453
NFS Server on localhost2049Y2458
Self-heal Daemon on localhostN/AY2463
NFS Server on 192.168.0.62049Y2194
Self-heal Daemon on 192.168.0.6N/AY2199
NFS Server on 192.168.0.52049Y2089
Self-heal Daemon on 192.168.0.5N/AY2097

Task Status of Volume shared
--
There are no active volume tasks






On Wed, May 24, 2017 at 8:32 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

> Are the other glusterd instances are up? output of gluster peer status &
> gluster volume status please?
>
> On Wed, May 24, 2017 at 4:20 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>
>> Thanks Atin,
>>
>> So I got gluster downgraded to 3.7.9 on host 1 and now have the glusterfs
>> and glusterfsd processes come up.  But I see the volume is mounted read
>> only.
>>
>> I see these being logged every 3s:
>>
>> [2017-05-24 10:45:44.440435] W [socket.c:852:__socket_keepalive]
>> 0-socket: failed to set keep idle -1 on socket 17, Invalid argument
>> [2017-05-24 10:45:44.440475] E [socket.c:2966:socket_connect]
>> 0-management: Failed to set keep-alive: Invalid argument
>> [2017-05-24 10:45:44.440734] W [socket.c:852:__socket_keepalive]
>> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument
>> [2017-05-24 10:45:44.440754] E [socket.c:2966:socket_connect]
>> 0-management: Failed to set keep-alive: Invalid argument
>> [2017-05-24 10:45:44.441354] E [rpc-clnt.c:362:saved_frames_unwind] (-->
>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>> (--> 
>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>> (--> 
>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>> (--> 
>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>> (--> 
>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-24 Thread Pawan Alwandi
Thanks Atin,

So I got gluster downgraded to 3.7.9 on host 1 and now have the glusterfs
and glusterfsd processes come up.  But I see the volume is mounted read
only.

I see these being logged every 3s:

[2017-05-24 10:45:44.440435] W [socket.c:852:__socket_keepalive] 0-socket:
failed to set keep idle -1 on socket 17, Invalid argument
[2017-05-24 10:45:44.440475] E [socket.c:2966:socket_connect] 0-management:
Failed to set keep-alive: Invalid argument
[2017-05-24 10:45:44.440734] W [socket.c:852:__socket_keepalive] 0-socket:
failed to set keep idle -1 on socket 20, Invalid argument
[2017-05-24 10:45:44.440754] E [socket.c:2966:socket_connect] 0-management:
Failed to set keep-alive: Invalid argument
[2017-05-24 10:45:44.441354] E [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
(--> 
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
(--> 
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_
connection_cleanup+0x7e)[0x7f767c239c8e] (--> /usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] ) 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2017-05-24
10:45:44.440945 (xid=0xbf)
[2017-05-24 10:45:44.441505] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
glusterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
glusterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] ) 0-management:
Lock for vol shared not held
[2017-05-24 10:45:44.441660] E [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
(--> 
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
(--> 
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_
connection_cleanup+0x7e)[0x7f767c239c8e] (--> /usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] ) 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2017-05-24
10:45:44.441086 (xid=0xbf)
[2017-05-24 10:45:44.441790] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
glusterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
glusterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] ) 0-management:
Lock for vol shared not held

The heal info says this:

# gluster volume heal shared info
Brick 192.168.0.5:/data/exports/shared
Number of entries: 0

Brick 192.168.0.6:/data/exports/shared
Status: Transport endpoint is not connected

Brick 192.168.0.7:/data/exports/shared
Status: Transport endpoint is not connected

Any idea whats up here?

Pawan

On Mon, May 22, 2017 at 9:42 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

>
>
> On Mon, May 22, 2017 at 9:05 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>
>>
>> On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee <amukh...@redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, May 22, 2017 at 7:51 PM, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>> Sorry Pawan, I did miss the other part of the attachments. So looking
>>>> from the glusterd.info file from all the hosts, it looks like host2
>>>> and host3 do not have the correct op-version. Can you please set the
>>>> op-version as "operating-version=30702" in host2 and host3 and restart
>>>> glusterd instance one by one on all the nodes?
>>>>
>>>
>>> Please ensure that all the hosts are upgraded to the same bits before
>>> doing this change.
>>>
>>
>> Having to upgrade all 3 hosts to newer version before gluster could work
>> successfully on any of them means application downtime.  The applications
>> running on these hosts are expected to be highly available.  So with the
>> way the things are right now, is an online upgrade possible?  My upgrade
>> steps are: (1) stop the applications (2) umount the gluster volume, and
>> then (3) upgrade gluster one host at a time.
>>
>
> One of the way to mitigate this is to first do an online upgrade to
> glusterfs-3.7.9 (op-version:30707) given this bug was introduced in 3.7.10
> an

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-22 Thread Pawan Alwandi
On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

>
>
> On Mon, May 22, 2017 at 7:51 PM, Atin Mukherjee <amukh...@redhat.com>
> wrote:
>
>> Sorry Pawan, I did miss the other part of the attachments. So looking
>> from the glusterd.info file from all the hosts, it looks like host2 and
>> host3 do not have the correct op-version. Can you please set the op-version
>> as "operating-version=30702" in host2 and host3 and restart glusterd
>> instance one by one on all the nodes?
>>
>
> Please ensure that all the hosts are upgraded to the same bits before
> doing this change.
>

Having to upgrade all 3 hosts to newer version before gluster could work
successfully on any of them means application downtime.  The applications
running on these hosts are expected to be highly available.  So with the
way the things are right now, is an online upgrade possible?  My upgrade
steps are: (1) stop the applications (2) umount the gluster volume, and
then (3) upgrade gluster one host at a time.

Our goal is to get gluster upgraded to 3.11 from 3.6.9, and to make this an
online upgrade we are okay to take two steps 3.6.9 -> 3.7 and then 3.7 to
3.11.


>
>
>>
>> Apparently it looks like there is a bug which you have uncovered, during
>> peer handshaking if one of the glusterd instance is running with old bits
>> then during validating the handshake request there is a possibility that
>> uuid received will be blank and the same was ignored however there was a
>> patch http://review.gluster.org/13519 which had some additional changes
>> which was always looking at this field and doing some extra checks which
>> was causing the handshake to fail. For now, the above workaround should
>> suffice. I'll be sending a patch pretty soon.
>>
>
> Posted a patch https://review.gluster.org/#/c/17358 .
>
>
>>
>>
>>
>> On Mon, May 22, 2017 at 11:35 AM, Pawan Alwandi <pa...@platform.sh>
>> wrote:
>>
>>> Hello Atin,
>>>
>>> The tar's have the content of `/var/lib/glusterd` too for all 3 nodes,
>>> please check again.
>>>
>>> Thanks
>>>
>>> On Mon, May 22, 2017 at 11:32 AM, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>> Pawan,
>>>>
>>>> I see you have provided the log files from the nodes, however it'd be
>>>> really helpful if you can provide me the content of /var/lib/glusterd from
>>>> all the nodes to get to the root cause of this issue.
>>>>
>>>> On Fri, May 19, 2017 at 12:09 PM, Pawan Alwandi <pa...@platform.sh>
>>>> wrote:
>>>>
>>>>> Hello Atin,
>>>>>
>>>>> Thanks for continued support.  I've attached requested files from all
>>>>> 3 nodes.
>>>>>
>>>>> (I think we already verified the UUIDs to be correct, anyway let us
>>>>> know if you find any more info in the logs)
>>>>>
>>>>> Pawan
>>>>>
>>>>> On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee <amukh...@redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Thu, 18 May 2017 at 23:40, Atin Mukherjee <amukh...@redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi <pa...@platform.sh>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello Atin,
>>>>>>>>
>>>>>>>> I realized that these http://gluster.readthedocs.io/
>>>>>>>> en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only work
>>>>>>>> for upgrades from 3.7, while we are running 3.6.2.  Are there
>>>>>>>> instructions/suggestion you have for us to upgrade from 3.6 version?
>>>>>>>>
>>>>>>>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would work, but
>>>>>>>> I see similar errors reported when I upgraded to 3.7 too.
>>>>>>>>
>>>>>>>> For what its worth, I was able to set the op-version (gluster v set
>>>>>>>> all cluster.op-version 30702) but that doesn't seem to help.
>>>>>>>>
>>>>>>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030]
>>>>>>>> [glusterfsd.c:2338:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>> /usr/sbin/glusterd version 3.7.20 (arg

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-22 Thread Pawan Alwandi
Hello Atin,

The tar's have the content of `/var/lib/glusterd` too for all 3 nodes,
please check again.

Thanks

On Mon, May 22, 2017 at 11:32 AM, Atin Mukherjee <amukh...@redhat.com>
wrote:

> Pawan,
>
> I see you have provided the log files from the nodes, however it'd be
> really helpful if you can provide me the content of /var/lib/glusterd from
> all the nodes to get to the root cause of this issue.
>
> On Fri, May 19, 2017 at 12:09 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>
>> Hello Atin,
>>
>> Thanks for continued support.  I've attached requested files from all 3
>> nodes.
>>
>> (I think we already verified the UUIDs to be correct, anyway let us know
>> if you find any more info in the logs)
>>
>> Pawan
>>
>> On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee <amukh...@redhat.com>
>> wrote:
>>
>>>
>>> On Thu, 18 May 2017 at 23:40, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi <pa...@platform.sh> wrote:
>>>>
>>>>> Hello Atin,
>>>>>
>>>>> I realized that these http://gluster.readthedocs.io/
>>>>> en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only work for
>>>>> upgrades from 3.7, while we are running 3.6.2.  Are there
>>>>> instructions/suggestion you have for us to upgrade from 3.6 version?
>>>>>
>>>>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would work, but I
>>>>> see similar errors reported when I upgraded to 3.7 too.
>>>>>
>>>>> For what its worth, I was able to set the op-version (gluster v set
>>>>> all cluster.op-version 30702) but that doesn't seem to help.
>>>>>
>>>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030]
>>>>> [glusterfsd.c:2338:main] 0-/usr/sbin/glusterd: Started running
>>>>> /usr/sbin/glusterd version 3.7.20 (args: /usr/sbin/glusterd -p
>>>>> /var/run/glusterd.pid)
>>>>> [2017-05-17 06:48:33.703808] I [MSGID: 106478] [glusterd.c:1383:init]
>>>>> 0-management: Maximum allowed open file descriptors set to 65536
>>>>> [2017-05-17 06:48:33.703836] I [MSGID: 106479] [glusterd.c:1432:init]
>>>>> 0-management: Using /var/lib/glusterd as working directory
>>>>> [2017-05-17 06:48:33.708866] W [MSGID: 103071]
>>>>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm
>>>>> event channel creation failed [No such device]
>>>>> [2017-05-17 06:48:33.709011] W [MSGID: 103055] [rdma.c:4901:init]
>>>>> 0-rdma.management: Failed to initialize IB Device
>>>>> [2017-05-17 06:48:33.709033] W [rpc-transport.c:359:rpc_transport_load]
>>>>> 0-rpc-transport: 'rdma' initialization failed
>>>>> [2017-05-17 06:48:33.709088] W [rpcsvc.c:1642:rpcsvc_create_listener]
>>>>> 0-rpc-service: cannot create listener, initing the transport failed
>>>>> [2017-05-17 06:48:33.709105] E [MSGID: 106243] [glusterd.c:1656:init]
>>>>> 0-management: creation of 1 listeners failed, continuing with succeeded
>>>>> transport
>>>>> [2017-05-17 06:48:35.480043] I [MSGID: 106513]
>>>>> [glusterd-store.c:2068:glusterd_restore_op_version] 0-glusterd:
>>>>> retrieved op-version: 30600
>>>>> [2017-05-17 06:48:35.605779] I [MSGID: 106498]
>>>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>> 0-management: connect returned 0
>>>>> [2017-05-17 06:48:35.607059] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>> 0-management: setting frame-timeout to 600
>>>>> [2017-05-17 06:48:35.607670] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>> 0-management: setting frame-timeout to 600
>>>>> [2017-05-17 06:48:35.607025] I [MSGID: 106498]
>>>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>> 0-management: connect returned 0
>>>>> [2017-05-17 06:48:35.608125] I [MSGID: 106544]
>>>>> [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID:
>>>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>
>>>>
>>>>> Final graph:
>>>>> +---
>>>>> ---+
>>>>>   1: volume management
>>>>>   2: type mgmt/glusterd
>>>>>   3: op

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-19 Thread Pawan Alwandi
Hello Atin,

Thanks for continued support.  I've attached requested files from all 3
nodes.

(I think we already verified the UUIDs to be correct, anyway let us know if
you find any more info in the logs)

Pawan

On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee <amukh...@redhat.com>
wrote:

>
> On Thu, 18 May 2017 at 23:40, Atin Mukherjee <amukh...@redhat.com> wrote:
>
>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi <pa...@platform.sh> wrote:
>>
>>> Hello Atin,
>>>
>>> I realized that these http://gluster.readthedocs.io/
>>> en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only work for
>>> upgrades from 3.7, while we are running 3.6.2.  Are there
>>> instructions/suggestion you have for us to upgrade from 3.6 version?
>>>
>>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would work, but I see
>>> similar errors reported when I upgraded to 3.7 too.
>>>
>>> For what its worth, I was able to set the op-version (gluster v set all
>>> cluster.op-version 30702) but that doesn't seem to help.
>>>
>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030] [glusterfsd.c:2338:main]
>>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.20
>>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid)
>>> [2017-05-17 06:48:33.703808] I [MSGID: 106478] [glusterd.c:1383:init]
>>> 0-management: Maximum allowed open file descriptors set to 65536
>>> [2017-05-17 06:48:33.703836] I [MSGID: 106479] [glusterd.c:1432:init]
>>> 0-management: Using /var/lib/glusterd as working directory
>>> [2017-05-17 06:48:33.708866] W [MSGID: 103071]
>>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>>> channel creation failed [No such device]
>>> [2017-05-17 06:48:33.709011] W [MSGID: 103055] [rdma.c:4901:init]
>>> 0-rdma.management: Failed to initialize IB Device
>>> [2017-05-17 06:48:33.709033] W [rpc-transport.c:359:rpc_transport_load]
>>> 0-rpc-transport: 'rdma' initialization failed
>>> [2017-05-17 06:48:33.709088] W [rpcsvc.c:1642:rpcsvc_create_listener]
>>> 0-rpc-service: cannot create listener, initing the transport failed
>>> [2017-05-17 06:48:33.709105] E [MSGID: 106243] [glusterd.c:1656:init]
>>> 0-management: creation of 1 listeners failed, continuing with succeeded
>>> transport
>>> [2017-05-17 06:48:35.480043] I [MSGID: 106513] 
>>> [glusterd-store.c:2068:glusterd_restore_op_version]
>>> 0-glusterd: retrieved op-version: 30600
>>> [2017-05-17 06:48:35.605779] I [MSGID: 106498] [glusterd-handler.c:3640:
>>> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
>>> [2017-05-17 06:48:35.607059] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
>>> 0-management: setting frame-timeout to 600
>>> [2017-05-17 06:48:35.607670] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
>>> 0-management: setting frame-timeout to 600
>>> [2017-05-17 06:48:35.607025] I [MSGID: 106498] [glusterd-handler.c:3640:
>>> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
>>> [2017-05-17 06:48:35.608125] I [MSGID: 106544]
>>> [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID:
>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>
>>
>>> Final graph:
>>> +---
>>> ---+
>>>   1: volume management
>>>   2: type mgmt/glusterd
>>>   3: option rpc-auth.auth-glusterfs on
>>>   4: option rpc-auth.auth-unix on
>>>   5: option rpc-auth.auth-null on
>>>   6: option rpc-auth-allow-insecure on
>>>   7: option transport.socket.listen-backlog 128
>>>   8: option event-threads 1
>>>   9: option ping-timeout 0
>>>  10: option transport.socket.read-fail-log off
>>>  11: option transport.socket.keepalive-interval 2
>>>  12: option transport.socket.keepalive-time 10
>>>  13: option transport-type rdma
>>>  14: option working-directory /var/lib/glusterd
>>>  15: end-volume
>>>  16:
>>> +---
>>> ---+
>>> [2017-05-17 06:48:35.609868] I [MSGID: 101190] 
>>> [event-epoll.c:632:event_dispatch_epoll_worker]
>>> 0-epoll: Started thread with index 1
>>> [2017-05-17 06:48:35.610839] W [socket.c:596:__socket_rwv] 0-management:
>>> readv on 192.168.0.7:24007 failed (No data available)
>>> [2017-05-17 06:48:35.611907] E [rpc-clnt.c:3

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-17 Thread Pawan Alwandi
ock for vol shared not held
[2017-05-17 06:48:35.612039] W [MSGID: 106118]
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: Lock not
released for shared
[2017-05-17 06:48:35.612079] W [socket.c:596:__socket_rwv] 0-management:
readv on 192.168.0.6:24007 failed (No data available)
[2017-05-17 06:48:35.612179] E [rpc-clnt.c:370:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380]
) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
called at 2017-05-17 06:48:35.610007 (xid=0x1)
[2017-05-17 06:48:35.612197] E [MSGID: 106167]
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] 0-management:
Error through RPC layer, retry again later
[2017-05-17 06:48:35.612211] I [MSGID: 106004]
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: Peer
<192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), in state , has disconnected from glusterd.
[2017-05-17 06:48:35.612292] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4b)
[0x7fd6bdc4912b]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x160)
[0x7fd6bdc52dd0]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7fd6bdcef1b3] ) 0-management: Lock for vol shared not held
[2017-05-17 06:48:35.613432] W [MSGID: 106118]
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: Lock not
released for shared
[2017-05-17 06:48:35.614317] E [MSGID: 106170]
[glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req] 0-management:
Request from peer 192.168.0.6:991 has an entry in peerinfo, but uuid does
not match



On Mon, May 15, 2017 at 10:31 PM, Atin Mukherjee <amukh...@redhat.com>
wrote:

>
> On Mon, 15 May 2017 at 11:58, Pawan Alwandi <pa...@platform.sh> wrote:
>
>> Hi Atin,
>>
>> I see below error.  Do I require gluster to be upgraded on all 3 hosts
>> for this to work?  Right now I have host 1 running 3.10.1 and host 2 & 3
>> running 3.6.2
>>
>> # gluster v set all cluster.op-version 31001
>> volume set: failed: Required op_version (31001) is not supported
>>
>
> Yes you should given 3.6 version is EOLed.
>
>>
>>
>>
>> On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee <amukh...@redhat.com>
>> wrote:
>>
>>> On Sun, 14 May 2017 at 21:43, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>> Allright, I see that you haven't bumped up the op-version. Can you
>>>> please execute:
>>>>
>>>> gluster v set all cluster.op-version 30101  and then restart glusterd
>>>> on all the nodes and check the brick status?
>>>>
>>>
>>> s/30101/31001
>>>
>>>
>>>>
>>>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <pa...@platform.sh>
>>>> wrote:
>>>>
>>>>> Hello Atin,
>>>>>
>>>>> Thanks for looking at this.  Below is the output you requested for.
>>>>>
>>>>> Again, I'm seeing those errors after upgrading gluster on host 1.
>>>>>
>>>>> Host 1
>>>>>
>>>>> # cat /var/lib/glusterd/glusterd.info
>>>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>> operating-version=30600
>>>>>
>>>>> # cat /var/lib/glusterd/peers/*
>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>> state=3
>>>>> hostname1=192.168.0.7
>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>> state=3
>>>>> hostname1=192.168.0.6
>>>>>
>>>>> # gluster --version
>>>>> glusterfs 3.10.1
>>>>>
>>>>> Host 2
>>>>>
>>>>> # cat /var/lib/glusterd/glusterd.info
>>>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>> operating-version=30600
>>>>>
>>>>> # cat /var/lib/glusterd/peers/*
>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>> state=3
>>>>> hostname1=192.168.0.7
>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-15 Thread Pawan Alwandi
Hi Atin,

I see below error.  Do I require gluster to be upgraded on all 3 hosts for
this to work?  Right now I have host 1 running 3.10.1 and host 2 & 3
running 3.6.2

# gluster v set all cluster.op-version 31001
volume set: failed: Required op_version (31001) is not supported


On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee <amukh...@redhat.com> wrote:

> On Sun, 14 May 2017 at 21:43, Atin Mukherjee <amukh...@redhat.com> wrote:
>
>> Allright, I see that you haven't bumped up the op-version. Can you please
>> execute:
>>
>> gluster v set all cluster.op-version 30101  and then restart glusterd on
>> all the nodes and check the brick status?
>>
>
> s/30101/31001
>
>
>>
>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <pa...@platform.sh> wrote:
>>
>>> Hello Atin,
>>>
>>> Thanks for looking at this.  Below is the output you requested for.
>>>
>>> Again, I'm seeing those errors after upgrading gluster on host 1.
>>>
>>> Host 1
>>>
>>> # cat /var/lib/glusterd/glusterd.info
>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> operating-version=30600
>>>
>>> # cat /var/lib/glusterd/peers/*
>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> state=3
>>> hostname1=192.168.0.7
>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> state=3
>>> hostname1=192.168.0.6
>>>
>>> # gluster --version
>>> glusterfs 3.10.1
>>>
>>> Host 2
>>>
>>> # cat /var/lib/glusterd/glusterd.info
>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> operating-version=30600
>>>
>>> # cat /var/lib/glusterd/peers/*
>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> state=3
>>> hostname1=192.168.0.7
>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> state=3
>>> hostname1=192.168.0.5
>>>
>>> # gluster --version
>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>
>>> Host 3
>>>
>>> # cat /var/lib/glusterd/glusterd.info
>>> UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> operating-version=30600
>>>
>>> # cat /var/lib/glusterd/peers/*
>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> state=3
>>> hostname1=192.168.0.5
>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> state=3
>>> hostname1=192.168.0.6
>>>
>>> # gluster --version
>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>
>>>
>>>
>>> On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>> I have already asked for the following earlier:
>>>>
>>>> Can you please provide output of following from all the nodes:
>>>>
>>>> cat /var/lib/glusterd/glusterd.info
>>>> cat /var/lib/glusterd/peers/*
>>>>
>>>> On Sat, 13 May 2017 at 12:22, Pawan Alwandi <pa...@platform.sh> wrote:
>>>>
>>>>> Hello folks,
>>>>>
>>>>> Does anyone have any idea whats going on here?
>>>>>
>>>>> Thanks,
>>>>> Pawan
>>>>>
>>>>> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi <pa...@platform.sh>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't see the
>>>>>> glusterfsd and glusterfs processes coming up.
>>>>>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/
>>>>>> upgrade_to_3.10/ is the process that I'm trying to follow.
>>>>>>
>>>>>> This is a 3 node server setup with a replicated volume having replica
>>>>>> count of 3.
>>>>>>
>>>>>> Logs below:
>>>>>>
>>>>>> [2017-05-10 09:07:03.507959] I [MSGID: 100030]
>>>>>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>>>>>> /usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>>>>>> /var/run/glusterd.pid)
>>>>>> [2017-05-10 09:07:03.512827] I [MSGID: 106478] [glusterd.c:1449:init]
>>>>>> 0-management: Maximum allowed open file descriptors set to 65536
>>>>>> [2017-05-10 09:07:03.512855] I [MSGID: 106479] [glusterd.c:1496:init]
>>>>>> 0-management: Using /var/lib/glusterd as working directory
>>>>

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-13 Thread Pawan Alwandi
Hello folks,

Does anyone have any idea whats going on here?

Thanks,
Pawan

On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi <pa...@platform.sh> wrote:

> Hello,
>
> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't see the
> glusterfsd and glusterfs processes coming up.
> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.10/ is
> the process that I'm trying to follow.
>
> This is a 3 node server setup with a replicated volume having replica
> count of 3.
>
> Logs below:
>
> [2017-05-10 09:07:03.507959] I [MSGID: 100030] [glusterfsd.c:2460:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.10.1
> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid)
> [2017-05-10 09:07:03.512827] I [MSGID: 106478] [glusterd.c:1449:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2017-05-10 09:07:03.512855] I [MSGID: 106479] [glusterd.c:1496:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2017-05-10 09:07:03.520426] W [MSGID: 103071] 
> [rdma.c:4590:__gf_rdma_ctx_create]
> 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
> [2017-05-10 09:07:03.520452] W [MSGID: 103055] [rdma.c:4897:init]
> 0-rdma.management: Failed to initialize IB Device
> [2017-05-10 09:07:03.520465] W [rpc-transport.c:350:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2017-05-10 09:07:03.520518] W [rpcsvc.c:1661:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2017-05-10 09:07:03.520534] E [MSGID: 106243] [glusterd.c:1720:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2017-05-10 09:07:04.931764] I [MSGID: 106513] 
> [glusterd-store.c:2197:glusterd_restore_op_version]
> 0-glusterd: retrieved op-version: 30600
> [2017-05-10 09:07:04.964354] I [MSGID: 106544]
> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
> [2017-05-10 09:07:04.993944] I [MSGID: 106498] [glusterd-handler.c:3669:
> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2017-05-10 09:07:04.995864] I [MSGID: 106498] [glusterd-handler.c:3669:
> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2017-05-10 09:07:04.995879] W [MSGID: 106062] [glusterd-handler.c:3466:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2017-05-10 09:07:04.995903] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2017-05-10 09:07:04.996325] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> Final graph:
> +---
> ---+
>   1: volume management
>   2: type mgmt/glusterd
>   3: option rpc-auth.auth-glusterfs on
>   4: option rpc-auth.auth-unix on
>   5: option rpc-auth.auth-null on
>   6: option rpc-auth-allow-insecure on
>   7: option transport.socket.listen-backlog 128
>   8: option event-threads 1
>   9: option ping-timeout 0
>  10: option transport.socket.read-fail-log off
>  11: option transport.socket.keepalive-interval 2
>  12: option transport.socket.keepalive-time 10
>  13: option transport-type rdma
>  14: option working-directory /var/lib/glusterd
>  15: end-volume
>  16:
> +---
> ---+
> [2017-05-10 09:07:04.996310] W [MSGID: 106062] [glusterd-handler.c:3466:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2017-05-10 09:07:05.000461] I [MSGID: 101190] 
> [event-epoll.c:629:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 1
> [2017-05-10 09:07:05.001493] W [socket.c:593:__socket_rwv] 0-management:
> readv on 192.168.0.7:24007 failed (No data available)
> [2017-05-10 09:07:05.001513] I [MSGID: 106004] 
> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
> 0-management: Peer <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
> in state , h
> as disconnected from glusterd.
> [2017-05-10 09:07:05.001677] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) [0x7f0bf9d7dcf0]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
> t held
> [2017-05-10 09:07:05.001696] W [MSGID: 106118] 
> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify