Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-10 Thread Atin Mukherjee
On Fri, Jul 7, 2017 at 2:23 PM, Gianluca Cecchi 
wrote:

> On Thu, Jul 6, 2017 at 3:22 PM, Gianluca Cecchi  > wrote:
>
>> On Thu, Jul 6, 2017 at 2:16 PM, Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi <
>>> gianluca.cec...@gmail.com> wrote:
>>>
 On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi <
 gianluca.cec...@gmail.com> wrote:

>
> Eventually I can destroy and recreate this "export" volume again with
> the old names (ovirt0N.localdomain.local) if you give me the sequence of
> commands, then enable debug and retry the reset-brick command
>
> Gianluca
>


 So it seems I was able to destroy and re-create.
 Now I see that the volume creation uses by default the new ip, so I
 reverted the hostnames roles in the commands after putting glusterd in
 debug mode on the host where I execute the reset-brick command (do I have
 to set debug for the the nodes too?)

>>>
>>> You have to set the log level to debug for glusterd instance where the
>>> commit fails and share the glusterd log of that particular node.
>>>
>>>
>>
>> Ok, done.
>>
>> Command executed on ovirt01 with timestamp "2017-07-06 13:04:12" in
>> glusterd log files
>>
>> [root@ovirt01 export]# gluster volume reset-brick export
>> gl01.localdomain.local:/gluster/brick3/export start
>> volume reset-brick: success: reset-brick start operation successful
>>
>> [root@ovirt01 export]# gluster volume reset-brick export
>> gl01.localdomain.local:/gluster/brick3/export
>> ovirt01.localdomain.local:/gluster/brick3/export commit force
>> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
>> Please check log file for details.
>> Commit failed on ovirt03.localdomain.local. Please check log file for
>> details.
>> [root@ovirt01 export]#
>>
>> See glusterd log files for the 3 nodes in debug mode here:
>> ovirt01: https://drive.google.com/file/d/0BwoPbcrMv8mvY1RTTG
>> p3RUhScm8/view?usp=sharing
>> ovirt02: https://drive.google.com/file/d/0BwoPbcrMv8mvSVpJUH
>> NhMzhMSU0/view?usp=sharing
>> ovirt03: https://drive.google.com/file/d/0BwoPbcrMv8mvT2xiWE
>> dQVmJNb0U/view?usp=sharing
>>
>> HIH debugging
>> Gianluca
>>
>>
> Hi Atin,
> did you have time to see the logs?
> Comparing debug enabled messages with previous ones, I see these added
> lines on nodes where commit failed after running the commands
>
> gluster volume reset-brick export 
> gl01.localdomain.local:/gluster/brick3/export
> start
> gluster volume reset-brick export 
> gl01.localdomain.local:/gluster/brick3/export
> ovirt01.localdomain.local:/gluster/brick3/export commit force
>
>
> [2017-07-06 13:04:30.221872] D [MSGID: 0] 
> [glusterd-peer-utils.c:674:gd_peerinfo_find_from_hostname]
> 0-management: Friend ovirt01.localdomain.local found.. state: 3
> [2017-07-06 13:04:30.221882] D [MSGID: 0] 
> [glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
> 0-management: returning 0
> [2017-07-06 13:04:30.221888] D [MSGID: 0] 
> [glusterd-utils.c:1039:glusterd_resolve_brick]
> 0-management: Returning 0
> [2017-07-06 13:04:30.221908] D [MSGID: 0] 
> [glusterd-utils.c:998:glusterd_brickinfo_new]
> 0-management: Returning 0
> [2017-07-06 13:04:30.221915] D [MSGID: 0] 
> [glusterd-utils.c:1195:glusterd_brickinfo_new_from_brick]
> 0-management: Returning 0
> [2017-07-06 13:04:30.222187] D [MSGID: 0] 
> [glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
> 0-management: returning 0
> [2017-07-06 13:04:30.01] D [MSGID: 0] 
> [glusterd-utils.c:1486:glusterd_volume_brickinfo_get]
> 0-management: Returning -1
>

The above log entry is the reason of the failure. GlusterD is unable to
find the old brick (src_brick) from its volinfo structure. FWIW, would you
be able to share the content of 'gluster get-state' output & gluster volume
info output after running reset-brick start? I'd need to check why glusterd
is unable to find out the old brick's details from its volinfo post
reset-brick start.



> [2017-07-06 13:04:30.07] D [MSGID: 0] 
> [store.c:459:gf_store_handle_destroy]
> 0-: Returning 0
> [2017-07-06 13:04:30.42] D [MSGID: 0] [glusterd-utils.c:1512:gluster
> d_volume_brickinfo_get_by_brick] 0-glusterd: Returning -1
> [2017-07-06 13:04:30.50] D [MSGID: 0] [glusterd-replace-brick.c:416:
> glusterd_op_perform_replace_brick] 0-glusterd: Returning -1
> [2017-07-06 13:04:30.57] C [MSGID: 106074]
> [glusterd-reset-brick.c:372:glusterd_op_reset_brick] 0-management: Unable
> to add dst-brick: ovirt01.localdomain.local:/gluster/brick3/export to
> volume: export
>
>
> Does it share up more light?
>
> Thanks,
> Gianluca
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-07 Thread Atin Mukherjee
You'd need to allow some more time to dig into the logs. I'll try to get
back on this by Monday.

On Fri, Jul 7, 2017 at 2:23 PM, Gianluca Cecchi 
wrote:

> On Thu, Jul 6, 2017 at 3:22 PM, Gianluca Cecchi  > wrote:
>
>> On Thu, Jul 6, 2017 at 2:16 PM, Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi <
>>> gianluca.cec...@gmail.com> wrote:
>>>
 On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi <
 gianluca.cec...@gmail.com> wrote:

>
> Eventually I can destroy and recreate this "export" volume again with
> the old names (ovirt0N.localdomain.local) if you give me the sequence of
> commands, then enable debug and retry the reset-brick command
>
> Gianluca
>


 So it seems I was able to destroy and re-create.
 Now I see that the volume creation uses by default the new ip, so I
 reverted the hostnames roles in the commands after putting glusterd in
 debug mode on the host where I execute the reset-brick command (do I have
 to set debug for the the nodes too?)

>>>
>>> You have to set the log level to debug for glusterd instance where the
>>> commit fails and share the glusterd log of that particular node.
>>>
>>>
>>
>> Ok, done.
>>
>> Command executed on ovirt01 with timestamp "2017-07-06 13:04:12" in
>> glusterd log files
>>
>> [root@ovirt01 export]# gluster volume reset-brick export
>> gl01.localdomain.local:/gluster/brick3/export start
>> volume reset-brick: success: reset-brick start operation successful
>>
>> [root@ovirt01 export]# gluster volume reset-brick export
>> gl01.localdomain.local:/gluster/brick3/export
>> ovirt01.localdomain.local:/gluster/brick3/export commit force
>> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
>> Please check log file for details.
>> Commit failed on ovirt03.localdomain.local. Please check log file for
>> details.
>> [root@ovirt01 export]#
>>
>> See glusterd log files for the 3 nodes in debug mode here:
>> ovirt01: https://drive.google.com/file/d/0BwoPbcrMv8mvY1RTTG
>> p3RUhScm8/view?usp=sharing
>> ovirt02: https://drive.google.com/file/d/0BwoPbcrMv8mvSVpJUH
>> NhMzhMSU0/view?usp=sharing
>> ovirt03: https://drive.google.com/file/d/0BwoPbcrMv8mvT2xiWE
>> dQVmJNb0U/view?usp=sharing
>>
>> HIH debugging
>> Gianluca
>>
>>
> Hi Atin,
> did you have time to see the logs?
> Comparing debug enabled messages with previous ones, I see these added
> lines on nodes where commit failed after running the commands
>
> gluster volume reset-brick export 
> gl01.localdomain.local:/gluster/brick3/export
> start
> gluster volume reset-brick export 
> gl01.localdomain.local:/gluster/brick3/export
> ovirt01.localdomain.local:/gluster/brick3/export commit force
>
>
> [2017-07-06 13:04:30.221872] D [MSGID: 0] 
> [glusterd-peer-utils.c:674:gd_peerinfo_find_from_hostname]
> 0-management: Friend ovirt01.localdomain.local found.. state: 3
> [2017-07-06 13:04:30.221882] D [MSGID: 0] 
> [glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
> 0-management: returning 0
> [2017-07-06 13:04:30.221888] D [MSGID: 0] 
> [glusterd-utils.c:1039:glusterd_resolve_brick]
> 0-management: Returning 0
> [2017-07-06 13:04:30.221908] D [MSGID: 0] 
> [glusterd-utils.c:998:glusterd_brickinfo_new]
> 0-management: Returning 0
> [2017-07-06 13:04:30.221915] D [MSGID: 0] [glusterd-utils.c:1195:
> glusterd_brickinfo_new_from_brick] 0-management: Returning 0
> [2017-07-06 13:04:30.222187] D [MSGID: 0] 
> [glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
> 0-management: returning 0
> [2017-07-06 13:04:30.01] D [MSGID: 0] 
> [glusterd-utils.c:1486:glusterd_volume_brickinfo_get]
> 0-management: Returning -1
> [2017-07-06 13:04:30.07] D [MSGID: 0] 
> [store.c:459:gf_store_handle_destroy]
> 0-: Returning 0
> [2017-07-06 13:04:30.42] D [MSGID: 0] [glusterd-utils.c:1512:
> glusterd_volume_brickinfo_get_by_brick] 0-glusterd: Returning -1
> [2017-07-06 13:04:30.50] D [MSGID: 0] [glusterd-replace-brick.c:416:
> glusterd_op_perform_replace_brick] 0-glusterd: Returning -1
> [2017-07-06 13:04:30.57] C [MSGID: 106074] 
> [glusterd-reset-brick.c:372:glusterd_op_reset_brick]
> 0-management: Unable to add dst-brick: 
> ovirt01.localdomain.local:/gluster/brick3/export
> to volume: export
>
>
> Does it share up more light?
>
> Thanks,
> Gianluca
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-07 Thread Gianluca Cecchi
On Thu, Jul 6, 2017 at 3:22 PM, Gianluca Cecchi 
wrote:

> On Thu, Jul 6, 2017 at 2:16 PM, Atin Mukherjee 
> wrote:
>
>>
>>
>> On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi <
>> gianluca.cec...@gmail.com> wrote:
>>
>>> On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi <
>>> gianluca.cec...@gmail.com> wrote:
>>>

 Eventually I can destroy and recreate this "export" volume again with
 the old names (ovirt0N.localdomain.local) if you give me the sequence of
 commands, then enable debug and retry the reset-brick command

 Gianluca

>>>
>>>
>>> So it seems I was able to destroy and re-create.
>>> Now I see that the volume creation uses by default the new ip, so I
>>> reverted the hostnames roles in the commands after putting glusterd in
>>> debug mode on the host where I execute the reset-brick command (do I have
>>> to set debug for the the nodes too?)
>>>
>>
>> You have to set the log level to debug for glusterd instance where the
>> commit fails and share the glusterd log of that particular node.
>>
>>
>
> Ok, done.
>
> Command executed on ovirt01 with timestamp "2017-07-06 13:04:12" in
> glusterd log files
>
> [root@ovirt01 export]# gluster volume reset-brick export
> gl01.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
>
> [root@ovirt01 export]# gluster volume reset-brick export
> gl01.localdomain.local:/gluster/brick3/export 
> ovirt01.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
> Please check log file for details.
> Commit failed on ovirt03.localdomain.local. Please check log file for
> details.
> [root@ovirt01 export]#
>
> See glusterd log files for the 3 nodes in debug mode here:
> ovirt01: https://drive.google.com/file/d/0BwoPbcrMv8mvY1RTTGp3RUhScm8/
> view?usp=sharing
> ovirt02: https://drive.google.com/file/d/0BwoPbcrMv8mvSVpJUHNhMzhMSU0/
> view?usp=sharing
> ovirt03: https://drive.google.com/file/d/0BwoPbcrMv8mvT2xiWEdQVmJNb0U/
> view?usp=sharing
>
> HIH debugging
> Gianluca
>
>
Hi Atin,
did you have time to see the logs?
Comparing debug enabled messages with previous ones, I see these added
lines on nodes where commit failed after running the commands

gluster volume reset-brick export
gl01.localdomain.local:/gluster/brick3/export start
gluster volume reset-brick export
gl01.localdomain.local:/gluster/brick3/export
ovirt01.localdomain.local:/gluster/brick3/export commit force


[2017-07-06 13:04:30.221872] D [MSGID: 0]
[glusterd-peer-utils.c:674:gd_peerinfo_find_from_hostname] 0-management:
Friend ovirt01.localdomain.local found.. state: 3
[2017-07-06 13:04:30.221882] D [MSGID: 0]
[glusterd-peer-utils.c:167:glusterd_hostname_to_uuid] 0-management:
returning 0
[2017-07-06 13:04:30.221888] D [MSGID: 0]
[glusterd-utils.c:1039:glusterd_resolve_brick] 0-management: Returning 0
[2017-07-06 13:04:30.221908] D [MSGID: 0]
[glusterd-utils.c:998:glusterd_brickinfo_new] 0-management: Returning 0
[2017-07-06 13:04:30.221915] D [MSGID: 0]
[glusterd-utils.c:1195:glusterd_brickinfo_new_from_brick] 0-management:
Returning 0
[2017-07-06 13:04:30.222187] D [MSGID: 0]
[glusterd-peer-utils.c:167:glusterd_hostname_to_uuid] 0-management:
returning 0
[2017-07-06 13:04:30.01] D [MSGID: 0]
[glusterd-utils.c:1486:glusterd_volume_brickinfo_get] 0-management:
Returning -1
[2017-07-06 13:04:30.07] D [MSGID: 0]
[store.c:459:gf_store_handle_destroy] 0-: Returning 0
[2017-07-06 13:04:30.42] D [MSGID: 0]
[glusterd-utils.c:1512:glusterd_volume_brickinfo_get_by_brick] 0-glusterd:
Returning -1
[2017-07-06 13:04:30.50] D [MSGID: 0]
[glusterd-replace-brick.c:416:glusterd_op_perform_replace_brick]
0-glusterd: Returning -1
[2017-07-06 13:04:30.57] C [MSGID: 106074]
[glusterd-reset-brick.c:372:glusterd_op_reset_brick] 0-management: Unable
to add dst-brick: ovirt01.localdomain.local:/gluster/brick3/export to
volume: export


Does it share up more light?

Thanks,
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-06 Thread Gianluca Cecchi
On Thu, Jul 6, 2017 at 2:16 PM, Atin Mukherjee  wrote:

>
>
> On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi  > wrote:
>
>> On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi <
>> gianluca.cec...@gmail.com> wrote:
>>
>>>
>>> Eventually I can destroy and recreate this "export" volume again with
>>> the old names (ovirt0N.localdomain.local) if you give me the sequence of
>>> commands, then enable debug and retry the reset-brick command
>>>
>>> Gianluca
>>>
>>
>>
>> So it seems I was able to destroy and re-create.
>> Now I see that the volume creation uses by default the new ip, so I
>> reverted the hostnames roles in the commands after putting glusterd in
>> debug mode on the host where I execute the reset-brick command (do I have
>> to set debug for the the nodes too?)
>>
>
> You have to set the log level to debug for glusterd instance where the
> commit fails and share the glusterd log of that particular node.
>
>

Ok, done.

Command executed on ovirt01 with timestamp "2017-07-06 13:04:12" in
glusterd log files

[root@ovirt01 export]# gluster volume reset-brick export
gl01.localdomain.local:/gluster/brick3/export start
volume reset-brick: success: reset-brick start operation successful

[root@ovirt01 export]# gluster volume reset-brick export
gl01.localdomain.local:/gluster/brick3/export
ovirt01.localdomain.local:/gluster/brick3/export commit force
volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
Please check log file for details.
Commit failed on ovirt03.localdomain.local. Please check log file for
details.
[root@ovirt01 export]#

See glusterd log files for the 3 nodes in debug mode here:
ovirt01:
https://drive.google.com/file/d/0BwoPbcrMv8mvY1RTTGp3RUhScm8/view?usp=sharing
ovirt02:
https://drive.google.com/file/d/0BwoPbcrMv8mvSVpJUHNhMzhMSU0/view?usp=sharing
ovirt03:
https://drive.google.com/file/d/0BwoPbcrMv8mvT2xiWEdQVmJNb0U/view?usp=sharing

HIH debugging
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-06 Thread Atin Mukherjee
On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi 
wrote:

> On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi  > wrote:
>
>>
>> Eventually I can destroy and recreate this "export" volume again with the
>> old names (ovirt0N.localdomain.local) if you give me the sequence of
>> commands, then enable debug and retry the reset-brick command
>>
>> Gianluca
>>
>
>
> So it seems I was able to destroy and re-create.
> Now I see that the volume creation uses by default the new ip, so I
> reverted the hostnames roles in the commands after putting glusterd in
> debug mode on the host where I execute the reset-brick command (do I have
> to set debug for the the nodes too?)
>

You have to set the log level to debug for glusterd instance where the
commit fails and share the glusterd log of that particular node.


>
>
> [root@ovirt01 ~]# gluster volume reset-brick export
> gl01.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
>
> [root@ovirt01 ~]# gluster volume reset-brick export
> gl01.localdomain.local:/gluster/brick3/export 
> ovirt01.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
> Please check log file for details.
> Commit failed on ovirt03.localdomain.local. Please check log file for
> details.
> [root@ovirt01 ~]#
>
> See here the glusterd.log in zip format:
> https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/
> view?usp=sharing
>
> Time of the reset-brick operation in logfile is 2017-07-06 11:42
> (BTW: can I have time in log not in UTC format, as I'm using CEST date in
> my system?)
>
> I see a difference, because the brick doesn't seems isolated as before...
>
> [root@ovirt01 glusterfs]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: e278a830-beed-4255-b9ca-587a630cbdbf
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: ovirt01.localdomain.local:/gluster/brick3/export
> Brick2: 10.10.2.103:/gluster/brick3/export
> Brick3: 10.10.2.104:/gluster/brick3/export (arbiter)
>
> [root@ovirt02 ~]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: e278a830-beed-4255-b9ca-587a630cbdbf
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: ovirt01.localdomain.local:/gluster/brick3/export
> Brick2: 10.10.2.103:/gluster/brick3/export
> Brick3: 10.10.2.104:/gluster/brick3/export (arbiter)
>
> And also in oVirt I see all 3 bricks online
>
> Gianluca
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-06 Thread Gianluca Cecchi
On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi 
wrote:

>
> Eventually I can destroy and recreate this "export" volume again with the
> old names (ovirt0N.localdomain.local) if you give me the sequence of
> commands, then enable debug and retry the reset-brick command
>
> Gianluca
>


So it seems I was able to destroy and re-create.
Now I see that the volume creation uses by default the new ip, so I
reverted the hostnames roles in the commands after putting glusterd in
debug mode on the host where I execute the reset-brick command (do I have
to set debug for the the nodes too?)


[root@ovirt01 ~]# gluster volume reset-brick export
gl01.localdomain.local:/gluster/brick3/export start
volume reset-brick: success: reset-brick start operation successful

[root@ovirt01 ~]# gluster volume reset-brick export
gl01.localdomain.local:/gluster/brick3/export
ovirt01.localdomain.local:/gluster/brick3/export commit force
volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
Please check log file for details.
Commit failed on ovirt03.localdomain.local. Please check log file for
details.
[root@ovirt01 ~]#

See here the glusterd.log in zip format:
https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/view?usp=sharing

Time of the reset-brick operation in logfile is 2017-07-06 11:42
(BTW: can I have time in log not in UTC format, as I'm using CEST date in
my system?)

I see a difference, because the brick doesn't seems isolated as before...

[root@ovirt01 glusterfs]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: e278a830-beed-4255-b9ca-587a630cbdbf
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ovirt01.localdomain.local:/gluster/brick3/export
Brick2: 10.10.2.103:/gluster/brick3/export
Brick3: 10.10.2.104:/gluster/brick3/export (arbiter)

[root@ovirt02 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: e278a830-beed-4255-b9ca-587a630cbdbf
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ovirt01.localdomain.local:/gluster/brick3/export
Brick2: 10.10.2.103:/gluster/brick3/export
Brick3: 10.10.2.104:/gluster/brick3/export (arbiter)

And also in oVirt I see all 3 bricks online

Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Gianluca Cecchi
On Thu, Jul 6, 2017 at 6:55 AM, Atin Mukherjee  wrote:

>
>
>>
> You can switch back to info mode the moment this is hit one more time with
> the debug log enabled. What I'd need here is the glusterd log (with debug
> mode) to figure out the exact cause of the failure.
>
>
>>
>> Let me know,
>> thanks
>>
>>
>
Yes, but with the volume in the current state I cannot run the reset-brick
command.
I have another volume, named "iso", that I can use, but I would like to use
it as clean after understanding the problem on "export" volume.
Currently on "export" volume in fact  I have this

[root@ovirt01 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 1
Transport-type: tcp
Bricks:
Brick1: gl01.localdomain.local:/gluster/brick3/export
Options Reconfigured:
...

While on the other two nodes

[root@ovirt02 ~]#  gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 2
Transport-type: tcp
Bricks:
Brick1: ovirt02.localdomain.local:/gluster/brick3/export
Brick2: ovirt03.localdomain.local:/gluster/brick3/export
Options Reconfigured:


[root@ovirt03 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 2
Transport-type: tcp
Bricks:
Brick1: ovirt02.localdomain.local:/gluster/brick3/export
Brick2: ovirt03.localdomain.local:/gluster/brick3/export
Options Reconfigured:
...

Eventually I can destroy and recreate this "export" volume again with the
old names (ovirt0N.localdomain.local) if you give me the sequence of
commands, then enable debug and retry the reset-brick command

Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Atin Mukherjee
On Thu, Jul 6, 2017 at 3:47 AM, Gianluca Cecchi 
wrote:

> On Wed, Jul 5, 2017 at 6:39 PM, Atin Mukherjee 
> wrote:
>
>> OK, so the log just hints to the following:
>>
>> [2017-07-05 15:04:07.178204] E [MSGID: 106123]
>> [glusterd-mgmt.c:1532:glusterd_mgmt_v3_commit] 0-management: Commit
>> failed for operation Reset Brick on local node
>> [2017-07-05 15:04:07.178214] E [MSGID: 106123]
>> [glusterd-replace-brick.c:649:glusterd_mgmt_v3_initiate_replace_brick_cmd_phases]
>> 0-management: Commit Op Failed
>>
>> While going through the code, glusterd_op_reset_brick () failed resulting
>> into these logs. Now I don't see any error logs generated from
>> glusterd_op_reset_brick () which makes me thing that have we failed from a
>> place where we log the failure in debug mode. Would you be able to restart
>> glusterd service with debug log mode and reran this test and share the log?
>>
>>
> Do you mean to run the reset-brick command for another volume or for the
> same? Can I run it against this "now broken" volume?
>
> Or perhaps can I modify /usr/lib/systemd/system/glusterd.service and
> change in [service] section
>
> from
> Environment="LOG_LEVEL=INFO"
>
> to
> Environment="LOG_LEVEL=DEBUG"
>
> and then
> systemctl daemon-reload
> systemctl restart glusterd
>

Yes, that's how you can run glusterd in debug log mode.

>
> I think it would be better to keep gluster in debug mode the less time
> possible, as there are other volumes active right now, and I want to
> prevent fill the log files file system
> Best to put only some components in debug mode if possible as in the
> example commands above.
>

You can switch back to info mode the moment this is hit one more time with
the debug log enabled. What I'd need here is the glusterd log (with debug
mode) to figure out the exact cause of the failure.


>
> Let me know,
> thanks
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Gianluca Cecchi
On Wed, Jul 5, 2017 at 6:39 PM, Atin Mukherjee  wrote:

> OK, so the log just hints to the following:
>
> [2017-07-05 15:04:07.178204] E [MSGID: 106123] 
> [glusterd-mgmt.c:1532:glusterd_mgmt_v3_commit]
> 0-management: Commit failed for operation Reset Brick on local node
> [2017-07-05 15:04:07.178214] E [MSGID: 106123]
> [glusterd-replace-brick.c:649:glusterd_mgmt_v3_initiate_replace_brick_cmd_phases]
> 0-management: Commit Op Failed
>
> While going through the code, glusterd_op_reset_brick () failed resulting
> into these logs. Now I don't see any error logs generated from
> glusterd_op_reset_brick () which makes me thing that have we failed from a
> place where we log the failure in debug mode. Would you be able to restart
> glusterd service with debug log mode and reran this test and share the log?
>
>
What's the best way to set glusterd in debug mode?
Can I set this volume, and work on it even if it is now compromised?

I ask because I have tried this:

[root@ovirt01 ~]# gluster volume get export diagnostics.brick-log-level
Option
Value
--
-
diagnostics.brick-log-level INFO


[root@ovirt01 ~]# gluster volume set export diagnostics.brick-log-level
DEBUG
volume set: failed: Error, Validation Failed
[root@ovirt01 ~]#

While on another volume that is in good state, I can run

[root@ovirt01 ~]# gluster volume set iso diagnostics.brick-log-level DEBUG
volume set: success
[root@ovirt01 ~]#

[root@ovirt01 ~]# gluster volume get iso diagnostics.brick-log-level
Option
Value
--
-
diagnostics.brick-log-level DEBUG

[root@ovirt01 ~]# gluster volume set iso diagnostics.brick-log-level INFO
volume set: success
[root@ovirt01 ~]#

 [root@ovirt01 ~]# gluster volume get iso diagnostics.brick-log-level
Option
Value
--
-
diagnostics.brick-log-level
INFO
[root@ovirt01 ~]#

Do you mean to run the reset-brick command for another volume or for the
same? Can I run it against this "now broken" volume?

Or perhaps can I modify /usr/lib/systemd/system/glusterd.service and change
in [service] section

from
Environment="LOG_LEVEL=INFO"

to
Environment="LOG_LEVEL=DEBUG"

and then
systemctl daemon-reload
systemctl restart glusterd

I think it would be better to keep gluster in debug mode the less time
possible, as there are other volumes active right now, and I want to
prevent fill the log files file system
Best to put only some components in debug mode if possible as in the
example commands above.

Let me know,
thanks
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Atin Mukherjee
OK, so the log just hints to the following:

[2017-07-05 15:04:07.178204] E [MSGID: 106123]
[glusterd-mgmt.c:1532:glusterd_mgmt_v3_commit] 0-management: Commit failed
for operation Reset Brick on local node
[2017-07-05 15:04:07.178214] E [MSGID: 106123]
[glusterd-replace-brick.c:649:glusterd_mgmt_v3_initiate_replace_brick_cmd_phases]
0-management: Commit Op Failed

While going through the code, glusterd_op_reset_brick () failed resulting
into these logs. Now I don't see any error logs generated from
glusterd_op_reset_brick () which makes me thing that have we failed from a
place where we log the failure in debug mode. Would you be able to restart
glusterd service with debug log mode and reran this test and share the log?


On Wed, Jul 5, 2017 at 9:12 PM, Gianluca Cecchi 
wrote:

>
>
> On Wed, Jul 5, 2017 at 5:22 PM, Atin Mukherjee 
> wrote:
>
>> And what does glusterd log indicate for these failures?
>>
>
>
> See here in gzip format
>
> https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/
> view?usp=sharing
>
> It seems that on each host the peer files have been updated with a new
> entry "hostname2":
>
> [root@ovirt01 ~]# cat /var/lib/glusterd/peers/*
> uuid=b89311fe-257f-4e44-8e15-9bff6245d689
> state=3
> hostname1=ovirt02.localdomain.local
> hostname2=10.10.2.103
> uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3
> state=3
> hostname1=ovirt03.localdomain.local
> hostname2=10.10.2.104
> [root@ovirt01 ~]#
>
> [root@ovirt02 ~]# cat /var/lib/glusterd/peers/*
> uuid=e9717281-a356-42aa-a579-a4647a29a0bc
> state=3
> hostname1=ovirt01.localdomain.local
> hostname2=10.10.2.102
> uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3
> state=3
> hostname1=ovirt03.localdomain.local
> hostname2=10.10.2.104
> [root@ovirt02 ~]#
>
> [root@ovirt03 ~]# cat /var/lib/glusterd/peers/*
> uuid=b89311fe-257f-4e44-8e15-9bff6245d689
> state=3
> hostname1=ovirt02.localdomain.local
> hostname2=10.10.2.103
> uuid=e9717281-a356-42aa-a579-a4647a29a0bc
> state=3
> hostname1=ovirt01.localdomain.local
> hostname2=10.10.2.102
> [root@ovirt03 ~]#
>
>
> But not the gluster info on the second and third node that have lost the
> ovirt01/gl01 host brick information...
>
> Eg on ovirt02
>
>
> [root@ovirt02 peers]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 0 x (2 + 1) = 2
> Transport-type: tcp
> Bricks:
> Brick1: ovirt02.localdomain.local:/gluster/brick3/export
> Brick2: ovirt03.localdomain.local:/gluster/brick3/export
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard: on
> features.shard-block-size: 512MB
> performance.low-prio-threads: 32
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 6
> network.ping-timeout: 30
> user.cifs: off
> nfs.disable: on
> performance.strict-o-direct: on
> [root@ovirt02 peers]#
>
> And on ovirt03
>
> [root@ovirt03 ~]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 0 x (2 + 1) = 2
> Transport-type: tcp
> Bricks:
> Brick1: ovirt02.localdomain.local:/gluster/brick3/export
> Brick2: ovirt03.localdomain.local:/gluster/brick3/export
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard: on
> features.shard-block-size: 512MB
> performance.low-prio-threads: 32
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 6
> network.ping-timeout: 30
> user.cifs: off
> nfs.disable: on
> performance.strict-o-direct: on
> [root@ovirt03 ~]#
>
> While on ovirt01 it seems isolated...
>
> [root@ovirt01 ~]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 0 x (2 + 1) = 1
> Transport-type: tcp
> Bricks:
> Brick1: gl01.localdomain.local:/gluster/brick3/export
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cl

Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Gianluca Cecchi
On Wed, Jul 5, 2017 at 5:22 PM, Atin Mukherjee  wrote:

> And what does glusterd log indicate for these failures?
>


See here in gzip format

https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/view?usp=sharing


It seems that on each host the peer files have been updated with a new
entry "hostname2":

[root@ovirt01 ~]# cat /var/lib/glusterd/peers/*
uuid=b89311fe-257f-4e44-8e15-9bff6245d689
state=3
hostname1=ovirt02.localdomain.local
hostname2=10.10.2.103
uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3
state=3
hostname1=ovirt03.localdomain.local
hostname2=10.10.2.104
[root@ovirt01 ~]#

[root@ovirt02 ~]# cat /var/lib/glusterd/peers/*
uuid=e9717281-a356-42aa-a579-a4647a29a0bc
state=3
hostname1=ovirt01.localdomain.local
hostname2=10.10.2.102
uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3
state=3
hostname1=ovirt03.localdomain.local
hostname2=10.10.2.104
[root@ovirt02 ~]#

[root@ovirt03 ~]# cat /var/lib/glusterd/peers/*
uuid=b89311fe-257f-4e44-8e15-9bff6245d689
state=3
hostname1=ovirt02.localdomain.local
hostname2=10.10.2.103
uuid=e9717281-a356-42aa-a579-a4647a29a0bc
state=3
hostname1=ovirt01.localdomain.local
hostname2=10.10.2.102
[root@ovirt03 ~]#


But not the gluster info on the second and third node that have lost the
ovirt01/gl01 host brick information...

Eg on ovirt02


[root@ovirt02 peers]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 2
Transport-type: tcp
Bricks:
Brick1: ovirt02.localdomain.local:/gluster/brick3/export
Brick2: ovirt03.localdomain.local:/gluster/brick3/export
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root@ovirt02 peers]#

And on ovirt03

[root@ovirt03 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 2
Transport-type: tcp
Bricks:
Brick1: ovirt02.localdomain.local:/gluster/brick3/export
Brick2: ovirt03.localdomain.local:/gluster/brick3/export
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root@ovirt03 ~]#

While on ovirt01 it seems isolated...

[root@ovirt01 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 1
Transport-type: tcp
Bricks:
Brick1: gl01.localdomain.local:/gluster/brick3/export
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root@ovirt01 ~]#
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Atin Mukherjee
And what does glusterd log indicate for these failures?

On Wed, Jul 5, 2017 at 8:43 PM, Gianluca Cecchi 
wrote:

>
>
> On Wed, Jul 5, 2017 at 5:02 PM, Sahina Bose  wrote:
>
>>
>>
>> On Wed, Jul 5, 2017 at 8:16 PM, Gianluca Cecchi <
>> gianluca.cec...@gmail.com> wrote:
>>
>>>
>>>
>>> On Wed, Jul 5, 2017 at 7:42 AM, Sahina Bose  wrote:
>>>


> ...
>
> then the commands I need to run would be:
>
> gluster volume reset-brick export 
> ovirt01.localdomain.local:/gluster/brick3/export
> start
> gluster volume reset-brick export 
> ovirt01.localdomain.local:/gluster/brick3/export
> gl01.localdomain.local:/gluster/brick3/export commit force
>
> Correct?
>

 Yes, correct. gl01.localdomain.local should resolve correctly on all 3
 nodes.

>>>
>>>
>>> It fails at first step:
>>>
>>>  [root@ovirt01 ~]# gluster volume reset-brick export
>>> ovirt01.localdomain.local:/gluster/brick3/export start
>>> volume reset-brick: failed: Cannot execute command. The cluster is
>>> operating at version 30712. reset-brick command reset-brick start is
>>> unavailable in this version.
>>> [root@ovirt01 ~]#
>>>
>>> It seems somehow in relation with this upgrade not of the commercial
>>> solution Red Hat Gluster Storage
>>> https://access.redhat.com/documentation/en-US/Red_Hat_Storag
>>> e/3.1/html/Installation_Guide/chap-Upgrading_Red_Hat_Storage.html
>>>
>>> So ti seems I have to run some command of type:
>>>
>>> gluster volume set all cluster.op-version X
>>>
>>> with X > 30712
>>>
>>> It seems that latest version of commercial Red Hat Gluster Storage is
>>> 3.1 and its op-version is indeed 30712..
>>>
>>> So the question is which particular op-version I have to set and if the
>>> command can be set online without generating disruption
>>>
>>
>> It should have worked with the glusterfs 3.10 version from Centos repo.
>> Adding gluster-users for help on the op-version
>>
>>
>>>
>>> Thanks,
>>> Gianluca
>>>
>>
>>
>
> It seems op-version is not updated automatically by default, so that it
> can manage mixed versions while you update one by one...
>
> I followed what described here:
> https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/
>
>
> - Get current version:
>
> [root@ovirt01 ~]# gluster volume get all cluster.op-version
> Option  Value
>
> --  -
>
> cluster.op-version  30712
>
> [root@ovirt01 ~]#
>
>
> - Get maximum version I can set for current setup:
>
> [root@ovirt01 ~]# gluster volume get all cluster.max-op-version
> Option  Value
>
> --  -
>
> cluster.max-op-version  31000
>
> [root@ovirt01 ~]#
>
>
> - Get op version information for all the connected clients:
>
> [root@ovirt01 ~]# gluster volume status all clients | grep ":49" | awk
> '{print $4}' | sort | uniq -c
>  72 31000
> [root@ovirt01 ~]#
>
> --> ok
>
>
> - Update op-version
>
> [root@ovirt01 ~]# gluster volume set all cluster.op-version 31000
> volume set: success
> [root@ovirt01 ~]#
>
>
> - Verify:
> [root@ovirt01 ~]# gluster volume get all cluster.op-versionOption
>  Value
> --  -
>
> cluster.op-version  31000
>
> [root@ovirt01 ~]#
>
> --> ok
>
> [root@ovirt01 ~]# gluster volume reset-brick export
> ovirt01.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
>
> [root@ovirt01 ~]# gluster volume reset-brick export
> ovirt01.localdomain.local:/gluster/brick3/export 
> gl01.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
> Please check log file for details.
> Commit failed on ovirt03.localdomain.local. Please check log file for
> details.
> [root@ovirt01 ~]#
>
> [root@ovirt01 bricks]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gl01.localdomain.local:/gluster/brick3/export
> Brick2: ovirt02.localdomain.local:/gluster/brick3/export
> Brick3: ovirt03.localdomain.local:/gluster/brick3/export (arbiter)
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard: on
> features.shard-block-size: 512MB
> performance.low-prio-threads: 32
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-wait-qlength: 1
> cluster

Re: [ovirt-users] [Gluster-users] op-version for reset-brick (Was: Re: Upgrading HC from 4.0 to 4.1)

2017-07-05 Thread Atin Mukherjee
On Wed, Jul 5, 2017 at 8:32 PM, Sahina Bose  wrote:

>
>
> On Wed, Jul 5, 2017 at 8:16 PM, Gianluca Cecchi  > wrote:
>
>>
>>
>> On Wed, Jul 5, 2017 at 7:42 AM, Sahina Bose  wrote:
>>
>>>
>>>
 ...

 then the commands I need to run would be:

 gluster volume reset-brick export 
 ovirt01.localdomain.local:/gluster/brick3/export
 start
 gluster volume reset-brick export 
 ovirt01.localdomain.local:/gluster/brick3/export
 gl01.localdomain.local:/gluster/brick3/export commit force

 Correct?

>>>
>>> Yes, correct. gl01.localdomain.local should resolve correctly on all 3
>>> nodes.
>>>
>>
>>
>> It fails at first step:
>>
>>  [root@ovirt01 ~]# gluster volume reset-brick export
>> ovirt01.localdomain.local:/gluster/brick3/export start
>> volume reset-brick: failed: Cannot execute command. The cluster is
>> operating at version 30712. reset-brick command reset-brick start is
>> unavailable in this version.
>> [root@ovirt01 ~]#
>>
>> It seems somehow in relation with this upgrade not of the commercial
>> solution Red Hat Gluster Storage
>> https://access.redhat.com/documentation/en-US/Red_Hat_Storag
>> e/3.1/html/Installation_Guide/chap-Upgrading_Red_Hat_Storage.html
>>
>> So ti seems I have to run some command of type:
>>
>> gluster volume set all cluster.op-version X
>>
>> with X > 30712
>>
>> It seems that latest version of commercial Red Hat Gluster Storage is 3.1
>> and its op-version is indeed 30712..
>>
>> So the question is which particular op-version I have to set and if the
>> command can be set online without generating disruption
>>
>
> It should have worked with the glusterfs 3.10 version from Centos repo.
> Adding gluster-users for help on the op-version
>

This definitely means your cluster op-version is running < 3.9.0

 if (conf->op_version < GD_OP_VERSION_3_9_0
&&
strcmp (cli_op, "GF_REPLACE_OP_COMMIT_FORCE"))
{
snprintf (msg, sizeof (msg), "Cannot execute command. The
"
  "cluster is operating at version %d. reset-brick
"
  "command %s is unavailable in this
version.",

conf->op_version,
  gd_rb_op_to_str
(cli_op));
ret =
-1;
goto
out;
}

What's the version of gluster bits are you running across the gluster
cluster? Please note cluster.op-version is not exactly the same as of rpm
version and with every upgrades it's recommended to bump up the op-version.


>
>>
>> Thanks,
>> Gianluca
>>
>
>
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users