[ovirt-users] Re: unable to bring up gluster bricks after 4.5 upgrade

2022-08-29 Thread Jayme
It appears that I may have resolved the issue after putting host into
maintenance again and rebooting a second time. I'm really not sure why but
all bricks are up now

On Mon, Aug 29, 2022 at 3:45 PM Jayme  wrote:

> A bit more info from the host's brick log
>
> [2022-08-29 18:43:44.251198 +] D [MSGID: 0]
> [options.c:1113:xlator_reconfigure_rec] 0-engine-barrier: reconfigured
> [2022-08-29 18:43:44.251203 +] D [MSGID: 0]
> [options.c:1133:xlator_reconfigure_rec] 0-engine-index: No reconfigure()
> found
> [2022-08-29 18:43:44.251207 +] D [MSGID: 0]
> [options.c:1113:xlator_reconfigure_rec] 0-engine-index: reconfigured
> [2022-08-29 18:43:44.251214 +] I [MSGID: 0]
> [options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option
> deem-statfs using set value off
> [2022-08-29 18:43:44.251221 +] I [MSGID: 0]
> [options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option
> server-quota using set value off
> [2022-08-29 18:43:44.251248 +] D [MSGID: 0]
> [options.c:1113:xlator_reconfigure_rec] 0-engine-quota: reconfigured
> [2022-08-29 18:44:04.899452 +] E [MSGID: 113072]
> [posix-inode-fd-ops.c:2087:posix_writev] 0-engine-posix: write failed:
> offset 0, [Invalid argument]
> [2022-08-29 18:44:04.899542 +] E [MSGID: 115067]
> [server-rpc-fops_v2.c:1324:server4_writev_cbk] 0-engine-server: WRITE info
> [{frame=358765}, {WRITEV_fd_no=5},
> {uuid_utoa=c816cdf3-12e6-45c0-ae0f-2cf03e0f7299},
> {client=CTX_ID:11b78775-07c9-47ff-b426-b44f3f88a3f7-GRAPH_ID:0-PID:25622-HOST:host1.x-PC_NAME:engine-client-0-RECON_NO:-2},
> {error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}]
> [2022-08-29 18:44:14.876436 +] E [MSGID: 113002]
> [posix-entry-ops.c:769:posix_mkdir] 0-engine-posix: gfid is null for (null)
> [Invalid argument]
> [2022-08-29 18:44:14.876503 +] E [MSGID: 115056]
> [server-rpc-fops_v2.c:497:server4_mkdir_cbk] 0-engine-server: MKDIR info
> [{frame=359508}, {MKDIR_path=},
> {uuid_utoa=----0001}, {bname=},
> {client=CTX_ID:37199949-8cf2-4bbe-938e-e9ef3bd98486-GRAPH_ID:3-PID:2473-HOST:host0.x-PC_NAME:engine-client-0-RECON_NO:-0},
> {error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}]
>
> On Mon, Aug 29, 2022 at 3:18 PM Jayme  wrote:
>
>> Hello All,
>>
>> I've been struggling with a few issues upgrading my 3-node HCI custer
>> from 4.4 to 4.5.
>>
>> At present the self hosted engine VM is properly running oVirt 4.5 on
>> CentOS 8x stream.
>>
>> First host node, I set in maintenance and installed new node-ng image. I
>> ran into issue with rescue mode on boot which appears to have been related
>> to LVM devices bug. I was able to work past that and get the node to boot.
>>
>> The node running 4.5.2 image is booting properly and gluster/lvm mounts
>> etc all look good. I am able to activate the host and run VMs on it etc.
>> however, oVirt cli is showing that all bricks on host are DOWN.
>>
>> I was unable to get the bricks back up even after doing a force start of
>> the volumes.
>>
>> Here is the glusterd log from the host in question when I try force start
>> on the engine volume (other volumes are similar:
>>
>> ==> glusterd.log <==
>> The message "I [MSGID: 106568]
>> [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: bitd service is
>> stopped" repeated 2 times between [2022-08-29 18:09:56.027147 +] and
>> [2022-08-29 18:10:34.694144 +]
>> [2022-08-29 18:10:34.695348 +] I [MSGID: 106618]
>> [glusterd-svc-helper.c:909:glusterd_attach_svc] 0-glusterd: adding svc
>> glustershd (volume=engine) to existing process with pid 2473
>> [2022-08-29 18:10:34.695669 +] I [MSGID: 106131]
>> [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already
>> stopped
>> [2022-08-29 18:10:34.695691 +] I [MSGID: 106568]
>> [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: scrub service is
>> stopped
>> [2022-08-29 18:10:34.695832 +] I [MSGID: 106617]
>> [glusterd-svc-helper.c:698:glusterd_svc_attach_cbk] 0-management: svc
>> glustershd of volume engine attached successfully to pid 2473
>> [2022-08-29 18:10:34.703718 +] E [MSGID: 106115]
>> [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit
>> failed on gluster2.x. Please check log file for details.
>> [2022-08-29 18:10:34.703774 +] E [MSGID: 106115]
>> [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit
>> failed on gluster1.x. Please check log file for details.
>> [2022-08-29 18:10:34.703797 +] E [MSGID: 106664]
>> [glusterd-mgmt.c:1969:glusterd_mgmt_v3_post_commit] 0-management: Post
>> commit failed on peers
>> [2022-08-29 18:10:34.703800 +] E [MSGID: 106664]
>> [glusterd-mgmt.c:2664:glusterd_mgmt_v3_initiate_all_phases] 0-management:
>> Post commit Op Failed
>>
>> If I run start command manually on host cli:
>>
>>  gluster volume start engine force
>> volume start: engine: failed: Post commit failed on gluster1.. Please
>> check log file for details.

[ovirt-users] Re: unable to bring up gluster bricks after 4.5 upgrade

2022-08-29 Thread Jayme
A bit more info from the host's brick log

[2022-08-29 18:43:44.251198 +] D [MSGID: 0]
[options.c:1113:xlator_reconfigure_rec] 0-engine-barrier: reconfigured
[2022-08-29 18:43:44.251203 +] D [MSGID: 0]
[options.c:1133:xlator_reconfigure_rec] 0-engine-index: No reconfigure()
found
[2022-08-29 18:43:44.251207 +] D [MSGID: 0]
[options.c:1113:xlator_reconfigure_rec] 0-engine-index: reconfigured
[2022-08-29 18:43:44.251214 +] I [MSGID: 0]
[options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option
deem-statfs using set value off
[2022-08-29 18:43:44.251221 +] I [MSGID: 0]
[options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option
server-quota using set value off
[2022-08-29 18:43:44.251248 +] D [MSGID: 0]
[options.c:1113:xlator_reconfigure_rec] 0-engine-quota: reconfigured
[2022-08-29 18:44:04.899452 +] E [MSGID: 113072]
[posix-inode-fd-ops.c:2087:posix_writev] 0-engine-posix: write failed:
offset 0, [Invalid argument]
[2022-08-29 18:44:04.899542 +] E [MSGID: 115067]
[server-rpc-fops_v2.c:1324:server4_writev_cbk] 0-engine-server: WRITE info
[{frame=358765}, {WRITEV_fd_no=5},
{uuid_utoa=c816cdf3-12e6-45c0-ae0f-2cf03e0f7299},
{client=CTX_ID:11b78775-07c9-47ff-b426-b44f3f88a3f7-GRAPH_ID:0-PID:25622-HOST:host1.x-PC_NAME:engine-client-0-RECON_NO:-2},
{error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}]
[2022-08-29 18:44:14.876436 +] E [MSGID: 113002]
[posix-entry-ops.c:769:posix_mkdir] 0-engine-posix: gfid is null for (null)
[Invalid argument]
[2022-08-29 18:44:14.876503 +] E [MSGID: 115056]
[server-rpc-fops_v2.c:497:server4_mkdir_cbk] 0-engine-server: MKDIR info
[{frame=359508}, {MKDIR_path=},
{uuid_utoa=----0001}, {bname=},
{client=CTX_ID:37199949-8cf2-4bbe-938e-e9ef3bd98486-GRAPH_ID:3-PID:2473-HOST:host0.x-PC_NAME:engine-client-0-RECON_NO:-0},
{error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}]

On Mon, Aug 29, 2022 at 3:18 PM Jayme  wrote:

> Hello All,
>
> I've been struggling with a few issues upgrading my 3-node HCI custer from
> 4.4 to 4.5.
>
> At present the self hosted engine VM is properly running oVirt 4.5 on
> CentOS 8x stream.
>
> First host node, I set in maintenance and installed new node-ng image. I
> ran into issue with rescue mode on boot which appears to have been related
> to LVM devices bug. I was able to work past that and get the node to boot.
>
> The node running 4.5.2 image is booting properly and gluster/lvm mounts
> etc all look good. I am able to activate the host and run VMs on it etc.
> however, oVirt cli is showing that all bricks on host are DOWN.
>
> I was unable to get the bricks back up even after doing a force start of
> the volumes.
>
> Here is the glusterd log from the host in question when I try force start
> on the engine volume (other volumes are similar:
>
> ==> glusterd.log <==
> The message "I [MSGID: 106568] [glusterd-svc-mgmt.c:266:glusterd_svc_stop]
> 0-management: bitd service is stopped" repeated 2 times between [2022-08-29
> 18:09:56.027147 +] and [2022-08-29 18:10:34.694144 +]
> [2022-08-29 18:10:34.695348 +] I [MSGID: 106618]
> [glusterd-svc-helper.c:909:glusterd_attach_svc] 0-glusterd: adding svc
> glustershd (volume=engine) to existing process with pid 2473
> [2022-08-29 18:10:34.695669 +] I [MSGID: 106131]
> [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already
> stopped
> [2022-08-29 18:10:34.695691 +] I [MSGID: 106568]
> [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: scrub service is
> stopped
> [2022-08-29 18:10:34.695832 +] I [MSGID: 106617]
> [glusterd-svc-helper.c:698:glusterd_svc_attach_cbk] 0-management: svc
> glustershd of volume engine attached successfully to pid 2473
> [2022-08-29 18:10:34.703718 +] E [MSGID: 106115]
> [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit
> failed on gluster2.x. Please check log file for details.
> [2022-08-29 18:10:34.703774 +] E [MSGID: 106115]
> [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit
> failed on gluster1.x. Please check log file for details.
> [2022-08-29 18:10:34.703797 +] E [MSGID: 106664]
> [glusterd-mgmt.c:1969:glusterd_mgmt_v3_post_commit] 0-management: Post
> commit failed on peers
> [2022-08-29 18:10:34.703800 +] E [MSGID: 106664]
> [glusterd-mgmt.c:2664:glusterd_mgmt_v3_initiate_all_phases] 0-management:
> Post commit Op Failed
>
> If I run start command manually on host cli:
>
>  gluster volume start engine force
> volume start: engine: failed: Post commit failed on gluster1.. Please
> check log file for details.
> Post commit failed on gluster2.. Please check log file for details.
>
> I feel like this may be some issue with the difference in major versions
> of GlusterFS on the nodes but I am unsure. The other nodes are running
> ovirt-node-ng-4.4.6.3
>
> At this point I am afraid to bring down any other node to attempt
> upgrading it without the