[ovirt-devel] Re: hc-basic-suite-master fails due to missing glusterfs firewalld services

2021-06-21 Thread Yedidyah Bar David
On Thu, Jun 17, 2021 at 6:27 PM Marcin Sobczyk  wrote:
>
>
>
> On 6/17/21 1:44 PM, Yedidyah Bar David wrote:
> > On Wed, Jun 16, 2021 at 1:23 PM Yedidyah Bar David  wrote:
> >> Hi,
> >>
> >> I now tried running locally hc-basic-suite-master with a patched OST,
> >> and it failed due to $subject. I checked and see that this also
> >> happened on CI, e.g. [1], before it started failing to to an unrelated
> >> reason later:
> >>
> >> E   TASK [gluster.infra/roles/firewall_config : Add/Delete
> >> services to firewalld rules] ***
> >> E   failed: [lago-hc-basic-suite-master-host-0]
> >> (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
> >> "item": "glusterfs", "msg": "ERROR: Exception caught:
> >> org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
> >> not among existing services Permanent and Non-Permanent(immediate)
> >> operation, Services are defined by port/tcp relationship and named as
> >> they are in /etc/services (on most systems)"}
> >> E   failed: [lago-hc-basic-suite-master-host-2]
> >> (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
> >> "item": "glusterfs", "msg": "ERROR: Exception caught:
> >> org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
> >> not among existing services Permanent and Non-Permanent(immediate)
> >> operation, Services are defined by port/tcp relationship and named as
> >> they are in /etc/services (on most systems)"}
> >> E   failed: [lago-hc-basic-suite-master-host-1]
> >> (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
> >> "item": "glusterfs", "msg": "ERROR: Exception caught:
> >> org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
> >> not among existing services Permanent and Non-Permanent(immediate)
> >> operation, Services are defined by port/tcp relationship and named as
> >> they are in /etc/services (on most systems)"}
> >>
> >> This seems similar to [2], and indeed I can't see the package
> >> 'glusterfs-server' installed locally on host-0. Any idea?
> > I think I understand:
> >
> > It seems like the deployment of hc relied on the order of running the deploy
> > scripts as written in lagoinitfile. With the new deploy code, all of them 
> > run
> > in parallel. Does this make sense?
> The scripts run in parallel as in "on all VMs at the same time", but
> sequentially
> as in "one script at a time on each VM" - this is the same behavior we
> had with lago deployment.

Well, I do not think it works as intended, then. When running locally,
I logged into host-0, and after it failed, I had:

# dnf history
ID | Command line

   | Date and time| Action(s)  | Altered
--
 4 | install -y --nogpgcheck ansible gluster-ansible-roles
ovirt-hosted-engine-setup ovirt-ansible-hosted-engine-setup
ovirt-ansible-reposit | 2021-06-17 11:54 | I, U   |8
 3 | -y --nogpgcheck install ovirt-host python3-coverage
vdsm-hook-vhostmd
 | 2021-06-08 02:15 | Install|  493 EE
 2 | install -y dnf-utils
https://resources.ovirt.org/pub/yum-repo/ovirt-release-master.rpm
| 2021-06-08 02:14 |
Install|1
 1 |

   | 2021-06-08 02:06 | Install|  511 EE

Meaning, it already ran setup_first_host.sh (and failed there), but
didn't run hc_setup_host.sh, although it appears before it.

If you check [1], which is a build that failed due to this reason
(unlike the later ones), you see there:

-- Captured log setup --
2021-06-07 01:58:38+,594 INFO
[ost_utils.pytest.fixtures.deployment] Waiting for SSH on the VMs
(deployment:40)
2021-06-07 01:59:11+,947 INFO
[ost_utils.deployment_utils.package_mgmt] oVirt packages used on VMs:
(package_mgmt:133)
2021-06-07 01:59:11+,948 INFO
[ost_utils.deployment_utils.package_mgmt]
vdsm-4.40.70.2-1.git34cdc8884.el8.x86_64 (package_mgmt:135)
2021-06-07 01:59:11+,950 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
on lago-hc-basic-suite-master-host-1 (scripts:36)
2021-06-07 01:59:11+,950 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
on lago-hc-basic-suite-master-host-2 (scripts:36)
2021-06-07 01:59:11+,952 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
on lago-hc-basic-suite-master-host-0 (scripts:36)
2021-06-07 01:59:13+,260 INFO
[ost_utils.deployment_utils.scripts] Running

[ovirt-devel] Re: hc-basic-suite-master fails due to missing glusterfs firewalld services

2021-06-21 Thread Marcin Sobczyk



On 6/17/21 6:59 PM, Yedidyah Bar David wrote:

On Thu, Jun 17, 2021 at 6:27 PM Marcin Sobczyk  wrote:



On 6/17/21 1:44 PM, Yedidyah Bar David wrote:

On Wed, Jun 16, 2021 at 1:23 PM Yedidyah Bar David  wrote:

Hi,

I now tried running locally hc-basic-suite-master with a patched OST,
and it failed due to $subject. I checked and see that this also
happened on CI, e.g. [1], before it started failing to to an unrelated
reason later:

E   TASK [gluster.infra/roles/firewall_config : Add/Delete
services to firewalld rules] ***
E   failed: [lago-hc-basic-suite-master-host-0]
(item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
"item": "glusterfs", "msg": "ERROR: Exception caught:
org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
not among existing services Permanent and Non-Permanent(immediate)
operation, Services are defined by port/tcp relationship and named as
they are in /etc/services (on most systems)"}
E   failed: [lago-hc-basic-suite-master-host-2]
(item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
"item": "glusterfs", "msg": "ERROR: Exception caught:
org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
not among existing services Permanent and Non-Permanent(immediate)
operation, Services are defined by port/tcp relationship and named as
they are in /etc/services (on most systems)"}
E   failed: [lago-hc-basic-suite-master-host-1]
(item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
"item": "glusterfs", "msg": "ERROR: Exception caught:
org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
not among existing services Permanent and Non-Permanent(immediate)
operation, Services are defined by port/tcp relationship and named as
they are in /etc/services (on most systems)"}

This seems similar to [2], and indeed I can't see the package
'glusterfs-server' installed locally on host-0. Any idea?

I think I understand:

It seems like the deployment of hc relied on the order of running the deploy
scripts as written in lagoinitfile. With the new deploy code, all of them run
in parallel. Does this make sense?

The scripts run in parallel as in "on all VMs at the same time", but
sequentially
as in "one script at a time on each VM" - this is the same behavior we
had with lago deployment.

Well, I do not think it works as intended, then. When running locally,
I logged into host-0, and after it failed, I had:

# dnf history
ID | Command line

| Date and time| Action(s)  | Altered
--
  4 | install -y --nogpgcheck ansible gluster-ansible-roles
ovirt-hosted-engine-setup ovirt-ansible-hosted-engine-setup
ovirt-ansible-reposit | 2021-06-17 11:54 | I, U   |8
  3 | -y --nogpgcheck install ovirt-host python3-coverage
vdsm-hook-vhostmd
  | 2021-06-08 02:15 | Install|  493 EE
  2 | install -y dnf-utils
https://resources.ovirt.org/pub/yum-repo/ovirt-release-master.rpm
 | 2021-06-08 02:14 |
Install|1
  1 |

| 2021-06-08 02:06 | Install|  511 EE

Meaning, it already ran setup_first_host.sh (and failed there), but
didn't run hc_setup_host.sh, although it appears before it.

If you check [1], which is a build that failed due to this reason
(unlike the later ones), you see there:

-- Captured log setup --
2021-06-07 01:58:38+,594 INFO
[ost_utils.pytest.fixtures.deployment] Waiting for SSH on the VMs
(deployment:40)
2021-06-07 01:59:11+,947 INFO
[ost_utils.deployment_utils.package_mgmt] oVirt packages used on VMs:
(package_mgmt:133)
2021-06-07 01:59:11+,948 INFO
[ost_utils.deployment_utils.package_mgmt]
vdsm-4.40.70.2-1.git34cdc8884.el8.x86_64 (package_mgmt:135)
2021-06-07 01:59:11+,950 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
on lago-hc-basic-suite-master-host-1 (scripts:36)
2021-06-07 01:59:11+,950 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
on lago-hc-basic-suite-master-host-2 (scripts:36)
2021-06-07 01:59:11+,952 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
on lago-hc-basic-suite-master-host-0 (scripts:36)
2021-06-07 01:59:13+,260 INFO
[ost_utils.deployment_utils.scripts] Running
/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/hc-basic-suite-master/hc_setup_host.sh
on lago-hc-basic-suite-master-host-1 

[ovirt-devel] Re: hc-basic-suite-master fails due to missing glusterfs firewalld services

2021-06-21 Thread Yedidyah Bar David
On Wed, Jun 16, 2021 at 1:23 PM Yedidyah Bar David  wrote:
>
> Hi,
>
> I now tried running locally hc-basic-suite-master with a patched OST,
> and it failed due to $subject. I checked and see that this also
> happened on CI, e.g. [1], before it started failing to to an unrelated
> reason later:
>
> E   TASK [gluster.infra/roles/firewall_config : Add/Delete
> services to firewalld rules] ***
> E   failed: [lago-hc-basic-suite-master-host-0]
> (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
> "item": "glusterfs", "msg": "ERROR: Exception caught:
> org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
> not among existing services Permanent and Non-Permanent(immediate)
> operation, Services are defined by port/tcp relationship and named as
> they are in /etc/services (on most systems)"}
> E   failed: [lago-hc-basic-suite-master-host-2]
> (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
> "item": "glusterfs", "msg": "ERROR: Exception caught:
> org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
> not among existing services Permanent and Non-Permanent(immediate)
> operation, Services are defined by port/tcp relationship and named as
> they are in /etc/services (on most systems)"}
> E   failed: [lago-hc-basic-suite-master-host-1]
> (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
> "item": "glusterfs", "msg": "ERROR: Exception caught:
> org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
> not among existing services Permanent and Non-Permanent(immediate)
> operation, Services are defined by port/tcp relationship and named as
> they are in /etc/services (on most systems)"}
>
> This seems similar to [2], and indeed I can't see the package
> 'glusterfs-server' installed locally on host-0. Any idea?

I think I understand:

It seems like the deployment of hc relied on the order of running the deploy
scripts as written in lagoinitfile. With the new deploy code, all of them run
in parallel. Does this make sense?


>
> Thanks and best regards,
>
> [1] 
> https://jenkins.ovirt.org/job/ovirt-system-tests_hc-basic-suite-master/2088/
>
> [2] https://github.com/oVirt/ovirt-ansible/issues/124
> --
> Didi



--
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/HQ37ENFRGYJK4H3INGAAR5FYWK33WAH4/


[ovirt-devel] Re: hc-basic-suite-master fails due to missing glusterfs firewalld services

2021-06-21 Thread Marcin Sobczyk



On 6/17/21 1:44 PM, Yedidyah Bar David wrote:

On Wed, Jun 16, 2021 at 1:23 PM Yedidyah Bar David  wrote:

Hi,

I now tried running locally hc-basic-suite-master with a patched OST,
and it failed due to $subject. I checked and see that this also
happened on CI, e.g. [1], before it started failing to to an unrelated
reason later:

E   TASK [gluster.infra/roles/firewall_config : Add/Delete
services to firewalld rules] ***
E   failed: [lago-hc-basic-suite-master-host-0]
(item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
"item": "glusterfs", "msg": "ERROR: Exception caught:
org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
not among existing services Permanent and Non-Permanent(immediate)
operation, Services are defined by port/tcp relationship and named as
they are in /etc/services (on most systems)"}
E   failed: [lago-hc-basic-suite-master-host-2]
(item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
"item": "glusterfs", "msg": "ERROR: Exception caught:
org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
not among existing services Permanent and Non-Permanent(immediate)
operation, Services are defined by port/tcp relationship and named as
they are in /etc/services (on most systems)"}
E   failed: [lago-hc-basic-suite-master-host-1]
(item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
"item": "glusterfs", "msg": "ERROR: Exception caught:
org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
not among existing services Permanent and Non-Permanent(immediate)
operation, Services are defined by port/tcp relationship and named as
they are in /etc/services (on most systems)"}

This seems similar to [2], and indeed I can't see the package
'glusterfs-server' installed locally on host-0. Any idea?

I think I understand:

It seems like the deployment of hc relied on the order of running the deploy
scripts as written in lagoinitfile. With the new deploy code, all of them run
in parallel. Does this make sense?
The scripts run in parallel as in "on all VMs at the same time", but 
sequentially
as in "one script at a time on each VM" - this is the same behavior we 
had with lago deployment.


Regards, Marcin




Thanks and best regards,

[1] https://jenkins.ovirt.org/job/ovirt-system-tests_hc-basic-suite-master/2088/

[2] https://github.com/oVirt/ovirt-ansible/issues/124
--
Didi



--
Didi


___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/DVNFOM2NHSZO6G4CR2MA6YXKZ26Q6UJU/


[ovirt-devel] Re: hc-basic-suite-master fails due to missing glusterfs firewalld services

2021-06-21 Thread Yedidyah Bar David
On Fri, Jun 18, 2021 at 10:18 AM Marcin Sobczyk  wrote:
>
>
>
> On 6/17/21 6:59 PM, Yedidyah Bar David wrote:
> > On Thu, Jun 17, 2021 at 6:27 PM Marcin Sobczyk  wrote:
> >>
> >>
> >> On 6/17/21 1:44 PM, Yedidyah Bar David wrote:
> >>> On Wed, Jun 16, 2021 at 1:23 PM Yedidyah Bar David  
> >>> wrote:
>  Hi,
> 
>  I now tried running locally hc-basic-suite-master with a patched OST,
>  and it failed due to $subject. I checked and see that this also
>  happened on CI, e.g. [1], before it started failing to to an unrelated
>  reason later:
> 
>  E   TASK [gluster.infra/roles/firewall_config : Add/Delete
>  services to firewalld rules] ***
>  E   failed: [lago-hc-basic-suite-master-host-0]
>  (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
>  "item": "glusterfs", "msg": "ERROR: Exception caught:
>  org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
>  not among existing services Permanent and Non-Permanent(immediate)
>  operation, Services are defined by port/tcp relationship and named as
>  they are in /etc/services (on most systems)"}
>  E   failed: [lago-hc-basic-suite-master-host-2]
>  (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
>  "item": "glusterfs", "msg": "ERROR: Exception caught:
>  org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
>  not among existing services Permanent and Non-Permanent(immediate)
>  operation, Services are defined by port/tcp relationship and named as
>  they are in /etc/services (on most systems)"}
>  E   failed: [lago-hc-basic-suite-master-host-1]
>  (item=glusterfs) => {"ansible_loop_var": "item", "changed": false,
>  "item": "glusterfs", "msg": "ERROR: Exception caught:
>  org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs'
>  not among existing services Permanent and Non-Permanent(immediate)
>  operation, Services are defined by port/tcp relationship and named as
>  they are in /etc/services (on most systems)"}
> 
>  This seems similar to [2], and indeed I can't see the package
>  'glusterfs-server' installed locally on host-0. Any idea?
> >>> I think I understand:
> >>>
> >>> It seems like the deployment of hc relied on the order of running the 
> >>> deploy
> >>> scripts as written in lagoinitfile. With the new deploy code, all of them 
> >>> run
> >>> in parallel. Does this make sense?
> >> The scripts run in parallel as in "on all VMs at the same time", but
> >> sequentially
> >> as in "one script at a time on each VM" - this is the same behavior we
> >> had with lago deployment.
> > Well, I do not think it works as intended, then. When running locally,
> > I logged into host-0, and after it failed, I had:
> >
> > # dnf history
> > ID | Command line
> >
> > | Date and time| Action(s)  | Altered
> > --
> >   4 | install -y --nogpgcheck ansible gluster-ansible-roles
> > ovirt-hosted-engine-setup ovirt-ansible-hosted-engine-setup
> > ovirt-ansible-reposit | 2021-06-17 11:54 | I, U   |8
> >   3 | -y --nogpgcheck install ovirt-host python3-coverage
> > vdsm-hook-vhostmd
> >   | 2021-06-08 02:15 | Install|  493 EE
> >   2 | install -y dnf-utils
> > https://resources.ovirt.org/pub/yum-repo/ovirt-release-master.rpm
> >  | 2021-06-08 02:14 |
> > Install|1
> >   1 |
> >
> > | 2021-06-08 02:06 | Install|  511 EE
> >
> > Meaning, it already ran setup_first_host.sh (and failed there), but
> > didn't run hc_setup_host.sh, although it appears before it.
> >
> > If you check [1], which is a build that failed due to this reason
> > (unlike the later ones), you see there:
> >
> > -- Captured log setup 
> > --
> > 2021-06-07 01:58:38+,594 INFO
> > [ost_utils.pytest.fixtures.deployment] Waiting for SSH on the VMs
> > (deployment:40)
> > 2021-06-07 01:59:11+,947 INFO
> > [ost_utils.deployment_utils.package_mgmt] oVirt packages used on VMs:
> > (package_mgmt:133)
> > 2021-06-07 01:59:11+,948 INFO
> > [ost_utils.deployment_utils.package_mgmt]
> > vdsm-4.40.70.2-1.git34cdc8884.el8.x86_64 (package_mgmt:135)
> > 2021-06-07 01:59:11+,950 INFO
> > [ost_utils.deployment_utils.scripts] Running
> > /home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/common/deploy-scripts/setup_host.sh
> > on lago-hc-basic-suite-master-host-1 (scripts:36)
> > 2021-06-07 01:59:11+,950 INFO
> > [ost_utils.deployment_utils.scripts] Running
> >