[ OST Failure Report ] [ oVirt Master (vdsm) ] [ 22-02-2018 ] [ 002_bootstrap.verify_add_hosts + 002_bootstrap.add_hosts ]

2018-02-22 Thread Dafna Ron
Hi,

We had two failed tests reported in vdsm project last evening  the patch
reported seems to be related to the issue.









*Link and headline of suspected patches: momIF: change the way we connect
to MOM - https://gerrit.ovirt.org/#/c/87944/
Link to
Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5823/
Link
to all
logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5823/artifacts
(Relevant)
error snippet from the log: *

2018-02-21 14:15:47,576-0500 INFO  (MainThread) [vdsm.api] FINISH
prepareForShutdown return=None from=internal,
task_id=7d37a33b-0215-40c0-a821-9b94707caca6 (api:52)
2018-02-21 14:15:47,576-0500 ERROR (MainThread) [vds] Exception raised
(vdsmd:158)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in run
serve_clients(log)
  File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in
serve_clients
cif = clientIF.getInstance(irs, log, scheduler)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 251,
in getInstance
cls._instance = clientIF(irs, log, scheduler)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 121,
in __init__
self.mom = MomClient(config.get("mom", "socket_path"))
  File "/usr/lib/python2.7/site-packages/vdsm/momIF.py", line 51, in __init__
raise MomNotAvailableError()
MomNotAvailableError

**
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[CQ]: 87311, 2 (ovirt-host) failed "ovirt-master" system tests, but isn't the failure root cause

2018-02-22 Thread oVirt Jenkins
A system test invoked by the "ovirt-master" change queue including change
87311,2 (ovirt-host) failed. However, this change seems not to be the root
cause for this failure. Change 85421,2 (ovirt-host) that this change depends on
or is based on, was detected as the cause of the testing failures.

This change had been removed from the testing queue. Artifacts built from this
change will not be released until either change 85421,2 (ovirt-host) is fixed
and this change is updated to refer to or rebased on the fixed version, or this
change is modified to no longer depend on it.

For further details about the change see:
https://gerrit.ovirt.org/#/c/87311/2

For further details about the change that seems to be the root cause behind the
testing failures see:
https://gerrit.ovirt.org/#/c/85421/2

For failed test results see:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5827/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] [ OST Failure Report ] [ oVirt Master (ovirt-engine-metrics) ] [ 22-02-2018 ] [ 003_00_metrics_bootstrap.metrics_and_log_collector ]

2018-02-22 Thread Yaniv Kaul
On Thu, Feb 22, 2018 at 2:46 PM, Dafna Ron  wrote:

> hi,
>
> We are failing test 003_00_metrics_bootstrap.metrics_and_log_collector
> for basic suite.
>
> *Link and headline of suspected patches: *
>
>
>
>
>
>
> *ansible: End playbook based on initial validations -
> https://gerrit.ovirt.org/#/c/88062/
> Link to
> Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/
> Link
> to all
> logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/artifacts
> (Relevant)
> error snippet from the log: *
>
> /var/tmp:
> drwxr-x--x. root abrt system_u:object_r:abrt_var_cache_t:s0 abrt
> -rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.aLitM7
> -rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.G2r7IM
> -rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.kVymZE
> -rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.uPDvvU
> drwx--. root root system_u:object_r:tmp_t:s0   
> systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE
> drwx--. root root system_u:object_r:tmp_t:s0   
> systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS
>
> /var/tmp/abrt:
> -rw---. root root system_u:object_r:abrt_var_cache_t:s0 last-via-server
>
> /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE:
> drwxrwxrwt. root root system_u:object_r:tmp_t:s0   tmp
>
> /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE/tmp:
>
> /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS:
> drwxrwxrwt. root root system_u:object_r:tmp_t:s0   tmp
>
> /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS/tmp:
>
> /var/yp:
> )
> 2018-02-22 07:24:05::DEBUG::__main__::251::root:: STDERR(/bin/ls: cannot open 
> directory 
> /rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp:
>  No such file or directory
> )
> 2018-02-22 07:24:05::ERROR::__main__::832::root:: Failed to collect logs 
> from: lago-basic-suite-master-host-0; /bin/ls: cannot open directory 
> /rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp:
>  No such file or directory
>
>
Is that reproducible? It's a log collector bug anyway, but I assume it's a
race between some task (for example, downloading images from Glance) and
log collector collecting logs.
Can you open a bug on log collector?
TIA,
Y.


>
> **
>
> ___
> Devel mailing list
> de...@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[CQ]: 84106, 6 (vdsm) failed "ovirt-master" system tests, but isn't the failure root cause

2018-02-22 Thread oVirt Jenkins
A system test invoked by the "ovirt-master" change queue including change
84106,6 (vdsm) failed. However, this change seems not to be the root cause for
this failure. Change 87944,3 (vdsm) that this change depends on or is based on,
was detected as the cause of the testing failures.

This change had been removed from the testing queue. Artifacts built from this
change will not be released until either change 87944,3 (vdsm) is fixed and
this change is updated to refer to or rebased on the fixed version, or this
change is modified to no longer depend on it.

For further details about the change see:
https://gerrit.ovirt.org/#/c/84106/6

For further details about the change that seems to be the root cause behind the
testing failures see:
https://gerrit.ovirt.org/#/c/87944/3

For failed test results see:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5830/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[ OST Failure Report ] [ oVirt Master (ovirt-host) ] [ 22-02-2018 ] [002_bootstrap.add_hosts ]

2018-02-22 Thread Dafna Ron
hi,

we have a failed test for ovirt-host in upgrade suite.

*Link and headline of suspected patches: *






*Require collectd-virt plugin - https://gerrit.ovirt.org/#/c/87311/
Link to Job:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5827/
Link
to all
logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5827/artifacts
(Relevant)
error snippet from the log: *

2018-02-22 05:38:47,587-05 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [546df98d] EVENT_ID: VDS_INSTALL_IN_PROGRESS(509),
Installing Host lago-upgrade-from-release-suite-master-host0. Setting
kernel arguments.
2018-02-22 05:38:47,858-05 ERROR
[org.ovirt.engine.core.uutils.ssh.SSHDialog]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] Swallowing
exception as preferring stderr
2018-02-22 05:38:47,859-05 ERROR
[org.ovirt.engine.core.uutils.ssh.SSHDialog]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] SSH error running
command root@lago-upgrade-from-release-suite-master-host0:'umask 0077;
MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XX)";
trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr
\"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C
"${MYTMP}" -x &&  "${MYTMP}"/ovirt-host-deploy
DIALOG/dialect=str:machine DIALOG/customization=bool:True':
RuntimeException: Unexpected error during execution: bash: line 1:
1395 Segmentation fault  "${MYTMP}"/ovirt-host-deploy
DIALOG/dialect=str:machine DIALOG/customization=bool:True

2018-02-22 05:38:47,859-05 ERROR
[org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy)
[546df98d] Error during deploy dialog
2018-02-22 05:38:47,860-05 ERROR
[org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] Error during host
lago-upgrade-from-release-suite-master-host0 install
2018-02-22 05:38:47,864-05 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] EVENT_ID:
VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during
installation of Host lago-upgrade-from-release-suite-master-host0:
Unexpected error during execution: bash: line 1:  1395 Segmentation
fault  "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine
DIALOG/customization=bool:True
.
2018-02-22 05:38:47,864-05 ERROR
[org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] Error during host
lago-upgrade-from-release-suite-master-host0 install, preferring first
exception: Unexpected connection termination
2018-02-22 05:38:47,864-05 ERROR
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] Host installation
failed for host '65c048e5-a97a-422d-88b8-abe1fd925602',
'lago-upgrade-from-release-suite-master-host0': Unexpected connection
termination
2018-02-22 05:38:47,869-05 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] START,
SetVdsStatusVDSCommand(HostName =
lago-upgrade-from-release-suite-master-host0,
SetVdsStatusVDSCommandParameters:{hostId='65c048e5-a97a-422d-88b8-abe1fd925602',
status='InstallFailed', nonOperationalReason='NONE',
stopSpmFailureLogged='false', maintenanceReason='null'}), log id:
1973b09c
2018-02-22 05:38:47,898-05 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] FINISH,
SetVdsStatusVDSCommand, log id: 1973b09c
2018-02-22 05:38:47,911-05 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] EVENT_ID:
VDS_INSTALL_FAILED(505), Host
lago-upgrade-from-release-suite-master-host0 installation failed.
Unexpected connection termination.
2018-02-22 05:38:47,920-05 INFO
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [546df98d] Lock freed to
object 'EngineLock:{exclusiveLocks='[65c048e5-a97a-422d-88b8-abe1fd925602=VDS]',
sharedLocks=''}'

**
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Change in ovirt-engine[master]: packaging: setup: postgres95: Fixes

2018-02-22 Thread Yedidyah Bar David
On Tue, Feb 20, 2018 at 12:04 PM, Code Review  wrote:
> Jenkins CI posted comments on this change.
>
> View Change
>
> Patch set 9:
>
> Build Failed
>
> http://jenkins.ovirt.org/job/ovirt-engine_master_check-merged-fc27-x86_64/159/
> : FAILURE
>
> http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-el7-x86_64/6668/
> : FAILURE

09:07:32  > git clean -fdx # timeout=10
09:07:36 ERROR: Error fetching remote repo 'origin'
09:07:36 hudson.plugins.git.GitException: Failed to fetch from
git://gerrit.ovirt.org/ovirt-engine.git
...

09:07:36 stderr: warning: failed to remove exported-artifacts/tests:
Permission denied
09:07:36 warning: failed to remove
output/ovirt-engine-4.3.0-0.0.master.20180220072827.git1987ee4.el7.centos.src.rpm:
Permission denied
09:07:36 warning: failed to remove rpmbuild/SPECS/ovirt-engine.spec:
Permission denied
09:07:36 warning: failed to remove
rpmbuild/SOURCES/ovirt-engine-4.3.0_master.tar.gz: Permission denied
09:07:36 warning: failed to remove
rpmbuild/BUILD/ovirt-engine-4.3.0/.gitignore: Permission denied
09:07:36 warning: failed to remove
rpmbuild/BUILD/ovirt-engine-4.3.0/.gitreview: Permission denied
...

Known issue? Bad permissions in some slave(s)?

>
> http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-fcraw-x86_64/110/
> : SUCCESS
>
> http://jenkins.ovirt.org/job/ovirt-engine_master_check-merged-el7-x86_64/7564/
> : SUCCESS
>
> http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-fc27-x86_64/159/
> : SUCCESS
>
> http://jenkins.ovirt.org/job/ovirt-engine_master_check-merged-fcraw-x86_64/111/
> : SUCCESS
>
> http://jenkins.ovirt.org/job/standard-enqueue/10362/ :
> This change was successfully submitted to the change queue(s) for system
> testing.
>
> To view, visit change 85543. To unsubscribe, visit settings.
>
> Gerrit-Project: ovirt-engine
> Gerrit-Branch: master
> Gerrit-MessageType: comment
> Gerrit-Change-Id: I5b4430bfe79df9a4c0e0ffc1696c821e20584eb3
> Gerrit-Change-Number: 85543
> Gerrit-PatchSet: 9
> Gerrit-Owner: Yedidyah Bar David 
> Gerrit-Reviewer: Ido Rosenzwig 
> Gerrit-Reviewer: Jenkins CI
> Gerrit-Reviewer: Lev Veyde 
> Gerrit-Reviewer: Rafael Martins 
> Gerrit-Reviewer: Sandro Bonazzola 
> Gerrit-Reviewer: Simone Tiraboschi 
> Gerrit-Reviewer: Yedidyah Bar David 
> Gerrit-Reviewer: gerrit-hooks 
> Gerrit-Comment-Date: Tue, 20 Feb 2018 10:04:20 +
> Gerrit-HasComments: No



-- 
Didi
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[ OST Failure Report ] [ oVirt Master (ovirt-engine-metrics) ] [ 22-02-2018 ] [ 003_00_metrics_bootstrap.metrics_and_log_collector ]

2018-02-22 Thread Dafna Ron
hi,

We are failing test 003_00_metrics_bootstrap.metrics_and_log_collector for
basic suite.

*Link and headline of suspected patches: *






*ansible: End playbook based on initial validations -
https://gerrit.ovirt.org/#/c/88062/
Link to
Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/
Link
to all
logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/artifacts
(Relevant)
error snippet from the log: *

/var/tmp:
drwxr-x--x. root abrt system_u:object_r:abrt_var_cache_t:s0 abrt
-rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.aLitM7
-rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.G2r7IM
-rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.kVymZE
-rw---. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.uPDvvU
drwx--. root root system_u:object_r:tmp_t:s0
systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE
drwx--. root root system_u:object_r:tmp_t:s0
systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS

/var/tmp/abrt:
-rw---. root root system_u:object_r:abrt_var_cache_t:s0 last-via-server

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE:
drwxrwxrwt. root root system_u:object_r:tmp_t:s0   tmp

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE/tmp:

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS:
drwxrwxrwt. root root system_u:object_r:tmp_t:s0   tmp

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS/tmp:

/var/yp:
)
2018-02-22 07:24:05::DEBUG::__main__::251::root:: STDERR(/bin/ls:
cannot open directory
/rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp:
No such file or directory
)
2018-02-22 07:24:05::ERROR::__main__::832::root:: Failed to collect
logs from: lago-basic-suite-master-host-0; /bin/ls: cannot open
directory 
/rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp:
No such file or directory


**
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[CQ]: 88062, 6 (ovirt-engine-metrics) failed "ovirt-master" system tests

2018-02-22 Thread oVirt Jenkins
Change 88062,6 (ovirt-engine-metrics) is probably the reason behind recent
system test failures in the "ovirt-master" change queue and needs to be fixed.

This change had been removed from the testing queue. Artifacts build from this
change will not be released until it is fixed.

For further details about the change see:
https://gerrit.ovirt.org/#/c/88062/6

For failed test results see:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[CQ]: 87861, 7 (vdsm) failed "ovirt-master" system tests, but isn't the failure root cause

2018-02-22 Thread oVirt Jenkins
A system test invoked by the "ovirt-master" change queue including change
87861,7 (vdsm) failed. However, this change seems not to be the root cause for
this failure. Change 87944,3 (vdsm) that this change depends on or is based on,
was detected as the cause of the testing failures.

This change had been removed from the testing queue. Artifacts built from this
change will not be released until either change 87944,3 (vdsm) is fixed and
this change is updated to refer to or rebased on the fixed version, or this
change is modified to no longer depend on it.

For further details about the change see:
https://gerrit.ovirt.org/#/c/87861/7

For further details about the change that seems to be the root cause behind the
testing failures see:
https://gerrit.ovirt.org/#/c/87944/3

For failed test results see:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5837/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-system-tests_he-basic-suite-4.1 - Build # 194 - Fixed!

2018-02-22 Thread jenkins
Project: http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.1/ 
Build: http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.1/194/
Build Number: 194
Build Status:  Fixed
Triggered By: Started by timer

-
Changes Since Last Success:
-
Changes for Build #193
[Eyal Edri] 4.2: replace python-ioprocess with python2-ioprocess

[Daniel Belenky] Add production pipeline jobs for Jenkins project

[Daniel Belenky] Add data normalizer param to nested config

[Eyal Edri] remove vdsm 3.6 build-artifacts

[Barak Korren] Fix '.git' bug in `pushe.py`

[Barak Korren] Make OST timed jobs congifure Git user and Email


Changes for Build #194
[Your Name] networking: Introducing mac pools and overlap range usage tests

[Barak Korren] Install/Update mock from global_setup.sh

[Sandro Bonazzola] ovirt-image-uploader: drop master jobs




-
Failed Tests:
-
All tests passed___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] [ OST Failure Report ] [ oVirt Master (vdsm) ] [ 22-02-2018 ] [ 002_bootstrap.verify_add_hosts + 002_bootstrap.add_hosts ]

2018-02-22 Thread Dafna Ron
Thanks Dan!


On Thu, Feb 22, 2018 at 7:15 PM, Dan Kenigsberg  wrote:

> On Thu, Feb 22, 2018 at 12:59 PM, Dafna Ron  wrote:
> > Hi,
> >
> > We had two failed tests reported in vdsm project last evening  the patch
> > reported seems to be related to the issue.
> >
> >
> > Link and headline of suspected patches:
> >
> > momIF: change the way we connect to MOM -
> > https://gerrit.ovirt.org/#/c/87944/
> >
> >
> > Link to Job:
> >
> > http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5823/
> >
> > Link to all logs:
> >
> > http://jenkins.ovirt.org/job/ovirt-master_change-queue-
> tester/5823/artifacts
> >
> > (Relevant) error snippet from the log:
> >
> > 
> >
> >
> >
> > 2018-02-21 14:15:47,576-0500 INFO  (MainThread) [vdsm.api] FINISH
> > prepareForShutdown return=None from=internal,
> > task_id=7d37a33b-0215-40c0-a821-9b94707caca6 (api:52)
> > 2018-02-21 14:15:47,576-0500 ERROR (MainThread) [vds] Exception raised
> > (vdsmd:158)
> > Traceback (most recent call last):
> >   File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in
> run
> > serve_clients(log)
> >   File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in
> > serve_clients
> > cif = clientIF.getInstance(irs, log, scheduler)
> >   File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 251, in
> > getInstance
> > cls._instance = clientIF(irs, log, scheduler)
> >   File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 121, in
> > __init__
> > self.mom = MomClient(config.get("mom", "socket_path"))
> >   File "/usr/lib/python2.7/site-packages/vdsm/momIF.py", line 51, in
> > __init__
> > raise MomNotAvailableError()
> > MomNotAvailableError
> >
> > 
> >
>
> this smells like a race between mom and vdsm startups (that
> bidirectional dependency is woderful!). I am sure that Francesco can
> fix it quickly, but until then I've posted a revert of the offending
> patch
> http://jenkins.ovirt.org/job/ovirt-system-tests_manual/2225/
>
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[JIRA] (OVIRT-1909) GetBadges is not working anymore

2018-02-22 Thread sbonazzo (oVirt JIRA)
sbonazzo created OVIRT-1909:
---

 Summary: GetBadges is not working anymore
 Key: OVIRT-1909
 URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1909
 Project: oVirt - virtualization made easy
  Issue Type: By-EMAIL
Reporter: sbonazzo
Assignee: infra


Not sure what happened 3 months ago but GetBadges is not receiving data
anymore.



--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100080)
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


The importance of fixing failed build-artifacts jobs

2018-02-22 Thread Dafna Ron
Hi All,

We have been seeing a large amount of changes that are not deployed into
tested lately because of failed build-artifacts jobs so we decided that
perhaps we need to explain the importance of fixing a failed
build-artifacts job.

If a change failed a build-artifacts job, no matter what platform/arch it
failed in, the change will not be deployed to tested.

Here is an example of a change that will not be added to tested:

[image: Inline image 1]

As you can see, only one of the build-artifacts jobs failed but since the
project specify that it requires all of these arches/platforms, the change
will not be added to tested until all of the jobs are fixed.

So what can we do?

1. Add the code which builds-artifacts to 'check-patch' so you'll get a -1
if a build failed (assuming you will not merge with -1 from CI).
2. post merge - look for emails on failed artifacts on your change (you
will have to fix the job and then re-trigger the change)
3. you can see all current broken failed artifacts jobs in jenkins under
'unstable critical' view [1] and you will know if your project is being
deployed.
4. Remove the broken OS from your project ( either from Jenkins or from
your automation dir if you're using V2 ) - ask us for help! this should be
an easy patch
5.Don't add new OS builds until you're absolutly sure they work ( you can
add check-patch to keep testing it, but don't add build-artifacts until its
stable ).

Please contact myself or anyone else from the CI team for assistance or
questions and we would be happy to help.

[1] http://jenkins.ovirt.org/

Thank you,

Dafna
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[CQ]: 87909, 3 (vdsm) failed "ovirt-master" system tests, but isn't the failure root cause

2018-02-22 Thread oVirt Jenkins
A system test invoked by the "ovirt-master" change queue including change
87909,3 (vdsm) failed. However, this change seems not to be the root cause for
this failure. Change 87944,3 (vdsm) that this change depends on or is based on,
was detected as the cause of the testing failures.

This change had been removed from the testing queue. Artifacts built from this
change will not be released until either change 87944,3 (vdsm) is fixed and
this change is updated to refer to or rebased on the fixed version, or this
change is modified to no longer depend on it.

For further details about the change see:
https://gerrit.ovirt.org/#/c/87909/3

For further details about the change that seems to be the root cause behind the
testing failures see:
https://gerrit.ovirt.org/#/c/87944/3

For failed test results see:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5838/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] [ OST Failure Report ] [ oVirt Master (vdsm) ] [ 22-02-2018 ] [ 002_bootstrap.verify_add_hosts + 002_bootstrap.add_hosts ]

2018-02-22 Thread Dan Kenigsberg
On Thu, Feb 22, 2018 at 12:59 PM, Dafna Ron  wrote:
> Hi,
>
> We had two failed tests reported in vdsm project last evening  the patch
> reported seems to be related to the issue.
>
>
> Link and headline of suspected patches:
>
> momIF: change the way we connect to MOM -
> https://gerrit.ovirt.org/#/c/87944/
>
>
> Link to Job:
>
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5823/
>
> Link to all logs:
>
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5823/artifacts
>
> (Relevant) error snippet from the log:
>
> 
>
>
>
> 2018-02-21 14:15:47,576-0500 INFO  (MainThread) [vdsm.api] FINISH
> prepareForShutdown return=None from=internal,
> task_id=7d37a33b-0215-40c0-a821-9b94707caca6 (api:52)
> 2018-02-21 14:15:47,576-0500 ERROR (MainThread) [vds] Exception raised
> (vdsmd:158)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in run
> serve_clients(log)
>   File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in
> serve_clients
> cif = clientIF.getInstance(irs, log, scheduler)
>   File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 251, in
> getInstance
> cls._instance = clientIF(irs, log, scheduler)
>   File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 121, in
> __init__
> self.mom = MomClient(config.get("mom", "socket_path"))
>   File "/usr/lib/python2.7/site-packages/vdsm/momIF.py", line 51, in
> __init__
> raise MomNotAvailableError()
> MomNotAvailableError
>
> 
>

this smells like a race between mom and vdsm startups (that
bidirectional dependency is woderful!). I am sure that Francesco can
fix it quickly, but until then I've posted a revert of the offending
patch
http://jenkins.ovirt.org/job/ovirt-system-tests_manual/2225/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] The importance of fixing failed build-artifacts jobs

2018-02-22 Thread Yaniv Kaul
I think there's a rush to add FC27 and S390 (unrelated?) to the build. If
either fail, right now, I don't think we should be too concerned with them.
In the very near future we should be, though.
Y.

On Thu, Feb 22, 2018 at 9:06 PM, Dafna Ron  wrote:

> Hi All,
>
> We have been seeing a large amount of changes that are not deployed into
> tested lately because of failed build-artifacts jobs so we decided that
> perhaps we need to explain the importance of fixing a failed
> build-artifacts job.
>
> If a change failed a build-artifacts job, no matter what platform/arch it
> failed in, the change will not be deployed to tested.
>
> Here is an example of a change that will not be added to tested:
>
> [image: Inline image 1]
>
> As you can see, only one of the build-artifacts jobs failed but since the
> project specify that it requires all of these arches/platforms, the change
> will not be added to tested until all of the jobs are fixed.
>
> So what can we do?
>
> 1. Add the code which builds-artifacts to 'check-patch' so you'll get a -1
> if a build failed (assuming you will not merge with -1 from CI).
> 2. post merge - look for emails on failed artifacts on your change (you
> will have to fix the job and then re-trigger the change)
> 3. you can see all current broken failed artifacts jobs in jenkins under
> 'unstable critical' view [1] and you will know if your project is being
> deployed.
> 4. Remove the broken OS from your project ( either from Jenkins or from
> your automation dir if you're using V2 ) - ask us for help! this should be
> an easy patch
> 5.Don't add new OS builds until you're absolutly sure they work ( you can
> add check-patch to keep testing it, but don't add build-artifacts until its
> stable ).
>
> Please contact myself or anyone else from the CI team for assistance or
> questions and we would be happy to help.
>
> [1] http://jenkins.ovirt.org/
>
> Thank you,
>
> Dafna
>
>
>
>
>
>
> ___
> Devel mailing list
> de...@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[CQ]: 87428, 2 (vdsm) failed "ovirt-master" system tests, but isn't the failure root cause

2018-02-22 Thread oVirt Jenkins
A system test invoked by the "ovirt-master" change queue including change
87428,2 (vdsm) failed. However, this change seems not to be the root cause for
this failure. Change 87944,3 (vdsm) that this change depends on or is based on,
was detected as the cause of the testing failures.

This change had been removed from the testing queue. Artifacts built from this
change will not be released until either change 87944,3 (vdsm) is fixed and
this change is updated to refer to or rebased on the fixed version, or this
change is modified to no longer depend on it.

For further details about the change see:
https://gerrit.ovirt.org/#/c/87428/2

For further details about the change that seems to be the root cause behind the
testing failures see:
https://gerrit.ovirt.org/#/c/87944/3

For failed test results see:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5841/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Is CI broken on 3.6 branch?

2018-02-22 Thread Ala Hino
Hi,

I have a patch on 3.6 branch and CI seems to be broken:

CI: http://jenkins.ovirt.org/job/vdsm_3.6_check-patch-el7-x86_64/292/
Patch: https://gerrit.ovirt.org/88045

Another CI triggered yesterday by Francesco but not completed yet.

Please advice.
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [oVirt Jenkins] ovirt-system-tests_he-basic-ansible-suite-master - Build # 47 - Failure!

2018-02-22 Thread Sandro Bonazzola
2018-02-23 3:29 GMT+01:00 :

> Project: http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-
> ansible-suite-master/
> Build: http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-
> ansible-suite-master/47/



This fails on:

*03:28:49* [ INFO  ] TASK [Wait for the engine to come up on the target VM]

*03:29:11* [ ERROR ] fatal: [localhost]: FAILED! => {"msg": "The
conditional check 'health_result.rc == 0 and
health_result.stdout|from_json|json_query('*.\"engine-status\".\"health\"')|first==\"good\"'
failed. The error was: error while evaluating conditional
(health_result.rc == 0 and
health_result.stdout|from_json|json_query('*.\"engine-status\".\"health\"')|first==\"good\"):
No first item, sequence was empty."}*03:29:11* [ ERROR ] Failed to
execute stage 'Closing up': Failed executing ansible-playbook


Messages also show some errors probably not related to this one in a vdsm
hook:

Feb 22 21:28:48 lago-he-basic-ansible-suite-master-host0 systemd:
Started oVirt Hosted Engine High Availability Monitoring Agent.
Feb 22 21:28:48 lago-he-basic-ansible-suite-master-host0 systemd:
Starting oVirt Hosted Engine High Availability Monitoring Agent...
Feb 22 21:28:49 lago-he-basic-ansible-suite-master-host0 python:
ansible-command Invoked with warn=True executable=None
_uses_shell=False _raw_params=hosted-engine --vm-status --json
removes=None creates=None chdir=None stdin=None
Feb 22 21:28:50 lago-he-basic-ansible-suite-master-host0 python:
detected unhandled Python exception in
'/usr/libexec/vdsm/hooks/openstacknet-get-config'
Feb 22 21:28:50 lago-he-basic-ansible-suite-master-host0 abrt-server:
Duplicate: core backtrace
Feb 22 21:28:50 lago-he-basic-ansible-suite-master-host0 abrt-server:
DUP_OF_DIR: /var/tmp/abrt/Python-2018-02-22-21:22:17-5163
Feb 22 21:28:50 lago-he-basic-ansible-suite-master-host0 abrt-server:
Deleting problem directory Python-2018-02-22-21:28:50-8927 (dup of
Python-2018-02-22-21:22:17-5163)
Feb 22 21:28:54 lago-he-basic-ansible-suite-master-host0 python:
ansible-command Invoked with warn=True executable=None
_uses_shell=False _raw_params=hosted-engine --vm-status --json
removes=None creates=None chdir=None stdin=None
Feb 22 21:29:00 lago-he-basic-ansible-suite-master-host0 python:
ansible-command Invoked with warn=True executable=None
_uses_shell=False _raw_params=hosted-engine --vm-status --json
removes=None creates=None chdir=None stdin=None
Feb 22 21:29:00 lago-he-basic-ansible-suite-master-host0 python:
detected unhandled Python exception in
'/usr/libexec/vdsm/hooks/openstacknet-get-config'
Feb 22 21:29:00 lago-he-basic-ansible-suite-master-host0 abrt-server:
Not saving repeating crash in
'/usr/libexec/vdsm/hooks/openstacknet-get-config'
Feb 22 21:29:05 lago-he-basic-ansible-suite-master-host0 python:
ansible-command Invoked with warn=True executable=None
_uses_shell=False _raw_params=hosted-engine --vm-status --json
removes=None creates=None chdir=None stdin=None
Feb 22 21:29:08 lago-he-basic-ansible-suite-master-host0 vdsm[3767]:
WARN Worker blocked:  at
0x3cdf4d0> timeout=15, duration=15 at 0x3cc9410> task#=84 at
0x3cc90d0>, traceback:#012File: "/usr/lib64/python2.7/threading.py",
line 785, in __bootstrap#012  self.__bootstrap_inner()#012File:
"/usr/lib64/python2.7/threading.py", line 812, in
__bootstrap_inner#012  self.run()#012File:
"/usr/lib64/python2.7/threading.py", line 765, in run#012
self.__target(*self.__args, **self.__kwargs)#012File:
"/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line
194, in run#012  ret = func(*args, **kwargs)#012File:
"/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in
_run#012  self._execute_task()#012File:
"/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in
_execute_task#012  task()#012File:
"/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in
__call__#012  self._callable()#012File:
"/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 232, in
__call__#012  self._func()#012File:
"/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 578, in
__call__#012  stats = hostapi.get_stats(self._cif,
self._samples.stats())#012File:
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 77, in
get_stats#012  ret['haStats'] = _getHaInfo()#012File:
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in
_getHaInfo#012  stats = instance.get_all_stats()#012File:
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 92, in get_all_stats#012  stats =
broker.get_stats_from_storage()#012File:
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 135, in get_stats_from_storage#012  result =
self._proxy.get_stats()#012File: "/usr/lib64/python2.7/xmlrpclib.py",
line 1233, in __call__#012  return self.__send(self.__name,
args)#012File: "/usr/lib64/python2.7/xmlrpclib.py", line 1587, in
__request#012  verbose=self.__verbose#012File:
"/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request#012  

Re: Is CI broken on 3.6 branch?

2018-02-22 Thread Sandro Bonazzola
2018-02-23 6:58 GMT+01:00 Ala Hino :

> Hi,
>
> I have a patch on 3.6 branch and CI seems to be broken:
>
> CI: http://jenkins.ovirt.org/job/vdsm_3.6_check-patch-el7-x86_64/292/
> Patch: https://gerrit.ovirt.org/88045
>
> Another CI triggered yesterday by Francesco but not completed yet.
>

This is failing on:

*17:09:47* 
http://mirror.centos.org/centos/7/virt/x86_64/ovirt-3.6/repodata/repomd.xml:
[Errno 14] HTTP Error 404 - Not Found


For oVirt 3.6 you can't use CentOS 7.4. According to
https://ovirt.org/documentation/install-guide/chap-Introduction_to_Hypervisor_Hosts/#host-compatibility-matrix
latest supported is CentOS 7.2.
You need to change automation configuration for consuming
http://vault.centos.org/7.2.1511/virt/x86_64/ovirt-3.6/ instead of
http://mirror.centos.org/centos/7/virt/x86_64/ovirt-3.6
Since 7.2 moved to Vault once 7.4 has been released in August 2017. On a
side note, being oVirt 3.6 EOL since 4.0 GA, I can't guarantee this work at
all.




>
> Please advice.
>
>
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>
>


-- 

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R

Red Hat EMEA 

TRIED. TESTED. TRUSTED. 
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-system-tests_he-basic-ansible-suite-master - Build # 47 - Failure!

2018-02-22 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-ansible-suite-master/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-ansible-suite-master/47/
Build Number: 47
Build Status:  Failure
Triggered By: Started by timer

-
Changes Since Last Success:
-
Changes for Build #47
[Your Name] networking: Introducing mac pools and overlap range usage tests

[Barak Korren] Install/Update mock from global_setup.sh

[Sandro Bonazzola] ovirt-image-uploader: drop master jobs




-
Failed Tests:
-
No tests ran.___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-system-tests_performance-suite-master - Build # 107 - Fixed!

2018-02-22 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-system-tests_performance-suite-master/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-system-tests_performance-suite-master/107/
Build Number: 107
Build Status:  Fixed
Triggered By: Started by timer

-
Changes Since Last Success:
-
Changes for Build #105
[Gal Ben Haim] master: Update reposync-config

[Daniel Belenky] std enqueue: Add env.GERRIT_TRIGGER_CI_VOTE_LABEL

[Daniel Belenky] cleanup: Fix race when unmounting dirs in chroot

[Daniel Belenky] Add a tool to check if a change is merged

[Sandro Bonazzola] ovirt-provider-ovn: drop manual jobs

[Dafna Ron] jenkins: adding infra@ovirt.org to the alert list

[Barak Korren] Run `pusher.py` on non-Gerrit events

[Daniel Belenky] Utilize stdci_dsl api in standard-stage


Changes for Build #106
[Eyal Edri] 4.2: replace python-ioprocess with python2-ioprocess

[Daniel Belenky] Add production pipeline jobs for Jenkins project

[Daniel Belenky] Add data normalizer param to nested config

[Eyal Edri] remove vdsm 3.6 build-artifacts

[Barak Korren] Fix '.git' bug in `pushe.py`

[Barak Korren] Make OST timed jobs congifure Git user and Email


Changes for Build #107
[Your Name] networking: Introducing mac pools and overlap range usage tests

[Barak Korren] Install/Update mock from global_setup.sh

[Sandro Bonazzola] ovirt-image-uploader: drop master jobs




-
Failed Tests:
-
All tests passed___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-appliance_master_build-artifacts-el7-x86_64 - Build # 713 - Failure!

2018-02-22 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-appliance_master_build-artifacts-el7-x86_64/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-appliance_master_build-artifacts-el7-x86_64/713/
Build Number: 713
Build Status:  Failure
Triggered By: Started by timer

-
Changes Since Last Success:
-
Changes for Build #713
[dfodor] Added Dafna Hirschfeld to jenkins whitelist

[Barak Korren] Make OST timed jobs congifure Git user and Email

[Barak Korren] Install/Update mock from global_setup.sh

[Sandro Bonazzola] ovirt-image-uploader: drop master jobs

[Yuval Turgeman] appliance-report: try to mount /var




-
Failed Tests:
-
No tests ran.___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra