Re: [ovirt-devel] [VDSM] stuck tests in ci

2016-05-31 Thread Piotr Kliczewski
All,

I just noticed one more build [1] which got stuck with:

15:46:40 Traceback (most recent call last):
15:46:40   File "/usr/lib64/python2.7/threading.py", line 804, in
__bootstrap_inner
15:46:40   File "/usr/lib64/python2.7/threading.py", line 757, in run
15:46:40   File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 181, in
_communicate
15:46:40 : 'NoneType' object has no
attribute 'close'

Thanks,
Piotr

[1] 
http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/2380/console

On Sat, May 21, 2016 at 8:28 PM, Nir Soffer  wrote:
> The issue is non-daemon thread blocking the python process during
> shutdown of the tests.
>
> Current ioprocess does not create such thread, but we still see this
> issue today:
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1841/console
>
> If the builds are using the latest ioprocess build (0.16.0-1), built after
> Sun May 15 21:29:24 2016 +0300, this is probably not related to ioprocess
>
> To understand this issue we need to get a stacktrace from the stuck python
> process.
>
> See relevant log bellow.
>
>
> Nir
>
> 
>
> 11:49:30 
> 
> 11:49:30 TOTAL
>  40672  2107248%
> 11:49:30 
> --
> 11:49:30 Ran 2182 tests in 147.661s
> 11:49:30
> 11:49:30 OK (SKIP=94)
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'udev_unref'" in  object at 0x7312610>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5435350>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5420850>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x7269910>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5420f90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5419c10>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x610e610>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x6dc4d50>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x610f390>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x54195d0>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5432510>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x6110250>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5414750>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5414150>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x6a33390>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x543d590>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x6ba8390>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5432d50>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5435a90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x503f350>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5420250>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5442e90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x503f950>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x610e7d0>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x5414d90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x543d450>> ignored
> 11:49:30 Exception in thread ioprocess communication (8008) (most
> likely raised during interpreter shutdown):
> 11:49:30 Traceback (most recent call last):
> 11:49:30   File "/usr/lib64/python2.7/threading.py", line 811, in
> __bootstrap_inner
> 11:49:30   File "/usr/lib64/python2.7/threading.py", line 764, in run
> 11:49:30   File
> 

Re: [ovirt-devel] [VDSM] stuck tests in ci

2016-05-20 Thread Piotr Kliczewski
Eyal,

This was ioprocess issue occurring after the fix was provided. I
haven't seen it since build #1389.

Thanks,
Piotr

On Thu, May 19, 2016 at 3:00 PM, Eyal Edri  wrote:
> was that resolved?
> any infra issue or was it problems with the tests?
>
> On Mon, May 16, 2016 at 3:27 PM, Piotr Kliczewski
>  wrote:
>>
>> and one more:
>>
>>
>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1389/console
>>
>> On Mon, May 16, 2016 at 1:46 PM, Piotr Kliczewski
>>  wrote:
>> > One more occurrence of the issue [1]
>> >
>> >
>> > [1]
>> > http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1359/console
>> >
>> > On Sun, May 15, 2016 at 8:37 PM, Nir Soffer  wrote:
>> >> The ioprocess issue fixed in https://gerrit.ovirt.org/57473
>> >>
>> >> Will be merge soon and available via ovirt-release-master.
>> >>
>> >> Nir
>> >>
>> >> On Sun, May 15, 2016 at 7:45 PM, Nir Soffer  wrote:
>> >>> Hi all,
>> >>>
>> >>> I found another stuck build today:
>> >>>
>> >>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console
>> >>>
>> >>> 11:27:18
>> >>> 
>> >>> 11:27:18 TOTAL
>> >>>  40513  21121
>> >>> 48%
>> >>> 11:27:18
>> >>> --
>> >>> 11:27:18 Ran 2169 tests in 145.934s
>> >>> 11:27:18
>> >>> 11:27:18 OK (SKIP=88)
>> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>> >>> 'write'" in > >>> object at 0x7fd7c9f2d3d0>> ignored
>> >>> [...]
>> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>> >>> 'write'" in > >>> object at 0x7fd7c9f15550>> ignored
>> >>> 11:27:18 Exception in thread ioprocess communication (6533) (most
>> >>> likely raised during interpreter shutdown):
>> >>> 11:27:18 Traceback (most recent call last):
>> >>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
>> >>> __bootstrap_inner
>> >>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
>> >>> 11:27:18   File
>> >>> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
>> >>> _communicate
>> >>> 11:27:18 : 'NoneType' object has no
>> >>> attribute 'close'
>> >>>
>> >>> This seems smells like a non-daemon thread started by some code,
>> >>> blocking hte test process.
>> >>>
>> >>> I suspect ioprocess, starting such thread, looking into it.
>> >>>
>> >>> Meanwhile, please:
>> >>> - verify that all threads in actual code and in the tests are daemon
>> >>> threads
>> >>> - convert your threads to use vdsm.concurrent.thread instead of
>> >>> threading.Thread (daemon by default)
>> >>> - watch your builds and abort stuck builds
>> >>>
>> >>> David, we need a timeout in the ci, aborting the job after a project
>> >>> based timeout, maybe
>> >>> defined in the project yaml.
>> >>>
>> >>> Cheers,
>> >>> Nir
>> >> ___
>> >> Devel mailing list
>> >> Devel@ovirt.org
>> >> http://lists.ovirt.org/mailman/listinfo/devel
>> ___
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>>
>
>
>
> --
> Eyal Edri
> Associate Manager
> RHEV DevOps
> EMEA ENG Virtualization R
> Red Hat Israel
>
> phone: +972-9-7692018
> irc: eedri (on #tlv #rhev-dev #rhev-integ)
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] stuck tests in ci

2016-05-19 Thread Eyal Edri
was that resolved?
any infra issue or was it problems with the tests?

On Mon, May 16, 2016 at 3:27 PM, Piotr Kliczewski <
piotr.kliczew...@gmail.com> wrote:

> and one more:
>
>
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1389/console
>
> On Mon, May 16, 2016 at 1:46 PM, Piotr Kliczewski
>  wrote:
> > One more occurrence of the issue [1]
> >
> >
> > [1]
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1359/console
> >
> > On Sun, May 15, 2016 at 8:37 PM, Nir Soffer  wrote:
> >> The ioprocess issue fixed in https://gerrit.ovirt.org/57473
> >>
> >> Will be merge soon and available via ovirt-release-master.
> >>
> >> Nir
> >>
> >> On Sun, May 15, 2016 at 7:45 PM, Nir Soffer  wrote:
> >>> Hi all,
> >>>
> >>> I found another stuck build today:
> >>>
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console
> >>>
> >>> 11:27:18
> 
> >>> 11:27:18 TOTAL
> >>>  40513  21121
> >>> 48%
> >>> 11:27:18
> --
> >>> 11:27:18 Ran 2169 tests in 145.934s
> >>> 11:27:18
> >>> 11:27:18 OK (SKIP=88)
> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
> >>> 'write'" in  >>> object at 0x7fd7c9f2d3d0>> ignored
> >>> [...]
> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
> >>> 'write'" in  >>> object at 0x7fd7c9f15550>> ignored
> >>> 11:27:18 Exception in thread ioprocess communication (6533) (most
> >>> likely raised during interpreter shutdown):
> >>> 11:27:18 Traceback (most recent call last):
> >>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
> >>> __bootstrap_inner
> >>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
> >>> 11:27:18   File
> >>> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
> >>> _communicate
> >>> 11:27:18 : 'NoneType' object has no
> >>> attribute 'close'
> >>>
> >>> This seems smells like a non-daemon thread started by some code,
> >>> blocking hte test process.
> >>>
> >>> I suspect ioprocess, starting such thread, looking into it.
> >>>
> >>> Meanwhile, please:
> >>> - verify that all threads in actual code and in the tests are daemon
> threads
> >>> - convert your threads to use vdsm.concurrent.thread instead of
> >>> threading.Thread (daemon by default)
> >>> - watch your builds and abort stuck builds
> >>>
> >>> David, we need a timeout in the ci, aborting the job after a project
> >>> based timeout, maybe
> >>> defined in the project yaml.
> >>>
> >>> Cheers,
> >>> Nir
> >> ___
> >> Devel mailing list
> >> Devel@ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/devel
> ___
> Devel mailing list
> Devel@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
>
>


-- 
Eyal Edri
Associate Manager
RHEV DevOps
EMEA ENG Virtualization R
Red Hat Israel

phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] stuck tests in ci

2016-05-16 Thread Piotr Kliczewski
and one more:

http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1389/console

On Mon, May 16, 2016 at 1:46 PM, Piotr Kliczewski
 wrote:
> One more occurrence of the issue [1]
>
>
> [1] 
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1359/console
>
> On Sun, May 15, 2016 at 8:37 PM, Nir Soffer  wrote:
>> The ioprocess issue fixed in https://gerrit.ovirt.org/57473
>>
>> Will be merge soon and available via ovirt-release-master.
>>
>> Nir
>>
>> On Sun, May 15, 2016 at 7:45 PM, Nir Soffer  wrote:
>>> Hi all,
>>>
>>> I found another stuck build today:
>>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console
>>>
>>> 11:27:18 
>>> 
>>> 11:27:18 TOTAL
>>>  40513  21121
>>> 48%
>>> 11:27:18 
>>> --
>>> 11:27:18 Ran 2169 tests in 145.934s
>>> 11:27:18
>>> 11:27:18 OK (SKIP=88)
>>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>>> 'write'" in >> object at 0x7fd7c9f2d3d0>> ignored
>>> [...]
>>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>>> 'write'" in >> object at 0x7fd7c9f15550>> ignored
>>> 11:27:18 Exception in thread ioprocess communication (6533) (most
>>> likely raised during interpreter shutdown):
>>> 11:27:18 Traceback (most recent call last):
>>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
>>> __bootstrap_inner
>>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
>>> 11:27:18   File
>>> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
>>> _communicate
>>> 11:27:18 : 'NoneType' object has no
>>> attribute 'close'
>>>
>>> This seems smells like a non-daemon thread started by some code,
>>> blocking hte test process.
>>>
>>> I suspect ioprocess, starting such thread, looking into it.
>>>
>>> Meanwhile, please:
>>> - verify that all threads in actual code and in the tests are daemon threads
>>> - convert your threads to use vdsm.concurrent.thread instead of
>>> threading.Thread (daemon by default)
>>> - watch your builds and abort stuck builds
>>>
>>> David, we need a timeout in the ci, aborting the job after a project
>>> based timeout, maybe
>>> defined in the project yaml.
>>>
>>> Cheers,
>>> Nir
>> ___
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] stuck tests in ci

2016-05-16 Thread Piotr Kliczewski
One more occurrence of the issue [1]


[1] http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1359/console

On Sun, May 15, 2016 at 8:37 PM, Nir Soffer  wrote:
> The ioprocess issue fixed in https://gerrit.ovirt.org/57473
>
> Will be merge soon and available via ovirt-release-master.
>
> Nir
>
> On Sun, May 15, 2016 at 7:45 PM, Nir Soffer  wrote:
>> Hi all,
>>
>> I found another stuck build today:
>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console
>>
>> 11:27:18 
>> 
>> 11:27:18 TOTAL
>>  40513  21121
>> 48%
>> 11:27:18 
>> --
>> 11:27:18 Ran 2169 tests in 145.934s
>> 11:27:18
>> 11:27:18 OK (SKIP=88)
>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>> 'write'" in > object at 0x7fd7c9f2d3d0>> ignored
>> [...]
>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>> 'write'" in > object at 0x7fd7c9f15550>> ignored
>> 11:27:18 Exception in thread ioprocess communication (6533) (most
>> likely raised during interpreter shutdown):
>> 11:27:18 Traceback (most recent call last):
>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
>> __bootstrap_inner
>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
>> 11:27:18   File
>> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
>> _communicate
>> 11:27:18 : 'NoneType' object has no
>> attribute 'close'
>>
>> This seems smells like a non-daemon thread started by some code,
>> blocking hte test process.
>>
>> I suspect ioprocess, starting such thread, looking into it.
>>
>> Meanwhile, please:
>> - verify that all threads in actual code and in the tests are daemon threads
>> - convert your threads to use vdsm.concurrent.thread instead of
>> threading.Thread (daemon by default)
>> - watch your builds and abort stuck builds
>>
>> David, we need a timeout in the ci, aborting the job after a project
>> based timeout, maybe
>> defined in the project yaml.
>>
>> Cheers,
>> Nir
> ___
> Devel mailing list
> Devel@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] stuck tests in ci

2016-05-15 Thread Nir Soffer
The ioprocess issue fixed in https://gerrit.ovirt.org/57473

Will be merge soon and available via ovirt-release-master.

Nir

On Sun, May 15, 2016 at 7:45 PM, Nir Soffer  wrote:
> Hi all,
>
> I found another stuck build today:
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console
>
> 11:27:18 
> 
> 11:27:18 TOTAL
>  40513  21121
> 48%
> 11:27:18 
> --
> 11:27:18 Ran 2169 tests in 145.934s
> 11:27:18
> 11:27:18 OK (SKIP=88)
> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x7fd7c9f2d3d0>> ignored
> [...]
> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in  object at 0x7fd7c9f15550>> ignored
> 11:27:18 Exception in thread ioprocess communication (6533) (most
> likely raised during interpreter shutdown):
> 11:27:18 Traceback (most recent call last):
> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
> __bootstrap_inner
> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
> 11:27:18   File
> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
> _communicate
> 11:27:18 : 'NoneType' object has no
> attribute 'close'
>
> This seems smells like a non-daemon thread started by some code,
> blocking hte test process.
>
> I suspect ioprocess, starting such thread, looking into it.
>
> Meanwhile, please:
> - verify that all threads in actual code and in the tests are daemon threads
> - convert your threads to use vdsm.concurrent.thread instead of
> threading.Thread (daemon by default)
> - watch your builds and abort stuck builds
>
> David, we need a timeout in the ci, aborting the job after a project
> based timeout, maybe
> defined in the project yaml.
>
> Cheers,
> Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


[ovirt-devel] [VDSM] stuck tests in ci

2016-05-15 Thread Nir Soffer
Hi all,

I found another stuck build today:
http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console

11:27:18 

11:27:18 TOTAL
 40513  21121
48%
11:27:18 --
11:27:18 Ran 2169 tests in 145.934s
11:27:18
11:27:18 OK (SKIP=88)
11:27:18 Exception AttributeError: "'NoneType' object has no attribute
'write'" in > ignored
[...]
11:27:18 Exception AttributeError: "'NoneType' object has no attribute
'write'" in > ignored
11:27:18 Exception in thread ioprocess communication (6533) (most
likely raised during interpreter shutdown):
11:27:18 Traceback (most recent call last):
11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
__bootstrap_inner
11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
11:27:18   File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
_communicate
11:27:18 : 'NoneType' object has no
attribute 'close'

This seems smells like a non-daemon thread started by some code,
blocking hte test process.

I suspect ioprocess, starting such thread, looking into it.

Meanwhile, please:
- verify that all threads in actual code and in the tests are daemon threads
- convert your threads to use vdsm.concurrent.thread instead of
threading.Thread (daemon by default)
- watch your builds and abort stuck builds

David, we need a timeout in the ci, aborting the job after a project
based timeout, maybe
defined in the project yaml.

Cheers,
Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel