The issue is non-daemon thread blocking the python process during shutdown of the tests.
Current ioprocess does not create such thread, but we still see this issue today: http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1841/console If the builds are using the latest ioprocess build (0.16.0-1), built after Sun May 15 21:29:24 2016 +0300, this is probably not related to ioprocess To understand this issue we need to get a stacktrace from the stuck python process. See relevant log bellow. Nir ---- 11:49:30 -------------------------------------------------------------------------------------------------------------------------------------------- 11:49:30 TOTAL 40672 21072 48% 11:49:30 ---------------------------------------------------------------------- 11:49:30 Ran 2182 tests in 147.661s 11:49:30 11:49:30 OK (SKIP=94) 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'udev_unref'" in <bound method Context.__del__ of <pyudev.core.Context object at 0x7312610>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5435350>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5420850>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x7269910>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5420f90>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5419c10>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x610e610>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x6dc4d50>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x610f390>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x54195d0>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5432510>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x6110250>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5414750>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5414150>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x6a33390>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x543d590>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x6ba8390>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5432d50>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5435a90>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x503f350>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5420250>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5442e90>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x503f950>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x610e7d0>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x5414d90>> ignored 11:49:30 Exception AttributeError: "'NoneType' object has no attribute 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess object at 0x543d450>> ignored 11:49:30 Exception in thread ioprocess communication (8008) (most likely raised during interpreter shutdown): 11:49:30 Traceback (most recent call last): 11:49:30 File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner 11:49:30 File "/usr/lib64/python2.7/threading.py", line 764, in run 11:49:30 File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in _communicate 11:49:30 <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'close' 17:42:17 Build timed out (after 360 minutes). Marking the build as failed. 17:42:17 Build was aborted On Fri, May 20, 2016 at 10:30 AM, Piotr Kliczewski <[email protected]> wrote: > Eyal, > > This was ioprocess issue occurring after the fix was provided. I > haven't seen it since build #1389. > > Thanks, > Piotr > > On Thu, May 19, 2016 at 3:00 PM, Eyal Edri <[email protected]> wrote: >> was that resolved? >> any infra issue or was it problems with the tests? >> >> On Mon, May 16, 2016 at 3:27 PM, Piotr Kliczewski >> <[email protected]> wrote: >>> >>> and one more: >>> >>> >>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1389/console >>> >>> On Mon, May 16, 2016 at 1:46 PM, Piotr Kliczewski >>> <[email protected]> wrote: >>> > One more occurrence of the issue [1] >>> > >>> > >>> > [1] >>> > http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1359/console >>> > >>> > On Sun, May 15, 2016 at 8:37 PM, Nir Soffer <[email protected]> wrote: >>> >> The ioprocess issue fixed in https://gerrit.ovirt.org/57473 >>> >> >>> >> Will be merge soon and available via ovirt-release-master. >>> >> >>> >> Nir >>> >> >>> >> On Sun, May 15, 2016 at 7:45 PM, Nir Soffer <[email protected]> wrote: >>> >>> Hi all, >>> >>> >>> >>> I found another stuck build today: >>> >>> >>> >>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console >>> >>> >>> >>> 11:27:18 >>> >>> ------------------------------------------------------------------------------------------------------------------------------------------------ >>> >>> 11:27:18 TOTAL >>> >>> 40513 21121 >>> >>> 48% >>> >>> 11:27:18 >>> >>> ---------------------------------------------------------------------- >>> >>> 11:27:18 Ran 2169 tests in 145.934s >>> >>> 11:27:18 >>> >>> 11:27:18 OK (SKIP=88) >>> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute >>> >>> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess >>> >>> object at 0x7fd7c9f2d3d0>> ignored >>> >>> [...] >>> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute >>> >>> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess >>> >>> object at 0x7fd7c9f15550>> ignored >>> >>> 11:27:18 Exception in thread ioprocess communication (6533) (most >>> >>> likely raised during interpreter shutdown): >>> >>> 11:27:18 Traceback (most recent call last): >>> >>> 11:27:18 File "/usr/lib64/python2.7/threading.py", line 804, in >>> >>> __bootstrap_inner >>> >>> 11:27:18 File "/usr/lib64/python2.7/threading.py", line 757, in run >>> >>> 11:27:18 File >>> >>> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in >>> >>> _communicate >>> >>> 11:27:18 <type 'exceptions.AttributeError'>: 'NoneType' object has no >>> >>> attribute 'close' >>> >>> >>> >>> This seems smells like a non-daemon thread started by some code, >>> >>> blocking hte test process. >>> >>> >>> >>> I suspect ioprocess, starting such thread, looking into it. >>> >>> >>> >>> Meanwhile, please: >>> >>> - verify that all threads in actual code and in the tests are daemon >>> >>> threads >>> >>> - convert your threads to use vdsm.concurrent.thread instead of >>> >>> threading.Thread (daemon by default) >>> >>> - watch your builds and abort stuck builds >>> >>> >>> >>> David, we need a timeout in the ci, aborting the job after a project >>> >>> based timeout, maybe >>> >>> defined in the project yaml. >>> >>> >>> >>> Cheers, >>> >>> Nir >>> >> _______________________________________________ >>> >> Devel mailing list >>> >> [email protected] >>> >> http://lists.ovirt.org/mailman/listinfo/devel >>> _______________________________________________ >>> Devel mailing list >>> [email protected] >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >>> >> >> >> >> -- >> Eyal Edri >> Associate Manager >> RHEV DevOps >> EMEA ENG Virtualization R&D >> Red Hat Israel >> >> phone: +972-9-7692018 >> irc: eedri (on #tlv #rhev-dev #rhev-integ) _______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
