On Sun, May 29, 2016 at 2:10 AM, Nir Soffer <[email protected]> wrote: > It look like this when tests times out: > > 23:04:44 miscTests.EventTests > 23:04:44 testEmit OK > 23:04:44 testEmitCallbackException OK > 23:04:49 testEmitStale OK > 23:04:49 testInstanceMethod OK > 23:04:50 testInstanceMethodDead OK > 23:04:55 testOneShot > 23:04:55 > ======================================================================== > 23:04:55 = Timeout completing tests - extracting stacktrace > = > 23:04:55 > ======================================================================== > 23:04:55 > 23:04:55 attach: No such file or directory. > 23:04:55 [New LWP 7887] > 23:04:55 [New LWP 7880] > 23:04:55 [New LWP 7873] > 23:04:55 [New LWP 7866] > 23:04:55 [New LWP 7859] > 23:04:55 [New LWP 7852] > 23:04:55 [New LWP 7845] > 23:04:55 [Thread debugging using libthread_db enabled] > 23:04:55 Using host libthread_db library "/lib64/libthread_db.so.1". > 23:04:56 0x00007f17f0a1fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () > from /lib64/libpthread.so.0 > 23:04:56 > 23:04:56 Thread 8 (Thread 0x7f17df860700 (LWP 7845)): > 23:04:56 Undefined command: "py-bt". Try "help". > 23:04:56 OK > 23:04:56 testUnregister > 23:04:56 > ======================================================================== > 23:04:56 = Aborting tests > = > 23:04:56 > ======================================================================== > 23:04:56 ../tests/run_tests_local.sh: line 35: 7743 Killed > "$PYTHON_EXE" ../tests/testrunner.py --local-modules $@ > > > > On Sun, May 29, 2016 at 2:07 AM, Nir Soffer <[email protected]> wrote: >> On Thu, May 26, 2016 at 11:08 PM, Nir Soffer <[email protected]> wrote: >>> Hi all, >>> >>> We had 2 issues causing vdsm check-patch and check-merge jobs to get stuck. >>> >>> I fixed the one that caused most trouble: >>> https://gerrit.ovirt.org/57993 >>> >>> The other issue may be related to ioprocess, I fixed a related issue: >>> https://gerrit.ovirt.org/57473 >>> >>> But I have seen stuck jobs after this change, so the issue may not >>> be fixed yet. >>> >>> If you see a stuck vdsm job - job that run more than 15 minutes, please >>> get me a backtrace: >>> >>> 1. locate the test_runner process pid: >>> >>> $ ps aux | grep testrunner.py | grep -v grep >>> nsoffer 26297 82.6 0.9 389592 111144 pts/3 R+ 22:52 0:02 >>> /usr/bin/python ../tests/testrunner.py ... >>> >>> 2. save a backtrace: >>> >>> gdb attach 26297 --batch -ex "thread apply all py-bt" > py-bt.out >> >> This requires the python-debuginfo package, typically installed using: >> >> dnf debuginfo-install python >> >> I sent this patch, detecting stuck vdsm tests, printing a backtrace, and >> killing >> the stuck process: >> https://gerrit.ovirt.org/58212 >> >> It works, but we don't get a backtrace, since python-debuginfo is not >> installed >> although I require it - probably we need to add the fedora-debug repository >> to check-patch.repos. I tried to use the urls from >> /etc/yum.repos.d/fedora.repo, >> but none of them work. >> >> I will need help from infra to get it working.
I sent also this patch, that should fix the issue on jenkins, but I cannot test it on jenkins: https://gerrit.ovirt.org/58213 Nir _______________________________________________ Infra mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/infra
