On Mon, Apr 9, 2018 at 3:35 AM, Benny Zlotnik <bzlot...@redhat.com> wrote:

> $ gdb -p 13024 -batch -ex "thread apply all bt"
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> 0x00007f98275cfaff in ppoll () from /lib64/libc.so.6
>
> Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> #0  0x00007f98275cfaff in ppoll () from /lib64/libc.so.6
> #1  0x000055b55cf59d69 in qemu_poll_ns ()
> #2  0x000055b55cf5ba45 in aio_poll ()
> #3  0x000055b55ceedc0f in bdrv_get_block_status_above ()
> #4  0x000055b55cea3611 in convert_iteration_sectors ()
> #5  0x000055b55cea4352 in img_convert ()
> #6  0x000055b55ce9d819 in main ()


My team caught this issue too after switching to CentOS 7.4 with qemu-img
2.9.0
gdb shows exactly the same backtrace when the convert stuck, and we are on
NFS.

Later we found the following:
1. The stuck can happen on local storage, too.
2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly again.

BTW, we use "qemu-img convert" to convert qcow2 and its backing files into
a single qcow2 image.


> On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer <nir...@gmail.com> wrote:
>
> > On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik <bzlot...@redhat.com>
> wrote:
> >
> >> Hi,
> >>
> >> As part of copy operation initiated by rhev got stuck for more than a
> day
> >> and consumes plenty of CPU
> >> vdsm     13024  3117 99 Apr07 ?        1-06:58:43 /usr/bin/qemu-img
> >> convert
> >> -p -t none -T none -f qcow2
> >> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
> >> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
> >> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> >> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
> >> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
> >> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-
> 4b6b-ab00-56523df185da
> >>
> >> The target image appears to have no data yet:
> >> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> >> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> >> file format: raw
> >> virtual size: 120G (128849018880 bytes)
> >> disk size: 0
> >>
> >> strace -p 13024 -tt -T -f shows only:
> >> ...
> >> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.000010>
> >> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.000009>
> >> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.000009>
> >> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.000010>
> >>
> >> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
> >>
> >> What could cause this? I'll provide any additional information needed
> >>
> >
> > A backtrace may help, try:
> >
> > gdb -p 13024 -batch -ex "thread apply all bt"
> >
> > Also adding Kevin and qemu-block.
> >
> > Nir
> >
>


-- 
Thanks,
Li Qun

Reply via email to