Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-18 Thread David Lee
On Wed, Apr 18, 2018 at 4:44 PM, Fam Zheng  wrote:
>
> qemu-img hangs because the convert_iteration_sectors loop cannot make any
> progress when it reaches the end of the base image. It is a bug (implicitly?)
> fixed by Eric Blake (Cc'ed) 's BDRV_BLOCK_EOF patches on upstream, backporting
> them to the above downstream version fixes the problem for me:
>
> commit c61e684e44272f2acb2bef34cf2aa234582a73a9
> Author: Eric Blake 
>
> block: Exploit BDRV_BLOCK_EOF for larger zero blocks
>
> commit fb0d8654ffc3ea1494067327fc4c4da5d0872724
> Author: Eric Blake 
>
> block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()
>
> Fam

Fam,

Thanks for the info.

-- 
Thanks,
Li Qun



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-18 Thread Fam Zheng
On Wed, 04/18 15:58, Fam Zheng wrote:
> On Wed, 04/18 15:42, David Lee wrote:
> > On Thu, Apr 12, 2018 at 11:57 PM, David Lee  wrote:
> > >>>
> > >>> We tested qemu-kvm-ev-2.9.0-16.el7_4.14.1 - where from the source RPM we
> > >>> verified it does contain ef6dada8b44e1e7c4bec5c1115903af9af415b50
> > >>>
> > >>> But the issue still exists.  The convert got stuck if one of the old
> > >>> active overlay
> > >>> had been 'vol-resize'd  with qemu monitor command to a larger size.  
> > >>> This looks
> > >>> like a prerequisite but not sufficient condition to trigger this 
> > >>> badness.
> > >>
> > >> So it is a separate issue. Did you try upstream master as well?
> > >>
> > >> Fam
> > >
> > > Not yet.
> > 
> > Stefan & FAM,
> > 
> > Here are the steps to reproduce this issue reliably:
> > 
> > # qemu-img create -f qcow2 test.qcow2 100m
> > ... omitted
> > # qemu-img create -F qcow2 -f qcow2 -b test.qcow2 overlay.qcow2
> > ... omitted
> > # qemu-img resize overlay.qcow2 +20m
> > Image resized.
> > # qemu-img create -F qcow2 -f qcow2 -b overlay.qcow2 overlay2.qcow2
> > ... omitted
> > # qemu-img convert overlay2.qcow2 -f qcow2 -O qcow2 combined.qcow2
> > [hang]
> > 
> > 
> > # qemu-img --version
> > qemu-img version 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.14.1)
> 
> Thanks, I can reproduce this but not on master. I will take a look.

qemu-img hangs because the convert_iteration_sectors loop cannot make any
progress when it reaches the end of the base image. It is a bug (implicitly?)
fixed by Eric Blake (Cc'ed) 's BDRV_BLOCK_EOF patches on upstream, backporting
them to the above downstream version fixes the problem for me:

commit c61e684e44272f2acb2bef34cf2aa234582a73a9
Author: Eric Blake 

block: Exploit BDRV_BLOCK_EOF for larger zero blocks

commit fb0d8654ffc3ea1494067327fc4c4da5d0872724
Author: Eric Blake 

block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()

Fam



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-18 Thread Fam Zheng
On Wed, 04/18 15:42, David Lee wrote:
> On Thu, Apr 12, 2018 at 11:57 PM, David Lee  wrote:
> >>>
> >>> We tested qemu-kvm-ev-2.9.0-16.el7_4.14.1 - where from the source RPM we
> >>> verified it does contain ef6dada8b44e1e7c4bec5c1115903af9af415b50
> >>>
> >>> But the issue still exists.  The convert got stuck if one of the old
> >>> active overlay
> >>> had been 'vol-resize'd  with qemu monitor command to a larger size.  This 
> >>> looks
> >>> like a prerequisite but not sufficient condition to trigger this badness.
> >>
> >> So it is a separate issue. Did you try upstream master as well?
> >>
> >> Fam
> >
> > Not yet.
> 
> Stefan & FAM,
> 
> Here are the steps to reproduce this issue reliably:
> 
> # qemu-img create -f qcow2 test.qcow2 100m
> ... omitted
> # qemu-img create -F qcow2 -f qcow2 -b test.qcow2 overlay.qcow2
> ... omitted
> # qemu-img resize overlay.qcow2 +20m
> Image resized.
> # qemu-img create -F qcow2 -f qcow2 -b overlay.qcow2 overlay2.qcow2
> ... omitted
> # qemu-img convert overlay2.qcow2 -f qcow2 -O qcow2 combined.qcow2
> [hang]
> 
> 
> # qemu-img --version
> qemu-img version 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.14.1)

Thanks, I can reproduce this but not on master. I will take a look.

Fam



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-18 Thread David Lee
On Thu, Apr 12, 2018 at 11:57 PM, David Lee  wrote:
>>>
>>> We tested qemu-kvm-ev-2.9.0-16.el7_4.14.1 - where from the source RPM we
>>> verified it does contain ef6dada8b44e1e7c4bec5c1115903af9af415b50
>>>
>>> But the issue still exists.  The convert got stuck if one of the old
>>> active overlay
>>> had been 'vol-resize'd  with qemu monitor command to a larger size.  This 
>>> looks
>>> like a prerequisite but not sufficient condition to trigger this badness.
>>
>> So it is a separate issue. Did you try upstream master as well?
>>
>> Fam
>
> Not yet.

Stefan & FAM,

Here are the steps to reproduce this issue reliably:

# qemu-img create -f qcow2 test.qcow2 100m
... omitted
# qemu-img create -F qcow2 -f qcow2 -b test.qcow2 overlay.qcow2
... omitted
# qemu-img resize overlay.qcow2 +20m
Image resized.
# qemu-img create -F qcow2 -f qcow2 -b overlay.qcow2 overlay2.qcow2
... omitted
# qemu-img convert overlay2.qcow2 -f qcow2 -O qcow2 combined.qcow2
[hang]


# qemu-img --version
qemu-img version 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.14.1)

-- 
Thanks,
Li Qun



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-16 Thread Stefan Hajnoczi
On Mon, Apr 09, 2018 at 10:38:54AM +0300, Benny Zlotnik wrote:
> source: qcow2 on NFS
> target: raw on NFS

Have you tried on a local file system with the same source file
contents?

Which NFS protocol version is being used?

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-12 Thread David Lee
On Thu, Apr 12, 2018 at 10:23 PM, Fam Zheng  wrote:
> On Thu, 04/12 21:45, David Lee wrote:
>> On Thu, Apr 12, 2018 at 10:16 AM, David Lee  wrote:
>> >> > My team caught this issue too after switching to CentOS 7.4 with 
>> >> > qemu-img
>> >> > 2.9.0
>> >> > gdb shows exactly the same backtrace when the convert stuck, and we are 
>> >> > on
>> >> > NFS.
>> >> >
>> >> > Later we found the following:
>> >> > 1. The stuck can happen on local storage, too.
>> >> > 2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly 
>> >> > again.
>> >> >
>> >> > BTW, we use "qemu-img convert" to convert qcow2 and its backing files 
>> >> > into
>> >> > a single qcow2 image.
>> >>
>> >> Maybe it is RHBZ 1508886?
>> >>
>> >> Fam
>> >
>> >
>> >
>> > Thanks, Fam.  We just tracked down to this BZ too and are about to trying
>> > the commit ef6dada8b44e1e7c4bec5c1115903af9af415b50
>>
>> We tested qemu-kvm-ev-2.9.0-16.el7_4.14.1 - where from the source RPM we
>> verified it does contain ef6dada8b44e1e7c4bec5c1115903af9af415b50
>>
>> But the issue still exists.  The convert got stuck if one of the old
>> active overlay
>> had been 'vol-resize'd  with qemu monitor command to a larger size.  This 
>> looks
>> like a prerequisite but not sufficient condition to trigger this badness.
>
> So it is a separate issue. Did you try upstream master as well?
>
> Fam

Not yet.

-- 
Thanks,
Li Qun



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-12 Thread Fam Zheng
On Thu, 04/12 21:45, David Lee wrote:
> On Thu, Apr 12, 2018 at 10:16 AM, David Lee  wrote:
> >> > My team caught this issue too after switching to CentOS 7.4 with qemu-img
> >> > 2.9.0
> >> > gdb shows exactly the same backtrace when the convert stuck, and we are 
> >> > on
> >> > NFS.
> >> >
> >> > Later we found the following:
> >> > 1. The stuck can happen on local storage, too.
> >> > 2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly again.
> >> >
> >> > BTW, we use "qemu-img convert" to convert qcow2 and its backing files 
> >> > into
> >> > a single qcow2 image.
> >>
> >> Maybe it is RHBZ 1508886?
> >>
> >> Fam
> >
> >
> >
> > Thanks, Fam.  We just tracked down to this BZ too and are about to trying
> > the commit ef6dada8b44e1e7c4bec5c1115903af9af415b50
> 
> We tested qemu-kvm-ev-2.9.0-16.el7_4.14.1 - where from the source RPM we
> verified it does contain ef6dada8b44e1e7c4bec5c1115903af9af415b50
> 
> But the issue still exists.  The convert got stuck if one of the old
> active overlay
> had been 'vol-resize'd  with qemu monitor command to a larger size.  This 
> looks
> like a prerequisite but not sufficient condition to trigger this badness.

So it is a separate issue. Did you try upstream master as well?

Fam



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-12 Thread David Lee
On Thu, Apr 12, 2018 at 10:16 AM, David Lee  wrote:
>> > My team caught this issue too after switching to CentOS 7.4 with qemu-img
>> > 2.9.0
>> > gdb shows exactly the same backtrace when the convert stuck, and we are on
>> > NFS.
>> >
>> > Later we found the following:
>> > 1. The stuck can happen on local storage, too.
>> > 2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly again.
>> >
>> > BTW, we use "qemu-img convert" to convert qcow2 and its backing files into
>> > a single qcow2 image.
>>
>> Maybe it is RHBZ 1508886?
>>
>> Fam
>
>
>
> Thanks, Fam.  We just tracked down to this BZ too and are about to trying
> the commit ef6dada8b44e1e7c4bec5c1115903af9af415b50

We tested qemu-kvm-ev-2.9.0-16.el7_4.14.1 - where from the source RPM we
verified it does contain ef6dada8b44e1e7c4bec5c1115903af9af415b50

But the issue still exists.  The convert got stuck if one of the old
active overlay
had been 'vol-resize'd  with qemu monitor command to a larger size.  This looks
like a prerequisite but not sufficient condition to trigger this badness.

-- 
Thanks,
Li Qun



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-11 Thread David Lee
On Thu, Apr 12, 2018 at 10:03 AM, Fam Zheng  wrote:

> On Thu, 04/12 09:51, David Lee wrote:
> > On Mon, Apr 9, 2018 at 3:35 AM, Benny Zlotnik 
> wrote:
> >
> > > $ gdb -p 13024 -batch -ex "thread apply all bt"
> > > [Thread debugging using libthread_db enabled]
> > > Using host libthread_db library "/lib64/libthread_db.so.1".
> > > 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> > >
> > > Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> > > #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> > > #1  0x55b55cf59d69 in qemu_poll_ns ()
> > > #2  0x55b55cf5ba45 in aio_poll ()
> > > #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> > > #4  0x55b55cea3611 in convert_iteration_sectors ()
> > > #5  0x55b55cea4352 in img_convert ()
> > > #6  0x55b55ce9d819 in main ()
> >
> >
> > My team caught this issue too after switching to CentOS 7.4 with qemu-img
> > 2.9.0
> > gdb shows exactly the same backtrace when the convert stuck, and we are
> on
> > NFS.
> >
> > Later we found the following:
> > 1. The stuck can happen on local storage, too.
> > 2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly again.
> >
> > BTW, we use "qemu-img convert" to convert qcow2 and its backing files
> into
> > a single qcow2 image.
>
> Maybe it is RHBZ 1508886?
>
> Fam
>


Thanks, Fam.  We just tracked down to this BZ too and are about to trying
the commit ef6dada8b44e1e7c4bec5c1115903af9af415b50


-- 
Thanks,
Li Qun


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-11 Thread Fam Zheng
On Thu, 04/12 09:51, David Lee wrote:
> On Mon, Apr 9, 2018 at 3:35 AM, Benny Zlotnik  wrote:
> 
> > $ gdb -p 13024 -batch -ex "thread apply all bt"
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib64/libthread_db.so.1".
> > 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> >
> > Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> > #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> > #1  0x55b55cf59d69 in qemu_poll_ns ()
> > #2  0x55b55cf5ba45 in aio_poll ()
> > #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> > #4  0x55b55cea3611 in convert_iteration_sectors ()
> > #5  0x55b55cea4352 in img_convert ()
> > #6  0x55b55ce9d819 in main ()
> 
> 
> My team caught this issue too after switching to CentOS 7.4 with qemu-img
> 2.9.0
> gdb shows exactly the same backtrace when the convert stuck, and we are on
> NFS.
> 
> Later we found the following:
> 1. The stuck can happen on local storage, too.
> 2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly again.
> 
> BTW, we use "qemu-img convert" to convert qcow2 and its backing files into
> a single qcow2 image.

Maybe it is RHBZ 1508886?

Fam



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-11 Thread David Lee
On Mon, Apr 9, 2018 at 3:35 AM, Benny Zlotnik  wrote:

> $ gdb -p 13024 -batch -ex "thread apply all bt"
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
>
> Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> #1  0x55b55cf59d69 in qemu_poll_ns ()
> #2  0x55b55cf5ba45 in aio_poll ()
> #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> #4  0x55b55cea3611 in convert_iteration_sectors ()
> #5  0x55b55cea4352 in img_convert ()
> #6  0x55b55ce9d819 in main ()


My team caught this issue too after switching to CentOS 7.4 with qemu-img
2.9.0
gdb shows exactly the same backtrace when the convert stuck, and we are on
NFS.

Later we found the following:
1. The stuck can happen on local storage, too.
2. Replace qemu-img 2.9.0 with 2.6.0 and everything works smoothly again.

BTW, we use "qemu-img convert" to convert qcow2 and its backing files into
a single qcow2 image.


> On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:
>
> > On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik 
> wrote:
> >
> >> Hi,
> >>
> >> As part of copy operation initiated by rhev got stuck for more than a
> day
> >> and consumes plenty of CPU
> >> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
> >> convert
> >> -p -t none -T none -f qcow2
> >> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
> >> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
> >> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> >> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
> >> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
> >> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-
> 4b6b-ab00-56523df185da
> >>
> >> The target image appears to have no data yet:
> >> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> >> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> >> file format: raw
> >> virtual size: 120G (128849018880 bytes)
> >> disk size: 0
> >>
> >> strace -p 13024 -tt -T -f shows only:
> >> ...
> >> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.10>
> >> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.09>
> >> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.09>
> >> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.10>
> >>
> >> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
> >>
> >> What could cause this? I'll provide any additional information needed
> >>
> >
> > A backtrace may help, try:
> >
> > gdb -p 13024 -batch -ex "thread apply all bt"
> >
> > Also adding Kevin and qemu-block.
> >
> > Nir
> >
>


-- 
Thanks,
Li Qun


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-11 Thread Max Reitz
On 2018-04-09 08:04, Stefan Hajnoczi wrote:
> On Sun, Apr 08, 2018 at 10:35:16PM +0300, Benny Zlotnik wrote:
> 
> What type of storage are the source and destination images?  (e.g.
> source is a local qcow2 file on xfs, destination is a raw file on NFS)
> 
>> $ gdb -p 13024 -batch -ex "thread apply all bt"
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
>>
>> Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
>> #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
>> #1  0x55b55cf59d69 in qemu_poll_ns ()
>> #2  0x55b55cf5ba45 in aio_poll ()
>> #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
>> #4  0x55b55cea3611 in convert_iteration_sectors ()
> 
> CCing Max Reitz in case this is familiar.

Hmm, not really, no...

The culprit I know of (sensing block status outside of qemu) would block
in lseek64() under find_allocation().

I didn't have any luck reproducing the issue either...

Whenever I had some hang in ppoll(), it was usually during a drain, but
that doesn't seem to be the case here either.  So I have no idea.

Maybe I'll test some other configurations at another time, but so far I
didn't experience any hangs and I have no idea what could be provoking
them (other than some network issue outside of qemu, but well...).

Max

>> #5  0x55b55cea4352 in img_convert ()
>> #6  0x55b55ce9d819 in main ()
>>
>>
>> On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:
>>
>>> On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik  wrote:
>>>
 Hi,

 As part of copy operation initiated by rhev got stuck for more than a day
 and consumes plenty of CPU
 vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
 convert
 -p -t none -T none -f qcow2
 /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
 -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
 _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da

 The target image appears to have no data yet:
 qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
 image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
 file format: raw
 virtual size: 120G (128849018880 bytes)
 disk size: 0

 strace -p 13024 -tt -T -f shows only:
 ...
 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
 0},
 NULL, 8) = 0 (Timeout) <0.10>
 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
 0},
 NULL, 8) = 0 (Timeout) <0.09>
 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
 0},
 NULL, 8) = 0 (Timeout) <0.09>
 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
 0},
 NULL, 8) = 0 (Timeout) <0.10>

 version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64

 What could cause this? I'll provide any additional information needed

>>>
>>> A backtrace may help, try:
>>>
>>> gdb -p 13024 -batch -ex "thread apply all bt"
>>>
>>> Also adding Kevin and qemu-block.
>>>
>>> Nir
>>>




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-09 Thread Benny Zlotnik
source: qcow2 on NFS
target: raw on NFS


source:
$ qemu-img info
/rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
image:
/rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
file format: qcow2
virtual size: 120G (128849018880 bytes)
disk size: 63G
cluster_size: 65536
backing file: 950926cc-aac6-42fd-a719-6386d4202897 (actual path:
/rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/950926cc-aac6-42fd-a719-6386d4202897)
backing file format: qcow2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

target:
$ qemu-img info /rhev/data-center/mnt/bb422fac-81c5-4fea-8782-3498bb5c8a59
/26989331-2c39-4b34-a7ed-d7dd7703646c/images/9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
image:
bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
file format: raw
virtual size: 120G (128849018880 bytes)
disk size: 0


On Mon, Apr 9, 2018 at 9:04 AM, Stefan Hajnoczi  wrote:

> On Sun, Apr 08, 2018 at 10:35:16PM +0300, Benny Zlotnik wrote:
>
> What type of storage are the source and destination images?  (e.g.
> source is a local qcow2 file on xfs, destination is a raw file on NFS)
>
> > $ gdb -p 13024 -batch -ex "thread apply all bt"
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib64/libthread_db.so.1".
> > 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> >
> > Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> > #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> > #1  0x55b55cf59d69 in qemu_poll_ns ()
> > #2  0x55b55cf5ba45 in aio_poll ()
> > #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> > #4  0x55b55cea3611 in convert_iteration_sectors ()
>
> CCing Max Reitz in case this is familiar.
>
> > #5  0x55b55cea4352 in img_convert ()
> > #6  0x55b55ce9d819 in main ()
> >
> >
> > On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:
> >
> > > On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> As part of copy operation initiated by rhev got stuck for more than a
> day
> > >> and consumes plenty of CPU
> > >> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
> > >> convert
> > >> -p -t none -T none -f qcow2
> > >> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
> > >> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
> > >> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> > >> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
> > >> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
> > >> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-
> 4b6b-ab00-56523df185da
> > >>
> > >> The target image appears to have no data yet:
> > >> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> > >> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> > >> file format: raw
> > >> virtual size: 120G (128849018880 bytes)
> > >> disk size: 0
> > >>
> > >> strace -p 13024 -tt -T -f shows only:
> > >> ...
> > >> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.10>
> > >> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.09>
> > >> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.09>
> > >> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.10>
> > >>
> > >> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
> > >>
> > >> What could cause this? I'll provide any additional information needed
> > >>
> > >
> > > A backtrace may help, try:
> > >
> > > gdb -p 13024 -batch -ex "thread apply all bt"
> > >
> > > Also adding Kevin and qemu-block.
> > >
> > > Nir
> > >
>


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-09 Thread Stefan Hajnoczi
On Sun, Apr 08, 2018 at 10:35:16PM +0300, Benny Zlotnik wrote:

What type of storage are the source and destination images?  (e.g.
source is a local qcow2 file on xfs, destination is a raw file on NFS)

> $ gdb -p 13024 -batch -ex "thread apply all bt"
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> #1  0x55b55cf59d69 in qemu_poll_ns ()
> #2  0x55b55cf5ba45 in aio_poll ()
> #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> #4  0x55b55cea3611 in convert_iteration_sectors ()

CCing Max Reitz in case this is familiar.

> #5  0x55b55cea4352 in img_convert ()
> #6  0x55b55ce9d819 in main ()
> 
> 
> On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:
> 
> > On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik  wrote:
> >
> >> Hi,
> >>
> >> As part of copy operation initiated by rhev got stuck for more than a day
> >> and consumes plenty of CPU
> >> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
> >> convert
> >> -p -t none -T none -f qcow2
> >> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
> >> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
> >> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> >> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
> >> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
> >> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
> >>
> >> The target image appears to have no data yet:
> >> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> >> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> >> file format: raw
> >> virtual size: 120G (128849018880 bytes)
> >> disk size: 0
> >>
> >> strace -p 13024 -tt -T -f shows only:
> >> ...
> >> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.10>
> >> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.09>
> >> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.09>
> >> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.10>
> >>
> >> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
> >>
> >> What could cause this? I'll provide any additional information needed
> >>
> >
> > A backtrace may help, try:
> >
> > gdb -p 13024 -batch -ex "thread apply all bt"
> >
> > Also adding Kevin and qemu-block.
> >
> > Nir
> >


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-08 Thread Benny Zlotnik
$ gdb -p 13024 -batch -ex "thread apply all bt"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x7f98275cfaff in ppoll () from /lib64/libc.so.6

Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
#0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
#1  0x55b55cf59d69 in qemu_poll_ns ()
#2  0x55b55cf5ba45 in aio_poll ()
#3  0x55b55ceedc0f in bdrv_get_block_status_above ()
#4  0x55b55cea3611 in convert_iteration_sectors ()
#5  0x55b55cea4352 in img_convert ()
#6  0x55b55ce9d819 in main ()


On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:

> On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik  wrote:
>
>> Hi,
>>
>> As part of copy operation initiated by rhev got stuck for more than a day
>> and consumes plenty of CPU
>> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
>> convert
>> -p -t none -T none -f qcow2
>> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
>> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
>> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
>> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
>> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
>> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
>>
>> The target image appears to have no data yet:
>> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
>> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
>> file format: raw
>> virtual size: 120G (128849018880 bytes)
>> disk size: 0
>>
>> strace -p 13024 -tt -T -f shows only:
>> ...
>> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
>> 0},
>> NULL, 8) = 0 (Timeout) <0.10>
>> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
>> 0},
>> NULL, 8) = 0 (Timeout) <0.09>
>> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
>> 0},
>> NULL, 8) = 0 (Timeout) <0.09>
>> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
>> 0},
>> NULL, 8) = 0 (Timeout) <0.10>
>>
>> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
>>
>> What could cause this? I'll provide any additional information needed
>>
>
> A backtrace may help, try:
>
> gdb -p 13024 -batch -ex "thread apply all bt"
>
> Also adding Kevin and qemu-block.
>
> Nir
>


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-08 Thread Nir Soffer
On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik  wrote:

> Hi,
>
> As part of copy operation initiated by rhev got stuck for more than a day
> and consumes plenty of CPU
> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img convert
> -p -t none -T none -f qcow2
>
> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
>
> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
>
> The target image appears to have no data yet:
> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> file format: raw
> virtual size: 120G (128849018880 bytes)
> disk size: 0
>
> strace -p 13024 -tt -T -f shows only:
> ...
> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0, 0},
> NULL, 8) = 0 (Timeout) <0.10>
> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0, 0},
> NULL, 8) = 0 (Timeout) <0.09>
> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0, 0},
> NULL, 8) = 0 (Timeout) <0.09>
> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0, 0},
> NULL, 8) = 0 (Timeout) <0.10>
>
> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
>
> What could cause this? I'll provide any additional information needed
>

A backtrace may help, try:

gdb -p 13024 -batch -ex "thread apply all bt"

Also adding Kevin and qemu-block.

Nir