[Qemu-block] [PULL 1/5] vhost-user-blk: set config ops before vhost-user init

2018-04-09 Thread Michael S. Tsirkin
From: Maxime Coquelin 

As soon as vhost-user init is done, the backend may send
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, so let's set the
notification callback before it.

Also, it will be used to know whether the device supports
the config feature to advertize it or not.

Signed-off-by: Maxime Coquelin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Changpeng Liu 
---
 hw/block/vhost-user-blk.c | 4 ++--
 hw/virtio/vhost.c | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index f840f07..262baca 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -259,6 +259,8 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 s->dev.vq_index = 0;
 s->dev.backend_features = 0;
 
+vhost_dev_set_config_notifier(>dev, _ops);
+
 ret = vhost_dev_init(>dev, >chardev, VHOST_BACKEND_TYPE_USER, 0);
 if (ret < 0) {
 error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
@@ -277,8 +279,6 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 s->blkcfg.num_queues = s->num_queues;
 }
 
-vhost_dev_set_config_notifier(>dev, _ops);
-
 return;
 
 vhost_err:
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 250f886..b6c314e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1451,7 +1451,6 @@ int vhost_dev_set_config(struct vhost_dev *hdev, const 
uint8_t *data,
 void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
const VhostDevConfigOps *ops)
 {
-assert(hdev->vhost_ops);
 hdev->config_ops = ops;
 }
 
-- 
MST




Re: [Qemu-block] block-stream/commit and mixing internal and external snapshots

2018-04-09 Thread Eric Blake
On 04/09/2018 04:11 AM, Kevin Wolf wrote:
> Am 07.04.2018 um 00:16 hat Eric Blake geschrieben:
>> Perhaps others have already known this, but I just realized that if you
>> mix internal and external snapshots, you can set yourself up for massive
>> failures when trying to use block-stream or block-commit to consolidate
>> data across the external backing chain, without also thinking about the
>> internal snapshots.
> 
> Yeah, internal and external snapshots don't mix well. Basically, the
> only thing that will work reliably is having a qcow2 image with internal
> snapshots at the top, and then an immutable backing chain without
> internal snapshots below it.

I may try to tackle some safety valve additions in 2.13; but as the
problem is pre-existing and not a late regression in 2.12, it is not a
candidate for rushing anything into -rc3.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 3/3] iotests: blacklist bochs and cloop for 205 and 208

2018-04-09 Thread Kevin Wolf
Am 09.04.2018 um 13:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 03.04.2018 16:36, Kevin Wolf wrote:
> > Am 30.03.2018 um 17:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Blacklist these formats, as they don't support image creation, as they
> > > say:
> > >  > ./qemu-img create -f bochs x 1m
> > >  qemu-img: x: Format driver 'bochs' does not support image creation
> > > 
> > >  > ./qemu-img create -f cloop x 1m
> > >  qemu-img: x: Format driver 'cloop' does not support image creation
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > We can take this for now, but I think I would actually prefer a solution
> > like in the bash tests, where the $IMGFMT_GENERIC environment variable
> > is checked for "_supported_fmt generic".
> > 
> > I suppose in Python test cases, we can assume that generic is meant when
> > neither supported_fmts nor unsupported_fmts are given (or both are empty
> > lists).
> > 
> > Kevin
> 
> it may be ok for verify_image_format, as we can call it or not call (to
> support all formats).
> 
> but iotests main function always call verify_image_format, so, this
> will skip bochs and cloop for all iotests which call maind() without
> format restriction.

Yes, but that's what we want. Read-only formats can only be tested with
test cases made specifically for the respective format, because they
need to use a binary image from sample_images/.

I don't think there is a case where we really want to run the test for
all possible formats. Can you think of one?

> So, I think it is safer to directly mimic bash tests behavior - allow
> 'generic' as a member of supported_fmts.

That works, too, but I think it's not quite as nice.

Kevin



[Qemu-block] [PATCH v2 1/2] iotests.py: improve verify_image_format helper

2018-04-09 Thread Vladimir Sementsov-Ogievskiy
Support "generic" formats like in bash tests with their
   _supported_fmt generic
The test, supporting "generic" formats will run if IMGFMT_GENERIC =
true, which is default, except for bochs and cloop. However, you can
use verify_image_format(['generic', 'bochs']), which will run for all
except cloop (for this moment).

Also, add an assert (we don't want set both arguments) and remove
duplication.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/iotests.py | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index b5d7945af8..1a2b83893c 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -532,9 +532,17 @@ def notrun(reason):
 sys.exit(0)
 
 def verify_image_format(supported_fmts=[], unsupported_fmts=[]):
-if supported_fmts and (imgfmt not in supported_fmts):
-notrun('not suitable for this image format: %s' % imgfmt)
-if unsupported_fmts and (imgfmt in unsupported_fmts):
+assert not (supported_fmts and unsupported_fmts)
+
+if 'generic' in supported_fmts and \
+os.environ.get('IMGFMT_GENERIC', 'true') == 'true':
+# similar to
+#   _supported_fmt generic
+# for bash tests
+return
+
+not_sup = supported_fmts and (imgfmt not in supported_fmts)
+if not_sup or (imgfmt in unsupported_fmts):
 notrun('not suitable for this image format: %s' % imgfmt)
 
 def verify_platform(supported_oses=['linux']):
-- 
2.11.1




[Qemu-block] [PATCH v2 2/2] iotests: blacklist bochs and cloop for 205 and 208

2018-04-09 Thread Vladimir Sementsov-Ogievskiy
Blacklist these formats, as they don't support image creation, as they
say:
> ./qemu-img create -f bochs x 1m
qemu-img: x: Format driver 'bochs' does not support image creation

> ./qemu-img create -f cloop x 1m
qemu-img: x: Format driver 'cloop' does not support image creation

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/205 | 2 +-
 tests/qemu-iotests/208 | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/205 b/tests/qemu-iotests/205
index e7b2eae51d..31b2f5707a 100755
--- a/tests/qemu-iotests/205
+++ b/tests/qemu-iotests/205
@@ -153,4 +153,4 @@ class TestNbdServerRemove(iotests.QMPTestCase):
 
 
 if __name__ == '__main__':
-iotests.main()
+iotests.main(supported_fmts=['generic'])
diff --git a/tests/qemu-iotests/208 b/tests/qemu-iotests/208
index 18f59ada94..1e202388dc 100755
--- a/tests/qemu-iotests/208
+++ b/tests/qemu-iotests/208
@@ -22,6 +22,8 @@
 
 import iotests
 
+iotests.verify_image_format(supported_fmts=['generic'])
+
 with iotests.FilePath('disk.img') as disk_img_path, \
  iotests.FilePath('disk-snapshot.img') as disk_snapshot_img_path, \
  iotests.FilePath('nbd.sock') as nbd_sock_path, \
-- 
2.11.1




[Qemu-block] [PATCH v2 0/2] iotests: blacklist bochs and cloop for 205 and 208

2018-04-09 Thread Vladimir Sementsov-Ogievskiy
v2: move from unsupported_fmts to support "generic", like in bash tests.

Vladimir Sementsov-Ogievskiy (2):
  iotests.py: improve verify_image_format helper
  iotests: blacklist bochs and cloop for 205 and 208

 tests/qemu-iotests/205|  2 +-
 tests/qemu-iotests/208|  2 ++
 tests/qemu-iotests/iotests.py | 14 +++---
 3 files changed, 14 insertions(+), 4 deletions(-)

-- 
2.11.1




Re: [Qemu-block] [PATCH 3/3] iotests: blacklist bochs and cloop for 205 and 208

2018-04-09 Thread Vladimir Sementsov-Ogievskiy

03.04.2018 16:36, Kevin Wolf wrote:

Am 30.03.2018 um 17:16 hat Vladimir Sementsov-Ogievskiy geschrieben:

Blacklist these formats, as they don't support image creation, as they
say:
 > ./qemu-img create -f bochs x 1m
 qemu-img: x: Format driver 'bochs' does not support image creation

 > ./qemu-img create -f cloop x 1m
 qemu-img: x: Format driver 'cloop' does not support image creation

Signed-off-by: Vladimir Sementsov-Ogievskiy 

We can take this for now, but I think I would actually prefer a solution
like in the bash tests, where the $IMGFMT_GENERIC environment variable
is checked for "_supported_fmt generic".

I suppose in Python test cases, we can assume that generic is meant when
neither supported_fmts nor unsupported_fmts are given (or both are empty
lists).

Kevin


it may be ok for verify_image_format, as we can call it or not call (to 
support all formats).


but iotests main function always call verify_image_format, so, this will 
skip bochs and cloop for all

iotests which call maind() without format restriction.

So, I think it is safer to directly mimic bash tests behavior - allow 
'generic' as a member of supported_fmts.


--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH for-2.12] hw/block/pflash_cfi: fix off-by-one error

2018-04-09 Thread Kevin Wolf
Am 05.04.2018 um 01:32 hat Philippe Mathieu-Daudé geschrieben:
> ASAN reported:
> 
> hw/block/pflash_cfi02.c:245:33: runtime error: index 82 out of bounds for 
> type 'uint8_t [82]'
> 
> Since the 'cfi_len' member is not used, remove it to keep the code safer.
> 
> Reported-by: AddressSanitizer
> Signed-off-by: Philippe Mathieu-Daudé 

Cc: qemu-sta...@nongnu.org

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH] iotests: Split 214 off of 122

2018-04-09 Thread Alberto Garcia
On Fri 06 Apr 2018 06:41:08 PM CEST, Max Reitz  wrote:
> Commit abd3622cc03cf41ed542126a540385f30a4c0175 added a case to 122
> regarding how the qcow2 driver handles an incorrect compressed data
> length value.  This does not really fit into 122, as that file is
> supposed to contain qemu-img convert test cases, which this case is not.
> So this patch splits it off into its own file; maybe we will even get
> more qcow2-only compression tests in the future.
>
> Also, that test case does not work with refcount_bits=1, so mark that
> option as unsupported.
>
> Signed-off-by: Max Reitz 

Looks good to me

> I was a bit lost what to do about the copyright text, since this test
> case was written by Berto.  I figured I'd drop the "owner" variable
> (it isn't used anyway), but I put "Red Hat" into the copyright line --
> currently every test has copyright information, so I decided it'd be
> difficult to leave that out, and I figured I simply cannot claim
> copyright for Igalia.  So, here we go.

The new file contains only my test, right? You can use

Copyright (C) 2018 Igalia, S.L.
Author: Alberto Garcia 

and add

Signed-off-by: Alberto Garcia 

Berto



Re: [Qemu-block] block-stream/commit and mixing internal and external snapshots

2018-04-09 Thread Kevin Wolf
Am 07.04.2018 um 00:16 hat Eric Blake geschrieben:
> Perhaps others have already known this, but I just realized that if you
> mix internal and external snapshots, you can set yourself up for massive
> failures when trying to use block-stream or block-commit to consolidate
> data across the external backing chain, without also thinking about the
> internal snapshots.

Yeah, internal and external snapshots don't mix well. Basically, the
only thing that will work reliably is having a qcow2 image with internal
snapshots at the top, and then an immutable backing chain without
internal snapshots below it.

> Here's a quick demonstration:
> [...]
> 
> The root cause to all of this is that right now, ALL internal snapshots
> share the same backing file information in the file header; but
> block-stream operations only modify the active snapshot.  The actions of
> changing the backing file or of rewriting the clusters in the backing
> file don't break the active snapshot, but DO bleed through to the
> internal snapshots, for any cluster where the internal snapshot was
> relying on the backing file.
> 
> Does this mean we should make it harder to perform external block
> operations on a qcow2 file that has internal snapshots (either refuse
> outright, or at least require a 'force' flag to let the user acknowledge
> the risk)?  Similarly, should it be harder to create an internal
> snapshot when an image already has an external backing file, and/or
> should we improve the qcow2 specification of internal snapshot
> descriptors to record a per-snapshot backing file rather than the
> current approach that all snapshots share the same backing file?
> Whether or not we track a per-snapshot backing file, should the presence
> of internal snapshots be used to request op-blockers for read
> consistency on backing files?

Op blockers can't really protect a node against itself. As far as the
backing file node is concerned, nothing bad has happened. It is still
fully consistent and it hasn't been written to. It just isn't used any
more by its parent node.

Possibly we can use a blocker to enforce that the backing file child
isn't changed, but that would be something like a BLK_PERM_GRAPH_MOD
permission that we failed to define precisely so far.

Other than that, if you want to make the merge of the external snapshots
fail, maybe the only thing you could do is returning an error in when
trying to change the backing file link in qcow2_change_backing_file()
while there are internal snapshots. I'm not sure that this will result
in a good state, though, and it is only called at the very end of the
block job (i.e. all data is already copied), so it's not a nice failure
mode.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [RFC PATCH 4/8] file-posix: Implement bdrv_co_copy_range

2018-04-09 Thread Fam Zheng
On Wed, 04/04 14:20, Stefan Hajnoczi wrote:
> On Thu, Mar 29, 2018 at 07:09:10PM +0800, Fam Zheng wrote:
> > +static ssize_t handle_aiocb_copy_range(RawPosixAIOData *aiocb)
> > +{
> > +#ifndef HAS_COPY_FILE_RANGE
> > +return -ENOTSUP;
> > +#else
> > +uint64_t bytes = aiocb->aio_nbytes;
> > +off_t in_off = aiocb->aio_offset;
> > +off_t out_off = aiocb->offset2;
> > +
> > +while (bytes) {
> > +ssize_t ret = copy_file_range(aiocb->aio_fildes, _off,
> > +  aiocb->fd2, _off,
> > +  bytes, 0);
> > +if (ret < 0) {
> > +return -errno;
> > +}
> 
> EINTR should retry.

Will add (it is not listed in the manpage so I wasn't sure if it is necessary.)

Fam



Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-09 Thread Benny Zlotnik
source: qcow2 on NFS
target: raw on NFS


source:
$ qemu-img info
/rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
image:
/rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
file format: qcow2
virtual size: 120G (128849018880 bytes)
disk size: 63G
cluster_size: 65536
backing file: 950926cc-aac6-42fd-a719-6386d4202897 (actual path:
/rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-19f5-45bd-868f-767600c7115e/950926cc-aac6-42fd-a719-6386d4202897)
backing file format: qcow2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

target:
$ qemu-img info /rhev/data-center/mnt/bb422fac-81c5-4fea-8782-3498bb5c8a59
/26989331-2c39-4b34-a7ed-d7dd7703646c/images/9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
image:
bb422fac-81c5-4fea-8782-3498bb5c8a59/26989331-2c39-4b34-a7ed-d7dd7703646c/images/9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
file format: raw
virtual size: 120G (128849018880 bytes)
disk size: 0


On Mon, Apr 9, 2018 at 9:04 AM, Stefan Hajnoczi  wrote:

> On Sun, Apr 08, 2018 at 10:35:16PM +0300, Benny Zlotnik wrote:
>
> What type of storage are the source and destination images?  (e.g.
> source is a local qcow2 file on xfs, destination is a raw file on NFS)
>
> > $ gdb -p 13024 -batch -ex "thread apply all bt"
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib64/libthread_db.so.1".
> > 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> >
> > Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> > #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> > #1  0x55b55cf59d69 in qemu_poll_ns ()
> > #2  0x55b55cf5ba45 in aio_poll ()
> > #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> > #4  0x55b55cea3611 in convert_iteration_sectors ()
>
> CCing Max Reitz in case this is familiar.
>
> > #5  0x55b55cea4352 in img_convert ()
> > #6  0x55b55ce9d819 in main ()
> >
> >
> > On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:
> >
> > > On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> As part of copy operation initiated by rhev got stuck for more than a
> day
> > >> and consumes plenty of CPU
> > >> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
> > >> convert
> > >> -p -t none -T none -f qcow2
> > >> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
> > >> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
> > >> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> > >> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
> > >> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
> > >> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-
> 4b6b-ab00-56523df185da
> > >>
> > >> The target image appears to have no data yet:
> > >> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> > >> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> > >> file format: raw
> > >> virtual size: 120G (128849018880 bytes)
> > >> disk size: 0
> > >>
> > >> strace -p 13024 -tt -T -f shows only:
> > >> ...
> > >> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.10>
> > >> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.09>
> > >> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.09>
> > >> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1,
> {0,
> > >> 0},
> > >> NULL, 8) = 0 (Timeout) <0.10>
> > >>
> > >> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
> > >>
> > >> What could cause this? I'll provide any additional information needed
> > >>
> > >
> > > A backtrace may help, try:
> > >
> > > gdb -p 13024 -batch -ex "thread apply all bt"
> > >
> > > Also adding Kevin and qemu-block.
> > >
> > > Nir
> > >
>


Re: [Qemu-block] [PATCH for-2.12 v2] qemu-iotests: update 185 output

2018-04-09 Thread Stefan Hajnoczi
On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote:
> On 2018-04-04 17:01, Stefan Hajnoczi wrote:
> > Commit 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 ("vl: introduce
> > vm_shutdown()") added a bdrv_drain_all() call.  As a side-effect of the
> > drain operation the block job iterates one more time than before.  The
> > 185 output no longer matches and the test is failing now.
> > 
> > It may be possible to avoid the superfluous block job iteration, but
> > that type of patch is not suitable late in the QEMU 2.12 release cycle.
> > 
> > This patch simply updates the 185 output file.  The new behavior is
> > correct, just not optimal, so make the test pass again.
> > 
> > Fixes: 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 ("vl: introduce 
> > vm_shutdown()")
> > Cc: Kevin Wolf 
> > Cc: QingFeng Hao 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  tests/qemu-iotests/185 | 10 ++
> >  tests/qemu-iotests/185.out | 12 +++-
> >  2 files changed, 13 insertions(+), 9 deletions(-)
> 
> On tmpfs, this isn't enough to let the test pass.  There, the active
> commit job finishes before the quit is sent, resulting in this diff:
> 
> --- tests/qemu-iotests/185.out  2018-04-04 18:10:02.015935435 +0200
> +++ tests/qemu-iotests/185.out.bad  2018-04-04 18:10:21.045473817 +0200
> @@ -26,9 +26,9 @@
> 
>  {"return": {}}
>  {"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304,
> "offset": 4194304, "speed": 65536, "type": "commit"}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "SHUTDOWN", "data": {"guest": false}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304,
> "offset": 4194304, "speed": 65536, "type": "commit"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_JOB_COMPLETED", "data": {"device": "disk", "len":
> 4194304, "offset": 4194304, "speed": 65536, "type": "commit"}}
> 
>  === Start mirror job and exit qemu ===
> 
> This seems to be independent of whether there is actually data on
> TEST_IMG (the commit source), so something doesn't seem quite right with
> the block job throttling here...?

That is a race condition.  I can reproduce it on XFS too.

Will see if I can figure it out...

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-09 Thread Stefan Hajnoczi
On Sun, Apr 08, 2018 at 10:35:16PM +0300, Benny Zlotnik wrote:

What type of storage are the source and destination images?  (e.g.
source is a local qcow2 file on xfs, destination is a raw file on NFS)

> $ gdb -p 13024 -batch -ex "thread apply all bt"
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> 0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x7f983e30ab00 (LWP 13024)):
> #0  0x7f98275cfaff in ppoll () from /lib64/libc.so.6
> #1  0x55b55cf59d69 in qemu_poll_ns ()
> #2  0x55b55cf5ba45 in aio_poll ()
> #3  0x55b55ceedc0f in bdrv_get_block_status_above ()
> #4  0x55b55cea3611 in convert_iteration_sectors ()

CCing Max Reitz in case this is familiar.

> #5  0x55b55cea4352 in img_convert ()
> #6  0x55b55ce9d819 in main ()
> 
> 
> On Sun, Apr 8, 2018 at 10:28 PM, Nir Soffer  wrote:
> 
> > On Sun, Apr 8, 2018 at 9:27 PM Benny Zlotnik  wrote:
> >
> >> Hi,
> >>
> >> As part of copy operation initiated by rhev got stuck for more than a day
> >> and consumes plenty of CPU
> >> vdsm 13024  3117 99 Apr07 ?1-06:58:43 /usr/bin/qemu-img
> >> convert
> >> -p -t none -T none -f qcow2
> >> /rhev/data-center/bb422fac-81c5-4fea-8782-3498bb5c8a59/
> >> 26989331-2c39-4b34-a7ed-d7dd7703646c/images/597e12b6-
> >> 19f5-45bd-868f-767600c7115e/62a5492e-e120-4c25-898e-9f5f5629853e
> >> -O raw /rhev/data-center/mnt/mantis-nfs-lif1.lab.eng.tlv2.redhat.com:
> >> _vol__service/26989331-2c39-4b34-a7ed-d7dd7703646c/images/
> >> 9ece9408-9ca6-48cd-992a-6f590c710672/06d6d3c0-beb8-4b6b-ab00-56523df185da
> >>
> >> The target image appears to have no data yet:
> >> qemu-img info 06d6d3c0-beb8-4b6b-ab00-56523df185da"
> >> image: 06d6d3c0-beb8-4b6b-ab00-56523df185da
> >> file format: raw
> >> virtual size: 120G (128849018880 bytes)
> >> disk size: 0
> >>
> >> strace -p 13024 -tt -T -f shows only:
> >> ...
> >> 21:13:01.309382 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.10>
> >> 21:13:01.309411 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.09>
> >> 21:13:01.309440 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.09>
> >> 21:13:01.309468 ppoll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, {0,
> >> 0},
> >> NULL, 8) = 0 (Timeout) <0.10>
> >>
> >> version: qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
> >>
> >> What could cause this? I'll provide any additional information needed
> >>
> >
> > A backtrace may help, try:
> >
> > gdb -p 13024 -batch -ex "thread apply all bt"
> >
> > Also adding Kevin and qemu-block.
> >
> > Nir
> >


signature.asc
Description: PGP signature