[PATCH v3] block/gluster: correctly set max_pdiscard

2022-05-20 Thread Fabian Ebner
On 64-bit platforms, assigning SIZE_MAX to the int64_t max_pdiscard
results in a negative value, and the following assertion would trigger
down the line (it's not the same max_pdiscard, but computed from the
other one):
qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion
`max_pdiscard >= bs->bl.request_alignment' failed.

On 32-bit platforms, it's fine to keep using SIZE_MAX.

The assertion in qemu_gluster_co_pdiscard() is checking that the value
of 'bytes' can safely be passed to glfs_discard_async(), which takes a
size_t for the argument in question, so it is kept as is. And since
max_pdiscard is still <= SIZE_MAX, relying on max_pdiscard is still
fine.

Fixes: 0c8022876f ("block: use int64_t instead of int in driver discard 
handlers")
Cc: qemu-sta...@nongnu.org
Signed-off-by: Fabian Ebner 
---

v2 -> v3:
* Keep assertion in qemu_gluster_co_pdiscard() as is.
* Improve commit message.

v1 -> v2:
* Use an expression that works for both 64-bit and 32-bit platforms.

 block/gluster.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index 398976bc66..b60213ab80 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -891,7 +891,7 @@ out:
 static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
 {
 bs->bl.max_transfer = GLUSTER_MAX_TRANSFER;
-bs->bl.max_pdiscard = SIZE_MAX;
+bs->bl.max_pdiscard = MIN(SIZE_MAX, INT64_MAX);
 }
 
 static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
-- 
2.30.2





Re: [PATCH v2] block/gluster: correctly set max_pdiscard

2022-05-13 Thread Fabian Ebner
Am 12.05.22 um 18:05 schrieb Stefano Garzarella:
> On Thu, May 12, 2022 at 05:44:13PM +0200, Stefano Garzarella wrote:
>> On Thu, May 12, 2022 at 12:30:48PM +0200, Fabian Ebner wrote:
>>> On 64-bit platforms, SIZE_MAX is too large for max_pdiscard, which is
>>
>> The main problem is that SIZE_MAX for an int64_t is a negative value.
>>

Yes, I should've stated that directly.

>>> int64_t, and the following assertion would be triggered:
>>> qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion
>>> `max_pdiscard >= bs->bl.request_alignment' failed.
>>>
>>> Fixes: 0c8022876f ("block: use int64_t instead of int in driver
>>> discard handlers")
>>> Cc: qemu-sta...@nongnu.org
>>> Signed-off-by: Fabian Ebner 
>>> ---
>>> block/gluster.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/gluster.c b/block/gluster.c
>>> index 398976bc66..f711bf0bd6 100644
>>> --- a/block/gluster.c
>>> +++ b/block/gluster.c
>>> @@ -891,7 +891,7 @@ out:
>>> static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error
>>> **errp)
>>> {
>>>    bs->bl.max_transfer = GLUSTER_MAX_TRANSFER;
>>> -    bs->bl.max_pdiscard = SIZE_MAX;
>>> +    bs->bl.max_pdiscard = MIN(SIZE_MAX, INT64_MAX);
>>
>> What would be the problem if we use INT64_MAX?
> 
> Okay, I just saw Eric's answer to v1 and I think this is right.
> 

Sorry for not mentioning the changes from v1.

> Please explain it in the commit message and also the initial problem
> that is SIZE_MAX on a 64-bit platform is a negative number for int64_t,
> so the assert fails.
> 

I'll try and improve the commit message for v3.

> Thanks,
> Stefano
> 
>> (I guess the intention of the original patch was to set the maximum
>> value in drivers that do not have a specific maximum).
>>
>> Or we can set to 0, since in block/io.c we have this code:
>>
>>    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
>> INT64_MAX),
>>   align);
>>    assert(max_pdiscard >= bs->bl.request_alignment);
>>
>> Where `max_pdiscard` is set to INT64_MAX (and aligned) if
>> bs->bl.max_pdiscard is 0.
>>
>>> }
>>>
>>> static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
>>> @@ -1304,7 +1304,7 @@ static coroutine_fn int
>>> qemu_gluster_co_pdiscard(BlockDriverState *bs,
>>>    GlusterAIOCB acb;
>>>    BDRVGlusterState *s = bs->opaque;
>>>
>>> -    assert(bytes <= SIZE_MAX); /* rely on max_pdiscard */
>>> +    assert(bytes <= MIN(SIZE_MAX, INT64_MAX)); /* rely on
>>> max_pdiscard */
>>
>> Can we use bs->bl.max_pdiscard directly here?
>>

Now I'm thinking that the assert is actually for checking that the value
can be passed to glfs_discard_async(), which takes a size_t for the
argument in question. So maybe it's best to keep assert(bytes <=
SIZE_MAX) as is?

>> Thanks,
>> Stefano
> 
> 
> 




[PATCH v2] block/gluster: correctly set max_pdiscard

2022-05-12 Thread Fabian Ebner
On 64-bit platforms, SIZE_MAX is too large for max_pdiscard, which is
int64_t, and the following assertion would be triggered:
qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion
`max_pdiscard >= bs->bl.request_alignment' failed.

Fixes: 0c8022876f ("block: use int64_t instead of int in driver discard 
handlers")
Cc: qemu-sta...@nongnu.org
Signed-off-by: Fabian Ebner 
---
 block/gluster.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 398976bc66..f711bf0bd6 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -891,7 +891,7 @@ out:
 static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
 {
 bs->bl.max_transfer = GLUSTER_MAX_TRANSFER;
-bs->bl.max_pdiscard = SIZE_MAX;
+bs->bl.max_pdiscard = MIN(SIZE_MAX, INT64_MAX);
 }
 
 static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
@@ -1304,7 +1304,7 @@ static coroutine_fn int 
qemu_gluster_co_pdiscard(BlockDriverState *bs,
 GlusterAIOCB acb;
 BDRVGlusterState *s = bs->opaque;
 
-assert(bytes <= SIZE_MAX); /* rely on max_pdiscard */
+assert(bytes <= MIN(SIZE_MAX, INT64_MAX)); /* rely on max_pdiscard */
 
 acb.size = 0;
 acb.ret = 0;
-- 
2.30.2





Re: [PATCH] block/gluster: correctly set max_pdiscard which is int64_t

2022-05-09 Thread Fabian Ebner
Am 06.05.22 um 17:39 schrieb Eric Blake:
> On Thu, May 05, 2022 at 10:31:24AM +0200, Fabian Ebner wrote:
>> Previously, max_pdiscard would be zero in the following assertion:
>> qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion
>> `max_pdiscard >= bs->bl.request_alignment' failed.
>>
>> Fixes: 0c8022876f ("block: use int64_t instead of int in driver discard 
>> handlers")
>> Cc: qemu-sta...@nongnu.org
>> Signed-off-by: Fabian Ebner 
>> ---
>>  block/gluster.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/gluster.c b/block/gluster.c
>> index 398976bc66..592e71b22a 100644
>> --- a/block/gluster.c
>> +++ b/block/gluster.c
>> @@ -891,7 +891,7 @@ out:
>>  static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>  {
>>  bs->bl.max_transfer = GLUSTER_MAX_TRANSFER;
>> -bs->bl.max_pdiscard = SIZE_MAX;
>> +bs->bl.max_pdiscard = INT64_MAX;
> 
> SIZE_MAX is unsigned, but can differ between 32- and 64-bit platforms.
> Blindly setting max_pdiscard to a signed 64-bit value seems wrong if
> glfs_discard_async() takes a size_t and you are on a 32-bit platform.
> 

Sorry, I did not consider this.

> Is the real issue that SIZE_MAX on a 64-bit platform is too large,

Yes, there it's too big for max_pdiscard which is int64_t.

> where we want min(SIZE_MAX,INT_MAX) as our real cap?
> 

Why not min(SIZE_MAX,INT64_MAX)? Since the constraint is to fit in both
size_t and int64_t. That would also preserve the current value on 32-bit
platforms.

>>  }
>>  
>>  static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
>> @@ -1304,7 +1304,7 @@ static coroutine_fn int 
>> qemu_gluster_co_pdiscard(BlockDriverState *bs,
>>  GlusterAIOCB acb;
>>  BDRVGlusterState *s = bs->opaque;
>>  
>> -assert(bytes <= SIZE_MAX); /* rely on max_pdiscard */
>> +assert(bytes <= INT64_MAX); /* rely on max_pdiscard */
>>  
>>  acb.size = 0;
>>  acb.ret = 0;
>> -- 
>> 2.30.2
>>
>>
>>
> 




[PATCH] block/gluster: correctly set max_pdiscard which is int64_t

2022-05-05 Thread Fabian Ebner
Previously, max_pdiscard would be zero in the following assertion:
qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion
`max_pdiscard >= bs->bl.request_alignment' failed.

Fixes: 0c8022876f ("block: use int64_t instead of int in driver discard 
handlers")
Cc: qemu-sta...@nongnu.org
Signed-off-by: Fabian Ebner 
---
 block/gluster.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 398976bc66..592e71b22a 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -891,7 +891,7 @@ out:
 static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
 {
 bs->bl.max_transfer = GLUSTER_MAX_TRANSFER;
-bs->bl.max_pdiscard = SIZE_MAX;
+bs->bl.max_pdiscard = INT64_MAX;
 }
 
 static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
@@ -1304,7 +1304,7 @@ static coroutine_fn int 
qemu_gluster_co_pdiscard(BlockDriverState *bs,
 GlusterAIOCB acb;
 BDRVGlusterState *s = bs->opaque;
 
-assert(bytes <= SIZE_MAX); /* rely on max_pdiscard */
+assert(bytes <= INT64_MAX); /* rely on max_pdiscard */
 
 acb.size = 0;
 acb.ret = 0;
-- 
2.30.2





Re: [PATCH 0/4] Make qemu-img dd more flexible

2022-02-15 Thread Fabian Ebner
Am 11.02.22 um 17:42 schrieb Hanna Reitz:
> On 11.02.22 17:31, Eric Blake wrote:
>> On Thu, Feb 10, 2022 at 02:31:19PM +0100, Fabian Ebner wrote:
>>> Adds support for reading from stdin and writing to stdout (when raw
>>> format is used), as well as overriding the size of the output and
>>> input image/stream.
>>>
>>> Additionally, the options -n for skipping output image creation and -l
>>> for loading a snapshot are made available like for convert.
>> Without looking at the series itself, I want to refer back to earlier
>> times that someone proposed improving 'qemu-img dd':
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg00636.html
>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg02618.html
>>
>> As well as the observation that when we originally allowed 'qemu-img
>> dd' to be added, the end goal was that if 'qemu-img dd' can't operate
>> as a thin wrapper around 'qemu-img convert', then 'qemu-img convert'
>> needs to be made more powerful first.  Every time we diverge on what
>> the two uses can do, rather than keeping dd as a thin wrapper, we add
>> to our maintenance burden.

I'm wondering why it's not actually implemented as a thin wrapper then?
The fact that it isn't is (part of) the reason why dd was chosen, as
mentioned in the first patch:

"While dd and convert have overlapping use cases, `dd` is a
simple read/write loop while convert is much more
sophisticated and has ways to dealing with holes and blocks
of zeroes.
Since these typically can't be detected in pipes via
SEEK_DATA/HOLE or skipped while writing, dd seems to be the
better choice for implementing stdin/stdout streams."

Adding the same feature to convert seems much more involved.

>>
>> Sadly, there is a lot of technical debt in this area ('qemu-img dd
>> skip= count=' is STILL broken, more than 4 years after I first
>> proposed a potential patch), where no one has spent the necessary time
>> to improve the situation.
> 
> Note that by now (in contrast to 2018), we have FUSE disk exports, and I
> even have a script that uses them to let you run dd on any image:
> 
> https://gitlab.com/hreitz/qemu-scripts/-/blob/main/qemu-dd.py
> 
> Which is nice, because it gives you feature parity with dd, because it
> simply runs dd.

Thank you for the suggestion. It's definitely worth considering,
although it does add a bit of complexity and we currently don't have
FUSE support enabled in our builds.

> 
> (The main problem with the script is that it lives in that personal repo
> of mine and so nobody but me knows about it.  Suggestions to improve
> that situation are more than welcome.)
> 
> Now, the qemu storage daemon does not support loading qcow2 snapshots
> (as far as I’m aware), which is proposed in patch 4 of this series.  But
> I think that just means that it would be nice if the QSD could support
> that.

I suppose adding a snapshot-load QMP command would be a natural way to
add it?

> 
> Hanna
> 
> 




[PATCH 3/4] qemu-img: dd: add -n option (skip target volume creation)

2022-02-10 Thread Fabian Ebner
From: Alexandre Derumier 

Same rationale as in
b2e10493c7 ("add qemu-img convert -n option (skip target volume creation)")

Originally-by: Alexandre Derumier 
Signed-off-by: Thomas Lamprecht 
[FE: avoid wrong colon in getopt's optstring
 add documentation + commit message]
Signed-off-by: Fabian Ebner 
---
 docs/tools/qemu-img.rst |  6 +-
 qemu-img-cmds.hx|  4 ++--
 qemu-img.c  | 23 ++-
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 43328fe108..9b022d9363 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -210,6 +210,10 @@ Parameters to dd subcommand:
 
 .. program:: qemu-img-dd
 
+.. option:: -n
+
+  Skip the creation of the target volume
+
 .. option:: bs=BLOCK_SIZE
 
   Defines the block size
@@ -496,7 +500,7 @@ Command description:
   it doesn't need to be specified separately in this case.
 
 
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] [osize=OUTPUT_SIZE] [if=INPUT] 
[of=OUTPUT]
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] 
[bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] 
[osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
 
   dd copies from *INPUT* file (default: STDIN) to *OUTPUT* file (default:
   STDOUT) converting it from *FMT* format to *OUTPUT_FMT* format.
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 50993e6c47..97e750623f 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
 ERST
 
 DEF("dd", img_dd,
-"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] 
[count=blocks] [skip=blocks] [isize=input_size] [osize=output_size] [if=input] 
[of=output]")
+"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [bs=block_size] 
[count=blocks] [skip=blocks] [isize=input_size] [osize=output_size] [if=input] 
[of=output]")
 SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] [osize=OUTPUT_SIZE] [if=INPUT] 
[of=OUTPUT]
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] 
[bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] 
[osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
 ERST
 
 DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 630928773d..89bf6fd087 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4944,7 +4944,7 @@ static int img_dd(int argc, char **argv)
 const char *fmt = NULL;
 int64_t size = 0, readsize = 0;
 int64_t block_count = 0, out_pos, in_pos;
-bool force_share = false;
+bool force_share = false, skip_create = false;
 struct DdInfo dd = {
 .flags = 0,
 .count = 0,
@@ -4982,7 +4982,7 @@ static int img_dd(int argc, char **argv)
 { 0, 0, 0, 0 }
 };
 
-while ((c = getopt_long(argc, argv, ":hf:O:U", long_options, NULL))) {
+while ((c = getopt_long(argc, argv, ":hf:O:Un", long_options, NULL))) {
 if (c == EOF) {
 break;
 }
@@ -5002,6 +5002,9 @@ static int img_dd(int argc, char **argv)
 case 'h':
 help();
 break;
+case 'n':
+skip_create = true;
+break;
 case 'U':
 force_share = true;
 break;
@@ -5144,13 +5147,15 @@ static int img_dd(int argc, char **argv)
 size - in.bsz * in.offset, _abort);
 }
 
-ret = bdrv_create(drv, out.filename, opts, _err);
-if (ret < 0) {
-error_reportf_err(local_err,
-  "%s: error while creating output image: ",
-  out.filename);
-ret = -1;
-goto out;
+if (!skip_create) {
+ret = bdrv_create(drv, out.filename, opts, _err);
+if (ret < 0) {
+error_reportf_err(local_err,
+  "%s: error while creating output image: ",
+  out.filename);
+ret = -1;
+goto out;
+}
 }
 
 /* TODO, we can't honour --image-opts for the target,
-- 
2.30.2





[PATCH 1/4] qemu-img: dd: add osize and read from/to stdin/stdout

2022-02-10 Thread Fabian Ebner
From: Wolfgang Bumiller 

Neither convert nor dd were previously able to write to or
read from a pipe. Particularly serializing an image file
into a raw stream or vice versa can be useful, but using
`qemu-img convert -f qcow2 -O raw foo.qcow2 /dev/stdout` in
a pipe will fail trying to seek.

While dd and convert have overlapping use cases, `dd` is a
simple read/write loop while convert is much more
sophisticated and has ways to dealing with holes and blocks
of zeroes.
Since these typically can't be detected in pipes via
SEEK_DATA/HOLE or skipped while writing, dd seems to be the
better choice for implementing stdin/stdout streams.

This patch causes "if" and "of" to default to stdin and
stdout respectively, allowing only the "raw" format to be
used in these cases.
Since the input can now be a pipe we have no way of
detecting the size of the output image to create. Since we
also want to support images with a size not matching the
dd command's "bs" parameter (which, together with "count"
could be used to calculate the desired size, and is already
used to limit it), the "osize" option is added to explicitly
override the output file's size.

Signed-off-by: Wolfgang Bumiller 
Signed-off-by: Thomas Lamprecht 
[FE: add documentation
 avoid error when osize is larger than input image's size
 fail if both count and osize are specified
 fail if skip is specified when reading from stdin]
Signed-off-by: Fabian Ebner 
---
 docs/tools/qemu-img.rst |  17 +++-
 qemu-img-cmds.hx|   4 +-
 qemu-img.c  | 201 ++--
 3 files changed, 146 insertions(+), 76 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 8885ea11cf..775eaf3097 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -220,16 +220,20 @@ Parameters to dd subcommand:
 
 .. option:: if=INPUT
 
-  Sets the input file
+  Sets the input file (defaults to STDIN)
 
 .. option:: of=OUTPUT
 
-  Sets the output file
+  Sets the output file (defaults to STDOUT)
 
 .. option:: skip=BLOCKS
 
   Sets the number of input blocks to skip
 
+.. option:: osize=OUTPUT_SIZE
+
+  Sets the output image's size
+
 Parameters to snapshot subcommand:
 
 .. program:: qemu-img-snapshot
@@ -488,10 +492,10 @@ Command description:
   it doesn't need to be specified separately in this case.
 
 
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
 
-  dd copies from *INPUT* file to *OUTPUT* file converting it from
-  *FMT* format to *OUTPUT_FMT* format.
+  dd copies from *INPUT* file (default: STDIN) to *OUTPUT* file (default:
+  STDOUT) converting it from *FMT* format to *OUTPUT_FMT* format.
 
   The data is by default read and written using blocks of 512 bytes but can be
   modified by specifying *BLOCK_SIZE*. If count=\ *BLOCKS* is specified
@@ -499,6 +503,9 @@ Command description:
 
   The size syntax is similar to :manpage:`dd(1)`'s size syntax.
 
+  The output image will be created with size *OUTPUT_SIZE* and at most this 
many
+  bytes will be copied.
+
 .. option:: info [--object OBJECTDEF] [--image-opts] [-f FMT] [--output=OFMT] 
[--backing-chain] [-U] FILENAME
 
   Give information about the disk image *FILENAME*. Use it in
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 1b1dab5b17..e4935365c9 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
 ERST
 
 DEF("dd", img_dd,
-"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] 
[count=blocks] [skip=blocks] if=input of=output")
+"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] 
[count=blocks] [skip=blocks] [osize=output_size] [if=input] [of=output]")
 SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
 ERST
 
 DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 6fe2466032..ea488fd190 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4819,10 +4819,12 @@ static int img_bitmap(int argc, char **argv)
 #define C_IF  04
 #define C_OF  010
 #define C_SKIP020
+#define C_OSIZE   040
 
 struct DdInfo {
 unsigned int flags;
 int64_t count;
+int64_t osize;
 };
 
 struct DdIo {
@@ -4898,6 +4900,19 @@ static int img_dd_skip(const char *arg,
 return 0;
 }
 
+static int img_dd_osize(const char *arg,
+struct DdIo *in, struct DdIo *out,
+struct DdInfo *dd)
+{
+dd->osize = cvtnum("osize", arg);
+
+if (dd->osize < 0) {
+ret

[PATCH 0/4] Make qemu-img dd more flexible

2022-02-10 Thread Fabian Ebner
Adds support for reading from stdin and writing to stdout (when raw
format is used), as well as overriding the size of the output and
input image/stream.

Additionally, the options -n for skipping output image creation and -l
for loading a snapshot are made available like for convert.

Alexandre Derumier (1):
  qemu-img: dd: add -n option (skip target volume creation)

Fabian Ebner (1):
  qemu-img: dd: add -l option for loading a snapshot

Wolfgang Bumiller (2):
  qemu-img: dd: add osize and read from/to stdin/stdout
  qemu-img: dd: add isize parameter

 docs/tools/qemu-img.rst |  28 -
 qemu-img-cmds.hx|   4 +-
 qemu-img.c  | 261 +---
 3 files changed, 215 insertions(+), 78 deletions(-)

-- 
2.30.2





[PATCH 2/4] qemu-img: dd: add isize parameter

2022-02-10 Thread Fabian Ebner
From: Wolfgang Bumiller 

for writing small images from stdin to bigger ones.

In order to distinguish between an actually unexpected and
an expected end of input.

Signed-off-by: Wolfgang Bumiller 
Signed-off-by: Thomas Lamprecht 
[FE: override size earlier
 use flag to detect parameter
 add documenation]
Signed-off-by: Fabian Ebner 
---
 docs/tools/qemu-img.rst | 10 --
 qemu-img-cmds.hx|  4 ++--
 qemu-img.c  | 24 +++-
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 775eaf3097..43328fe108 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -230,6 +230,10 @@ Parameters to dd subcommand:
 
   Sets the number of input blocks to skip
 
+.. option:: isize=INPUT_SIZE
+
+  Treat the input image or stream as if it had this size
+
 .. option:: osize=OUTPUT_SIZE
 
   Sets the output image's size
@@ -492,7 +496,7 @@ Command description:
   it doesn't need to be specified separately in this case.
 
 
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] [osize=OUTPUT_SIZE] [if=INPUT] 
[of=OUTPUT]
 
   dd copies from *INPUT* file (default: STDIN) to *OUTPUT* file (default:
   STDOUT) converting it from *FMT* format to *OUTPUT_FMT* format.
@@ -504,7 +508,9 @@ Command description:
   The size syntax is similar to :manpage:`dd(1)`'s size syntax.
 
   The output image will be created with size *OUTPUT_SIZE* and at most this 
many
-  bytes will be copied.
+  bytes will be copied. When *INPUT_SIZE* is positive, it overrides the input
+  image's size for the copy operation. When *INPUT_SIZE* is zero and reading
+  from STDIN, do not treat premature end of the input stream as an error.
 
 .. option:: info [--object OBJECTDEF] [--image-opts] [-f FMT] [--output=OFMT] 
[--backing-chain] [-U] FILENAME
 
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index e4935365c9..50993e6c47 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
 ERST
 
 DEF("dd", img_dd,
-"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] 
[count=blocks] [skip=blocks] [osize=output_size] [if=input] [of=output]")
+"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] 
[count=blocks] [skip=blocks] [isize=input_size] [osize=output_size] [if=input] 
[of=output]")
 SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] 
[count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] [osize=OUTPUT_SIZE] [if=INPUT] 
[of=OUTPUT]
 ERST
 
 DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index ea488fd190..630928773d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4820,11 +4820,13 @@ static int img_bitmap(int argc, char **argv)
 #define C_OF  010
 #define C_SKIP020
 #define C_OSIZE   040
+#define C_ISIZE   0100
 
 struct DdInfo {
 unsigned int flags;
 int64_t count;
 int64_t osize;
+int64_t isize;
 };
 
 struct DdIo {
@@ -4913,6 +4915,19 @@ static int img_dd_osize(const char *arg,
 return 0;
 }
 
+static int img_dd_isize(const char *arg,
+struct DdIo *in, struct DdIo *out,
+struct DdInfo *dd)
+{
+dd->isize = cvtnum("isize", arg);
+
+if (dd->isize < 0) {
+return 1;
+}
+
+return 0;
+}
+
 static int img_dd(int argc, char **argv)
 {
 int ret = 0;
@@ -4934,6 +4949,7 @@ static int img_dd(int argc, char **argv)
 .flags = 0,
 .count = 0,
 .osize = 0,
+.isize = 0,
 };
 struct DdIo in = {
 .bsz = 512, /* Block size is by default 512 bytes */
@@ -4955,6 +4971,7 @@ static int img_dd(int argc, char **argv)
 { "of", img_dd_of, C_OF },
 { "skip", img_dd_skip, C_SKIP },
 { "osize", img_dd_osize, C_OSIZE },
+{ "isize", img_dd_isize, C_ISIZE },
 { NULL, NULL, 0 }
 };
 const struct option long_options[] = {
@@ -5061,7 +5078,9 @@ static int img_dd(int argc, char **argv)
 }
 }
 
-if (dd.flags & C_IF) {
+if (dd.flags & C_ISIZE && dd.isize > 0) {
+size = dd.isize;
+} else if (dd.flags & C_IF) {
 size = blk_getlength(blk1);
 if (size < 0) {
 error_report("Failed to get size for '%s'", in.filename);
@@ -5174,6 +5193,9 @@ static int img_dd(int argc, char **argv)
 } else {
 in_ret = read(STDIN_FILENO, in.buf, in_bsz);
 if (in_ret == 0) {
+if (dd.fla

[PATCH 4/4] qemu-img: dd: add -l option for loading a snapshot

2022-02-10 Thread Fabian Ebner
Signed-off-by: Fabian Ebner 
---
 docs/tools/qemu-img.rst |  7 ---
 qemu-img-cmds.hx|  4 ++--
 qemu-img.c  | 33 +++--
 3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 9b022d9363..b2333d7b04 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -500,10 +500,11 @@ Command description:
   it doesn't need to be specified separately in this case.
 
 
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] 
[bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] 
[osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [-l 
SNAPSHOT_PARAM] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] 
[osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
 
-  dd copies from *INPUT* file (default: STDIN) to *OUTPUT* file (default:
-  STDOUT) converting it from *FMT* format to *OUTPUT_FMT* format.
+  dd copies from *INPUT* file (default: STDIN) or snapshot *SNAPSHOT_PARAM* to
+  *OUTPUT* file (default: STDOUT) converting it from *FMT* format to
+  *OUTPUT_FMT* format.
 
   The data is by default read and written using blocks of 512 bytes but can be
   modified by specifying *BLOCK_SIZE*. If count=\ *BLOCKS* is specified
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 97e750623f..2f527306b0 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
 ERST
 
 DEF("dd", img_dd,
-"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [bs=block_size] 
[count=blocks] [skip=blocks] [isize=input_size] [osize=output_size] [if=input] 
[of=output]")
+"dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [-l snapshot_param] 
[bs=block_size] [count=blocks] [skip=blocks] [isize=input_size] 
[osize=output_size] [if=input] [of=output]")
 SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] 
[bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] 
[osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [-l 
SNAPSHOT_PARAM] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [isize=INPUT_SIZE] 
[osize=OUTPUT_SIZE] [if=INPUT] [of=OUTPUT]
 ERST
 
 DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 89bf6fd087..28b6430800 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4936,6 +4936,7 @@ static int img_dd(int argc, char **argv)
 BlockDriver *drv = NULL, *proto_drv = NULL;
 BlockBackend *blk1 = NULL, *blk2 = NULL;
 QemuOpts *opts = NULL;
+QemuOpts *sn_opts = NULL;
 QemuOptsList *create_opts = NULL;
 Error *local_err = NULL;
 bool image_opts = false;
@@ -4945,6 +4946,7 @@ static int img_dd(int argc, char **argv)
 int64_t size = 0, readsize = 0;
 int64_t block_count = 0, out_pos, in_pos;
 bool force_share = false, skip_create = false;
+const char *snapshot_name = NULL;
 struct DdInfo dd = {
 .flags = 0,
 .count = 0,
@@ -4982,7 +4984,7 @@ static int img_dd(int argc, char **argv)
 { 0, 0, 0, 0 }
 };
 
-while ((c = getopt_long(argc, argv, ":hf:O:Un", long_options, NULL))) {
+while ((c = getopt_long(argc, argv, ":hf:O:l:Un", long_options, NULL))) {
 if (c == EOF) {
 break;
 }
@@ -5005,6 +5007,19 @@ static int img_dd(int argc, char **argv)
 case 'n':
 skip_create = true;
 break;
+case 'l':
+if (strstart(optarg, SNAPSHOT_OPT_BASE, NULL)) {
+sn_opts = qemu_opts_parse_noisily(_snapshot_opts,
+  optarg, false);
+if (!sn_opts) {
+error_report("Failed in parsing snapshot param '%s'",
+ optarg);
+goto out;
+}
+} else {
+snapshot_name = optarg;
+}
+break;
 case 'U':
 force_share = true;
 break;
@@ -5074,11 +5089,24 @@ static int img_dd(int argc, char **argv)
 if (dd.flags & C_IF) {
 blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
 force_share);
-
 if (!blk1) {
 ret = -1;
 goto out;
 }
+if (sn_opts) {
+bdrv_snapshot_load_tmp(blk_bs(blk1),
+   qemu_opt_get(sn_opts, SNAPSHOT_OPT_ID),
+   qemu_opt_get(sn_opts, SNAPSHOT_OPT_NAME),
+   _err);
+} else if (snapshot_name != NULL) {
+bdrv_snapshot_load_tmp_by_id_or_name(blk_bs(blk1), snapshot_name,
+ _err);
+}
+if (local_err) {
+error_reportf_err(local_err, "Failed to load snapshot: ");
+ret = -1;
+ 

Re: [PULL 18/20] block/nbd: drop connection_co

2022-02-03 Thread Fabian Ebner
Am 02.02.22 um 15:21 schrieb Hanna Reitz:
> On 02.02.22 14:53, Eric Blake wrote:
>> On Wed, Feb 02, 2022 at 12:49:36PM +0100, Fabian Ebner wrote:
>>> Am 27.09.21 um 23:55 schrieb Eric Blake:
>>>> From: Vladimir Sementsov-Ogievskiy 
>>>>
>>>> OK, that's a big rewrite of the logic.
>>>>
>>>> Pre-patch we have an always running coroutine - connection_co. It does
>>>> reply receiving and reconnecting. And it leads to a lot of difficult
>>>> and unobvious code around drained sections and context switch. We also
>>>> abuse bs->in_flight counter which is increased for connection_co and
>>>> temporary decreased in points where we want to allow drained section to
>>>> begin. One of these place is in another file: in nbd_read_eof() in
>>>> nbd/client.c.
>>>>
>>>> We also cancel reconnect and requests waiting for reconnect on drained
>>>> begin which is not correct. And this patch fixes that.
>>>>
>>>> Let's finally drop this always running coroutine and go another way:
>>>> do both reconnect and receiving in request coroutines.
>>>>
>>> Hi,
>>>
>>> while updating our stack to 6.2, one of our live-migration tests stopped
>>> working (backtrace is below) and bisecting led me to this patch.
>>>
>>> The VM has a single qcow2 disk (converting to raw doesn't make a
>>> difference) and the issue only appears when using iothread (for both
>>> virtio-scsi-pci and virtio-block-pci).
>>>
>>> Reverting 1af7737871fb3b66036f5e520acb0a98fc2605f7 (which lives on top)
>>> and 4ddb5d2fde6f22b2cf65f314107e890a7ca14fcf (the commit corresponding
>>> to this patch) in v6.2.0 makes the migration work again.
>>>
>>> Backtrace:
>>>
>>> Thread 1 (Thread 0x7f9d93458fc0 (LWP 56711) "kvm"):
>>> #0  __GI_raise (sig=sig@entry=6) at
>>> ../sysdeps/unix/sysv/linux/raise.c:50
>>> #1  0x7f9d9d6bc537 in __GI_abort () at abort.c:79
>>> #2  0x7f9d9d6bc40f in __assert_fail_base (fmt=0x7f9d9d825128
>>> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5579153763f8
>>> "qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
>>> file=0x5579153764f9 "../io/channel.c", line=483, function=>> out>) at assert.c:92
>> Given that this assertion is about which aio context is set, I wonder
>> if the conversation at
>> https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg00096.html is
>> relevant; if so, Vladimir may already be working on the patch.
> 
> It should be exactly that patch:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg06222.html
> 
> (From the discussion it appears that for v1 I need to ensure the
> reconnection timer is deleted immediately once reconnecting succeeds,
> and then that should be good to move out of the RFC state.)

Thanks for the quick responses and happy to hear you're already working
on it! With the RFC, the issue is gone for me.

> 
> Basically, I expect qemu to crash every time that you try to use an NBD
> block device in an I/O thread (unless you don’t do any I/O), for example
> this is the simplest reproducer I know of:
> 
> $ qemu-nbd --fork -k /tmp/nbd.sock -f raw null-co://
> 
> $ qemu-system-x86_64 \
>     -object iothread,id=iothr0 \
>     -device virtio-scsi,id=vscsi,iothread=iothr0 \
>     -blockdev '{
>     "driver": "nbd",
>     "node-name": "nbd",
>     "server": {
>     "type": "unix",
>     "path": "/tmp/nbd.sock"
>     } }' \
>     -device scsi-hd,bus=vscsi.0,drive=nbd
> qemu-system-x86_64: ../qemu-6.2.0/io/channel.c:483:
> qio_channel_restart_read: Assertion `qemu_get_current_aio_context() ==
> qemu_coroutine_get_aio_context(co)' failed.
> qemu-nbd: Disconnect client, due to: Unable to read from socket:
> Connection reset by peer
> [1]    108747 abort (core dumped)  qemu-system-x86_64 -object
> iothread,id=iothr0 -device  -blockdev  -device
> 
> 

Interestingly, the reproducer didn't crash the very first time I tried
it. I did get the same error after ^C-ing though, and on subsequent
tries it mostly crashed immediately, but very occasionally it didn't.




Re: [PULL 18/20] block/nbd: drop connection_co

2022-02-02 Thread Fabian Ebner
Am 27.09.21 um 23:55 schrieb Eric Blake:
> From: Vladimir Sementsov-Ogievskiy 
> 
> OK, that's a big rewrite of the logic.
> 
> Pre-patch we have an always running coroutine - connection_co. It does
> reply receiving and reconnecting. And it leads to a lot of difficult
> and unobvious code around drained sections and context switch. We also
> abuse bs->in_flight counter which is increased for connection_co and
> temporary decreased in points where we want to allow drained section to
> begin. One of these place is in another file: in nbd_read_eof() in
> nbd/client.c.
> 
> We also cancel reconnect and requests waiting for reconnect on drained
> begin which is not correct. And this patch fixes that.
> 
> Let's finally drop this always running coroutine and go another way:
> do both reconnect and receiving in request coroutines.
>

Hi,

while updating our stack to 6.2, one of our live-migration tests stopped
working (backtrace is below) and bisecting led me to this patch.

The VM has a single qcow2 disk (converting to raw doesn't make a
difference) and the issue only appears when using iothread (for both
virtio-scsi-pci and virtio-block-pci).

Reverting 1af7737871fb3b66036f5e520acb0a98fc2605f7 (which lives on top)
and 4ddb5d2fde6f22b2cf65f314107e890a7ca14fcf (the commit corresponding
to this patch) in v6.2.0 makes the migration work again.

Backtrace:

Thread 1 (Thread 0x7f9d93458fc0 (LWP 56711) "kvm"):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7f9d9d6bc537 in __GI_abort () at abort.c:79
#2  0x7f9d9d6bc40f in __assert_fail_base (fmt=0x7f9d9d825128
"%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5579153763f8
"qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
file=0x5579153764f9 "../io/channel.c", line=483, function=) at assert.c:92
#3  0x7f9d9d6cb662 in __GI___assert_fail
(assertion=assertion@entry=0x5579153763f8
"qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
file=file@entry=0x5579153764f9 "../io/channel.c", line=line@entry=483,
function=function@entry=0x557915376570 <__PRETTY_FUNCTION__.2>
"qio_channel_restart_read") at assert.c:101
#4  0x5579150c351c in qio_channel_restart_read (opaque=) at ../io/channel.c:483
#5  qio_channel_restart_read (opaque=) at ../io/channel.c:477
#6  0x55791520182a in aio_dispatch_handler
(ctx=ctx@entry=0x557916908c60, node=0x7f9d8400f800) at
../util/aio-posix.c:329
#7  0x557915201f62 in aio_dispatch_handlers (ctx=0x557916908c60) at
../util/aio-posix.c:372
#8  aio_dispatch (ctx=0x557916908c60) at ../util/aio-posix.c:382
#9  0x5579151ea74e in aio_ctx_dispatch (source=,
callback=, user_data=) at ../util/async.c:311
#10 0x7f9d9e647e6b in g_main_context_dispatch () from
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x557915203030 in glib_pollfds_poll () at ../util/main-loop.c:232
#12 os_host_main_loop_wait (timeout=992816) at ../util/main-loop.c:255
#13 main_loop_wait (nonblocking=nonblocking@entry=0) at
../util/main-loop.c:531
#14 0x5579150539c1 in qemu_main_loop () at ../softmmu/runstate.c:726
#15 0x557914ce8ebe in main (argc=, argv=, envp=) at ../softmmu/main.c:50





[PATCH v2] block/io_uring: resubmit when result is -EAGAIN

2021-07-29 Thread Fabian Ebner
Linux SCSI can throw spurious -EAGAIN in some corner cases in its
completion path, which will end up being the result in the completed
io_uring request.

Resubmitting such requests should allow block jobs to complete, even
if such spurious errors are encountered.

Co-authored-by: Stefan Hajnoczi 
Reviewed-by: Stefano Garzarella 
Signed-off-by: Fabian Ebner 
---

Changes from v1:
* Focus on what's relevant for the patch itself in the commit
  message.
* Add Stefan's comment.
* Add Stefano's R-b tag (I hope that's fine, since there was no
  change code-wise).

 block/io_uring.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index 00a3ee9fb8..dfa475cc87 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -165,7 +165,21 @@ static void luring_process_completions(LuringState *s)
 total_bytes = ret + luringcb->total_read;
 
 if (ret < 0) {
-if (ret == -EINTR) {
+/*
+ * Only writev/readv/fsync requests on regular files or host block
+ * devices are submitted. Therefore -EAGAIN is not expected but 
it's
+ * known to happen sometimes with Linux SCSI. Submit again and hope
+ * the request completes successfully.
+ *
+ * For more information, see:
+ * 
https://lore.kernel.org/io-uring/20210727165811.284510-3-ax...@kernel.dk/T/#u
+ *
+ * If the code is changed to submit other types of requests in the
+ * future, then this workaround may need to be extended to deal 
with
+ * genuine -EAGAIN results that should not be resubmitted
+ * immediately.
+ */
+if (ret == -EINTR || ret == -EAGAIN) {
 luring_resubmit(s, luringcb);
 continue;
 }
-- 
2.30.2