Re: [Qemu-block] [Qemu-devel] [PATCH] qemu-doc: Do not hard-code the name of the QEMU binary

2019-08-28 Thread Thomas Huth
On 28/08/2019 21.18, John Snow wrote:
> 
> 
> On 8/28/19 5:34 AM, Thomas Huth wrote:
>> In our documentation, we use a mix of "$QEMU", "qemu-system-i386" and
>> "qemu-system-x86_64" when we give examples to the users how to run
>> QEMU. Some more consistency would be good here. Also some distributions
>> use different names for the QEMU binary (e.g. "qemu-kvm" in RHEL), so
>> providing more flexibility here would also be good. Thus let's define
>> some variables for the names of the QEMU command and use those in the
>> documentation instead: @value{qemu_system} for generic examples, and
>> @value{qemu_system_x86} for examples that only work with the x86
>> binaries.
>>
>> Signed-off-by: Thomas Huth 
> 
> Makes sense to me, but do we want a definitions.texi or similar that can
> be used globally (and is easy to find and edit by e.g. distro
> packagers), or is it better to re-define them per-each file as you've done?

Hmm, as long as it's just one or two variables, it seems a little bit
excessive to me, but if we'd have more config knobs, that would
certainly be the right way to go ... but currently we do not have any
more variables, do we?

 Thomas



Re: [Qemu-block] [Qemu-devel] [PATCH v3 0/4] iotests: use python logging

2019-08-28 Thread John Snow
Gentle ping. This should be fairly easy to review, I hope; the worst of
it is making sure that no tests remain that don't engage an entry point
in iotests.py anymore.

On 8/20/19 7:52 PM, John Snow wrote:
> This series uses python logging to enable output conditionally on
> iotests.log(). We unify an initialization call (which also enables
> debugging output for those tests with -d) and then make the switch
> inside of iotests.
> 
> It will help alleviate the need to create logged/unlogged versions
> of all the various helpers we have made.
> 
> V3:
>  - Rebased for 4.1+; now based on main branch.
> 
> V2:
>  - Added all of the other python tests I missed to use script_initialize
>  - Refactored the common setup as per Ehabkost's suggestion
>  - Added protocol arguments to common initialization,
>but this isn't strictly required.
> 
> John Snow (4):
>   iotests: add script_initialize
>   iotest 258: use script_main
>   iotests: add protocol support to initialization info
>   iotests: use python logging for iotests.log()
> 
>  tests/qemu-iotests/030|   4 +-
>  tests/qemu-iotests/149|   3 +-
>  tests/qemu-iotests/194|   3 +-
>  tests/qemu-iotests/202|   3 +-
>  tests/qemu-iotests/203|   3 +-
>  tests/qemu-iotests/206|   2 +-
>  tests/qemu-iotests/207|   4 +-
>  tests/qemu-iotests/208|   2 +-
>  tests/qemu-iotests/209|   2 +-
>  tests/qemu-iotests/210|   4 +-
>  tests/qemu-iotests/211|   4 +-
>  tests/qemu-iotests/212|   4 +-
>  tests/qemu-iotests/213|   4 +-
>  tests/qemu-iotests/216|   3 +-
>  tests/qemu-iotests/218|   2 +-
>  tests/qemu-iotests/219|   2 +-
>  tests/qemu-iotests/222|   5 +-
>  tests/qemu-iotests/224|   3 +-
>  tests/qemu-iotests/228|   3 +-
>  tests/qemu-iotests/234|   3 +-
>  tests/qemu-iotests/235|   4 +-
>  tests/qemu-iotests/236|   2 +-
>  tests/qemu-iotests/237|   2 +-
>  tests/qemu-iotests/238|   2 +
>  tests/qemu-iotests/242|   2 +-
>  tests/qemu-iotests/245|   1 +
>  tests/qemu-iotests/245.out|  24 
>  tests/qemu-iotests/246|   2 +-
>  tests/qemu-iotests/248|   2 +-
>  tests/qemu-iotests/254|   2 +-
>  tests/qemu-iotests/255|   2 +-
>  tests/qemu-iotests/256|   2 +-
>  tests/qemu-iotests/258|   8 +--
>  tests/qemu-iotests/262|   3 +-
>  tests/qemu-iotests/iotests.py | 108 ++
>  35 files changed, 124 insertions(+), 105 deletions(-)
> 



Re: [Qemu-block] [Qemu-devel] [PATCH v2] job: drop job_drain

2019-08-28 Thread John Snow



On 8/23/19 10:32 AM, Vladimir Sementsov-Ogievskiy wrote:
> In job_finish_sync job_enter should be enough for a job to make some
> progress and draining is a wrong tool for it. So use job_enter directly
> here and drop job_drain with all related staff not used more.
> 
> Suggested-by: Kevin Wolf 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v2: drop drain from tests/test-*, which I missed in v1

This patch has rotted a little, but not badly enough that --3way can't
handle it.

> 
>  include/block/blockjob_int.h | 19 ---
>  include/qemu/job.h   | 13 -
>  block/backup.c   | 19 +--
>  block/commit.c   |  1 -
>  block/mirror.c   | 28 +++-
>  block/stream.c   |  1 -
>  blockjob.c   | 13 -
>  job.c| 12 +---
>  tests/test-bdrv-drain.c  |  2 --
>  tests/test-block-iothread.c  |  1 -
>  tests/test-blockjob-txn.c|  1 -
>  tests/test-blockjob.c|  2 --
>  12 files changed, 5 insertions(+), 107 deletions(-)
> 
> diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
> index e4a318dd15..e2824a36a8 100644
> --- a/include/block/blockjob_int.h
> +++ b/include/block/blockjob_int.h
> @@ -52,17 +52,6 @@ struct BlockJobDriver {
>   * besides job->blk to the new AioContext.
>   */
>  void (*attached_aio_context)(BlockJob *job, AioContext *new_context);
> -
> -/*
> - * If the callback is not NULL, it will be invoked when the job has to be
> - * synchronously cancelled or completed; it should drain 
> BlockDriverStates
> - * as required to ensure progress.
> - *
> - * Block jobs must use the default implementation for job_driver.drain,
> - * which will in turn call this callback after doing generic block job
> - * stuff.
> - */
> -void (*drain)(BlockJob *job);
>  };
>  
>  /**
> @@ -107,14 +96,6 @@ void block_job_free(Job *job);
>   */
>  void block_job_user_resume(Job *job);
>  
> -/**
> - * block_job_drain:
> - * Callback to be used for JobDriver.drain in all block jobs. Drains the main
> - * block node associated with the block jobs and calls BlockJobDriver.drain 
> for
> - * job-specific actions.
> - */
> -void block_job_drain(Job *job);
> -
>  /**
>   * block_job_ratelimit_get_delay:
>   *
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 9e7cd1e4a0..09739b8dd9 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -220,13 +220,6 @@ struct JobDriver {
>   */
>  void (*complete)(Job *job, Error **errp);
>  
> -/*
> - * If the callback is not NULL, it will be invoked when the job has to be
> - * synchronously cancelled or completed; it should drain any activities
> - * as required to ensure progress.
> - */
> -void (*drain)(Job *job);
> -
>  /**
>   * If the callback is not NULL, prepare will be invoked when all the jobs
>   * belonging to the same transaction complete; or upon this job's 
> completion
> @@ -470,12 +463,6 @@ bool job_user_paused(Job *job);
>   */
>  void job_user_resume(Job *job, Error **errp);
>  
> -/*
> - * Drain any activities as required to ensure progress. This can be called 
> in a
> - * loop to synchronously complete a job.
> - */
> -void job_drain(Job *job);
> -
>  /**
>   * Get the next element from the list of block jobs after @job, or the
>   * first one if @job is %NULL.
> diff --git a/block/backup.c b/block/backup.c
> index 715e1d3be8..d1ecdfa9aa 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -320,21 +320,6 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
>  hbitmap_set(backup_job->copy_bitmap, 0, backup_job->len);
>  }
>  
> -static void backup_drain(BlockJob *job)
> -{
> -BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> -
> -/* Need to keep a reference in case blk_drain triggers execution
> - * of backup_complete...
> - */
> -if (s->target) {
> -BlockBackend *target = s->target;
> -blk_ref(target);
> -blk_drain(target);
> -blk_unref(target);
> -}
> -}
> -
>  static BlockErrorAction backup_error_action(BackupBlockJob *job,
>  bool read, int error)
>  {
> @@ -488,13 +473,11 @@ static const BlockJobDriver backup_job_driver = {
>  .job_type   = JOB_TYPE_BACKUP,
>  .free   = block_job_free,
>  .user_resume= block_job_user_resume,
> -.drain  = block_job_drain,
>  .run= backup_run,
>  .commit = backup_commit,
>  .abort  = backup_abort,
>  .clean  = backup_clean,
> -},
> -.drain  = backup_drain,
> +}
>  };
>  
>  static int64_t backup_calculate_cluster_size(BlockDriverState *target,
> diff --git a/block/com

Re: [Qemu-block] [Qemu-devel] [PATCH v6 1/6] iotests: allow Valgrind checking all QEMU processes

2019-08-28 Thread Eric Blake
On 8/28/19 5:58 PM, John Snow wrote:

>> +++ b/tests/qemu-iotests/common.rc
>> @@ -60,61 +60,132 @@ if ! . ./common.config
>>  exit 1
>>  fi
>>  
>> +# Unset the variables to turn Valgrind off for specific processes, e.g.

That's not unsetting, that's setting to the empty string.

>> +# $ VALGRIND_QEMU_IO= ./check -qcow2 -valgrind 015
>> +
>> +: ${VALGRIND_QEMU_VM='y'}
>> +: ${VALGRIND_QEMU_IMG='y'}
>> +: ${VALGRIND_QEMU_IO='y'}
>> +: ${VALGRIND_QEMU_NBD='y'}
>> +: ${VALGRIND_QEMU_VXHS='y'}
>> +
> 
> I have to admit to you that I'm not familiar with this trick. I'm
> looking it up and I see := documented, but not = alone.

It's been a repeated complaint to the bash developer that the manual is
doing a disservice to its users by not documenting ${var=val} in an
easily searchable form.  It IS documented, but only by virtue of
${var:=val} occurring under a section header that states:

   When not performing substring expansion,  using  the  forms
documented
   below  (e.g.,  :-),  bash  tests for a parameter that is unset or
null.
   Omitting the colon results in a test  only  for  a  parameter
that  is
   unset.

So the choice is whether you want to special case a variable set to an
empty string the same as an unset variable, or the same as a variable
with a non-empty value.

> 
> It doesn't seem documented here at all:
> https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
> 
> I see it here, though:
> https://www.tldp.org/LDP/abs/html/parameter-substitution.html
> 
> And it seems to work, but I'm not sure if this works with BSD or OSX's
> sh. I see Eric comment on that compatibility a lot, so maybe I'll let
> him chime in.

It's quite portable; POSIX requires it, and autoconf relies on it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 6/6] iotests: extend sleeping time under Valgrind

2019-08-28 Thread John Snow



On 8/26/19 11:50 AM, Andrey Shinkevich wrote:
> To synchronize the time when QEMU is running longer under the Valgrind,
> increase the sleeping time in the test 247.
> 
> Signed-off-by: Andrey Shinkevich 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 

Probably fine, as discussed. We can work on eliminating sleeps and other
pauses of indeterminate length as an ongoing process.

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v6 2/6] iotests: exclude killed processes from running under Valgrind

2019-08-28 Thread John Snow



On 8/26/19 11:50 AM, Andrey Shinkevich wrote:
>  The Valgrind tool fails to manage its termination in multi-threaded
>  processes when they raise the signal SIGKILL. The bug has been reported
>  to the Valgrind maintainers and was registered as the bug #409141:
>  https://bugs.kde.org/show_bug.cgi?id=409141
>  Let's exclude such test cases from running under the Valgrind until a
>  new version with the bug fix is released because checking for the
>  memory issues is covered by  other test cases.
> 
> Suggested-by: John Snow 
> Signed-off-by: Andrey Shinkevich 

Seems clear now, thanks.

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v6 1/6] iotests: allow Valgrind checking all QEMU processes

2019-08-28 Thread John Snow



On 8/26/19 11:50 AM, Andrey Shinkevich wrote:
> With the '-valgrind' option, let all the QEMU processes be run under
> the Valgrind tool. The Valgrind own parameters may be set with its
> environment variable VALGRIND_OPTS, e.g.
> $ VALGRIND_OPTS="--leak-check=yes" ./check -valgrind 
> or they may be listed in the Valgrind checked file ./.valgrindrc or
> ~/.valgrindrc like
> --memcheck:leak-check=no
> --memcheck:track-origins=yes
> To exclude a specific process from running under the Valgrind, the
> corresponding environment variable VALGRIND_QEMU_ is to be unset:
> $ VALGRIND_QEMU_IO= ./check -valgrind 
> When QEMU-IO process is being killed, the shell report refers to the
> text of the command in _qemu_io_wrapper(), which was modified with this
> patch. So, the benchmark output for the tests 039, 061 and 137 is to be
> changed also.
> 
> Signed-off-by: Andrey Shinkevich 
> ---
>  tests/qemu-iotests/039.out   |  30 ++--
>  tests/qemu-iotests/061.out   |  12 +
>  tests/qemu-iotests/137.out   |   6 +--
>  tests/qemu-iotests/common.rc | 107 
> +++
>  4 files changed, 97 insertions(+), 58 deletions(-)
> 
> diff --git a/tests/qemu-iotests/039.out b/tests/qemu-iotests/039.out
> index 724d7b2..66d2159 100644
> --- a/tests/qemu-iotests/039.out
> +++ b/tests/qemu-iotests/039.out
> @@ -11,11 +11,7 @@ No errors were found on the image.
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  wrote 512/512 bytes at offset 0
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -else
> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -fi )
> +./common.rc: Killed  ( VALGRIND_QEMU="${VALGRIND_ON}" 
> _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>  incompatible_features 0x1
>  ERROR cluster 5 refcount=0 reference=1
>  ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
> @@ -50,11 +46,7 @@ read 512/512 bytes at offset 0
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  wrote 512/512 bytes at offset 0
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -else
> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -fi )
> +./common.rc: Killed  ( VALGRIND_QEMU="${VALGRIND_ON}" 
> _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>  incompatible_features 0x1
>  ERROR cluster 5 refcount=0 reference=1
>  Rebuilding refcount structure
> @@ -68,11 +60,7 @@ incompatible_features 0x0
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  wrote 512/512 bytes at offset 0
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -else
> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -fi )
> +./common.rc: Killed  ( VALGRIND_QEMU="${VALGRIND_ON}" 
> _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>  incompatible_features 0x0
>  No errors were found on the image.
>  
> @@ -91,11 +79,7 @@ No errors were found on the image.
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  wrote 512/512 bytes at offset 0
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -else
> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -fi )
> +./common.rc: Killed  ( VALGRIND_QEMU="${VALGRIND_ON}" 
> _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>  incompatible_features 0x1
>  ERROR cluster 5 refcount=0 reference=1
>  ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
> @@ -105,11 +89,7 @@ Data may be corrupted, or further writes to the image may 
> corrupt it.
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  wrote 512/512 bytes at offset 0
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -else
> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
> -fi )
> +./common.rc: Killed  ( VALGRIND_QEMU="${VALGRIND_ON}" 
> _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>  incompatible_features

Re: [Qemu-block] [PATCH 2/2] iotests: Test reverse sub-cluster qcow2 writes

2019-08-28 Thread John Snow



On 8/23/19 9:03 AM, Max Reitz wrote:
> This exercises the regression introduced in commit
> 50ba5b2d994853b38fed10e0841b119da0f8b8e5.  On my machine, it has close
> to a 50 % false-negative rate, but that should still be sufficient to
> test the fix.
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/265 | 67 ++
>  tests/qemu-iotests/265.out |  6 
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 74 insertions(+)
>  create mode 100755 tests/qemu-iotests/265
>  create mode 100644 tests/qemu-iotests/265.out
> 
> diff --git a/tests/qemu-iotests/265 b/tests/qemu-iotests/265
> new file mode 100755
> index 00..dce6f77be3
> --- /dev/null
> +++ b/tests/qemu-iotests/265
> @@ -0,0 +1,67 @@
> +#!/usr/bin/env bash
> +#
> +# Test reverse-ordered qcow2 writes on a sub-cluster level
> +#
> +# Copyright (C) 2019 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +seq=$(basename $0)
> +echo "QA output created by $seq"
> +
> +status=1 # failure is the default!
> +
> +_cleanup()
> +{
> +_cleanup_test_img
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +# qcow2-specific test
> +_supported_fmt qcow2
> +_supported_proto file
> +_supported_os Linux
> +
> +echo '--- Writing to the image ---'
> +
> +# Reduce cluster size so we get more and quicker I/O
> +IMGOPTS='cluster_size=4096' _make_test_img 1M
> +(for ((kb = 1024 - 4; kb >= 0; kb -= 4)); do \
> + echo "aio_write -P 42 $((kb + 1))k 2k"; \
> + done) \
> + | $QEMU_IO "$TEST_IMG" > /dev/null
> +
> +echo '--- Verifying its content ---'
> +
> +(for ((kb = 0; kb < 1024; kb += 4)); do \
> +echo "read -P 0 ${kb}k 1k"; \
> +echo "read -P 42 $((kb + 1))k 2k"; \
> +echo "read -P 0 $((kb + 3))k 1k"; \
> + done) \
> + | $QEMU_IO "$TEST_IMG" | _filter_qemu_io | grep 'verification'
> +
> +# Status of qemu-io
> +if [ ${PIPESTATUS[1]} = 0 ]; then
> +echo 'Content verified.'
> +fi
> +
> +# success, all done
> +echo "*** done"
> +rm -f $seq.full
> +status=0
> diff --git a/tests/qemu-iotests/265.out b/tests/qemu-iotests/265.out
> new file mode 100644
> index 00..6eac620f25
> --- /dev/null
> +++ b/tests/qemu-iotests/265.out
> @@ -0,0 +1,6 @@
> +QA output created by 265
> +--- Writing to the image ---
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
> +--- Verifying its content ---
> +Content verified.
> +*** done
> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index d95d556414..0c129c1644 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -274,3 +274,4 @@
>  257 rw
>  258 rw quick
>  262 rw quick migration
> +265 rw auto quick
> 

Tested-by: John Snow 
Reviewed-by: John Snow 



Re: [Qemu-block] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Drop write notifiers and use filter node instead.
> 
> = Changes =
> 
> 1. add filter-node-name argument for backup qmp api. We have to do it
> in this commit, as 257 needs to be fixed.

I feel a bit bad about it not being an implicit node.  But I know you
don’t like them, they’re probably just broken altogether and because
libvirt doesn’t use backup (yet), probably nobody cares.

> 2. there no move write notifiers here, so is_write_notifier parameter

s/there/there are/, I suppose?

> is dropped from block-copy paths.
> 
> 3. Intersecting requests handling changed, now synchronization between
> backup-top, backup and guest writes are all done in block/block-copy.c
> and works as follows:
> 
> On copy operation, we work only with dirty areas. If bits are dirty it
> means that there are no requests intersecting with this area. We clear
> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
> prevent further operations from interaction with guest (only with
> guest, as neither backup nor backup-top will touch non-dirty area). If
> copy-operation failed we set dirty bits back together with releasing
> the lock.
> 
> The actual difference with old scheme is that on guest writes we
> don't lock the whole region but only dirty-parts, and to be more
> precise: only dirty-part we are currently operate on. In old scheme
> guest write to non-dirty area (which may be safely ignored by backup)
> may wait for intersecting request, touching some other area which is
> dirty.
> 
> 4. To sync with in-flight requests at job finish we now have drained
> removing of the filter, we don't need rw-lock.
> 
> = Notes =
> 
> Note the consequence of three objects appearing: backup-top, backup job
> and block-copy-state:
> 
> 1. We want to insert backup-top before job creation, to behave similar
> with mirror and commit, where job is started upon filter.
> 
> 2. We also have to create block-copy-state after filter injection, as
> we don't want it's source child be replaced by fitler. Instead we want

s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
ring to it)

> to keep BCS.source to be real source node, as we want to use
> bdrv_co_try_lock in CBW operations and it can't be used on filter, as
> on filter we already have in-flight (write) request from upper layer.

Reasonable, even more so as I suppose BCS.source should eventually be a
BdrvChild of backup-top.

What looks wrong is that the sync_bitmap is created on the source (“bs”
in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
it is on blk_bs(job->common.blk) (which is no longer true).

> So, we firstly create inject backup-top, then create job and BCS. BCS
> is the latest just to not create extra variable for it. Finally we set
> bcs for backup-top filter.
> 
> = Iotest changes =
> 
> 56: op-blocker doesn't shot now, as we set it on source, but then check

s/shot/show/?

> on filter, when trying to start second backup, so error caught in
> test_dismiss_collision is changed. It's OK anyway, as this test-case
> seems to test that after some collision we can dismiss first job and
> successfully start the second one. So, the change is that collision is
> changed from op-blocker to file-posix locks.

Well, but the op blocker belongs to the source, which should be covered
by internal permissions.  The fact that it now shows up as a file-posix
error thus shows that the conflict also moves from the source to the
target.  It’s OK because we actually don’t have a conflict on the source.

But I wonder whether it would be better for test_dismiss_collision() to
do a blockdev-backup instead where we can see the collision on the target.

Hm.  On second thought, why do we even get a conflict on the target?
block-copy does share the WRITE permission for it...

> However, it's obvious now that we'd better drop this op-blocker at all
> and add a test-case for two backups from one node (to different
> destinations) actually works. But not in these series.
> 
> 257: The test wants to emulate guest write during backup. They should
> go to filter node, not to original source node, of course. Therefore we
> need to specify filter node name and use it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json   |   8 +-
>  include/block/block-copy.h |   2 +-
>  include/block/block_int.h  |   1 +
>  block/backup-top.c |  14 +-
>  block/backup.c | 113 +++---
>  block/block-copy.c |  55 ---
>  block/replication.c|   2 +-
>  blockdev.c |   1 +
>  tests/qemu-iotests/056 |   2 +-
>  tests/qemu-iotests/257 |   7 +-
>  tests/qemu-iotests/257.out | 306 ++---
>  11 files changed, 230 insertions(+), 281 deletions(-)

Nice.

I don’t have much to object (for some reason, I feel a bit uneasy about
dropping all the request waiting code, but I suppose that is indeed
taken ca

Re: [Qemu-block] [PATCH 1/2] block/file-posix: Reduce xfsctl() use

2019-08-28 Thread John Snow



On 8/23/19 9:03 AM, Max Reitz wrote:
> This patch removes xfs_write_zeroes() and xfs_discard().  Both functions
> have been added just before the same feature was present through
> fallocate():
> 
> - fallocate() has supported PUNCH_HOLE for XFS since Linux 2.6.38 (March
>   2011); xfs_discard() was added in December 2010.
> 
> - fallocate() has supported ZERO_RANGE for XFS since Linux 3.15 (June
>   2014); xfs_write_zeroes() was added in November 2013.
> 
> Nowadays, all systems that qemu runs on should support both fallocate()
> features (RHEL 7's kernel does).
> 
> xfsctl() is still useful for getting the request alignment for O_DIRECT,
> so this patch does not remove our dependency on it completely.
> 
> Note that xfs_write_zeroes() had a bug: It calls ftruncate() when the
> file is shorter than the specified range (because ZERO_RANGE does not
> increase the file length).  ftruncate() may yield and then discard data
> that parallel write requests have written past the EOF in the meantime.
> Dropping the function altogether fixes the bug.
> 

And I assume getting rid of discard is just then simply convenient, so
why not.

> Suggested-by: Paolo Bonzini 
> Fixes: 50ba5b2d994853b38fed10e0841b119da0f8b8e5
> Reported-by: Lukáš Doktor 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Max Reitz 

Tested-by: John Snow 
Reviewed-by: John Snow 



Re: [Qemu-block] [PATCH v2 2/2] block: Remove unused masks

2019-08-28 Thread John Snow



On 8/27/19 2:59 PM, Nir Soffer wrote:
> Replace confusing usage:
> 
> ~BDRV_SECTOR_MASK
> 
> With more clear:
> 
> (BDRV_SECTOR_SIZE - 1)
> 
> Remove BDRV_SECTOR_MASK and the unused BDRV_BLOCK_OFFSET_MASK which was
> it's last user.
> 

Kind of lateral in my opinion, but if there was only ONE user across TWO
definitions, then for sure it can go, especially because using the
ALIGNED macros is indeed nicer and should be encouraged.

Reviewed-by: John Snow 

> Signed-off-by: Nir Soffer 
> ---
>  include/block/block.h | 2 --
>  migration/block.c | 2 +-
>  2 files changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 124ad40809..37c9de7446 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -143,7 +143,6 @@ typedef struct HDGeometry {
>  
>  #define BDRV_SECTOR_BITS   9
>  #define BDRV_SECTOR_SIZE   (1ULL << BDRV_SECTOR_BITS)
> -#define BDRV_SECTOR_MASK   ~(BDRV_SECTOR_SIZE - 1)
>  
>  #define BDRV_REQUEST_MAX_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, \
>   INT_MAX >> BDRV_SECTOR_BITS)
> @@ -195,7 +194,6 @@ typedef struct HDGeometry {
>  #define BDRV_BLOCK_ALLOCATED0x10
>  #define BDRV_BLOCK_EOF  0x20
>  #define BDRV_BLOCK_RECURSE  0x40
> -#define BDRV_BLOCK_OFFSET_MASK  BDRV_SECTOR_MASK
>  
>  typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) 
> BlockReopenQueue;
>  
> diff --git a/migration/block.c b/migration/block.c
> index aa747b55fa..92c36b68ec 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -906,7 +906,7 @@ static int block_load(QEMUFile *f, void *opaque, int 
> version_id)
>  do {
>  addr = qemu_get_be64(f);
>  
> -flags = addr & ~BDRV_SECTOR_MASK;
> +flags = addr & (BDRV_SECTOR_SIZE - 1);
>  addr >>= BDRV_SECTOR_BITS;
>  
>  if (flags & BLK_MIG_FLAG_DEVICE_BLOCK) {
> 



Re: [Qemu-block] [PATCH] qemu-doc: Do not hard-code the name of the QEMU binary

2019-08-28 Thread John Snow



On 8/28/19 5:34 AM, Thomas Huth wrote:
> In our documentation, we use a mix of "$QEMU", "qemu-system-i386" and
> "qemu-system-x86_64" when we give examples to the users how to run
> QEMU. Some more consistency would be good here. Also some distributions
> use different names for the QEMU binary (e.g. "qemu-kvm" in RHEL), so
> providing more flexibility here would also be good. Thus let's define
> some variables for the names of the QEMU command and use those in the
> documentation instead: @value{qemu_system} for generic examples, and
> @value{qemu_system_x86} for examples that only work with the x86
> binaries.
> 
> Signed-off-by: Thomas Huth 

Makes sense to me, but do we want a definitions.texi or similar that can
be used globally (and is easy to find and edit by e.g. distro
packagers), or is it better to re-define them per-each file as you've done?

--js



Re: [Qemu-block] [PATCH v2 0/2] Alignment checks cleanup

2019-08-28 Thread John Snow



On 8/27/19 2:59 PM, Nir Soffer wrote:
> While working on 4k support, I noticed that there is lot of code using
> BDRV_SECTOR_SIZE (512) for checking alignment. I wonder how this can work with
> 4k storage.
> 
> Lets start by cleaning up to make the code easier to understand:
> - Use QEMU_IS_ALIGNED macro to check alignment
> - Remove unneeded masks based on BDRV_SECTOR_SIZE
> 
> Nir Soffer (2):
>   block: Use QEMU_IS_ALIGNED
>   block: Remove unused masks
> 
>  block/bochs.c | 4 ++--
>  block/cloop.c | 4 ++--
>  block/dmg.c   | 4 ++--
>  block/io.c| 8 
>  block/qcow2-cluster.c | 4 ++--
>  block/qcow2.c | 4 ++--
>  block/vvfat.c | 8 
>  include/block/block.h | 2 --
>  migration/block.c | 2 +-
>  qemu-img.c| 2 +-
>  10 files changed, 20 insertions(+), 22 deletions(-)
> 

V2 changelog?

(Looks like adding patch 2 as a result of changing away users from the
BDRV_SECTOR_MASK.)

Reviewed-by: John Snow 



Re: [Qemu-block] [PATCH v9 12/13] block: introduce backup-top filter driver

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Backup-top filter caches write operations and does copy-before-write
> operations.
> 
> The driver will be used in backup instead of write-notifiers.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/backup-top.h  |  37 +++
>  block/backup-top.c  | 245 
>  block/Makefile.objs |   2 +
>  3 files changed, 284 insertions(+)
>  create mode 100644 block/backup-top.h
>  create mode 100644 block/backup-top.c

[...]

> +static void backup_top_refresh_filename(BlockDriverState *bs)
> +{
> +if (bs->backing == NULL) {
> +/*
> + * we can be here after failed bdrv_attach_child in
> + * bdrv_set_backing_hd
> + */
> +return;
> +}
> +bdrv_refresh_filename(bs->backing->bs);

bdrv_refresh_filename() should have done this already.

> +pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
> +bs->backing->bs->filename);
> +}

[...]

> +BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
> + const char *filter_node_name,
> + Error **errp)
> +{
[...]
> +assert(!(top->backing->shared_perm & BLK_PERM_WRITE));

Not wrong, but why?

> +
> +return top;
> +}
> +
> +void bdrv_backup_top_set_bcs(BlockDriverState *bs, BlockCopyState 
> *copy_state)
> +{
> +BDRVBackupTopState *s = bs->opaque;
> +
> +assert(blk_bs(copy_state->source) == bs->backing->bs);
> +s->bcs = copy_state;
> +}
> +
> +void bdrv_backup_top_drop(BlockDriverState *bs)
> +{
> +AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> +aio_context_acquire(aio_context);
> +
> +bdrv_drained_begin(bs);
> +
> +bdrv_child_try_set_perm(bs->backing, 0, BLK_PERM_ALL, &error_abort);

I would prefer a state->active = false and bdrv_child_refresh_perms().

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 2/2] qapi: deprecate implicit filters

2019-08-28 Thread John Snow
(Peter: search for "pkrempa" down below.)

On 8/28/19 5:20 AM, Vladimir Sementsov-Ogievskiy wrote:
> 27.08.2019 23:12, John Snow wrote:
>>
>>
>> On 8/23/19 5:22 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 14.08.2019 13:07, Vladimir Sementsov-Ogievskiy wrote:
 To get rid of implicit filters related workarounds in future let's
 deprecate them now.
>>>
>>> Interesting, could we deprecate implicit filter without deprecation of 
>>> unnecessity of
>>> parameter? As actually, it's good when this parameter is not necessary, in 
>>> most cases
>>> user is not interested in node-name.
>>>
>>
>> https://en.wiktionary.org/wiki/unnecessity -- I am surprised to learn
>> that this a real word in the language I speak. :)
>>
>> I assume you're referring to making the optional argument mandatory.
> 
> exactly, it's my a bit "google-translate-driven" English)
> 

It teaches me some fun words!

>>
>>> Obviously we can do the following:
>>>
>>> 1. In 4.2 we deprecate unnecessity, which implies deprecation of implicit 
>>> filters
>>> 2. After some releases in 4.x we can drop deprecated functionality, so we 
>>> drop it together with
>>> implicit filters. And, in same release 4.x we return it back (as it's 
>>> compatible change :)
>>> but without implicit filters (so, if filter-node-name not specified, we 
>>> just create
>>> explicit filter with autogenerated node-name)
>>>
>>> So, effectively we just drop "deprecation mark" together with implicit 
>>> filters, which is nice
>>> but actually confusing.
>>>
>>> Instead, we may do
>>> 1. In 4.2 deprecate
>>> 2. In 4.x drop optionality together with implicit filters
>>> 3. In 4.y (y > x of course) return optionality back
>>>
>>
>> Ah, I see what you're digging at here now...
>>
>>> It's a bit safer, but for users who miss releases [4.x, 4.y) it's no 
>>> difference..
>>>
>>> Or we just write in spec, that implicit filters are deprecated? But we have 
>>> nothing about implicit
>>> filters in spec. More over, we directly write that we have filter, and if 
>>> parameter is omitted
>>> it's node-name is autogenerated. So actually, the fact the filter is hidden 
>>> when filter-node-name is
>>> unspecified is _undocumented_.
>>>
>>> So, finally, it looks like nothing to deprecated in specification, we can 
>>> just drop implicit filters :)
>>>
>>> What do you think?
>>>
>>
>> What exactly _IS_ an implicit filter? How does it differ today from an
>> explicit filter? I assumed the only difference was if it was named or
>> not; but I think I must be mistaken now if you're proposing leaving the
>> interface alone entirely.
>>
>> Are they instantiated differently?
>>
> 
> As I understand, the only difference is their BlockDriverState.impicit field, 
> and several places in code
> where we skip implicit filter when trying to find something in a chain 
> starting from a device.
> 

Oh, oh, yes. I see.

> Hmm, OK, let's see:
> 
> 1. the only implicit filters are commit_top and mirror_top if user don't 
> specify filter-node-name.
> 
> Where it make sense, i.e., where implicit field used?
> 

`git grep -E '(->|\.)implicit'` is what I used to find usages.

> 2. bdrv_query_info, bdrv_query_bds_stats, bdrv_block_device_info(only when 
> called from bdrv_query_info), they'll
> report filter as top node if we don't mark it implicit.
> 

So that's a bit of a change, but only visually. The "reality" is still
the same, we just report it more "accurately." libvirt MIGHT need a
heads up here. I'm looping pkrempa back in for comment.


Would libvirt be negatively impacted by the revelation of formerly
internal ("implicit") nodes created by mirror and commit via query block
commands? At the moment, QEMU hides them from you if you do not name them.


> 3. bdrv_refresh_filename, bdrv_reopen_parse_backing, bdrv_drop_intermediate:
>I think it's not a problem, just drop special case for implicit fitlers
>

I'm much less certain about what the impact of this would be and would
need to audit it (and don't have the time to, personally.)

Do you have a POC or RFC patch that demonstrates dropping these special
cases? It might be nice to see as proof that it's safe to deprecate.

> So, seems the only real change is query-block and query-blockstats output 
> when mirror or commit is started
> without specifying filter-node-name (filter would be on top)
> 
> So, how should we deprecate this, or can we just change it?
> 

I'm not sure if it's worth it yet, what does dropping the implicit field
buy us? Conceptually I understand that it's simpler without the notion
of implicit fields, but I imagine there's some cleanup in particular
that motivated this.

I'd say to just change the behavior, we should:

- Give a standard three-release warning that the behavior will change in
an incompatible way
- Demonstrate with an RFC patch that special cases around ->implicit in
block.c can be removed and do not make the code more complex,
- Get blessings from Peter Krempa.

As always: Libvirt is not the end-all be-all o

Re: [Qemu-block] [PATCH v5 6/6] iotests: extend sleeping time under Valgrind

2019-08-28 Thread John Snow



On 8/28/19 11:24 AM, Andrey Shinkevich wrote:
> 
> 
> On 27/08/2019 22:42, John Snow wrote:
>>
>>
>> On 8/23/19 11:27 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 16.08.2019 4:01, John Snow wrote:


 On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
> To synchronize the time when QEMU is running longer under the Valgrind,
> increase the sleeping time in the test 247.
>
> Signed-off-by: Andrey Shinkevich 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> ---
>tests/qemu-iotests/247 | 6 +-
>1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/tests/qemu-iotests/247 b/tests/qemu-iotests/247
> index 546a794..c853b73 100755
> --- a/tests/qemu-iotests/247
> +++ b/tests/qemu-iotests/247
> @@ -57,7 +57,11 @@ TEST_IMG="$TEST_IMG.4" _make_test_img $size
>{"execute":"block-commit",
> "arguments":{"device":"format-4", "top-node": "format-2", 
> "base-node":"format-0", "job-id":"job0"}}
>EOF
> -sleep 1
> +if [ "${VALGRIND_QEMU}" == "y" ]; then
> +sleep 10
> +else
> +sleep 1
> +fi
>echo '{"execute":"quit"}'
>) | $QEMU -qmp stdio -nographic -nodefaults \
>-blockdev 
> file,node-name=file-0,filename=$TEST_IMG.0,auto-read-only=on \
>

 This makes me nervous, though. Won't this race terribly? (Wait, why
 doesn't it race already?)

>>>
>>> Hmm, however it works somehow. I'm afraid that everything with "sleep" is 
>>> definitely racy..
>>> Or what do you mean?
>>>
>>
>> Right -- anything with a sleep is already at risk for racing.
>>
>> What I am picking up on here is that with valgrind, there is an even
>> greater computational overhead that's much harder to predict, so I was
>> wondering how these values were determined.
>>
> 
> I just followed the trend and extended the sleeping time with a grater 
> tolerance so that the test could pass on systems where the 'sleep 1' 
> command helps to pass without Valgrind. We could rewrite the test 247 in 
> Python in a separate series, shall we?
> 

If you have the time, but I don't think anyone will require it for this
series.

Just reviewing "out loud." I'll look at V6 soon.

--js



Re: [Qemu-block] [PATCH v9 11/13] block: add lock/unlock range functions

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> From: Vladimir Sementsov-Ogievskiy 

Hm. :-)

Do you want to fix that?

> Introduce lock/unlock range functionality, based on serialized
> requests. This is needed to refactor backup, dropping local
> tracked-request-like synchronization.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block_int.h |  4 
>  block/io.c| 44 ++-
>  2 files changed, 47 insertions(+), 1 deletion(-)

Apart from that, I can’t see any changes from v8, so:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 08/13] iotests: 257: drop unused Drive.device field

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> After previous commit Drive.device is actually unused. Drop it together
> with .name property.  While being here reuse .node in qmp commands
> instead of writing 'drive0' twice.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/257 | 37 +++--
>  1 file changed, 15 insertions(+), 22 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 09/13] iotests: 257: drop device_add

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> SCSI devices are unused in test, drop them.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/257 |  8 ---
>  tests/qemu-iotests/257.out | 44 --
>  2 files changed, 52 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 07/13] iotests: prepare 124 and 257 bitmap querying for backup-top filter

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> After backup-top filter appearing it's not possible to see dirty
> bitmaps in top node, so use node-name instead.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/124|   3 +-
>  tests/qemu-iotests/257|  39 +---
>  tests/qemu-iotests/257.out| 364 +-
>  tests/qemu-iotests/iotests.py |  22 ++
>  4 files changed, 173 insertions(+), 255 deletions(-)
> 
> diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
> index 3440f54781..8b6024769c 100755
> --- a/tests/qemu-iotests/124
> +++ b/tests/qemu-iotests/124
> @@ -749,8 +749,7 @@ class 
> TestIncrementalBackupBlkdebug(TestIncrementalBackupBase):
>  
>  # Bitmap Status Check
>  query = self.vm.qmp('query-block')
> -ret = [bmap for bmap in query['return'][0]['dirty-bitmaps']
> -   if bmap.get('name') == bitmap.name][0]
> +ret = self.vm.get_bitmap(None, drive0['id'], bitmap.name)
>  self.assert_qmp(ret, 'count', 458752)
>  self.assert_qmp(ret, 'granularity', 65536)
>  self.assert_qmp(ret, 'status', 'frozen')

I see a couple more instances of querying the bitmaps through
query-block here.  Wouldn’t it make sense to replace them all with
get_bitmap()?

[...]

> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 84438e837c..9381964d9f 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -643,6 +643,28 @@ class VM(qtest.QEMUQtestMachine):
>  return x
>  return None
>  
> +def query_bitmaps(self):
> +res = self.qmp("query-named-block-nodes")
> +return {"bitmaps": {device['node-name']: device['dirty-bitmaps']
> +for device in res['return']
> +if 'dirty-bitmaps' in device}}

I’d leave the wrapping in {"bitmaps": x} to the callers, if they need it.

> +
> +def get_bitmap(self, bitmaps, node_name, name, recording=None):
> +"""
> +get a specific bitmap from the object returned by query_bitmaps.
> +:param recording: If specified, filter results by the specified 
> value.
> +"""
> +if bitmaps is None:
> +bitmaps = self.query_bitmaps()
> +
> +for bitmap in bitmaps['bitmaps'][node_name]:
> +if bitmap.get('name', '') == name:
> +if recording is None:
> +return bitmap
> +elif bitmap.get('recording') == recording:
> +return bitmap

Maybe add a “break” or “return None” here?

(Yes, yes, you just moved existing code.  Still.)

Max

> +return None
> +
>  
>  index_re = re.compile(r'([^\[]+)\[([^\]]+)\]')
>  
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 06/13] block: teach bdrv_debug_breakpoint skip filters with backing

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Teach bdrv_debug_breakpoint and bdrv_debug_remove_breakpoint skip
> filters with backing. This is needed to implement and use in backup job
> it's own backup_top filter driver (like mirror already has one), and
> without this improvement, breakpoint removal will fail at least in 55
> iotest.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block.c | 34 ++
>  1 file changed, 26 insertions(+), 8 deletions(-)

FWIW

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 05/13] block: move block_copy from block/backup.c to separate file

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Split block_copy to separate file, to be cleanly shared with backup-top
> filter driver in further commits.
> 
> It's a clean movement, the only change is drop "static" from interface
> functions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block-copy.h |  59 +++
>  block/backup.c | 313 +
>  block/block-copy.c | 307 
>  block/Makefile.objs|   1 +
>  block/trace-events |   2 +
>  5 files changed, 370 insertions(+), 312 deletions(-)
>  create mode 100644 include/block/block-copy.h
>  create mode 100644 block/block-copy.c

May change depending on changes to the preceding patches, but FWIW

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 04/13] block/backup: adjust block-copy functions style

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Fix comment style and reflow arguments in same manner like
> block_copy_state_new.

I like the current function header style better.

Max

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/backup.c | 40 +++-
>  1 file changed, 19 insertions(+), 21 deletions(-)



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 03/13] block/backup: introduce BlockCopyState

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Split copying code part from backup to "block-copy", including separate
> state structure and function renaming. This is needed to share it with
> backup-top filter driver in further commits.
> 
> Notes:
> 
> 1. As BlockCopyState keeps own BlockBackend objects, remaining

I suppose these should be BdrvChild objects at some point, but doing it
now would just mean effectively duplicating code from block-backend.c.
(“now” = before we have a backup-top filter to attach the children to.)

> job->common.blk users only use it to get bs by blk_bs() call, so clear
> job->commen.blk permissions set in block_job_create.
> 
> 2. Rename s/initializing_bitmap/skip_unallocated/ to sound a bit better
> as interface to BlockCopyState
> 
> 3. Split is not very clean: there left some duplicated fields, backup

Are there any but cluster_size and len (and source, in a sense)?

> code uses some BlockCopyState fields directly, let's postpone it for
> further improvements and keep this comment simpler for review.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/backup.c | 324 +++--
>  block/trace-events |  12 +-
>  2 files changed, 200 insertions(+), 136 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 13a1d80157..f52ac622e0 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -35,12 +35,35 @@ typedef struct CowRequest {
>  CoQueue wait_queue; /* coroutines blocked on this request */
>  } CowRequest;
>  
> +/*
> + * ProgressCallbackFunc
> + *
> + * Called when some progress is done in context of BlockCopyState:
> + *  1. When some bytes copied, called with @bytes > 0.
> + *  2. When some bytes resetted from copy_bitmap, called with @bytes = 0 
> (user

*reset

> + * may recalculate remaining bytes from copy_bitmap dirty count.
> + */
> +typedef void (*ProgressCallbackFunc)(int64_t bytes, void *opaque);

Maybe there should be two callbacks instead, one for “We’ve actively
made progress” (bytes > 0) and one for “The expected length has changed”
(bytes == 0)?

> +typedef struct BlockCopyState {
> +BlockBackend *source;
> +BlockBackend *target;
> +BdrvDirtyBitmap *copy_bitmap;
> +int64_t cluster_size;
> +bool use_copy_range;
> +int64_t copy_range_size;
> +uint64_t len;
> +
> +BdrvRequestFlags write_flags;
> +bool skip_unallocated;

The rename seems reasonable, although I think this should get a comment,
because it doesn’t mean just to skip unallocated clusters; it also means
to clear unallocated clusters from the bitmap.

> +
> +ProgressCallbackFunc progress_callback;
> +void *progress_opaque;
> +} BlockCopyState;
> +
>  typedef struct BackupBlockJob {
>  BlockJob common;
> -BlockBackend *target;
>  
>  BdrvDirtyBitmap *sync_bitmap;
> -BdrvDirtyBitmap *copy_bitmap;
>  
>  MirrorSyncMode sync_mode;
>  BitmapSyncMode bitmap_mode;

[...]

> @@ -99,9 +118,83 @@ static void cow_request_end(CowRequest *req)
>  qemu_co_queue_restart_all(&req->wait_queue);
>  }
>  
> +static void block_copy_state_free(BlockCopyState *s)
> +{
> +if (!s) {
> +return;
> +}
> +
> +bdrv_release_dirty_bitmap(blk_bs(s->source), s->copy_bitmap);
> +blk_unref(s->source);
> +s->source = NULL;
> +blk_unref(s->target);
> +s->target = NULL;

I’m not quite sure why you NULL these pointers when you free the whole
object next anyway.

> +g_free(s);
> +}
> +
> +static BlockCopyState *block_copy_state_new(
> +BlockDriverState *source, BlockDriverState *target,
> +int64_t cluster_size, BdrvRequestFlags write_flags,
> +ProgressCallbackFunc progress_callback, void *progress_opaque,
> +Error **errp)
> +{
> +BlockCopyState *s;
> +int ret;
> +uint64_t no_resize = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
> + BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD;
> +BdrvDirtyBitmap *copy_bitmap =
> +bdrv_create_dirty_bitmap(source, cluster_size, NULL, errp);
> +

This probably were easier to read if you didn’t initialize copy_bitmap
with the bdrv_create_dirty_bitmap() call but instead moved that call
right above the if () here (it still fits on a single line).

> +if (!copy_bitmap) {
> +return NULL;
> +}
> +bdrv_disable_dirty_bitmap(copy_bitmap);
> +
> +s = g_new0(BlockCopyState, 1);

With the following compound literal, you don’t need to allocate
zero-initialized memory here.

> +*s = (BlockCopyState) {
> +.source = blk_new(bdrv_get_aio_context(source),
> +  BLK_PERM_CONSISTENT_READ, no_resize),
> +.target = blk_new(bdrv_get_aio_context(target),
> +  BLK_PERM_WRITE, no_resize),

Maybe we want to assert that source’s and target’s context are the same?

> +.copy_bitmap = copy_bitmap,
> +.cluster_size = cluster_size,
> +.len = bdrv_dir

Re: [Qemu-block] [PATCH v5 6/6] iotests: extend sleeping time under Valgrind

2019-08-28 Thread Andrey Shinkevich


On 27/08/2019 22:42, John Snow wrote:
> 
> 
> On 8/23/19 11:27 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 16.08.2019 4:01, John Snow wrote:
>>>
>>>
>>> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
 To synchronize the time when QEMU is running longer under the Valgrind,
 increase the sleeping time in the test 247.

 Signed-off-by: Andrey Shinkevich 
 Reviewed-by: Vladimir Sementsov-Ogievskiy 
 ---
tests/qemu-iotests/247 | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/tests/qemu-iotests/247 b/tests/qemu-iotests/247
 index 546a794..c853b73 100755
 --- a/tests/qemu-iotests/247
 +++ b/tests/qemu-iotests/247
 @@ -57,7 +57,11 @@ TEST_IMG="$TEST_IMG.4" _make_test_img $size
{"execute":"block-commit",
 "arguments":{"device":"format-4", "top-node": "format-2", 
 "base-node":"format-0", "job-id":"job0"}}
EOF
 -sleep 1
 +if [ "${VALGRIND_QEMU}" == "y" ]; then
 +sleep 10
 +else
 +sleep 1
 +fi
echo '{"execute":"quit"}'
) | $QEMU -qmp stdio -nographic -nodefaults \
-blockdev 
 file,node-name=file-0,filename=$TEST_IMG.0,auto-read-only=on \

>>>
>>> This makes me nervous, though. Won't this race terribly? (Wait, why
>>> doesn't it race already?)
>>>
>>
>> Hmm, however it works somehow. I'm afraid that everything with "sleep" is 
>> definitely racy..
>> Or what do you mean?
>>
> 
> Right -- anything with a sleep is already at risk for racing.
> 
> What I am picking up on here is that with valgrind, there is an even
> greater computational overhead that's much harder to predict, so I was
> wondering how these values were determined.
> 

I just followed the trend and extended the sleeping time with a grater 
tolerance so that the test could pass on systems where the 'sleep 1' 
command helps to pass without Valgrind. We could rewrite the test 247 in 
Python in a separate series, shall we?

Andrey

> (I wouldn't withhold an RB for that alone -- the sleeps are existing
> problems.)
> 
> What I moved on to wondering in particular is why test 247 doesn't
> already have race problems, because it looks quite fragile.
> 
> Neither of these are really Andrey's problems; I was just surprised
> momentarily that I don't see 247 fail more often already, as-is.
> 
> --js
> 

-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [PATCH v5 4/6] iotests: Valgrind fails with nonexistent directory

2019-08-28 Thread Andrey Shinkevich


On 27/08/2019 22:45, John Snow wrote:
> 
> 
> On 8/25/19 11:24 AM, Andrey Shinkevich wrote:
>>
>>
>> On 16/08/2019 03:55, John Snow wrote:
>>>
>>>
>>> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
 The Valgrind uses the exported variable TMPDIR and fails if the
 directory does not exist. Let us exclude such a test case from
 being run under the Valgrind and notify the user of it.

 Suggested-by: Kevin Wolf 
 Signed-off-by: Andrey Shinkevich 
 ---
tests/qemu-iotests/051 | 4 
1 file changed, 4 insertions(+)

 diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
 index ce942a5..f8141ca 100755
 --- a/tests/qemu-iotests/051
 +++ b/tests/qemu-iotests/051
 @@ -377,6 +377,10 @@ printf %b "qemu-io $device_id \"write -P 0x33 0 
 4k\"\ncommit $device_id\n" |
$QEMU_IO -c "read -P 0x33 0 4k" "$TEST_IMG" | _filter_qemu_io

# Using snapshot=on with a non-existent TMPDIR
 +if [ "${VALGRIND_QEMU}" == "y" ]; then
 +_casenotrun "Valgrind needs a valid TMPDIR for itself"
 +fi
 +VALGRIND_QEMU="" \
TMPDIR=/nonexistent run_qemu -drive driver=null-co,snapshot=on

# Using snapshot=on together with read-only=on

>>>
>>> The only other way around this would be a complicated mechanism to set
>>> the TMPDIR for valgrind's sub-processes only, with e.g.
>>>
>>> valgrind ... env TMPDIR=/nonexistent qemu ...
>>>
>>> ... It's probably not worth trying to concoct such a thing; but I
>>> suppose it is possible. You'd have to set up a generic layer for setting
>>> environment variables, then in the qemu shim, you could either set them
>>> directly (non-valgrind invocation) or set them as part of the valgrind
>>> command-line.
>>>
>>> Or you could just take my R-B:
>>>
>>> Reviewed-by: John Snow 
>>>
>>
>> Thanks again John for your review and the advice.
>> Probably, it doesn't worth efforts to manage that case because QEMU
>> should fail anyway with the error message "Could not get temporary
>> filename: No such file or directory". So, we would not benefit much from
>> that run. We have other test cases that cover the main functionality.
>> It's just to check the QEMU error path for possible memory issues. Shall we?
>>
>> Andrey
>>
> 
> Yeah, don't bother with this for now. I just have a personal compulsion
> to try to concretely measure how much work it would take to avoid a
> "hack" and then make my decision.
> 
> You're free to just take the R-B :)
> 
> --js
> 

Thank you, John. Done in v6.
Please check the series in the email message
"[PATCH v6 0/6] Allow Valgrind checking all QEMU processes"
by 26.08.2019 with the message ID
<1566834628-485525-1-git-send-email-andrey.shinkev...@virtuozzo.com>

Andrey
-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [PATCH v5 1/6] iotests: allow Valgrind checking all QEMU processes

2019-08-28 Thread Andrey Shinkevich


On 27/08/2019 22:56, John Snow wrote:
> 
> 
> On 8/25/19 11:26 AM, Andrey Shinkevich wrote:
>>
>>
>> On 16/08/2019 01:49, John Snow wrote:
>>>
>>>
>>> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
 With the '-valgrind' option, let all the QEMU processes be run under
 the Valgrind tool. The Valgrind own parameters may be set with its
 environment variable VALGRIND_OPTS, e.g.
 VALGRIND_OPTS="--leak-check=yes" ./check -qcow2 -valgrind 
 or they may be listed in the Valgrind checked file ./.valgrindrc or
 ~/.valgrindrc like
 --memcheck:leak-check=no
 --memcheck:track-origins=yes
 When QEMU-IO process is being killed, the shell report refers to the
 text of the command in _qemu_io_wrapper(), which was modified with this
 patch. So, the benchmark output for the tests 039, 061 and 137 is to be
 changed also.

>>>
>>> Oh, weird. "VALGRIND_QEMU=y" actually has just meant ... valgrind
>>> qemu-io. OK.
>>>
 Signed-off-by: Andrey Shinkevich 
 ---
tests/qemu-iotests/039.out   | 30 ---
tests/qemu-iotests/061.out   | 12 ++--
tests/qemu-iotests/137.out   |  6 +---
tests/qemu-iotests/common.rc | 69 
 
4 files changed, 59 insertions(+), 58 deletions(-)

 diff --git a/tests/qemu-iotests/039.out b/tests/qemu-iotests/039.out
 index 724d7b2..972c6c0 100644
 --- a/tests/qemu-iotests/039.out
 +++ b/tests/qemu-iotests/039.out
 @@ -11,11 +11,7 @@ No errors were found on the image.
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
wrote 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
 then
 -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
 "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -else
 -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -fi )
 +./common.rc: Killed  ( _qemu_proc_wrapper 
 "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
incompatible_features 0x1
ERROR cluster 5 refcount=0 reference=1
ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
 @@ -50,11 +46,7 @@ read 512/512 bytes at offset 0
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
wrote 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
 then
 -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
 "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -else
 -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -fi )
 +./common.rc: Killed  ( _qemu_proc_wrapper 
 "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
incompatible_features 0x1
ERROR cluster 5 refcount=0 reference=1
Rebuilding refcount structure
 @@ -68,11 +60,7 @@ incompatible_features 0x0
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
wrote 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
 then
 -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
 "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -else
 -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -fi )
 +./common.rc: Killed  ( _qemu_proc_wrapper 
 "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
incompatible_features 0x0
No errors were found on the image.

 @@ -91,11 +79,7 @@ No errors were found on the image.
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
wrote 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
 then
 -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
 "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -else
 -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
 -fi )
 +./common.rc: Killed  ( _qemu_proc_wrapper 
 "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
incompatible_features 0x1
ERROR cluster 5 refcount=0 reference=1
ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
 @@ -105,11 +89,7 @@ Data may be corrupted, or further writes to the image 
 may corrupt it.
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
wrote 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
 then
 -exec valgr

Re: [Qemu-block] [PATCH v9 02/13] block/backup: split shareable copying part from backup_do_cow

2019-08-28 Thread Max Reitz
On 28.08.19 16:27, Vladimir Sementsov-Ogievskiy wrote:
> 28.08.2019 17:22, Max Reitz wrote:
>> On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
>>> Split copying logic which will be shared with backup-top filter.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   block/backup.c | 47 ---
>>>   1 file changed, 32 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/block/backup.c b/block/backup.c
>>> index 33b144305f..13a1d80157 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -248,26 +248,18 @@ static int64_t 
>>> backup_bitmap_reset_unallocated(BackupBlockJob *s,
>>>   return ret;
>>>   }
>>>   
>>> -static int coroutine_fn backup_do_cow(BackupBlockJob *job,
>>> -  int64_t offset, uint64_t bytes,
>>> -  bool *error_is_read,
>>> -  bool is_write_notifier)
>>> +static int coroutine_fn backup_do_copy(BackupBlockJob *job,
>>> +   int64_t offset, uint64_t bytes,
>>> +   bool *error_is_read,
>>> +   bool is_write_notifier)
>>>   {
>>> -CowRequest cow_request;
>>>   int ret = 0;
>>> -int64_t start, end; /* bytes */
>>> +int64_t start = offset, end = bytes + offset; /* bytes */
>>
>> Maybe just rename the “offset” parameter to “start”, replace the “bytes”
>> parameter by an “end” parameter, and drop this line?
>>
> 
> I really want final block_copy have more common in block-layer offset+bytes
> interface. So better to refactor a bit the function itself, but I'd prefer
> to do it as a follow-up and keep this patch simpler..

OK, but s/offset/start/ should still be possible.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 02/13] block/backup: split shareable copying part from backup_do_cow

2019-08-28 Thread Vladimir Sementsov-Ogievskiy
28.08.2019 17:22, Max Reitz wrote:
> On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
>> Split copying logic which will be shared with backup-top filter.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>   block/backup.c | 47 ---
>>   1 file changed, 32 insertions(+), 15 deletions(-)
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index 33b144305f..13a1d80157 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -248,26 +248,18 @@ static int64_t 
>> backup_bitmap_reset_unallocated(BackupBlockJob *s,
>>   return ret;
>>   }
>>   
>> -static int coroutine_fn backup_do_cow(BackupBlockJob *job,
>> -  int64_t offset, uint64_t bytes,
>> -  bool *error_is_read,
>> -  bool is_write_notifier)
>> +static int coroutine_fn backup_do_copy(BackupBlockJob *job,
>> +   int64_t offset, uint64_t bytes,
>> +   bool *error_is_read,
>> +   bool is_write_notifier)
>>   {
>> -CowRequest cow_request;
>>   int ret = 0;
>> -int64_t start, end; /* bytes */
>> +int64_t start = offset, end = bytes + offset; /* bytes */
> 
> Maybe just rename the “offset” parameter to “start”, replace the “bytes”
> parameter by an “end” parameter, and drop this line?
> 

I really want final block_copy have more common in block-layer offset+bytes
interface. So better to refactor a bit the function itself, but I'd prefer
to do it as a follow-up and keep this patch simpler..

> 
>>   void *bounce_buffer = NULL;
>>   int64_t status_bytes;
>>   
>> -qemu_co_rwlock_rdlock(&job->flush_rwlock);
>> -
>> -start = QEMU_ALIGN_DOWN(offset, job->cluster_size);
>> -end = QEMU_ALIGN_UP(bytes + offset, job->cluster_size);
>> -
>> -trace_backup_do_cow_enter(job, start, offset, bytes);
>> -
>> -wait_for_overlapping_requests(job, start, end);
>> -cow_request_begin(&cow_request, job, start, end);
>> +assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>> +assert(QEMU_IS_ALIGNED(end, job->cluster_size));
>>   
>>   while (start < end) {
>>   int64_t dirty_end;
> 


-- 
Best regards,
Vladimir


Re: [Qemu-block] [PATCH v9 02/13] block/backup: split shareable copying part from backup_do_cow

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Split copying logic which will be shared with backup-top filter.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/backup.c | 47 ---
>  1 file changed, 32 insertions(+), 15 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 33b144305f..13a1d80157 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -248,26 +248,18 @@ static int64_t 
> backup_bitmap_reset_unallocated(BackupBlockJob *s,
>  return ret;
>  }
>  
> -static int coroutine_fn backup_do_cow(BackupBlockJob *job,
> -  int64_t offset, uint64_t bytes,
> -  bool *error_is_read,
> -  bool is_write_notifier)
> +static int coroutine_fn backup_do_copy(BackupBlockJob *job,
> +   int64_t offset, uint64_t bytes,
> +   bool *error_is_read,
> +   bool is_write_notifier)
>  {
> -CowRequest cow_request;
>  int ret = 0;
> -int64_t start, end; /* bytes */
> +int64_t start = offset, end = bytes + offset; /* bytes */

Maybe just rename the “offset” parameter to “start”, replace the “bytes”
parameter by an “end” parameter, and drop this line?

Max

>  void *bounce_buffer = NULL;
>  int64_t status_bytes;
>  
> -qemu_co_rwlock_rdlock(&job->flush_rwlock);
> -
> -start = QEMU_ALIGN_DOWN(offset, job->cluster_size);
> -end = QEMU_ALIGN_UP(bytes + offset, job->cluster_size);
> -
> -trace_backup_do_cow_enter(job, start, offset, bytes);
> -
> -wait_for_overlapping_requests(job, start, end);
> -cow_request_begin(&cow_request, job, start, end);
> +assert(QEMU_IS_ALIGNED(start, job->cluster_size));
> +assert(QEMU_IS_ALIGNED(end, job->cluster_size));
>  
>  while (start < end) {
>  int64_t dirty_end;



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v9 01/13] block/backup: fix backup_cow_with_offload for last cluster

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> We shouldn't try to copy bytes beyond EOF. Fix it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/backup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 2baf7bed65..33b144305f 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -161,7 +161,7 @@ static int coroutine_fn 
> backup_cow_with_offload(BackupBlockJob *job,
>  
>  assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
>  assert(QEMU_IS_ALIGNED(start, job->cluster_size));
> -nbytes = MIN(job->copy_range_size, end - start);
> +nbytes = MIN(job->copy_range_size, MIN(end - start, job->len - start));

Might be easier to read as MIN(end, job->len) - start, but either way:

Reviewed-by: Max Reitz 

>  nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
>  bdrv_reset_dirty_bitmap(job->copy_bitmap, start,
>  job->cluster_size * nr_clusters);
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] protocol: Add NBD_CMD_FLAG_FAST_ZERO

2019-08-28 Thread Vladimir Sementsov-Ogievskiy
28.08.2019 16:04, Eric Blake wrote:
> On 8/28/19 4:57 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>> Hence, it is desirable to have a way for clients to specify that a
>>> particular write zero request is being attempted for a fast wipe, and
>>> get an immediate failure if the zero request would otherwise take the
>>> same time as a write.  Conversely, if the client is not performing a
>>> pre-initialization pass, it is still more efficient in terms of
>>> networking traffic to send NBD_CMD_WRITE_ZERO requests where the
>>> server implements the fallback to the slower write, than it is for the
>>> client to have to perform the fallback to send NBD_CMD_WRITE with a
>>> zeroed buffer.
>>
>> How are you going to finally use it in qemu-img convert?
> 
> It's already in use there (in fact, the cover letter shows actual timing
> examples of how qemu-img's use of BDRV_REQ_NO_FALLBACK, which translates
> to NBD_CMD_FLAG_FAST_ZERO, observably affects timing).
> 
>> Ok, we have a loop
>> of sending write-zero requests. And on first ENOTSUP we'll assume that there
>> is no benefit to continue? But what if actually server returns ENOTSUP only
>> once when we have 1000 iterations? Seems we should still do zeroing if we
>> have only a few ENOTSUPs...
> 
> When attempting a bulk zero, you try to wipe the entire device,
> presumably with something that is large and aligned.  Yes, if you have
> to split the write zero request due to size limitations, then you risk
> that the first write zero succeeds but later ones fail, then you didn't
> wipe the entire disk, but you also don't need to redo the work on the
> first half of the image.  But it is much more likely that the first
> write of the bulk zero is representative of the overall operation (and
> so in practice, it only takes one fast zero attempt to learn if bulk
> zeroing is worthwhile, then continue with fast zeroes without issues).
> 
>>
>> I understand that fail-on-first ENOTSUP is OK for raw-without-fallocte vs 
>> qcow2,
>> as first will always return ENOTSUP and second will never fail.. But in such 
>> way
>> we'll OK with simpler extension, which only have one server-advirtised 
>> negotiation
>> flag NBD_FLAG_ZERO_IS_FAST.
> 
> Advertising that a server's zero behavior is always going to be
> successfully fast is a much harder flag to implement.  The justification
> for the semantics I chose (advertising that the server can quickly
> report failure if success is not fast, but not requiring fast zero)
> covers the case when the decision of whether a zero is fast may also
> depend on other factors - for example, if the server knows the image
> starts in an all-zero state, then it can track a boolean: all write zero
> requests while the boolean is set return immediate success (nothing to
> do), but after the first regular write, the boolean is set to false, and
> all further write zero requests fail as being potentially slow; and such
> an implementation is still useful for the qemu-img convert case.

Agreed, thanks for this example)

> 
>>
>> There is not such problem if we have only one iteration, so may be new 
>> command
>> FILL_ZERO, filling the whole device by zeros?
> 
> Or better yet, implement support for 64-bit commands.  Yes, my cover
> letter called out further orthogonal extensions, and implementing 64-bit
> zeroing (so that you can issue a write zero request over the entire
> image in one command), as well as a way for a server to advertise when
> the image begins life in an all-zero state, are also further extensions
> coming down the pipeline.  But as not all servers have to implement all
> of the extensions, each extension that can be orthogonally implemented
> and show an improvement on its own is still worthwhile; and my cover
> letter has shown that fast zeroes on their own make a measurable
> difference to certain workloads.
> 
>>> +If the server advertised `NBD_FLAG_SEND_FAST_ZERO` but
>>> +`NBD_CMD_FLAG_FAST_ZERO` is not set, then the server MUST NOT fail
>>> +with `NBD_ENOTSUP`, even if the operation is no faster than a
>>> +corresponding `NBD_CMD_WRITE`. Conversely, if
>>> +`NBD_CMD_FLAG_FAST_ZERO` is set, the server MUST fail quickly with
>>> +`NBD_ENOTSUP` unless the request can be serviced in less time than
>>> +a corresponding `NBD_CMD_WRITE`, and SHOULD NOT alter the contents
>>> +of the export when returning this failure. The server's
>>> +determination of a fast request MAY depend on a number of factors,
>>> +such as whether the request was suitably aligned, on whether the
>>> +`NBD_CMD_FLAG_NO_HOLE` flag was present, or even on whether a
>>> +previous `NBD_CMD_TRIM` had been performed on the region.  If the
>>> +server did not advertise `NBD_FLAG_SEND_FAST_ZERO`, then it SHOULD
>>> +NOT fail with `NBD_ENOTSUP`, regardless of the speed of servicing
>>> +a request, and SHOULD fail with `NBD_EINVAL` if the
>>> +`NBD_CMD_FLAG_FAST_ZERO` flag was set. A server MAY advertis

Re: [Qemu-block] [Qemu-devel] [PATCH v4 0/3] qcow2: add zstd cluster compression

2019-08-28 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190828125654.10544-1-dplotni...@virtuozzo.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20190828125654.10544-1-dplotni...@virtuozzo.com
Type: series
Subject: [Qemu-devel] [PATCH v4 0/3] qcow2: add zstd cluster compression

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
9f7c5d7 qcow2: add zstd cluster compression
69dd572 qcow2: rework the cluster compression routine
15831f2 qcow2: introduce compression type feature

=== OUTPUT BEGIN ===
1/3 Checking commit 15831f298ae2 (qcow2: introduce compression type feature)
2/3 Checking commit 69dd5728957c (qcow2: rework the cluster compression routine)
3/3 Checking commit 9f7c5d7dab6e (qcow2: add zstd cluster compression)
ERROR: space prohibited after that open parenthesis '('
#137: FILE: block/qcow2-threads.c:248:
+s_size = be32_to_cpu( *(const uint32_t *) src);

total: 1 errors, 0 warnings, 268 lines checked

Patch 3/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190828125654.10544-1-dplotni...@virtuozzo.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] protocol: Add NBD_CMD_FLAG_FAST_ZERO

2019-08-28 Thread Eric Blake
On 8/28/19 4:57 AM, Vladimir Sementsov-Ogievskiy wrote:

>> Hence, it is desirable to have a way for clients to specify that a
>> particular write zero request is being attempted for a fast wipe, and
>> get an immediate failure if the zero request would otherwise take the
>> same time as a write.  Conversely, if the client is not performing a
>> pre-initialization pass, it is still more efficient in terms of
>> networking traffic to send NBD_CMD_WRITE_ZERO requests where the
>> server implements the fallback to the slower write, than it is for the
>> client to have to perform the fallback to send NBD_CMD_WRITE with a
>> zeroed buffer.
> 
> How are you going to finally use it in qemu-img convert?

It's already in use there (in fact, the cover letter shows actual timing
examples of how qemu-img's use of BDRV_REQ_NO_FALLBACK, which translates
to NBD_CMD_FLAG_FAST_ZERO, observably affects timing).

> Ok, we have a loop
> of sending write-zero requests. And on first ENOTSUP we'll assume that there
> is no benefit to continue? But what if actually server returns ENOTSUP only
> once when we have 1000 iterations? Seems we should still do zeroing if we
> have only a few ENOTSUPs...

When attempting a bulk zero, you try to wipe the entire device,
presumably with something that is large and aligned.  Yes, if you have
to split the write zero request due to size limitations, then you risk
that the first write zero succeeds but later ones fail, then you didn't
wipe the entire disk, but you also don't need to redo the work on the
first half of the image.  But it is much more likely that the first
write of the bulk zero is representative of the overall operation (and
so in practice, it only takes one fast zero attempt to learn if bulk
zeroing is worthwhile, then continue with fast zeroes without issues).

> 
> I understand that fail-on-first ENOTSUP is OK for raw-without-fallocte vs 
> qcow2,
> as first will always return ENOTSUP and second will never fail.. But in such 
> way
> we'll OK with simpler extension, which only have one server-advirtised 
> negotiation
> flag NBD_FLAG_ZERO_IS_FAST.

Advertising that a server's zero behavior is always going to be
successfully fast is a much harder flag to implement.  The justification
for the semantics I chose (advertising that the server can quickly
report failure if success is not fast, but not requiring fast zero)
covers the case when the decision of whether a zero is fast may also
depend on other factors - for example, if the server knows the image
starts in an all-zero state, then it can track a boolean: all write zero
requests while the boolean is set return immediate success (nothing to
do), but after the first regular write, the boolean is set to false, and
all further write zero requests fail as being potentially slow; and such
an implementation is still useful for the qemu-img convert case.

> 
> There is not such problem if we have only one iteration, so may be new command
> FILL_ZERO, filling the whole device by zeros?

Or better yet, implement support for 64-bit commands.  Yes, my cover
letter called out further orthogonal extensions, and implementing 64-bit
zeroing (so that you can issue a write zero request over the entire
image in one command), as well as a way for a server to advertise when
the image begins life in an all-zero state, are also further extensions
coming down the pipeline.  But as not all servers have to implement all
of the extensions, each extension that can be orthogonally implemented
and show an improvement on its own is still worthwhile; and my cover
letter has shown that fast zeroes on their own make a measurable
difference to certain workloads.

>> +If the server advertised `NBD_FLAG_SEND_FAST_ZERO` but
>> +`NBD_CMD_FLAG_FAST_ZERO` is not set, then the server MUST NOT fail
>> +with `NBD_ENOTSUP`, even if the operation is no faster than a
>> +corresponding `NBD_CMD_WRITE`. Conversely, if
>> +`NBD_CMD_FLAG_FAST_ZERO` is set, the server MUST fail quickly with
>> +`NBD_ENOTSUP` unless the request can be serviced in less time than
>> +a corresponding `NBD_CMD_WRITE`, and SHOULD NOT alter the contents
>> +of the export when returning this failure. The server's
>> +determination of a fast request MAY depend on a number of factors,
>> +such as whether the request was suitably aligned, on whether the
>> +`NBD_CMD_FLAG_NO_HOLE` flag was present, or even on whether a
>> +previous `NBD_CMD_TRIM` had been performed on the region.  If the
>> +server did not advertise `NBD_FLAG_SEND_FAST_ZERO`, then it SHOULD
>> +NOT fail with `NBD_ENOTSUP`, regardless of the speed of servicing
>> +a request, and SHOULD fail with `NBD_EINVAL` if the
>> +`NBD_CMD_FLAG_FAST_ZERO` flag was set. A server MAY advertise
>> +`NBD_FLAG_SEND_FAST_ZERO` whether or not it can perform fast
>> +zeroing; similarly, a server SHOULD fail with `NBD_ENOTSUP` when
>> +the flag is set if the server cannot quickly dete

[Qemu-block] [PATCH v4 1/3] qcow2: introduce compression type feature

2019-08-28 Thread Denis Plotnikov
The patch adds some preparation parts for incompatible compression type
feature to QCOW2 header that indicates that *all* compressed clusters
must be (de)compressed using a certain compression type.

It is implied that the compression type is set on the image creation and
can be changed only later by image conversion, thus compression type
defines the only compression algorithm used for the image.

The goal of the feature is to add support of other compression algorithms
to qcow2. For example, ZSTD which is more effective on compression than ZLIB.
It works roughly 2x faster than ZLIB providing a comparable compression ratio
and therefore provides a performance advantage in backup scenarios.

The default compression is ZLIB. Images created with ZLIB compression type
are backward compatible with older qemu versions.

Signed-off-by: Denis Plotnikov 
---
 block/qcow2.c | 91 +++
 block/qcow2.h | 26 ---
 docs/interop/qcow2.txt| 19 +++-
 include/block/block_int.h |  1 +
 qapi/block-core.json  | 22 +-
 5 files changed, 149 insertions(+), 10 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 039bdc2f7e..2884b9d9f2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1197,6 +1197,32 @@ static int qcow2_update_options(BlockDriverState *bs, 
QDict *options,
 return ret;
 }
 
+static int check_compression_type(BDRVQcow2State *s, Error **errp)
+{
+switch (s->compression_type) {
+case QCOW2_COMPRESSION_TYPE_ZLIB:
+break;
+
+default:
+error_setg(errp, "qcow2: unknown compression type: %u",
+   s->compression_type);
+return -ENOTSUP;
+}
+
+/*
+ * if the compression type differs from QCOW2_COMPRESSION_TYPE_ZLIB
+ * the incompatible feature flag must be set
+ */
+
+if (s->compression_type != QCOW2_COMPRESSION_TYPE_ZLIB &&
+!(s->incompatible_features & QCOW2_INCOMPAT_COMPRESSION_TYPE)) {
+error_setg(errp, "qcow2: Invalid compression type setting");
+return -EINVAL;
+}
+
+return 0;
+}
+
 /* Called with s->lock held.  */
 static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
   int flags, Error **errp)
@@ -1312,6 +1338,35 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 s->compatible_features  = header.compatible_features;
 s->autoclear_features   = header.autoclear_features;
 
+/*
+ * Handle compression type
+ * Older qcow2 images don't contain the compression type header.
+ * Distinguish them by the header length and use
+ * the only valid (default) compression type in that case
+ */
+if (header.header_length > offsetof(QCowHeader, compression_type)) {
+/* sanity check that we can read a compression type */
+size_t min_len = offsetof(QCowHeader, compression_type) +
+ sizeof(header.compression_type);
+if (header.header_length < min_len) {
+error_setg(errp,
+   "Could not read compression type, "
+   "qcow2 header is too short");
+ret = -EINVAL;
+goto fail;
+}
+
+header.compression_type = be32_to_cpu(header.compression_type);
+s->compression_type = header.compression_type;
+} else {
+s->compression_type = QCOW2_COMPRESSION_TYPE_ZLIB;
+}
+
+ret = check_compression_type(s, errp);
+if (ret) {
+goto fail;
+}
+
 if (s->incompatible_features & ~QCOW2_INCOMPAT_MASK) {
 void *feature_table = NULL;
 qcow2_read_extensions(bs, header.header_length, ext_end,
@@ -2516,6 +2571,12 @@ int qcow2_update_header(BlockDriverState *bs)
 total_size = bs->total_sectors * BDRV_SECTOR_SIZE;
 refcount_table_clusters = s->refcount_table_size >> (s->cluster_bits - 3);
 
+ret = check_compression_type(s, NULL);
+
+if (ret) {
+goto fail;
+}
+
 *header = (QCowHeader) {
 /* Version 2 fields */
 .magic  = cpu_to_be32(QCOW_MAGIC),
@@ -2538,6 +2599,7 @@ int qcow2_update_header(BlockDriverState *bs)
 .autoclear_features = cpu_to_be64(s->autoclear_features),
 .refcount_order = cpu_to_be32(s->refcount_order),
 .header_length  = cpu_to_be32(header_length),
+.compression_type   = cpu_to_be32(s->compression_type),
 };
 
 /* For older versions, write a shorter header */
@@ -2635,6 +2697,11 @@ int qcow2_update_header(BlockDriverState *bs)
 .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
 .name = "lazy refcounts",
 },
+{
+.type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+.bit  = QCOW2_INCOMPAT_COMPRESSION_TYPE_BITNR,
+.name = "compression type",
+},
 };
 
 ret = header_ext_add(buf, QCOW2_

[Qemu-block] [PATCH v4 2/3] qcow2: rework the cluster compression routine

2019-08-28 Thread Denis Plotnikov
The patch allow to process image compression type defined
in the image header and choose an appropriate method for
image clusters (de)compression.

Signed-off-by: Denis Plotnikov 
---
 block/qcow2-threads.c | 78 +++
 1 file changed, 64 insertions(+), 14 deletions(-)

diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 3b1e63fe41..14b5bd76fb 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -73,8 +73,11 @@ typedef struct Qcow2CompressData {
 Qcow2CompressFunc func;
 } Qcow2CompressData;
 
+
 /*
- * qcow2_compress()
+ * qcow2_zlib_compress()
+ *
+ * Compress @src_size bytes of data using zlib compression method
  *
  * @dest - destination buffer, @dest_size bytes
  * @src - source buffer, @src_size bytes
@@ -83,8 +86,8 @@ typedef struct Qcow2CompressData {
  *  -ENOMEM destination buffer is not enough to store compressed data
  *  -EIOon any other error
  */
-static ssize_t qcow2_compress(void *dest, size_t dest_size,
-  const void *src, size_t src_size)
+static ssize_t qcow2_zlib_compress(void *dest, size_t dest_size,
+   const void *src, size_t src_size)
 {
 ssize_t ret;
 z_stream strm;
@@ -119,19 +122,19 @@ static ssize_t qcow2_compress(void *dest, size_t 
dest_size,
 }
 
 /*
- * qcow2_decompress()
+ * qcow2_zlib_decompress()
  *
  * Decompress some data (not more than @src_size bytes) to produce exactly
- * @dest_size bytes.
+ * @dest_size bytes using zlib compression method
  *
  * @dest - destination buffer, @dest_size bytes
  * @src - source buffer, @src_size bytes
  *
  * Returns: 0 on success
- *  -1 on fail
+ *  -EIO on fail
  */
-static ssize_t qcow2_decompress(void *dest, size_t dest_size,
-const void *src, size_t src_size)
+static ssize_t qcow2_zlib_decompress(void *dest, size_t dest_size,
+ const void *src, size_t src_size)
 {
 int ret = 0;
 z_stream strm;
@@ -144,7 +147,7 @@ static ssize_t qcow2_decompress(void *dest, size_t 
dest_size,
 
 ret = inflateInit2(&strm, -12);
 if (ret != Z_OK) {
-return -1;
+return -EIO;
 }
 
 ret = inflate(&strm, Z_FINISH);
@@ -154,7 +157,7 @@ static ssize_t qcow2_decompress(void *dest, size_t 
dest_size,
  * @src buffer may be processed partly (because in qcow2 we know size 
of
  * compressed data with precision of one sector)
  */
-ret = -1;
+ret = -EIO;
 }
 
 inflateEnd(&strm);
@@ -189,20 +192,67 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, 
size_t dest_size,
 return arg.ret;
 }
 
+/*
+ * qcow2_co_compress()
+ *
+ * Compress @src_size bytes of data using the compression
+ * method defined by the image compression type
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: 0 on success
+ *  a negative error code on fail
+ */
 ssize_t coroutine_fn
 qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
   const void *src, size_t src_size)
 {
-return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
-qcow2_compress);
+BDRVQcow2State *s = bs->opaque;
+Qcow2CompressFunc fn;
+
+switch (s->compression_type) {
+case QCOW2_COMPRESSION_TYPE_ZLIB:
+fn = qcow2_zlib_compress;
+break;
+
+default:
+return -ENOTSUP;
+}
+
+return qcow2_co_do_compress(bs, dest, dest_size, src, src_size, fn);
 }
 
+/*
+ * qcow2_co_decompress()
+ *
+ * Decompress some data (not more than @src_size bytes) to produce exactly
+ * @dest_size bytes using the compression method defined by the image
+ * compression type
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: 0 on success
+ *  a negative error code on fail
+ */
 ssize_t coroutine_fn
 qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
 const void *src, size_t src_size)
 {
-return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
-qcow2_decompress);
+BDRVQcow2State *s = bs->opaque;
+Qcow2CompressFunc fn;
+
+switch (s->compression_type) {
+case QCOW2_COMPRESSION_TYPE_ZLIB:
+fn = qcow2_zlib_decompress;
+break;
+
+default:
+return -ENOTSUP;
+}
+
+return qcow2_co_do_compress(bs, dest, dest_size, src, src_size, fn);
 }
 
 
-- 
2.17.0




[Qemu-block] [PATCH v4 0/3] qcow2: add zstd cluster compression

2019-08-28 Thread Denis Plotnikov
v4:
* remove not feasible switch case [Vladimir]
* add sanity checks to zstd decompresssion [Vladimir]
* store zstd compressed length in big endian [Max, Kevin]

v3:
* relax the compression type setting requirement when
  the compression type is not zlib [Eric, Kevin]
* add compression type values to the spec [Eric]
* fix wording in the spec and descriptions [Eric]
* fix functions descriptions [Max]
* fix zstd (de)compression functions flaws [Max]
* fix zstd related parts of configure file [Max]
* rebased to v4.1.0-rc5 and chenged the series version aiming to 4.2

v2:
* relax the compression type setting restriction in the spec
* fix qcow2 header size checking
* fix error processing and messaging
* fix qcow2 image specific info reporting
* set Qcow2CompressionType zstd config dependant
* add zstd compressed cluster format description to the spec

v1:
* extend qcow2 header instead of adding a new incompatible extension header
specification re-written accordingly
* enable zstd compression via config
* fix zstd (de)compression functions
* fix comments/description
* fix function naming

---
The goal of the patch-set is to enable qcow2 to use zstd compression for
clusters. ZSTD provides better (de)compression performance than currently
used ZLIB. Using it will improve perforamnce (reduce compression time)
when the compressed clusters is used, e.g backup scenarios.

Also, the patch-set extends qcow2 specification by adding compression_type
feature. The feature enables adding ZSTD and another compression algorithms
in the future.

Here is some measurements ZSTD vs ZLIB:

The test:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
src.img [zlib|zstd]_compressed.img
decompress cmd
time ./qemu-img convert -O qcow2
[zlib|zstd]_compressed.img uncompressed.img


The results:
compression decompression
zlib zstd zlib zstd

real 65.5 16.3 (-75 %) 1.9 1.6 (-16 %)
user 65.0 15.8 5.3 2.5
sys 3.3 0.2 2.0 2.0

Both ZLIB and ZSTD gave the same compression ratio: ~1.5
compressed image size in both cases: ~1.4G

Denis Plotnikov (3):
  qcow2: introduce compression type feature
  qcow2: rework the cluster compression routine
  qcow2: add zstd cluster compression

 block/qcow2-threads.c | 185 +++---
 block/qcow2.c |  98 
 block/qcow2.h |  26 --
 configure |  34 +++
 docs/interop/qcow2.txt|  39 +++-
 include/block/block_int.h |   1 +
 qapi/block-core.json  |  23 -
 7 files changed, 382 insertions(+), 24 deletions(-)

-- 
2.17.0




[Qemu-block] [PATCH v4 3/3] qcow2: add zstd cluster compression

2019-08-28 Thread Denis Plotnikov
zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of compression ratio in comparison with
zlib, which, at the moment, has been the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
  time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
  src.img [zlib|zstd]_compressed.img
decompress cmd
  time ./qemu-img convert -O qcow2
  [zlib|zstd]_compressed.img uncompressed.img

   compression   decompression
 zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)1.9  1.6 (-16 %)
user 65.0   15.85.3  2.5
sys   3.30.22.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
---
 block/qcow2-threads.c  | 107 +
 block/qcow2.c  |   7 +++
 configure  |  34 +
 docs/interop/qcow2.txt |  20 
 qapi/block-core.json   |   3 +-
 5 files changed, 170 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 14b5bd76fb..f6643976f4 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -28,6 +28,11 @@
 #define ZLIB_CONST
 #include 
 
+#ifdef CONFIG_ZSTD
+#include 
+#include 
+#endif
+
 #include "qcow2.h"
 #include "block/thread-pool.h"
 #include "crypto.h"
@@ -165,6 +170,98 @@ static ssize_t qcow2_zlib_decompress(void *dest, size_t 
dest_size,
 return ret;
 }
 
+#ifdef CONFIG_ZSTD
+/*
+ * qcow2_zstd_compress()
+ *
+ * Compress @src_size bytes of data using zstd compression method
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: compressed size on success
+ *  -ENOMEM destination buffer is not enough to store compressed data
+ *  -EIOon any other error
+ */
+
+static ssize_t qcow2_zstd_compress(void *dest, size_t dest_size,
+   const void *src, size_t src_size)
+{
+ssize_t ret;
+uint32_t *c_size = dest;
+/* steal some bytes to store compressed chunk size */
+char *d_buf = ((char *) dest) + sizeof(*c_size);
+
+/* snaity check that we can store the compressed data length */
+if (dest_size < sizeof(*c_size)) {
+return -ENOMEM;
+}
+
+dest_size -= sizeof(*c_size);
+
+ret = ZSTD_compress(d_buf, dest_size, src, src_size, 5);
+
+if (ZSTD_isError(ret)) {
+if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall) {
+return -ENOMEM;
+} else {
+return -EIO;
+}
+}
+
+/* store the compressed chunk size in the very beginning of the buffer */
+*c_size = cpu_to_be32(ret);
+
+return ret + sizeof(*c_size);
+}
+
+/*
+ * qcow2_zstd_decompress()
+ *
+ * Decompress some data (not more than @src_size bytes) to produce exactly
+ * @dest_size bytes using zstd compression method
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: 0 on success
+ *  -EIO on any error
+ */
+
+static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
+ const void *src, size_t src_size)
+{
+ssize_t ret;
+/*
+ * zstd decompress wants to know the exact length of the data
+ * for that purpose, on the compression the length is stored in
+ * the very beginning of the compressed buffer
+ */
+uint32_t s_size;
+const char *s_buf = ((const char *) src) + sizeof(s_size);
+
+/* sanity check that we can read the content length */
+if (src_size < sizeof(s_size)) {
+return -EIO;
+}
+
+s_size = be32_to_cpu( *(const uint32_t *) src);
+
+/* sanity check that the buffer is big enough to read the content */
+if (src_size - sizeof(s_size) < s_size) {
+return -EIO;
+}
+
+ret = ZSTD_decompress(dest, dest_size, s_buf, s_size);
+
+if (ZSTD_isError(ret)) {
+return -EIO;
+}
+
+return 0;
+}
+#endif
+
 static int qcow2_compress_pool_func(void *opaque)
 {
 Qcow2CompressData *data = opaque;
@@ -216,6 +313,11 @@ qcow2_co_compress(BlockDriverState *bs, void *dest, size_t 
dest_size,
 fn = qcow2_zlib_compress;
 break;
 
+#ifdef CONFIG_ZSTD
+case QCOW2_COMPRESSION_TYPE_ZSTD:
+fn = qcow2_zstd_compress;
+break;
+#endif
 default:
 return -ENOTSUP;
 }
@@ -248,6 +350,11 @@ qcow2_co_decompress(BlockDriverState *bs, void *dest, 
s

Re: [Qemu-block] [PATCH v2 2/2] block: Remove unused masks

2019-08-28 Thread Juan Quintela
Nir Soffer  wrote:
> Replace confusing usage:
>
> ~BDRV_SECTOR_MASK
>
> With more clear:
>
> (BDRV_SECTOR_SIZE - 1)
>
> Remove BDRV_SECTOR_MASK and the unused BDRV_BLOCK_OFFSET_MASK which was
> it's last user.
>
> Signed-off-by: Nir Soffer 

Reviewed-by: Juan Quintela 



Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] protocol: Add NBD_CMD_FLAG_FAST_ZERO

2019-08-28 Thread Vladimir Sementsov-Ogievskiy
23.08.2019 17:34, Eric Blake wrote:
> While it may be counterintuitive at first, the introduction of
> NBD_CMD_WRITE_ZEROES and NBD_CMD_BLOCK_STATUS has caused a performance
> regression in qemu [1], when copying a sparse file. When the
> destination file must contain the same contents as the source, but it
> is not known in advance whether the destination started life with all
> zero content, then there are cases where it is faster to request a
> bulk zero of the entire device followed by writing only the portions
> of the device that are to contain data, as that results in fewer I/O
> transactions overall. In fact, there are even situations where
> trimming the entire device prior to writing zeroes may be faster than
> bare write zero request [2]. However, if a bulk zero request ever
> falls back to the same speed as a normal write, a bulk pre-zeroing
> algorithm is actually a pessimization, as it ends up writing portions
> of the disk twice.
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg06389.html
> [2] https://github.com/libguestfs/nbdkit/commit/407f8dde
> 
> Hence, it is desirable to have a way for clients to specify that a
> particular write zero request is being attempted for a fast wipe, and
> get an immediate failure if the zero request would otherwise take the
> same time as a write.  Conversely, if the client is not performing a
> pre-initialization pass, it is still more efficient in terms of
> networking traffic to send NBD_CMD_WRITE_ZERO requests where the
> server implements the fallback to the slower write, than it is for the
> client to have to perform the fallback to send NBD_CMD_WRITE with a
> zeroed buffer.

How are you going to finally use it in qemu-img convert? Ok, we have a loop
of sending write-zero requests. And on first ENOTSUP we'll assume that there
is no benefit to continue? But what if actually server returns ENOTSUP only
once when we have 1000 iterations? Seems we should still do zeroing if we
have only a few ENOTSUPs...

I understand that fail-on-first ENOTSUP is OK for raw-without-fallocte vs qcow2,
as first will always return ENOTSUP and second will never fail.. But in such way
we'll OK with simpler extension, which only have one server-advirtised 
negotiation
flag NBD_FLAG_ZERO_IS_FAST.

There is not such problem if we have only one iteration, so may be new command
FILL_ZERO, filling the whole device by zeros?

> 
> Add a protocol flag and corresponding transmission advertisement flag
> to make it easier for clients to inform the server of their intent. If
> the server advertises NBD_FLAG_SEND_FAST_ZERO, then it promises two
> things: to perform a fallback to write when the client does not
> request NBD_CMD_FLAG_FAST_ZERO (so that the client benefits from the
> lower network overhead); and to fail quickly with ENOTSUP, preferably
> without modifying the export, if the client requested the flag but the
> server cannot write zeroes more efficiently than a normal write (so
> that the client is not penalized with the time of writing data areas
> of the disk twice).
> 
> Note that the semantics are chosen so that servers should advertise
> the new flag whether or not they have fast zeroing (that is, this is
> NOT the server advertising that it has fast zeroes, but rather
> advertising that the client can get fast feedback as needed on whether
> zeroing is fast).  It is also intentional that the new advertisement
> includes a new errno value, ENOTSUP, with rules that this error should
> not be returned for any pre-existing behaviors, must not happen when
> the client does not request a fast zero, and must be returned quickly
> if the client requested fast zero but anything other than the error
> would not be fast; while leaving it possible for clients to
> distinguish other errors like EINVAL if alignment constraints are not
> met.  Clients should not send the flag unless the server advertised
> support, but well-behaved servers should already be reporting EINVAL
> to unrecognized flags. If the server does not advertise the new
> feature, clients can safely fall back to assuming that writing zeroes
> is no faster than normal writes (whether or not the assumption
> actually holds).
> 
> Note that the Linux fallocate(2) interface may or may not be powerful
> enough to easily determine if zeroing will be efficient - in
> particular, FALLOC_FL_ZERO_RANGE in isolation does NOT give that
> insight; likewise, for block devices, it is known that
> ioctl(BLKZEROOUT) does NOT have a way for userspace to probe if it is
> efficient or slow.  But with enough demand, the kernel may add another
> FALLOC_FL_ flag to use with FALLOC_FL_ZERO_RANGE, and/or appropriate
> ioctls with guaranteed ENOTSUP failures if a fast path cannot be
> taken.  If a server cannot easily determine if write zeroes will be
> efficient, the server should either fail all NBD_CMD_FLAG_FAST_ZERO
> with ENOTSUP, or else choose to not advertise NBD_FLAG_SEND_FAST_ZERO.
> 
> Signed-off-by: Eric Blake 
>

Re: [Qemu-block] [PATCH v5 0/4] virtio/block: handle zoned backing devices

2019-08-28 Thread Stefan Hajnoczi
On Fri, Aug 23, 2019 at 03:49:23PM -0400, Dmitry Fomichev wrote:
> Dmitry Fomichev (4):
>   block: Add zoned device model property
>   raw: Recognize zoned backing devices
>   block/ide/scsi: Set BLK_PERM_SUPPORT_ZONED
>   raw: Don't open ZBDs if backend can't handle them

The overall approach looks good.

I wonder if bdrv_get_zoned_info() is really needed since zone_model is
part of BlockLimits and is already fetched via bdrv_refresh_limits().
Was it introduced because the block limits haven't been or cannot be
fetched when zone_model is queried the first time?  It would be nice to
get rid of bdrv_get_zoned_info() if possible.


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH v5 4/4] raw: Don't open ZBDs if backend can't handle them

2019-08-28 Thread Stefan Hajnoczi
On Fri, Aug 23, 2019 at 03:49:27PM -0400, Dmitry Fomichev wrote:
> diff --git a/block/file-posix.c b/block/file-posix.c
> index d9f2fc5e46..090e7c4d2f 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -2884,6 +2884,20 @@ static int raw_check_perm(BlockDriverState *bs, 
> uint64_t perm, uint64_t shared,
>  goto fail;
>  }
>  }
> +
> +/*
> + * If we are opening a zoned block device, check if the backend
> + * driver can properly handle such devices, abort if not.
> + */
> +if (bdrv_is_zoned(bs) &&
> +(shared & BLK_PERM_SUPPORT_ZONED) &&
> +!(perm & BLK_PERM_SUPPORT_ZONED)) {
> +error_setg(errp,
> +   "block backend driver doesn't support HM zoned devices");

Spelling out "host-managed" would be helpful in the error message.  Web
search results for "hm zoned" and "hm zoned devices" aren't great,
whereas "host-managed zoned" brings up the T10 page.

> +ret = -ENOTSUP;
> +goto fail;
> +}
> +
>  return 0;
>  
>  fail:
> diff --git a/hw/block/fdc.c b/hw/block/fdc.c
> index c5f41b3eb6..673a8b39bc 100644
> --- a/hw/block/fdc.c
> +++ b/hw/block/fdc.c
> @@ -570,7 +570,7 @@ static void floppy_drive_realize(DeviceState *qdev, Error 
> **errp)
>  dev->conf.werror = BLOCKDEV_ON_ERROR_AUTO;
>  
>  if (!blkconf_apply_backend_options(&dev->conf, read_only, false, false,
> -errp)) {
> +   errp)) {

Please squash this whitespace change into the previous patch where the
false argument was first introduced.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v5 3/4] block/ide/scsi: Set BLK_PERM_SUPPORT_ZONED

2019-08-28 Thread Stefan Hajnoczi
On Fri, Aug 23, 2019 at 03:49:26PM -0400, Dmitry Fomichev wrote:
> Added a new boolean argument to blkconf_apply_backend_options()
> to let the common block code know whether the chosen block
> backend can handle zoned block devices or not.
> 
> blkconf_apply_backend_options() then sets BLK_PERM_SUPPORT_ZONED
> permission accordingly. The raw code can then use this permission
> to allow or deny opening a zone device by a particular block driver.
> 
> Signed-off-by: Dmitry Fomichev 
> Acked-by: Paul Durrant 
> ---
>  hw/block/block.c |  8 ++--
>  hw/block/fdc.c   |  5 +++--
>  hw/block/nvme.c  |  2 +-
>  hw/block/virtio-blk.c|  2 +-
>  hw/block/xen-block.c |  2 +-
>  hw/ide/qdev.c|  2 +-
>  hw/scsi/scsi-disk.c  | 13 +++--
>  hw/scsi/scsi-generic.c   |  2 +-
>  hw/usb/dev-storage.c |  2 +-
>  include/hw/block/block.h |  3 ++-
>  10 files changed, 24 insertions(+), 17 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


[Qemu-block] [PATCH] qemu-doc: Do not hard-code the name of the QEMU binary

2019-08-28 Thread Thomas Huth
In our documentation, we use a mix of "$QEMU", "qemu-system-i386" and
"qemu-system-x86_64" when we give examples to the users how to run
QEMU. Some more consistency would be good here. Also some distributions
use different names for the QEMU binary (e.g. "qemu-kvm" in RHEL), so
providing more flexibility here would also be good. Thus let's define
some variables for the names of the QEMU command and use those in the
documentation instead: @value{qemu_system} for generic examples, and
@value{qemu_system_x86} for examples that only work with the x86
binaries.

Signed-off-by: Thomas Huth 
---
 docs/qemu-block-drivers.texi |  72 ++--
 docs/qemu-cpu-models.texi|  10 +--
 qemu-doc.texi|  81 +++---
 qemu-options.hx  | 128 +--
 4 files changed, 149 insertions(+), 142 deletions(-)

diff --git a/docs/qemu-block-drivers.texi b/docs/qemu-block-drivers.texi
index c02547e28c..2c7ea49c32 100644
--- a/docs/qemu-block-drivers.texi
+++ b/docs/qemu-block-drivers.texi
@@ -2,6 +2,8 @@
 QEMU block driver reference manual
 @c man end
 
+@set qemu_system qemu-system-x86_64
+
 @c man begin DESCRIPTION
 
 @node disk_images_formats
@@ -405,7 +407,7 @@ QEMU can automatically create a virtual FAT disk image from 
a
 directory tree. In order to use it, just type:
 
 @example
-qemu-system-i386 linux.img -hdb fat:/my_directory
+@value{qemu_system} linux.img -hdb fat:/my_directory
 @end example
 
 Then you access access to all the files in the @file{/my_directory}
@@ -415,14 +417,14 @@ them via SAMBA or NFS. The default access is 
@emph{read-only}.
 Floppies can be emulated with the @code{:floppy:} option:
 
 @example
-qemu-system-i386 linux.img -fda fat:floppy:/my_directory
+@value{qemu_system} linux.img -fda fat:floppy:/my_directory
 @end example
 
 A read/write support is available for testing (beta stage) with the
 @code{:rw:} option:
 
 @example
-qemu-system-i386 linux.img -fda fat:floppy:rw:/my_directory
+@value{qemu_system} linux.img -fda fat:floppy:rw:/my_directory
 @end example
 
 What you should @emph{never} do:
@@ -440,14 +442,14 @@ QEMU can access directly to block device exported using 
the Network Block Device
 protocol.
 
 @example
-qemu-system-i386 linux.img -hdb nbd://my_nbd_server.mydomain.org:1024/
+@value{qemu_system} linux.img -hdb nbd://my_nbd_server.mydomain.org:1024/
 @end example
 
 If the NBD server is located on the same host, you can use an unix socket 
instead
 of an inet socket:
 
 @example
-qemu-system-i386 linux.img -hdb nbd+unix://?socket=/tmp/my_socket
+@value{qemu_system} linux.img -hdb nbd+unix://?socket=/tmp/my_socket
 @end example
 
 In this case, the block device must be exported using qemu-nbd:
@@ -464,23 +466,23 @@ qemu-nbd --socket=/tmp/my_socket --share=2 my_disk.qcow2
 @noindent
 and then you can use it with two guests:
 @example
-qemu-system-i386 linux1.img -hdb nbd+unix://?socket=/tmp/my_socket
-qemu-system-i386 linux2.img -hdb nbd+unix://?socket=/tmp/my_socket
+@value{qemu_system} linux1.img -hdb nbd+unix://?socket=/tmp/my_socket
+@value{qemu_system} linux2.img -hdb nbd+unix://?socket=/tmp/my_socket
 @end example
 
 If the nbd-server uses named exports (supported since NBD 2.9.18, or with 
QEMU's
 own embedded NBD server), you must specify an export name in the URI:
 @example
-qemu-system-i386 -cdrom nbd://localhost/debian-500-ppc-netinst
-qemu-system-i386 -cdrom nbd://localhost/openSUSE-11.1-ppc-netinst
+@value{qemu_system} -cdrom nbd://localhost/debian-500-ppc-netinst
+@value{qemu_system} -cdrom nbd://localhost/openSUSE-11.1-ppc-netinst
 @end example
 
 The URI syntax for NBD is supported since QEMU 1.3.  An alternative syntax is
 also available.  Here are some example of the older syntax:
 @example
-qemu-system-i386 linux.img -hdb nbd:my_nbd_server.mydomain.org:1024
-qemu-system-i386 linux2.img -hdb nbd:unix:/tmp/my_socket
-qemu-system-i386 -cdrom nbd:localhost:10809:exportname=debian-500-ppc-netinst
+@value{qemu_system} linux.img -hdb nbd:my_nbd_server.mydomain.org:1024
+@value{qemu_system} linux2.img -hdb nbd:unix:/tmp/my_socket
+@value{qemu_system} -cdrom 
nbd:localhost:10809:exportname=debian-500-ppc-netinst
 @end example
 
 @node disk_images_sheepdog
@@ -505,7 +507,7 @@ qemu-img convert @var{filename} sheepdog:///@var{image}
 
 You can boot from the Sheepdog disk image with the command:
 @example
-qemu-system-i386 sheepdog:///@var{image}
+@value{qemu_system} sheepdog:///@var{image}
 @end example
 
 You can also create a snapshot of the Sheepdog image like qcow2.
@@ -517,7 +519,7 @@ where @var{tag} is a tag name of the newly created snapshot.
 To boot from the Sheepdog snapshot, specify the tag name of the
 snapshot.
 @example
-qemu-system-i386 sheepdog:///@var{image}#@var{tag}
+@value{qemu_system} sheepdog:///@var{image}#@var{tag}
 @end example
 
 You can create a cloned image from the existing snapshot.
@@ -530,14 +532,14 @@ is its tag name.
 You can use an unix socket instead of an inet so

Re: [Qemu-block] [PATCH 0/2] block/file-posix: Reduce xfsctl() use

2019-08-28 Thread Stefano Garzarella
On Fri, Aug 23, 2019 at 03:03:39PM +0200, Max Reitz wrote:
> Hi,
> 
> As suggested by Paolo, this series drops xfsctl() calls where we have
> working fallocate() alternatives.  (And thus replaces “block/file-posix:
> Fix xfs_write_zeroes()”.)
> 
> Unfortunately, we also use xfsctl() to inquire the request alignment for
> O_DIRECT, and this is the only way we currently have to obtain it
> without trying.  Therefore, I didn’t quite like removing that call, too,
> so this series doesn’t get rid of xfsctl() completely.
> 
> (If we did, we could delete 146 lines instead of these measly 76 here.)
> 
> 
> Anyway, dropping xfs_write_zeroes() will also fix the guest corruptions
> Lukáš has reported (for qcow2, but I think it should be possible to see
> similar corruptions with raw, although I haven’t investigated that too
> far).
> 
> 
> Max Reitz (2):
>   block/file-posix: Reduce xfsctl() use
>   iotests: Test reverse sub-cluster qcow2 writes
> 
>  block/file-posix.c | 77 +-
>  tests/qemu-iotests/265 | 67 +
>  tests/qemu-iotests/265.out |  6 +++
>  tests/qemu-iotests/group   |  1 +
>  4 files changed, 75 insertions(+), 76 deletions(-)
>  create mode 100755 tests/qemu-iotests/265
>  create mode 100644 tests/qemu-iotests/265.out

The patch and the test LGTM.

I tried to run the 265 test without the
"block/file-posix: Reduce xfsctl() use" patch and the failure rate is ~30% on
my system.

With the patch applied the failure rate is 0% :-)

Reviewed-by: Stefano Garzarella 
Tested-by: Stefano Garzarella 

Thanks,
Stefano



Re: [Qemu-block] [PATCH v5 2/4] raw: Recognize zoned backing devices

2019-08-28 Thread Stefan Hajnoczi
On Fri, Aug 23, 2019 at 03:49:25PM -0400, Dmitry Fomichev wrote:
> +static int hdev_get_zoned_model(int fd)

Please use the enum:

  static BdrvZonedModel hdev_get_zoned_model(int fd)

> +{
> +#ifdef CONFIG_LINUX
> +char buf[32];
> +int ret;
> +
> +ret = hdev_read_blk_queue_entry(fd, "zoned", buf, sizeof(buf));
> +if (ret < 0) {
> +ret = BLK_ZONED_MODEL_NONE;
> +goto out;
>  }
> -g_free(sysfspath);
> +
> +buf[ret - 1] = 0;

The ret - 1 looks suspicious since sg_get_max_segments() does buf[ret] =
0 instead.  A comment would make it clear why ret - 1 is used here:

  buf[ret - 1] = 0; /* strip newline and NUL-terminate */


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v5 1/4] block: Add zoned device model property

2019-08-28 Thread Stefan Hajnoczi
On Fri, Aug 23, 2019 at 03:49:24PM -0400, Dmitry Fomichev wrote:
> +uint8_t bdrv_is_zoned(BlockDriverState *bs)
> +{
> +/*
> + * Host Aware zone devices are supposed to be able to work
> + * just like regular block devices. Thus, we only consider
> + * Host Managed devices to be zoned here.
> + */
> +return bdrv_get_zoned_model(bs) == BLK_ZONED_MODEL_HM;

This function doesn't say whether bs is a zoned device in general, it
says whether it's Host Managed.  Please rename to
bdrv_is_host_managed_zoned(), bdrv_is_zoned_hm(), or similar.

> +}
> +
>  bool bdrv_is_sg(BlockDriverState *bs)
>  {
>  return bs->sg;
> diff --git a/include/block/block.h b/include/block/block.h
> index 124ad40809..238c0f5ed7 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -271,18 +271,35 @@ enum {
>   */
>  BLK_PERM_GRAPH_MOD  = 0x10,
>  
> +/** This permission is required to open zoned block devices. */
> +BLK_PERM_SUPPORT_ZONED  = 0x20,

Maybe BLK_PERM_SUPPORT_ZONED_HM is clearer?  Let's keep the distinction
between general zoned devices and host-managed zoned devices clear,
otherwise the code will get confusing if we ever want to treat
host-aware specially.

> +
>  BLK_PERM_ALL= 0x1f,
>  
>  DEFAULT_PERM_PASSTHROUGH= BLK_PERM_CONSISTENT_READ
>   | BLK_PERM_WRITE
>   | BLK_PERM_WRITE_UNCHANGED
> - | BLK_PERM_RESIZE,
> + | BLK_PERM_RESIZE
> + | BLK_PERM_SUPPORT_ZONED,
>  
>  DEFAULT_PERM_UNCHANGED  = BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH,
>  };
>  
>  char *bdrv_perm_names(uint64_t perm);
>  
> +/*
> + * Known zoned device models.
> + *
> + * TODO For a Linux host, it could be preferrable to include
> + * /usr/include/linux/blkzoned.h instead of defining ZBD-specific
> + * values here.

It depends.  If it's necessary to convert back to the Linux enum often
then I agree.  Otherwise the code is cleaner a QEMU enum is defined here
and we can avoid #ifdef __linux__.

> + */
> +enum blk_zoned_model {
> +BLK_ZONED_MODEL_NONE, /* Regular block device */
> +BLK_ZONED_MODEL_HA,   /* Host-aware zoned block device */
> +BLK_ZONED_MODEL_HM,   /* Host-managed zoned block device */
> +};

Please use the same typedef enum approach as the rest of block.h:

  typedef enum {
  BDRV_ZONED_MODEL_NONE, /* Regular block device */
  BDRV_ZONED_MODEL_HA,   /* Host-aware zoned block device */
  BDRV_ZONED_MODEL_HM,   /* Host-managed zoned block device */
  } BdrvZonedModel;

> +
>  /* disk I/O throttling */
>  void bdrv_init(void);
>  void bdrv_init_with_whitelist(void);
> @@ -359,6 +376,8 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState 
> *bs);
>  BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
> BlockDriverState *in_bs, Error **errp);
>  void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
> +uint8_t bdrv_get_zoned_model(BlockDriverState *bs);

Please use the enum:

  BdrvZonedModel bdrv_get_zoned_model(BlockDriverState *bs)

> +uint8_t bdrv_is_zoned(BlockDriverState *bs);

Please use bool instead of uint8_t.

>  void bdrv_refresh_limits(BlockDriverState *bs, Error **errp);
>  int bdrv_commit(BlockDriverState *bs);
>  int bdrv_change_backing_file(BlockDriverState *bs,
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index ceec8c2f56..91496e8149 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -415,6 +415,7 @@ struct BlockDriver {
>  bool (*bdrv_debug_is_suspended)(BlockDriverState *bs, const char *tag);
>  
>  void (*bdrv_refresh_limits)(BlockDriverState *bs, Error **errp);
> +void (*bdrv_get_zoned_info)(BlockDriverState *bs);

"get" is strange since this function returns void.  What is it supposed
to do, refresh just the BlockLimits::zoned_model field?  Should this be
called bdrv_refresh_zoned_model() instead?

>  
>  /*
>   * Returns 1 if newly created images are guaranteed to contain only
> @@ -620,6 +621,9 @@ typedef struct BlockLimits {
>  
>  /* maximum number of iovec elements */
>  int max_iov;
> +
> +/* Zoned device model. Zero value indicates a regular block device */
> +uint8_t zoned_model;

Please use the enum.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 2/2] qapi: deprecate implicit filters

2019-08-28 Thread Vladimir Sementsov-Ogievskiy
27.08.2019 23:12, John Snow wrote:
> 
> 
> On 8/23/19 5:22 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 14.08.2019 13:07, Vladimir Sementsov-Ogievskiy wrote:
>>> To get rid of implicit filters related workarounds in future let's
>>> deprecate them now.
>>
>> Interesting, could we deprecate implicit filter without deprecation of 
>> unnecessity of
>> parameter? As actually, it's good when this parameter is not necessary, in 
>> most cases
>> user is not interested in node-name.
>>
> 
> https://en.wiktionary.org/wiki/unnecessity -- I am surprised to learn
> that this a real word in the language I speak. :)
> 
> I assume you're referring to making the optional argument mandatory.

exactly, it's my a bit "google-translate-driven" English)

> 
>> Obviously we can do the following:
>>
>> 1. In 4.2 we deprecate unnecessity, which implies deprecation of implicit 
>> filters
>> 2. After some releases in 4.x we can drop deprecated functionality, so we 
>> drop it together with
>> implicit filters. And, in same release 4.x we return it back (as it's 
>> compatible change :)
>> but without implicit filters (so, if filter-node-name not specified, we just 
>> create
>> explicit filter with autogenerated node-name)
>>
>> So, effectively we just drop "deprecation mark" together with implicit 
>> filters, which is nice
>> but actually confusing.
>>
>> Instead, we may do
>> 1. In 4.2 deprecate
>> 2. In 4.x drop optionality together with implicit filters
>> 3. In 4.y (y > x of course) return optionality back
>>
> 
> Ah, I see what you're digging at here now...
> 
>> It's a bit safer, but for users who miss releases [4.x, 4.y) it's no 
>> difference..
>>
>> Or we just write in spec, that implicit filters are deprecated? But we have 
>> nothing about implicit
>> filters in spec. More over, we directly write that we have filter, and if 
>> parameter is omitted
>> it's node-name is autogenerated. So actually, the fact the filter is hidden 
>> when filter-node-name is
>> unspecified is _undocumented_.
>>
>> So, finally, it looks like nothing to deprecated in specification, we can 
>> just drop implicit filters :)
>>
>> What do you think?
>>
> 
> What exactly _IS_ an implicit filter? How does it differ today from an
> explicit filter? I assumed the only difference was if it was named or
> not; but I think I must be mistaken now if you're proposing leaving the
> interface alone entirely.
> 
> Are they instantiated differently?
> 

As I understand, the only difference is their BlockDriverState.impicit field, 
and several places in code
where we skip implicit filter when trying to find something in a chain starting 
from a device.

Hmm, OK, let's see:

1. the only implicit filters are commit_top and mirror_top if user don't 
specify filter-node-name.

Where it make sense, i.e., where implicit field used?

2. bdrv_query_info, bdrv_query_bds_stats, bdrv_block_device_info(only when 
called from bdrv_query_info), they'll
report filter as top node if we don't mark it implicit.

3. bdrv_refresh_filename, bdrv_reopen_parse_backing, bdrv_drop_intermediate:
   I think it's not a problem, just drop special case for implicit fitlers

So, seems the only real change is query-block and query-blockstats output when 
mirror or commit is started
without specifying filter-node-name (filter would be on top)

So, how should we deprecate this, or can we just change it?

-- 
Best regards,
Vladimir


Re: [Qemu-block] [Qemu-devel] [PATCH 2/2] block/nvme: add support for discard

2019-08-28 Thread Maxim Levitsky
On Tue, 2019-08-27 at 18:29 -0400, John Snow wrote:
> 
> On 8/25/19 3:15 AM, Maxim Levitsky wrote:
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/nvme.c   | 83 ++
> >  block/trace-events |  2 ++
> >  2 files changed, 85 insertions(+)
> > 
> > diff --git a/block/nvme.c b/block/nvme.c
> > index f8bd11e19a..dd041f39c9 100644
> > --- a/block/nvme.c
> > +++ b/block/nvme.c
> > @@ -112,6 +112,7 @@ typedef struct {
> >  bool plugged;
> >  
> >  bool supports_write_zeros;
> > +bool supports_discard;
> >  
> >  CoMutex dma_map_lock;
> >  CoQueue dma_flush_queue;
> > @@ -463,6 +464,7 @@ static void nvme_identify(BlockDriverState *bs, int 
> > namespace, Error **errp)
> >  
> >  oncs = le16_to_cpu(idctrl->oncs);
> >  s->supports_write_zeros = (oncs & NVME_ONCS_WRITE_ZEROS) != 0;
> > +s->supports_discard = (oncs & NVME_ONCS_DSM) != 0;
> 
> Same comment -- checking !!(register & FIELD) is nicer than the
> negative. (I'm actually not sure even the !! is needed, but it seems to
> be a QEMU-ism and I've caught myself using it...)

All right, no problem to use !!

> 
> Rest looks good to me on a skim, but I'm not very well-versed in NVME.
Thanks!


> 
> >  
> >  memset(resp, 0, 4096);
> >  
> > @@ -1153,6 +1155,86 @@ static coroutine_fn int 
> > nvme_co_pwrite_zeroes(BlockDriverState *bs,
> >  }
> >  
> >  
> > +static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs,
> > + int64_t offset,
> > + int bytes)
> > +{
> > +BDRVNVMeState *s = bs->opaque;
> > +NVMeQueuePair *ioq = s->queues[1];
> > +NVMeRequest *req;
> > +NvmeDsmRange *buf;
> > +QEMUIOVector local_qiov;
> > +int ret;
> > +
> > +NvmeCmd cmd = {
> > +.opcode = NVME_CMD_DSM,
> > +.nsid = cpu_to_le32(s->nsid),
> > +.cdw10 = cpu_to_le32(0), /*number of ranges - 0 based*/
> > +.cdw11 = cpu_to_le32(1 << 2), /*deallocate bit*/
> > +};
> > +
> > +NVMeCoData data = {
> > +.ctx = bdrv_get_aio_context(bs),
> > +.ret = -EINPROGRESS,
> > +};
> > +
> > +if (!s->supports_discard) {
> > +return -ENOTSUP;
> > +}
> > +
> > +assert(s->nr_queues > 1);
> > +
> > +buf = qemu_try_blockalign0(bs, s->page_size);
> > +if (!buf) {
> > +return -ENOMEM;
> > +}
> > +
> > +buf->nlb = cpu_to_le32(bytes >> s->blkshift);
> > +buf->slba = cpu_to_le64(offset >> s->blkshift);
> > +buf->cattr = 0;
> > +
> > +qemu_iovec_init(&local_qiov, 1);
> > +qemu_iovec_add(&local_qiov, buf, 4096);
> > +
> > +req = nvme_get_free_req(ioq);
> > +assert(req);
> > +
> > +qemu_co_mutex_lock(&s->dma_map_lock);
> > +ret = nvme_cmd_map_qiov(bs, &cmd, req, &local_qiov);
> > +qemu_co_mutex_unlock(&s->dma_map_lock);
> > +
> > +if (ret) {
> > +req->busy = false;
> > +goto out;
> > +}
> > +
> > +trace_nvme_dsm(s, offset, bytes);
> > +
> > +nvme_submit_command(s, ioq, req, &cmd, nvme_rw_cb, &data);
> > +
> > +data.co = qemu_coroutine_self();
> > +while (data.ret == -EINPROGRESS) {
> > +qemu_coroutine_yield();
> > +}
> > +
> > +qemu_co_mutex_lock(&s->dma_map_lock);
> > +ret = nvme_cmd_unmap_qiov(bs, &local_qiov);
> > +qemu_co_mutex_unlock(&s->dma_map_lock);
> > +
> > +if (ret) {
> > +goto out;
> > +}
> > +
> > +ret = data.ret;
> > +trace_nvme_dsm_done(s, offset, bytes, ret);
> > +out:
> > +qemu_iovec_destroy(&local_qiov);
> > +qemu_vfree(buf);
> > +return ret;
> > +
> > +}
> > +
> > +
> >  static int nvme_reopen_prepare(BDRVReopenState *reopen_state,
> > BlockReopenQueue *queue, Error **errp)
> >  {
> > @@ -1259,6 +1341,7 @@ static BlockDriver bdrv_nvme = {
> >  .bdrv_co_pwritev  = nvme_co_pwritev,
> >  
> >  .bdrv_co_pwrite_zeroes= nvme_co_pwrite_zeroes,
> > +.bdrv_co_pdiscard = nvme_co_pdiscard,
> >  
> >  .bdrv_co_flush_to_disk= nvme_co_flush,
> >  .bdrv_reopen_prepare  = nvme_reopen_prepare,
> > diff --git a/block/trace-events b/block/trace-events
> > index 8209fbd0c7..7d1d48b502 100644
> > --- a/block/trace-events
> > +++ b/block/trace-events
> > @@ -153,6 +153,8 @@ nvme_write_zeros(void *s, uint64_t offset, uint64_t 
> > bytes, int flags) "s %p offs
> >  nvme_qiov_unaligned(const void *qiov, int n, void *base, size_t size, int 
> > align) "qiov %p n %d base %p size 0x%zx align 0x%x"
> >  nvme_prw_buffered(void *s, uint64_t offset, uint64_t bytes, int niov, int 
> > is_write) "s %p offset %"PRId64" bytes %"PRId64" niov %d is_write %d"
> >  nvme_rw_done(void *s, int is_write, uint64_t offset, uint64_t bytes, int 
> > ret) "s %p is_write %d offset %"PRId64" bytes %"PRId64" ret %d"
> > +nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset %"PRId64" 
> > bytes %"PRId64""
> > +nvme_dsm_done(void *s, uint6

Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] block/nvme: add support for write zeros

2019-08-28 Thread Maxim Levitsky
On Tue, 2019-08-27 at 18:22 -0400, John Snow wrote:
> Without a commit message, I have no real hope of reviewing this. I was
> CC'd, though, so I'll give it a blind shot.
> 
> We want to add write_zeroes support for block/nvme, but I can't really
> verify any of that is correct or working without a unit test, a spec, or
> some instructions to help me verify any of this does what it looks like
> it does.

Its a bit problematic to have unit tests for this specific driver,
because it is the first and the only driver which is a driver in the
real sense of the word, that is it works directly with host hardware,
replacing the in-kernel driver.

I guess some unit tests could be done using a nested VM or so.


Yea, I was initially confused as well as what write zeros does, but it
does _exactly_ what it says, just writes zeros to the block you ask
for. So basically just like a write command, but without data.

Plus optionally if the drive  declares that after discard,
the blocks read as zeros, you can enable 'DEALOC' bit in write zeros

command to specify that in addition to filling the block with zeros
it would be nice to discard its flash data as well (this is still a hint
though).


> 
> On 8/25/19 3:15 AM, Maxim Levitsky wrote:
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/nvme.c | 72 +++-
> >  block/trace-events   |  1 +
> >  include/block/nvme.h | 19 +++-
> >  3 files changed, 90 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/nvme.c b/block/nvme.c
> > index 5be3a39b63..f8bd11e19a 100644
> > --- a/block/nvme.c
> > +++ b/block/nvme.c
> > @@ -111,6 +111,8 @@ typedef struct {
> >  uint64_t max_transfer;
> >  bool plugged;
> >  
> > +bool supports_write_zeros;
> > +
> 
> I suppose the spelling of "zeroes" is not as established as I thought it
> was. Actually, what's worse is that the NVME writers apparently couldn't
> decide what to name it themselves either:
> 
> "Write Zeroes" has 23 hits.
> "Write Zeros" has just two, in Figure 114 for the Identify NS Data.
> 
> Oh, in QEMU we're not much better:
> 
> jhuston@probe (review) ~/s/qemu> git grep -i zeros | wc -l
> 265
> jhuston@probe (review) ~/s/qemu> git grep -i zeroes | wc -l
> 747
> 
> I'm going to suggest that we use 'zeroes' as the spelling here, to match
> the existing 'pwrite_zeroes', and then otherwise to match the NVME
> spec's usual spelling.
No problem I didn't notice the two spelling to be honest.


> 
> >  CoMutex dma_map_lock;
> >  CoQueue dma_flush_queue;
> >  
> > @@ -421,6 +423,7 @@ static void nvme_identify(BlockDriverState *bs, int 
> > namespace, Error **errp)
> >  NvmeIdNs *idns;
> >  NvmeLBAF *lbaf;
> >  uint8_t *resp;
> > +uint16_t oncs;
> >  int r;
> >  uint64_t iova;
> >  NvmeCmd cmd = {
> > @@ -458,6 +461,9 @@ static void nvme_identify(BlockDriverState *bs, int 
> > namespace, Error **errp)
> >  s->max_transfer = MIN_NON_ZERO(s->max_transfer,
> >s->page_size / sizeof(uint64_t) * s->page_size);
> >  
> > +oncs = le16_to_cpu(idctrl->oncs);
> > +s->supports_write_zeros = (oncs & NVME_ONCS_WRITE_ZEROS) != 0;
> > +
> 
> For other reviewers: oncs is "Optional NVM Command Support".
> 
> I think it's better to say `!!(oncs & NVME_ONCS_WRITE_ZEROES)` to remove
> doubt over the width of the bitmask.
No problem.


> 
> >  memset(resp, 0, 4096);
> >  
> >  cmd.cdw10 = 0;
> > @@ -470,6 +476,12 @@ static void nvme_identify(BlockDriverState *bs, int 
> > namespace, Error **errp)
> >  s->nsze = le64_to_cpu(idns->nsze);
> >  lbaf = &idns->lbaf[NVME_ID_NS_FLBAS_INDEX(idns->flbas)];
> >  
> > +if (NVME_ID_NS_DLFEAT_WRITE_ZEROS(idns->dlfeat) &&
> > +NVME_ID_NS_DLFEAT_READ_BEHAVIOR(idns->dlfeat) ==
> > +NVME_ID_NS_DLFEAT_READ_BEHAVIOR_ZEROS) {
> > +bs->supported_write_flags |= BDRV_REQ_MAY_UNMAP;
> > +}
> > +
> >  if (lbaf->ms) {
> >  error_setg(errp, "Namespaces with metadata are not yet supported");
> >  goto out;
> > @@ -764,6 +776,8 @@ static int nvme_file_open(BlockDriverState *bs, QDict 
> > *options, int flags,
> >  int ret;
> >  BDRVNVMeState *s = bs->opaque;
> >  
> > +bs->supported_write_flags = BDRV_REQ_FUA;
> > +
> 
> Is this a related change?
Yes. this tells qemu that you can 'force unit access' on write zeros
command, which is a flag (non optional thankfully) that tells
the drive to actually write to the flash, and not just cache the result
in some of its caches.

> 
> >  opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
> >  qemu_opts_absorb_qdict(opts, options, &error_abort);
> >  device = qemu_opt_get(opts, NVME_BLOCK_OPT_DEVICE);
> > @@ -792,7 +806,6 @@ static int nvme_file_open(BlockDriverState *bs, QDict 
> > *options, int flags,
> >  goto fail;
> >  }
> >  }
> > -bs->supported_write_flags = BDRV_REQ_FUA;
> >  return 0;
> >  fai

Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] vhost-user-blk: prevent using uninitialized vqs

2019-08-28 Thread Stefan Hajnoczi
On Thu, Aug 22, 2019 at 11:34:24AM -0700, Raphael Norwitz wrote:
> Same rational as: e6cc11d64fc998c11a4dfcde8fda3fc33a74d844
> 
> Of the 3 virtqueues, seabios only sets cmd, leaving ctrl
> and event without a physical address. This can cause
> vhost_verify_ring_part_mapping to return ENOMEM, causing
> the following logs:
> 
> qemu-system-x86_64: Unable to map available ring for ring 0
> qemu-system-x86_64: Verify ring failure on region 0
> 
> This has already been fixed for vhost scsi devices and was
> recently vhost-user scsi devices. This commit fixes it for
> vhost-user-blk devices.
> 
> Suggested-by: Phillippe Mathieu-Daude 
> Signed-off-by: Raphael Norwitz 
> ---
>  hw/block/vhost-user-blk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature