Re: [Qemu-block] [PATCH v2 2/2] iotests: Test allocate_first_block() with O_DIRECT

2019-08-25 Thread Nir Soffer
On Mon, Aug 26, 2019 at 1:03 AM Nir Soffer  wrote:

> Using block_resize we can test allocate_first_block() with file
> descriptor opened with O_DIRECT, ensuring that it works for any size
> larger than 4096 bytes.
>
> Testing smaller sizes is tricky as the result depends on the filesystem
> used for testing. For example on NFS any size will work since O_DIRECT
> does not require any alignment.
>

Forgot to add:

Signed-off-by: Nir Soffer 

---
>  tests/qemu-iotests/175 | 25 +
>  tests/qemu-iotests/175.out |  8 
>  2 files changed, 33 insertions(+)
>
> diff --git a/tests/qemu-iotests/175 b/tests/qemu-iotests/175
> index d54cb43c39..60cc251eb2 100755
> --- a/tests/qemu-iotests/175
> +++ b/tests/qemu-iotests/175
> @@ -49,6 +49,23 @@ _filter_blocks()
>  -e "s/blocks=$((extra_blocks + img_size /
> 512))\\(\$\\|[^0-9]\\)/max allocation/"
>  }
>
> +# Resize image using block_resize.
> +# Parameter 1: image path
> +# Parameter 2: new size
> +_block_resize()
> +{
> +local path=$1
> +local size=$2
> +
> +$QEMU -qmp stdio -nographic -nodefaults \
> +-blockdev file,node-name=file,filename=$path,cache.direct=on \
> +< +{'execute': 'qmp_capabilities'}
> +{'execute': 'block_resize', 'arguments': {'node-name': 'file', 'size':
> $size}}
> +{'execute': 'quit'}
> +EOF
> +}
> +
>  # get standard environment, filters and checks
>  . ./common.rc
>  . ./common.filter
> @@ -79,6 +96,14 @@ for mode in off full falloc; do
>  stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks
> $min_blocks $size
>  done
>
> +for new_size in 4096 1048576; do
> +echo
> +echo "== resize empty image with block_resize =="
> +_make_test_img 0 | _filter_imgfmt
> +_block_resize $TEST_IMG $new_size >/dev/null
> +stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks
> $min_blocks $new_size
> +done
> +
>  # success, all done
>  echo "*** done"
>  rm -f $seq.full
> diff --git a/tests/qemu-iotests/175.out b/tests/qemu-iotests/175.out
> index 263e521262..39c2ee0f62 100644
> --- a/tests/qemu-iotests/175.out
> +++ b/tests/qemu-iotests/175.out
> @@ -15,4 +15,12 @@ size=1048576, max allocation
>  == creating image with preallocation falloc ==
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
> preallocation=falloc
>  size=1048576, max allocation
> +
> +== resize empty image with block_resize ==
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=0
> +size=4096, min allocation
> +
> +== resize empty image with block_resize ==
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=0
> +size=1048576, min allocation
>   *** done
> --
> 2.20.1
>
>


Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/2] Optimize alignment probing

2019-08-25 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20190825220329.7942-1-nsof...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH v2 0/2] Optimize alignment probing
Message-id: 20190825220329.7942-1-nsof...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  
patchew/20190819213755.26175-1-richard.hender...@linaro.org -> 
patchew/20190819213755.26175-1-richard.hender...@linaro.org
 * [new tag] patchew/20190825220329.7942-1-nsof...@redhat.com -> 
patchew/20190825220329.7942-1-nsof...@redhat.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'7bfe584e321946771692711ff83ad2b5850daca7'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/Open

Re: [Qemu-block] [PATCH] block: posix: Always allocate the first block

2019-08-25 Thread Maxim Levitsky
On Sun, 2019-08-25 at 22:51 +0300, Nir Soffer wrote:
> On Sun, Aug 25, 2019 at 10:44 AM Maxim Levitsky  wrote:
> > On Sat, 2019-08-17 at 00:21 +0300, Nir Soffer wrote:
> > > When creating an image with preallocation "off" or "falloc", the first
> > > block of the image is typically not allocated. When using Gluster
> > > storage backed by XFS filesystem, reading this block using direct I/O
> > > succeeds regardless of request length, fooling alignment detection.
> > > 
> > > In this case we fallback to a safe value (4096) instead of the optimal
> > > value (512), which may lead to unneeded data copying when aligning
> > > requests.  Allocating the first block avoids the fallback.
> > > 
> > > When using preallocation=off, we always allocate at least one filesystem
> > > block:
> > > 
> > > $ ./qemu-img create -f raw test.raw 1g
> > > Formatting 'test.raw', fmt=raw size=1073741824
> > > 
> > > $ ls -lhs test.raw
> > > 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw
> > 
> > Are you sure about this?
> 
> This is the new behaviour with this change...
> 
> > [mlevitsk@maximlenovopc ~/work/test_area/posix-file 0]$ qemu-img create -f 
> > raw test.raw 1g -o preallocation=off
> > Formatting 'test.raw', fmt=raw size=1073741824 preallocation=off
> > [mlevitsk@maximlenovopc ~/work/test_area/posix-file 0]$ls -lhs ./test.raw 
> > 0 -rw-r--r--. 1 mlevitsk mlevitsk 1.0G Aug 25 10:38 ./test.raw
> > 
> > ext4, tested on qemu-4.0.0 and qemu git master.
> 
> And this is the old behavior. I guess the commit message does not make it 
> clear.

Ah, thanks!


> > From what I remember, the only case when posix-raw touches the first block 
> > is to zero it out
> > when running on top of kernel block device, to erase whatever header might 
> > be there, and this
> > is also kind of a backward compat hack which might be one day removed.
> 
> This change is only for file, on block storage we use BLKSSZGET.
>  
> > [...]
> > 
> > Best regards,
> > Maxim Levitsky
> > 
> > 

Best regards,
Maxim Levitsky





[Qemu-block] [PATCH v2 2/2] iotests: Test allocate_first_block() with O_DIRECT

2019-08-25 Thread Nir Soffer
Using block_resize we can test allocate_first_block() with file
descriptor opened with O_DIRECT, ensuring that it works for any size
larger than 4096 bytes.

Testing smaller sizes is tricky as the result depends on the filesystem
used for testing. For example on NFS any size will work since O_DIRECT
does not require any alignment.
---
 tests/qemu-iotests/175 | 25 +
 tests/qemu-iotests/175.out |  8 
 2 files changed, 33 insertions(+)

diff --git a/tests/qemu-iotests/175 b/tests/qemu-iotests/175
index d54cb43c39..60cc251eb2 100755
--- a/tests/qemu-iotests/175
+++ b/tests/qemu-iotests/175
@@ -49,6 +49,23 @@ _filter_blocks()
 -e "s/blocks=$((extra_blocks + img_size / 512))\\(\$\\|[^0-9]\\)/max 
allocation/"
 }
 
+# Resize image using block_resize.
+# Parameter 1: image path
+# Parameter 2: new size
+_block_resize()
+{
+local path=$1
+local size=$2
+
+$QEMU -qmp stdio -nographic -nodefaults \
+-blockdev file,node-name=file,filename=$path,cache.direct=on \
+<

[Qemu-block] [PATCH v2 0/2] Optimize alignment probing

2019-08-25 Thread Nir Soffer
When probing unallocated area on XFS filesystem we cannot detect request
alignment and we fallback to safe value which may not be optimal. Avoid this
fallback by always allocating the first block when creating a new image or
resizing empty image.

I tested v1 only with -raw format, and missed some changes in qcow2
tests creating raw images during the tests. This time I tested with both
-raw and -qcow2.

Changes in v2:
- Support file descriptor opened with O_DIRECT (e.g. in block_resize) (Max)
- Remove unneeded change in 160 (Max)
- Fix block filter in 175 on filesystem allocating extra blocks (Max)
- Comment why we ignore errors in allocte_first_block() (Max)
- Comment why allocate_first_block() is needed in FALLOC mode (Max)
- Clarify commit message about user visible changes (Maxim)
- Fix 178.out.qcow2
- Fix 150.out with -qcow2 by spliting to 150.out.{raw,qcow2}
- Add test for allocate_first_block() with block_resize (Max)
- Drop provisioing tests results since I ran them only once

v1 was here:
https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00821.html

Nir Soffer (2):
  block: posix: Always allocate the first block
  iotests: Test allocate_first_block() with O_DIRECT

 block/file-posix.c| 43 ++
 tests/qemu-iotests/{150.out => 150.out.qcow2} |  0
 tests/qemu-iotests/150.out.raw| 12 +
 tests/qemu-iotests/175| 44 ---
 tests/qemu-iotests/175.out| 16 +--
 tests/qemu-iotests/178.out.qcow2  |  4 +-
 tests/qemu-iotests/221.out| 12 +++--
 tests/qemu-iotests/253.out| 12 +++--
 8 files changed, 123 insertions(+), 20 deletions(-)
 rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%)
 create mode 100644 tests/qemu-iotests/150.out.raw

-- 
2.20.1




[Qemu-block] [PATCH v2 1/2] block: posix: Always allocate the first block

2019-08-25 Thread Nir Soffer
When creating an image with preallocation "off" or "falloc", the first
block of the image is typically not allocated. When using Gluster
storage backed by XFS filesystem, reading this block using direct I/O
succeeds regardless of request length, fooling alignment detection.

In this case we fallback to a safe value (4096) instead of the optimal
value (512), which may lead to unneeded data copying when aligning
requests.  Allocating the first block avoids the fallback.

Since we allocate the first block even with preallocation=off, we no
longer create images with zero disk size:

$ ./qemu-img create -f raw test.raw 1g
Formatting 'test.raw', fmt=raw size=1073741824

$ ls -lhs test.raw
4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw

And converting the image requires additional cluster:

$ ./qemu-img measure -f raw -O qcow2 test.raw
required size: 458752
fully allocated size: 1074135040

I did quick performance test for copying disks with qemu-img convert to
new raw target image to Gluster storage with sector size of 512 bytes:

for i in $(seq 10); do
rm -f dst.raw
sleep 10
time ./qemu-img convert -f raw -O raw -t none -T none src.raw dst.raw
done

Here is a table comparing the total time spent:

TypeBefore(s)   After(s)Diff(%)
---
real  530.028469.123  -11.4
user   17.204 10.768  -37.4
sys17.881  7.011  -60.7

We can see very clear improvement in CPU usage.

Signed-off-by: Nir Soffer 
---
 block/file-posix.c| 43 +++
 tests/qemu-iotests/{150.out => 150.out.qcow2} |  0
 tests/qemu-iotests/150.out.raw| 12 ++
 tests/qemu-iotests/175| 19 +---
 tests/qemu-iotests/175.out|  8 ++--
 tests/qemu-iotests/178.out.qcow2  |  4 +-
 tests/qemu-iotests/221.out| 12 --
 tests/qemu-iotests/253.out| 12 --
 8 files changed, 90 insertions(+), 20 deletions(-)
 rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%)
 create mode 100644 tests/qemu-iotests/150.out.raw

diff --git a/block/file-posix.c b/block/file-posix.c
index fbeb0068db..51688ae3fc 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1749,6 +1749,39 @@ static int handle_aiocb_discard(void *opaque)
 return ret;
 }
 
+/*
+ * Help alignment probing by allocating the first block.
+ *
+ * When reading with direct I/O from unallocated area on Gluster backed by XFS,
+ * reading succeeds regardless of request length. In this case we fallback to
+ * safe alignment which is not optimal. Allocating the first block avoids this
+ * fallback.
+ *
+ * fd may be opened with O_DIRECT, but we don't know the buffer alignment or
+ * request alignment, so we use safe values.
+ *
+ * Returns: 0 on success, -errno on failure. Since this is an optimization,
+ * caller may ignore failures.
+ */
+static int allocate_first_block(int fd, size_t max_size)
+{
+size_t write_size = MIN(MAX_BLOCKSIZE, max_size);
+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
+void *buf;
+ssize_t n;
+
+buf = qemu_memalign(max_align, write_size);
+memset(buf, 0, write_size);
+
+do {
+n = pwrite(fd, buf, write_size, 0);
+} while (n == -1 && errno == EINTR);
+
+qemu_vfree(buf);
+
+return (n == -1) ? -errno : 0;
+}
+
 static int handle_aiocb_truncate(void *opaque)
 {
 RawPosixAIOData *aiocb = opaque;
@@ -1788,6 +1821,13 @@ static int handle_aiocb_truncate(void *opaque)
 /* posix_fallocate() doesn't set errno. */
 error_setg_errno(errp, -result,
  "Could not preallocate new data");
+} else if (current_length == 0) {
+/*
+ * Needed only if posix_fallocate() used fallocate(), but we
+ * don't have a way to detect that. Optimize future alignment
+ * probing; ignore failures.
+ */
+allocate_first_block(fd, offset);
 }
 } else {
 result = 0;
@@ -1849,6 +1889,9 @@ static int handle_aiocb_truncate(void *opaque)
 if (ftruncate(fd, offset) != 0) {
 result = -errno;
 error_setg_errno(errp, -result, "Could not resize file");
+} else if (current_length == 0 && offset > current_length) {
+/* Optimize future alignment probing; ignore failures. */
+allocate_first_block(fd, offset);
 }
 return result;
 default:
diff --git a/tests/qemu-iotests/150.out b/tests/qemu-iotests/150.out.qcow2
similarity index 100%
rename from tests/qemu-iotests/150.out
rename to tests/qemu-iotests/150.out.qcow2
diff --git a/tests/qemu-iotests/150.out.raw b/tests/qemu-iotests/150.out.raw
new file mode 100644
index 00..3cdc7727a5
--- /dev/null
+++ b/tests/qemu-iotests/150.out

Re: [Qemu-block] [PATCH] block: posix: Always allocate the first block

2019-08-25 Thread Nir Soffer
On Sun, Aug 25, 2019 at 10:44 AM Maxim Levitsky  wrote:

> On Sat, 2019-08-17 at 00:21 +0300, Nir Soffer wrote:
> > When creating an image with preallocation "off" or "falloc", the first
> > block of the image is typically not allocated. When using Gluster
> > storage backed by XFS filesystem, reading this block using direct I/O
> > succeeds regardless of request length, fooling alignment detection.
> >
> > In this case we fallback to a safe value (4096) instead of the optimal
> > value (512), which may lead to unneeded data copying when aligning
> > requests.  Allocating the first block avoids the fallback.
> >
> > When using preallocation=off, we always allocate at least one filesystem
> > block:
> >
> > $ ./qemu-img create -f raw test.raw 1g
> > Formatting 'test.raw', fmt=raw size=1073741824
> >
> > $ ls -lhs test.raw
> > 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw
>
> Are you sure about this?
>

This is the new behaviour with this change...

[mlevitsk@maximlenovopc ~/work/test_area/posix-file 0]$ qemu-img create -f
> raw test.raw 1g -o preallocation=off
> Formatting 'test.raw', fmt=raw size=1073741824 preallocation=off
> [mlevitsk@maximlenovopc ~/work/test_area/posix-file 0]$ls -lhs ./test.raw
> 0 -rw-r--r--. 1 mlevitsk mlevitsk 1.0G Aug 25 10:38 ./test.raw
>
> ext4, tested on qemu-4.0.0 and qemu git master.
>

And this is the old behavior. I guess the commit message does not make it
clear.

>From what I remember, the only case when posix-raw touches the first block
> is to zero it out
> when running on top of kernel block device, to erase whatever header might
> be there, and this
> is also kind of a backward compat hack which might be one day removed.
>

This change is only for file, on block storage we use BLKSSZGET.


>
> [...]
>
> Best regards,
> Maxim Levitsky
>
>
>


Re: [Qemu-block] [Qemu-devel] [PATCH 05/13] qcrypto-luks: clear the masterkey and password before freeing them always

2019-08-25 Thread Maxim Levitsky
On Sun, 2019-08-25 at 18:31 +0300, Maxim Levitsky wrote:
> On Thu, 2019-08-22 at 13:56 +0300, Maxim Levitsky wrote:
> > On Thu, 2019-08-22 at 11:49 +0100, Daniel P. Berrangé wrote:
> > > On Tue, Aug 20, 2019 at 08:12:51PM +0200, Max Reitz wrote:
> > > > On 14.08.19 22:22, Maxim Levitsky wrote:
> > > > > While there are other places where these are still stored in memory,
> > > > > this is still one less key material area that can be sniffed with
> > > > > various side channel attacks
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > (Many empty lines here)
> > > > 
> > > > > Signed-off-by: Maxim Levitsky 
> > > > > ---
> > > > >  crypto/block-luks.c | 52 
> > > > > ++---
> > > > >  1 file changed, 44 insertions(+), 8 deletions(-)
> > > > 
> > > > Wouldn’t it make sense to introduce a dedicated function for this?
> > > 
> > > Yes, it would.
> > > 
> > > In fact I have a series pending which bumps min glib and introduces
> > > use of auto-free functions in this code.
> > > 
> > > It would be desirable to have a autp-free func for memset+free
> > > so we can just declare the variable
> > > 
> > >q_autowipefree char *password = NULL;
> > > 
> > > and have it result in memset+free
> > > 
> > 
> > That is perfect.
> > When do you think you could post the series so that I could rebase
> > on top of it?
> 
> 
> I am thinking that I will keep my patch as is, just so that code is
> consistent in memsetting the secrets (even though as Nir pointed out,
> that these will be probably optimized away anyway).
> And then when you send your patch you will just remove all
> of these memsets.
> 
> Is this all right? 

I see that your series actually already got merged.
Can I now implement the 'q_autowipefree', or do I need another glib version bump
for that?

Best regards,
Maxim Levitsky


> 
> Best regards,
>   Maxim Levitsky





Re: [Qemu-block] [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 12:35 +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 14, 2019 at 11:22:06PM +0300, Maxim Levitsky wrote:
> > Hi!
> > 
> > This patch series implements key management for luks based encryption
> > It supports both raw luks images and qcow2 encrypted images.
> > 
> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1731898
> > 
> > There are still several issues that need to be figured out,
> > on which the feedback is very welcome, but other than that the code mostly 
> > works.
> > 
> > The main issues are:
> > 
> > 1. Instead of the proposed 
> > blockdev-update-encryption/blockdev-erase-encryption
> > interface, it is probably better to implement 'blockdev-amend-options' in 
> > qmp,
> > and use this both for offline and online key update (with some translation
> > layer to convert the qemu-img 'options' to qmp structures)
> > 
> > This interface already exists for offline qcow2 format options update/
> > 
> > This is an issue that was raised today on IRC with Kevin Wolf. Really thanks
> > for the idea!
> > 
> > We agreed that this new qmp interface should take the same options as
> > blockdev-create does, however since we want to be able to edit the 
> > encryption
> > slots separately, this implies that we sort of need to allow this on 
> > creation
> > time as well.
> > 
> > Also the BlockdevCreateOptions is a union, which is specialized by the 
> > driver name
> > which is great for creation, but for update, the driver name is already 
> > known,
> > and thus the user should not be forced to pass it again.
> > However qmp doesn't seem to support union type guessing based on actual 
> > fields
> > given (this might not be desired either), which complicates this somewhat.
> 
> Given this design question around the integration into blockdev, I'd
> suggest splitting the series into two parts.
> 
> One series should do all the work in crypto/ code to support adding
> and erasing key slots.
> 
> One series should focus on block/ layer QMP/qemu-img integration.
> 
> The block layer QAPI stuff shouldn't leak into the crypto/ code.
> 
> So this will let us get on with reviewing & unit testing the
> crypto code, while we discuss block layer design options in more
> detail.
> 
> Regards,
> Daniel


I think we need 3 series here.


1. All the re-factoring/preparation work I done in luks crypto driver, which 
can be merged
now, pending minor changes from the review.
I think that it at least doesn't make the code worse.

2. Common code for the block layer to support key management this way or 
another,
   can be even added with not a single driver implementing it.

1,2 don't depend on each other mostly.


3. Key management in LUKS, which needs both 1,2, but thankfully is mostly 
implemented,
and won't need to change much from the current implementation.


So I'll send 1 now, and I will star working on 2.

Last week we (I and Daniel) defined a draft of amend interface,
and if time permits we will work on that tomorrow to finalize the
interface.

Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [PATCH 12/13] qemu-img: implement key management

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 16:42 +0200, Max Reitz wrote:
> On 22.08.19 13:32, Daniel P. Berrangé wrote:
> > On Tue, Aug 20, 2019 at 08:29:55PM +0200, Max Reitz wrote:
> > > On 14.08.19 22:22, Maxim Levitsky wrote:
> > > > Signed-off-by: Maxim Levitsky 
> > > > ---
> > > >  block/crypto.c   |  16 ++
> > > >  block/crypto.h   |   3 +
> > > >  qemu-img-cmds.hx |  13 +
> > > >  qemu-img.c   | 140 +++
> > > >  4 files changed, 172 insertions(+)
> > > 
> > > Yes, this seems a bit weird.  Putting it under amend seems like the
> > > natural thing if that works; if not, I think it should be a single
> > > qemu-img subcommand instead of two.
> > 
> > I'm not convinced by overloading two distinct operations on to one
> > sub-command - doesn't seem to give an obvious benefit to overload
> > them & IME experiance overloading results in harder to understand
> > commands due to having distinct args to each command.
> 
> Because it suits the qemu-img interface we currently have.  For example,
> we have a single subcommand for internal snapshot management (“qemu-img
> snapshot”), so I think it makes sense to have a single subcommand for
> encrypted image management.

I personally don't care, other that I do thing that the best here is to use
the amend interface.

> 
> Max
> 

Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [PATCH 09/13] qcrypto-luks: implement the encryption key management

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 12:27 +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 14, 2019 at 11:22:15PM +0300, Maxim Levitsky wrote:
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 374 +++-
> >  1 file changed, 373 insertions(+), 1 deletion(-)
> > 
> > diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> > index 1997e92fe1..2c33643b52 100644
> > --- a/crypto/block-luks.c
> > +++ b/crypto/block-luks.c
> > @@ -72,6 +72,8 @@ typedef struct QCryptoBlockLUKSKeySlot 
> > QCryptoBlockLUKSKeySlot;
> >  
> >  #define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME 2000
> >  
> > +#define QCRYPTO_BLOCK_LUKS_ERASE_ITERATIONS 40
> > +
> >  static const char qcrypto_block_luks_magic[QCRYPTO_BLOCK_LUKS_MAGIC_LEN] = 
> > {
> >  'L', 'U', 'K', 'S', 0xBA, 0xBE
> >  };
> > @@ -221,6 +223,9 @@ struct QCryptoBlockLUKS {
> >  
> >  /* Hash algorithm used in pbkdf2 function */
> >  QCryptoHashAlgorithm hash_alg;
> > +
> > +/* Name of the secret that was used to open the image */
> > +char *secret;
> >  };
> >  
> >  
> > @@ -1121,6 +1126,194 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
> >  }
> >  
> >  
> > +
> > +/*
> > + * Returns true if a slot i is marked as containing as active
> 
> s/as containing//
> 
> > + * (contains encrypted copy of the master key)
> > + */
> > +
> > +static bool
> > +qcrypto_block_luks_slot_active(QCryptoBlockLUKS *luks, int slot_idx)
> > +{
> > +uint32_t val = luks->header.key_slots[slot_idx].active;
> > +return val ==  QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED;
> > +}
> > +
> > +/*
> > + * Returns the number of slots that are marked as active
> > + * (contains encrypted copy of the master key)
> > + */
> > +
> > +static int
> > +qcrypto_block_luks_count_active_slots(QCryptoBlockLUKS *luks)
> > +{
> > +int i, ret = 0;
> 
> I prefer to see 'size_t' for loop iterators 
Done

> 
> > +
> > +for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
> > +if (qcrypto_block_luks_slot_active(luks, i)) {
> > +ret++;
> > +}
> > +}
> > +return ret;
> > +}
> > +
> > +
> > +/*
> > + * Finds first key slot which is not active
> > + * Returns the key slot index, or -1 if doesn't exist
> > + */
> > +
> > +static int
> > +qcrypto_block_luks_find_free_keyslot(QCryptoBlockLUKS *luks)
> > +{
> > +uint i;
> > +
> > +for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
> > +if (!qcrypto_block_luks_slot_active(luks, i)) {
> > +return i;
> > +}
> > +}
> > +return -1;
> > +
> > +}
> > +
> > +/*
> > + * Erases an keyslot given its index
> > + *
> > + * Returns:
> > + *0 if the keyslot was erased successfully
> > + *   -1 if a error occurred while erasing the keyslot
> > + *
> > + */
> > +
> 
> Redundant blank line
Done
> 
> > +static int
> > +qcrypto_block_luks_erase_key(QCryptoBlock *block,
> > + uint slot_idx,
> > + QCryptoBlockWriteFunc writefunc,
> > + void *opaque,
> > + Error **errp)
> > +{
> > +QCryptoBlockLUKS *luks = block->opaque;
> > +QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[slot_idx];
> > +uint8_t *garbagekey = NULL;
> > +size_t splitkeylen = masterkeylen(luks) * slot->stripes;
> > +int i;
> > +int ret = -1;
> > +
> > +assert(slot_idx < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
> > +assert(splitkeylen > 0);
> > +
> > +garbagekey = g_malloc0(splitkeylen);
> > +
> > +/* Reset the key slot header */
> > +memset(slot->salt, 0, QCRYPTO_BLOCK_LUKS_SALT_LEN);
> > +slot->iterations = 0;
> > +slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
> > +
> > +qcrypto_block_luks_store_header(block,  writefunc, opaque, errp);
> > +
> > +/*
> > + * Now try to erase the key material, even if the header
> > + * update failed
> > + */
> > +
> 
> Redundant blank line
Fixed.
> 
> > +for (i = 0 ; i < QCRYPTO_BLOCK_LUKS_ERASE_ITERATIONS ; i++) {
> > +if (qcrypto_random_bytes(garbagekey, splitkeylen, errp) < 0) {
> > +
> 
> Again, many more times beelow.
> 
> > +/*
> > + * If we failed to get the random data, still write
> > + * *something* to the key slot at least once
> > + */
> 
> Specificially  we write all-zeros, since garbagekey was allocated
> with g_malloc0
Clarified.
> 
> > +
> > +if (i > 0) {
> > +goto cleanup;
> > +}
> > +}
> > +
> > +if (writefunc(block, slot->key_offset * 
> > QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
> > +  garbagekey,
> > +  splitkeylen,
> > +  opaque,
> > +  errp) != splitkeylen) {
> 
> Indent is off - align with '('
Fixed.
> 
> > +goto cleanup;
> > +}
> > +}
> > +
> > +ret = 0;
> > +cleanup:
> > +g_free(garbagekey);
> 

Re: [Qemu-block] [Qemu-devel] [PATCH 07/13] block: add manage-encryption command (qmp and blockdev)

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 16:07 +0200, Markus Armbruster wrote:
> Maxim Levitsky  writes:
> 
> > On Wed, 2019-08-21 at 13:47 +0200, Markus Armbruster wrote:
> > > Maxim Levitsky  writes:
> > > 
> > > > This adds:
> > > > 
> > > > * x-blockdev-update-encryption and x-blockdev-erase-encryption qmp 
> > > > commands
> > > >   Both commands take the QCryptoKeyManageOptions
> > > >   the x-blockdev-update-encryption is meant for non destructive addition
> > > >   of key slots / whatever the encryption driver supports in the future
> > > > 
> > > >   x-blockdev-erase-encryption is meant for destructive encryption key 
> > > > erase,
> > > >   in some cases even without way to recover the data.
> > > > 
> > > > 
> > > > * bdrv_setup_encryption callback in the block driver
> > > >   This callback does both the above functions with 'action' parameter
> > > > 
> > > > * QCryptoKeyManageOptions with set of options that drivers can use for 
> > > > encryption managment
> > > >   Currently it has all the options that LUKS needs, and later it can be 
> > > > extended
> > > >   (via union) to support more encryption drivers if needed
> > > > 
> > > > * blk_setup_encryption / bdrv_setup_encryption - the usual block layer 
> > > > wrappers.
> > > >   Note that bdrv_setup_encryption takes BlockDriverState and not 
> > > > BdrvChild,
> > > >   for the ease of use from the qmp code. It is not expected that this 
> > > > function
> > > >   will be used by anything but qmp and qemu-img code
> > > > 
> > > > 
> > > > Signed-off-by: Maxim Levitsky 
> > > 
> > > [...]
> > > > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > > > index 0d43d4f37c..53ed411eed 100644
> > > > --- a/qapi/block-core.json
> > > > +++ b/qapi/block-core.json
> > > > @@ -5327,3 +5327,39 @@
> > > >'data' : { 'node-name': 'str',
> > > >   'iothread': 'StrOrNull',
> > > >   '*force': 'bool' } }
> > > > +
> > > > +
> > > > +##
> > > > +# @x-blockdev-update-encryption:
> > > > +#
> > > > +# Update the encryption keys for an encrypted block device
> > > > +#
> > > > +# @node-name:Name of the blockdev to operate on
> > > > +# @force: Disable safety checks (use with care)
> > > 
> > > What checks excactly are disabled?
> > 
> > Ability to overwrite an used slot with a different password. 
> > If overwrite fails, the image won't be recoverable.
> > 
> > The safe way is to add a new slot, then erase the old
> > one, but this changes the slot where the password
> > is stored, unless this procedure is used twice
> 
> Would this be a useful addition to the doc comment?
> 
> > > > +# @options:   Driver specific options
> > > > +#
> > > > +
> > > > +# Since: 4.2
> > > > +##
> > > > +{ 'command': 'x-blockdev-update-encryption',
> > > > +  'data': { 'node-name' : 'str',
> > > > +'*force' : 'bool',
> > > > +'options': 'QCryptoEncryptionSetupOptions' } }
> > > > +
> > > > +##
> > > > +# @x-blockdev-erase-encryption:
> > > > +#
> > > > +# Erase the encryption keys for an encrypted block device
> > > > +#
> > > > +# @node-name:Name of the blockdev to operate on
> > > > +# @force: Disable safety checks (use with care)
> > > 
> > > Likewise.
> > 
> > 1. Erase a slot which is already marked as
> > erased. Mostly harmless but pointless as well.
> > 
> > 2. Erase last keyslot. This irreversibly destroys
> > any ability to read the data from the device,
> > unless a backup of the header and the key material is
> > done prior. Still can be useful when it is desired to
> > erase the data fast.
> 
> Would this be a useful addition to the doc comment?
Yea, but since I'll will switch to the amend interface,
I'll leave it like that for now.


[...]


Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [PATCH 06/13] qcrypto-luks: implement more rigorous header checking

2019-08-25 Thread Maxim Levitsky
On Sun, 2019-08-25 at 18:40 +0300, Maxim Levitsky wrote:
> On Thu, 2019-08-22 at 12:04 +0100, Daniel P. Berrangé wrote:
> > On Wed, Aug 14, 2019 at 11:22:12PM +0300, Maxim Levitsky wrote:
> > > Check that keyslots don't overlap with the data,
> > > and check that keyslots don't overlap with each other.
> > > (this is done using naive O(n^2) nested loops,
> > > but since there are just 8 keyslots, this doens't really matter.
> > > 
> > > Signed-off-by: Maxim Levitsky 
> > > ---
> > >  crypto/block-luks.c | 42 ++
> > >  1 file changed, 42 insertions(+)
> > > 
> > > diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> > > index 336e633df4..1997e92fe1 100644
> > > --- a/crypto/block-luks.c
> > > +++ b/crypto/block-luks.c
> > > @@ -551,6 +551,8 @@ static int
> > >  qcrypto_block_luks_check_header(QCryptoBlockLUKS *luks, Error **errp)
> > >  {
> > >  int ret;
> > > +int i, j;
> > > +
> > >  
> > >  if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
> > > QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
> > > @@ -566,6 +568,46 @@ qcrypto_block_luks_check_header(QCryptoBlockLUKS 
> > > *luks, Error **errp)
> > >  goto fail;
> > >  }
> > >  
> > > +/* Check all keyslots for corruption  */
> > > +for (i = 0 ; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS ; i++) {
> > > +
> > > +QCryptoBlockLUKSKeySlot *slot1 = &luks->header.key_slots[i];
> > > +uint start1 = slot1->key_offset;
> > > +uint len1 = splitkeylen_sectors(luks, slot1->stripes);
> > 
> > Using 'uint' is not normal QEMU style.
> > 
> > Either use 'unsigned int'  or if a specific size is needed
> > then one of the 'guintNN' types from glib.
> > 
> > This applies elsewhere in this patch series too, but
> > I'll only comment here & let you find the other cases.
> 
> Fixed. Sorry for the noise.
> 
> > 
> > > +
> > > +if (slot1->stripes == 0 ||
> > > +(slot1->active != QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED &&
> > > +slot1->active != QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED)) {
> > > +
> > 
> > Redundant blank line
> 
> Fixed
> > 
> > > +error_setg(errp, "Keyslot %i is corrupted", i);
> > 
> > I'd do a separate check for stripes and active fields, and then give a
> > specific error message for each. That way if this does ever trigger
> > in practice will immediately understand which check failed.
> > 
> > Also using '%d' rather than '%i' is more common convention
> 
> Done.

Note that I switched i,j to be size_t since you said that you prefer this,
and to print this I apparently need %lu.


[...]

Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [PATCH 06/13] qcrypto-luks: implement more rigorous header checking

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 12:04 +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 14, 2019 at 11:22:12PM +0300, Maxim Levitsky wrote:
> > Check that keyslots don't overlap with the data,
> > and check that keyslots don't overlap with each other.
> > (this is done using naive O(n^2) nested loops,
> > but since there are just 8 keyslots, this doens't really matter.
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 42 ++
> >  1 file changed, 42 insertions(+)
> > 
> > diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> > index 336e633df4..1997e92fe1 100644
> > --- a/crypto/block-luks.c
> > +++ b/crypto/block-luks.c
> > @@ -551,6 +551,8 @@ static int
> >  qcrypto_block_luks_check_header(QCryptoBlockLUKS *luks, Error **errp)
> >  {
> >  int ret;
> > +int i, j;
> > +
> >  
> >  if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
> > QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
> > @@ -566,6 +568,46 @@ qcrypto_block_luks_check_header(QCryptoBlockLUKS 
> > *luks, Error **errp)
> >  goto fail;
> >  }
> >  
> > +/* Check all keyslots for corruption  */
> > +for (i = 0 ; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS ; i++) {
> > +
> > +QCryptoBlockLUKSKeySlot *slot1 = &luks->header.key_slots[i];
> > +uint start1 = slot1->key_offset;
> > +uint len1 = splitkeylen_sectors(luks, slot1->stripes);
> 
> Using 'uint' is not normal QEMU style.
> 
> Either use 'unsigned int'  or if a specific size is needed
> then one of the 'guintNN' types from glib.
> 
> This applies elsewhere in this patch series too, but
> I'll only comment here & let you find the other cases.

Fixed. Sorry for the noise.

> 
> > +
> > +if (slot1->stripes == 0 ||
> > +(slot1->active != QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED &&
> > +slot1->active != QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED)) {
> > +
> 
> Redundant blank line
Fixed
> 
> > +error_setg(errp, "Keyslot %i is corrupted", i);
> 
> I'd do a separate check for stripes and active fields, and then give a
> specific error message for each. That way if this does ever trigger
> in practice will immediately understand which check failed.
> 
> Also using '%d' rather than '%i' is more common convention
Done.
> 
> 
> > +ret = -EINVAL;
> > +goto fail;
> > +}
> > +
> > +if (start1 + len1 > luks->header.payload_offset) {
> > +error_setg(errp,
> > +   "Keyslot %i is overlapping with the encrypted 
> > payload",
> > +   i);
> > +ret = -EINVAL;
> > +goto fail;
> > +}
> > +
> > +for (j = i + 1 ; j < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS ; j++) {
> > +
> 
> Redundant blank
> 
> > +QCryptoBlockLUKSKeySlot *slot2 = &luks->header.key_slots[j];
> > +uint start2 = slot2->key_offset;
> > +uint len2 = splitkeylen_sectors(luks, slot2->stripes);
> > +
> > +if (start1 + len1 > start2 && start2 + len2 > start1) {
> > +error_setg(errp,
> > +   "Keyslots %i and %i are overlapping in the 
> > header",
> 
> %d
Fixed.
> 
> > +   i, j);
> > +ret = -EINVAL;
> > +goto fail;
> > +}
> > +}
> > +
> > +}
> >  return 0;
> >  fail:
> >  return ret;
> > -- 
> > 2.17.2
> > 
> 
> Regards,
> Daniel


Best regards,
Maxim Levitsky





Re: [Qemu-block] [Qemu-devel] [PATCH 05/13] qcrypto-luks: clear the masterkey and password before freeing them always

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 13:56 +0300, Maxim Levitsky wrote:
> On Thu, 2019-08-22 at 11:49 +0100, Daniel P. Berrangé wrote:
> > On Tue, Aug 20, 2019 at 08:12:51PM +0200, Max Reitz wrote:
> > > On 14.08.19 22:22, Maxim Levitsky wrote:
> > > > While there are other places where these are still stored in memory,
> > > > this is still one less key material area that can be sniffed with
> > > > various side channel attacks
> > > > 
> > > > 
> > > > 
> > > 
> > > (Many empty lines here)
> > > 
> > > > Signed-off-by: Maxim Levitsky 
> > > > ---
> > > >  crypto/block-luks.c | 52 ++---
> > > >  1 file changed, 44 insertions(+), 8 deletions(-)
> > > 
> > > Wouldn’t it make sense to introduce a dedicated function for this?
> > 
> > Yes, it would.
> > 
> > In fact I have a series pending which bumps min glib and introduces
> > use of auto-free functions in this code.
> > 
> > It would be desirable to have a autp-free func for memset+free
> > so we can just declare the variable
> > 
> >q_autowipefree char *password = NULL;
> > 
> > and have it result in memset+free
> > 
> 
> That is perfect.
> When do you think you could post the series so that I could rebase
> on top of it?


I am thinking that I will keep my patch as is, just so that code is
consistent in memsetting the secrets (even though as Nir pointed out,
that these will be probably optimized away anyway).
And then when you send your patch you will just remove all
of these memsets.

Is this all right? 

Best regards,
Maxim Levitsky




Re: [Qemu-block] [PATCH v5 1/6] iotests: allow Valgrind checking all QEMU processes

2019-08-25 Thread Andrey Shinkevich


On 16/08/2019 01:49, John Snow wrote:
> 
> 
> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
>> With the '-valgrind' option, let all the QEMU processes be run under
>> the Valgrind tool. The Valgrind own parameters may be set with its
>> environment variable VALGRIND_OPTS, e.g.
>> VALGRIND_OPTS="--leak-check=yes" ./check -qcow2 -valgrind 
>> or they may be listed in the Valgrind checked file ./.valgrindrc or
>> ~/.valgrindrc like
>> --memcheck:leak-check=no
>> --memcheck:track-origins=yes
>> When QEMU-IO process is being killed, the shell report refers to the
>> text of the command in _qemu_io_wrapper(), which was modified with this
>> patch. So, the benchmark output for the tests 039, 061 and 137 is to be
>> changed also.
>>
> 
> Oh, weird. "VALGRIND_QEMU=y" actually has just meant ... valgrind
> qemu-io. OK.
> 
>> Signed-off-by: Andrey Shinkevich 
>> ---
>>   tests/qemu-iotests/039.out   | 30 ---
>>   tests/qemu-iotests/061.out   | 12 ++--
>>   tests/qemu-iotests/137.out   |  6 +---
>>   tests/qemu-iotests/common.rc | 69 
>> 
>>   4 files changed, 59 insertions(+), 58 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/039.out b/tests/qemu-iotests/039.out
>> index 724d7b2..972c6c0 100644
>> --- a/tests/qemu-iotests/039.out
>> +++ b/tests/qemu-iotests/039.out
>> @@ -11,11 +11,7 @@ No errors were found on the image.
>>   Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>>   wrote 512/512 bytes at offset 0
>>   512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
>> then
>> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
>> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -else
>> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -fi )
>> +./common.rc: Killed  ( _qemu_proc_wrapper 
>> "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>>   incompatible_features 0x1
>>   ERROR cluster 5 refcount=0 reference=1
>>   ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
>> @@ -50,11 +46,7 @@ read 512/512 bytes at offset 0
>>   Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>>   wrote 512/512 bytes at offset 0
>>   512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
>> then
>> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
>> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -else
>> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -fi )
>> +./common.rc: Killed  ( _qemu_proc_wrapper 
>> "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>>   incompatible_features 0x1
>>   ERROR cluster 5 refcount=0 reference=1
>>   Rebuilding refcount structure
>> @@ -68,11 +60,7 @@ incompatible_features 0x0
>>   Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>>   wrote 512/512 bytes at offset 0
>>   512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
>> then
>> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
>> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -else
>> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -fi )
>> +./common.rc: Killed  ( _qemu_proc_wrapper 
>> "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>>   incompatible_features 0x0
>>   No errors were found on the image.
>>   
>> @@ -91,11 +79,7 @@ No errors were found on the image.
>>   Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>>   wrote 512/512 bytes at offset 0
>>   512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
>> then
>> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
>> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -else
>> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -fi )
>> +./common.rc: Killed  ( _qemu_proc_wrapper 
>> "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>>   incompatible_features 0x1
>>   ERROR cluster 5 refcount=0 reference=1
>>   ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
>> @@ -105,11 +89,7 @@ Data may be corrupted, or further writes to the image 
>> may corrupt it.
>>   Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>>   wrote 512/512 bytes at offset 0
>>   512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> -./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; 
>> then
>> -exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
>> "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -else
>> -exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
>> -fi )
>> +./common.rc: Killed  ( _qemu_proc_wrapper 
>> "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>>   incompatible_features 0x0
>>   No errors 

Re: [Qemu-block] [PATCH v5 4/6] iotests: Valgrind fails with nonexistent directory

2019-08-25 Thread Andrey Shinkevich


On 16/08/2019 03:55, John Snow wrote:
> 
> 
> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
>> The Valgrind uses the exported variable TMPDIR and fails if the
>> directory does not exist. Let us exclude such a test case from
>> being run under the Valgrind and notify the user of it.
>>
>> Suggested-by: Kevin Wolf 
>> Signed-off-by: Andrey Shinkevich 
>> ---
>>   tests/qemu-iotests/051 | 4 
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
>> index ce942a5..f8141ca 100755
>> --- a/tests/qemu-iotests/051
>> +++ b/tests/qemu-iotests/051
>> @@ -377,6 +377,10 @@ printf %b "qemu-io $device_id \"write -P 0x33 0 
>> 4k\"\ncommit $device_id\n" |
>>   $QEMU_IO -c "read -P 0x33 0 4k" "$TEST_IMG" | _filter_qemu_io
>>   
>>   # Using snapshot=on with a non-existent TMPDIR
>> +if [ "${VALGRIND_QEMU}" == "y" ]; then
>> +_casenotrun "Valgrind needs a valid TMPDIR for itself"
>> +fi
>> +VALGRIND_QEMU="" \
>>   TMPDIR=/nonexistent run_qemu -drive driver=null-co,snapshot=on
>>   
>>   # Using snapshot=on together with read-only=on
>>
> 
> The only other way around this would be a complicated mechanism to set
> the TMPDIR for valgrind's sub-processes only, with e.g.
> 
> valgrind ... env TMPDIR=/nonexistent qemu ...
> 
> ... It's probably not worth trying to concoct such a thing; but I
> suppose it is possible. You'd have to set up a generic layer for setting
> environment variables, then in the qemu shim, you could either set them
> directly (non-valgrind invocation) or set them as part of the valgrind
> command-line.
> 
> Or you could just take my R-B:
> 
> Reviewed-by: John Snow 
> 

Thanks again John for your review and the advice.
Probably, it doesn't worth efforts to manage that case because QEMU 
should fail anyway with the error message "Could not get temporary 
filename: No such file or directory". So, we would not benefit much from 
that run. We have other test cases that cover the main functionality. 
It's just to check the QEMU error path for possible memory issues. Shall we?

Andrey
-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [PATCH 04/13] qcrypto-luks: refactoring: simplify the math used for keyslot locations

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 11:47 +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 14, 2019 at 11:22:10PM +0300, Maxim Levitsky wrote:
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 64 +++--
> >  1 file changed, 38 insertions(+), 26 deletions(-)
> > 
> > diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> > index 6bb369f3b4..e1a4df94b7 100644
> > --- a/crypto/block-luks.c
> > +++ b/crypto/block-luks.c
> > @@ -417,6 +417,33 @@ static int masterkeylen(QCryptoBlockLUKS *luks)
> >  }
> >  
> >  
> > +/*
> > + * Returns number of sectors needed to store the key material
> > + * given number of anti forensic stripes
> > + */
> > +static int splitkeylen_sectors(QCryptoBlockLUKS *luks, int stripes)
> 
> Needs a qcrypto_block_luks_ prefix on method name.
Done.

> 
> I'd also put 'static int' on a separate line from method name
> to reduce too long lines.
Done.
> 
> > +
> > +{
> > +/*
> > + * This calculation doesn't match that shown in the spec,
> > + * but instead follows the cryptsetup implementation.
> > + */
> > +
> > +size_t header_sectors = QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
> > + QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
> 
> Following line indent should only be 4 spaces
I didn't knew that. Fixed.
> 
> > +
> > +size_t splitkeylen = masterkeylen(luks) * stripes;
> > +
> > +/* First align the key material size to block size*/
> > +size_t splitkeylen_sectors =
> > +DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE);
> 
> Again 4 space indent.
> 
> > +
> > +/* Then also align the key material size to the size of the header */
> > +return ROUND_UP(splitkeylen_sectors, header_sectors);
> > +}
> > +
> > +
> > +
> >  /*
> >   * Stores the main LUKS header, taking care of endianess
> >   */
> > @@ -1169,7 +1196,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
> >  QCryptoBlockCreateOptionsLUKS luks_opts;
> >  Error *local_err = NULL;
> >  uint8_t *masterkey = NULL;
> > -size_t splitkeylen = 0;
> > +size_t next_sector;
> >  size_t i;
> >  char *password;
> >  const char *cipher_alg;
> > @@ -1388,23 +1415,16 @@ qcrypto_block_luks_create(QCryptoBlock *block,
> >  goto error;
> >  }
> >  
> > +/* start with the sector that follows the header*/
> > +next_sector = QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
> > +  QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
> 
> I'd suggest 'post_header_sector'
I called it now header_sectors, and each split key, split_key_sectors.
I hope that this is good enough.
> 
> >  
> > -/* Although LUKS has multiple key slots, we're just going
> > - * to use the first key slot */
> > -splitkeylen = luks->header.key_bytes * QCRYPTO_BLOCK_LUKS_STRIPES;
> >  for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
> > -luks->header.key_slots[i].active = 
> > QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
> > -luks->header.key_slots[i].stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
> > -
> > -/* This calculation doesn't match that shown in the spec,
> > - * but instead follows the cryptsetup implementation.
> > - */
> > -luks->header.key_slots[i].key_offset =
> > -(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
> > - QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
> > -(ROUND_UP(DIV_ROUND_UP(splitkeylen, 
> > QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
> > -  (QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
> > -   QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) * i);
> > +QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[i];
> > +slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
> > +slot->key_offset = next_sector;
> > +slot->stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
> > +next_sector += splitkeylen_sectors(luks, 
> > QCRYPTO_BLOCK_LUKS_STRIPES);
> 
> I'm not a fan of the next_sector accumulator .
I actually think that accumulator is cleaner here,
but I won't argue about this. Fixed.


> 
> I'd prefer to see the '* i' part done in splitkeylen_sectors, so that
> we have
> 
>   slot->key_offset = post_header_sector +
> splitkeylen_sectors(luks, QCRYPTO_BLOCK_LUKS_STRIPES, i);
> 
> > @@ -1412,17 +1432,9 @@ qcrypto_block_luks_create(QCryptoBlock *block,
> >   * slot headers, rounded up to the nearest sector, combined with
> >   * the size of each master key material region, also rounded up
> >   * to the nearest sector */
> > -luks->header.payload_offset =
> > -(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
> > - QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
> > -(ROUND_UP(DIV_ROUND_UP(splitkeylen, 
> > QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
> > -  (QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
> > -   QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) *
> > - QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
> > -
> > +luks->header.payload_offset = next_sector;
> 
>   luks->header.payload_offset = po

Re: [Qemu-block] [Qemu-devel] [PATCH 03/13] qcrypto-luks: refactoring: extract load/store/check/parse header functions

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 11:34 +0100, Daniel P. Berrangé wrote:
> On Thu, Aug 22, 2019 at 01:43:05AM +0300, Maxim Levitsky wrote:
> > On Tue, 2019-08-20 at 20:01 +0200, Max Reitz wrote:
> > > On 14.08.19 22:22, Maxim Levitsky wrote:
> > > > With upcoming key management, the header will
> > > > need to be stored after the image is created.
> > > > 
> > > > Extracting load header isn't strictly needed, but
> > > > do this anyway for the symmetry.
> > > > 
> > > > Also I extracted a function that does basic sanity
> > > > checks on the just read header, and a function
> > > > which parses all the crypto format to make the
> > > > code a bit more readable, plus now the code
> > > > doesn't destruct the in-header cipher-mode string,
> > > > so that the header now can be stored many times,
> > > > which is needed for the key management.
> > > > 
> > > > Also this allows to contain the endianess conversions
> > > > in these functions alone
> > > > 
> > > > The header is no longer endian swapped in place,
> > > > to prevent (mostly theoretical races I think)
> > > > races where someone could see the header in the
> > > > process of beeing byteswapped.
> > > 
> > > The formatting looks weird, it doesn’t look quite 72 characters wide...
> > >  (what commit messages normally use)
> > 
> > Could you elaborate on that? I thought that code should not
> > exceed 80 character limit.
> > 
> > > 
> > > > Signed-off-by: Maxim Levitsky 
> > > > ---
> > > >  crypto/block-luks.c | 756 ++--
> > > >  1 file changed, 440 insertions(+), 316 deletions(-)
> > > 
> > > Also, this commit is just too big.
> > 
> > Yea, but it has no functional changes.
> > I can split it further, but that won't help much IMHO.
> 
> I'd find it easier to review if each newly introduced method was a
> separate patch. It makes it easier to see which bit of removed
> code was added to which method.
> 
> Regards,
> Daniel

Done, patch is now split in several patches.
Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [PATCH 03/13] qcrypto-luks: refactoring: extract load/store/check/parse header functions

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 11:38 +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 14, 2019 at 11:22:09PM +0300, Maxim Levitsky wrote:
> > With upcoming key management, the header will
> > need to be stored after the image is created.
> > 
> > Extracting load header isn't strictly needed, but
> > do this anyway for the symmetry.
> > 
> > Also I extracted a function that does basic sanity
> > checks on the just read header, and a function
> > which parses all the crypto format to make the
> > code a bit more readable, plus now the code
> > doesn't destruct the in-header cipher-mode string,
> > so that the header now can be stored many times,
> > which is needed for the key management.
> > 
> > Also this allows to contain the endianess conversions
> > in these functions alone
> > 
> > The header is no longer endian swapped in place,
> > to prevent (mostly theoretical races I think)
> > races where someone could see the header in the
> > process of beeing byteswapped.
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 756 ++--
> >  1 file changed, 440 insertions(+), 316 deletions(-)
> >  if (!(flags & QCRYPTO_BLOCK_OPEN_NO_IO)) {
> >  /* Try to find which key slot our password is valid for
> >   * and unlock the master key from that slot.
> >   */
> > -
> >  masterkey = g_new0(uint8_t, masterkeylen(luks));
> >  
> >  if (qcrypto_block_luks_find_key(block,
> > @@ -845,12 +1132,10 @@ qcrypto_block_luks_open(QCryptoBlock *block,
> >  }
> >  
> >  block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
> > -block->payload_offset = luks->header.payload_offset *
> > -block->sector_size;
> > +block->payload_offset = luks->header.payload_offset * 
> > block->sector_size;
> >  
> >  g_free(masterkey);
> >  g_free(password);
> > -
> >  return 0;
> 
> Smoe unrelated whitespace changes here.
> 
> 
> > +/* populate the slot 0 with the password encrypted master key*/
> > +/* This will also store the header */
> > +if (qcrypto_block_luks_store_key(block,
> > + 0,
> > + password,
> > + masterkey,
> > + luks_opts.iter_time,
> > + writefunc,
> > + opaque,
> > + errp)) {
> >  goto error;
> > -}
> > + }
> 
> Indent is off by 1
> 
> 
> Regards,
> Daniel


Fixed,


Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [QEMU] [PATCH v5 5/8] bootdevice: Gather LCHS from all relevant devices

2019-08-25 Thread Sam Eiderman via Qemu-block
> Only scsi-hd has the lchs properties, though, so what’s the purpose of
> defining the unrealize function for all other classes?
>
> Max

- shmuel.eider...@oracle.com
+ sam...@google.com

The only purpose is to already have them mapped to the correct existing
function, in case it will be used later on.
I can resubmit without the unrealize for the other classes, WDYT?

Sam




Re: [Qemu-block] [Qemu-devel] [QEMU] [PATCH v5 4/8] scsi: Propagate unrealize() callback to scsi-hd

2019-08-25 Thread Sam Eiderman via Qemu-block
> @@ -213,11 +221,18 @@ static void scsi_qdev_realize(DeviceState *qdev, Error 
> **errp)
>  static void scsi_qdev_unrealize(DeviceState *qdev, Error **errp)
>  {
>  SCSIDevice *dev = SCSI_DEVICE(qdev);
> +Error *local_err = NULL;
>
>  if (dev->vmsentry) {
>  qemu_del_vm_change_state_handler(dev->vmsentry);
>  }
>
> +scsi_device_unrealize(dev, &local_err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +
>  scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE));

(I see this code for the first time, but) I suppose I’d put the
scsi_device_unrealize() after scsi_device_purge_requests().

Max

>  blockdev_mark_auto_del(dev->conf.blk);
>  }

- shmuel.eider...@oracle.com
+ sam...@google.com

Sure, I'll resubmit

Sam




Re: [Qemu-block] [PATCH v5 3/6] iotests: Add casenotrun report to bash tests

2019-08-25 Thread Andrey Shinkevich


On 16/08/2019 23:33, Cleber Rosa wrote:
> On Thu, Aug 15, 2019 at 08:44:11PM -0400, John Snow wrote:
>>
>>
>> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
>>> The new function _casenotrun() is to be invoked if a test case cannot
>>> be run for some reason. The user will be notified by a message passed
>>> to the function.
>>>
>>
>> Oh, I assume this is a sub-test granularity; if we need to skip
>> individual items.
>>
>> I'm good with this, but we should CC Cleber Rosa, who has struggled
>> against this in the past, too.
>>
> 
> The discussion I was involved in was not that much about skipping
> tests per se, but about how to determine if a test should be skipped
> or not.  At that time, we proposed an integration with the build
> system, but the downside (and the reason for not pushing it forward)
> was the requirement to run the iotest outside of a build tree.
> 
>>> Suggested-by: Kevin Wolf 
>>> Signed-off-by: Andrey Shinkevich 
>>> ---
>>>   tests/qemu-iotests/common.rc | 7 +++
>>>   1 file changed, 7 insertions(+)
>>>
>>> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
>>> index 6e461a1..1089050 100644
>>> --- a/tests/qemu-iotests/common.rc
>>> +++ b/tests/qemu-iotests/common.rc
>>> @@ -428,6 +428,13 @@ _notrun()
>>>   exit
>>>   }
>>>   
>>> +# bail out, setting up .casenotrun file
>>> +#
>>> +_casenotrun()
>>> +{
>>> +echo "[case not run] $*" >>"$OUTPUT_DIR/$seq.casenotrun"
>>> +}
>>> +
>>>   # just plain bail out
>>>   #
>>>   _fail()
>>>
>>
>> seems fine to me otherwise.
>>
>> Reviewed-by: John Snow 
> 
> Yeah, this also LGTM.
> 
> Reviewed-by: Cleber Rosa 
> 

Thank you very much for your reviews. Please note that the function 
_casenotrun() works as a notifier only as it was done for the Python 
based iotests. It is a caller responsibility to skip running a 
particular test with all relevant logic. I will supply the comment in v6 
and will keep your 'Reviewed-by' if there are no objections ))

Andrey
-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [Qemu-devel] [PATCH 02/13] qcrypto-luks: misc refactoring

2019-08-25 Thread Maxim Levitsky
On Thu, 2019-08-22 at 16:32 +0200, Max Reitz wrote:
> On 22.08.19 01:59, Maxim Levitsky wrote:
> > On Tue, 2019-08-20 at 19:36 +0200, Max Reitz wrote:
> > > On 14.08.19 22:22, Maxim Levitsky wrote:
> > > > This is also a preparation for key read/write/erase functions
> > > > 
> > > > * use master key len from the header
> > > > * prefer to use crypto params in the QCryptoBlockLUKS
> > > >   over passing them as function arguments
> > > > * define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME
> > > > * Add comments to various crypto parameters in the QCryptoBlockLUKS
> > > > 
> > > > Signed-off-by: Maxim Levitsky 
> > > > ---
> > > >  crypto/block-luks.c | 213 ++--
> > > >  1 file changed, 105 insertions(+), 108 deletions(-)
> 
> 
> [...]
> 
> > > > @@ -410,21 +430,15 @@ 
> > > > qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm cipher,
> > > >   */
> > > >  static int
> > > >  qcrypto_block_luks_load_key(QCryptoBlock *block,
> > > > -QCryptoBlockLUKSKeySlot *slot,
> > > > +uint slot_idx,
> > > 
> > > Did you use uint on purpose or do you mean a plain “unsigned”?
> > 
> > Well there are just 8 slots, but yea I don't mind to make this an unsigned 
> > int.
> 
> My point was that “uint” is not a standard C type.

There are lot of non standard types, like the u8/u8/u32/u64 I used to use in 
the kernel,
so I kind of missed that. Won't use that type anymore :-)


> 
> [...]
> 
> > > > @@ -930,6 +922,17 @@ qcrypto_block_luks_create(QCryptoBlock *block,
> > > >  luks_opts.has_ivgen_hash_alg = true;
> > > >  }
> > > >  }
> > > > +
> > > > +luks = g_new0(QCryptoBlockLUKS, 1);
> > > > +block->opaque = luks;
> > > > +
> > > > +luks->cipher_alg = luks_opts.cipher_alg;
> > > > +luks->cipher_mode = luks_opts.cipher_mode;
> > > > +luks->ivgen_alg = luks_opts.ivgen_alg;
> > > > +luks->ivgen_hash_alg = luks_opts.ivgen_hash_alg;
> > > > +luks->hash_alg = luks_opts.hash_alg;
> > > > +
> > > > +
> > > 
> > > Why did you pull this up?  Now @luks is leaked in both of the next error
> > > paths.
> > 
> > This is because the purpose of these fields changed. As Daniel explained to 
> > me
> > they were initially added after the fact to serve as a cache of into to 
> > present in qemu-img info callback.
> > But now I use these everywhere in the code so I won't them to be correct as 
> > soon as possible to minimize
> > the risk of calling some function that uses these and would read garbage.
> 
> I get that, but I was wondering why you pulled the allocation of @luks
> up above the next two conditional blocks.  Allocating and initializing
> there should have worked just fine.
Yea, I didn't have to, just thought that putting the initialization as above
as possible is a good thing for future.


> 
> Max
> 

Best regards,
Maxim Levitsky




Re: [Qemu-block] [Qemu-devel] [PATCH v5 0/6] Allow Valgrind checking all QEMU processes

2019-08-25 Thread Andrey Shinkevich


On 16/08/2019 23:05, Cleber Rosa wrote:
> On Fri, Jul 19, 2019 at 07:30:10PM +0300, Andrey Shinkevich wrote:
>> In the current implementation of the QEMU bash iotests, only qemu-io
>> processes may be run under the Valgrind, which is a useful tool for
>> finding memory usage issues. Let's allow the common.rc bash script
>> runing all the QEMU processes, such as qemu-kvm, qemu-img, qemu-ndb
>> and qemu-vxhs, under the Valgrind tool.
>>
> 
> FIY, this looks very similar (in purpose) to:
> 
> https://avocado-framework.readthedocs.io/en/71.0/WrapProcess.html
> 
> And in fact Valgrind was one of the original motivations:
> 
> 
> https://github.com/avocado-framework/avocado/blob/master/examples/wrappers/valgrind.sh
> 
> Maybe this can be helpful for the Python based iotests.
> 
> - Cleber.
> 

Thank you Cleber for the advice. That is the way I actually ran Python 
iotests under Valgrind on my host and discovered some issues with them 
already.

Andrey

>> v5:
>>01: The patch "block/nbd: NBDReply is used being uninitialized" was 
>> detached
>>and taken into account in the patch "nbd: Initialize reply on failure"
>>by Eric Blake.
>>
>> v4:
>>01: The patch "iotests: Set read-zeroes on in null block driver for 
>> Valgrind"
>>was extended with new cases and issued as a separate series.
>>02: The new patch "block/nbd: NBDReply is used being uninitialized" was
>>added to resolve the failure of the iotest 083 run under Valgrind.
>>
>> v3:
>>01: The new function _casenotrun() was added to the common.rc bash
>>script to notify the user of test cases dropped for some reason.
>>Suggested by Kevin.
>>Particularly, the notification about the nonexistent TMPDIR in
>>the test 051 was added (noticed by Vladimir).
>>02: The timeout in some test cases was extended for Valgrind because
>>it differs when running on the ramdisk.
>>03: Due to the common.nbd script has been changed with the commit
>>b28f582c, the patch "iotests: amend QEMU NBD process synchronization"
>>is actual no more. Note that QEMU_NBD is launched in the bash nested
>>shell in the _qemu_nbd_wrapper() as it was before in common.rc.
>>04: The patch "iotests: new file to suppress Valgrind errors" was dropped
>>due to my superficial understanding of the work of the function
>>blk_pread_unthrottled(). Special thanks to Kevin who shed the light
>>on the null block driver involved. Now, the parameter 'read-zeroes=on'
>>is passed to the null block driver to initialize the buffer in the
>>function guess_disk_lchs() that the Valgrind was complaining to.
>>
>> v2:
>>01: The patch 2/7 of v1 was merged into the patch 1/7, suggested by 
>> Daniel.
>>02: Another patch 7/7 was added to introduce the Valgrind error 
>> suppression
>>file into the QEMU project.
>>Discussed in the email thread with the message ID:
>><1560276131-683243-1-git-send-email-andrey.shinkev...@virtuozzo.com>
>>
>> Andrey Shinkevich (6):
>>iotests: allow Valgrind checking all QEMU processes
>>iotests: exclude killed processes from running under  Valgrind
>>iotests: Add casenotrun report to bash tests
>>iotests: Valgrind fails with nonexistent directory
>>iotests: extended timeout under Valgrind
>>iotests: extend sleeping time under Valgrind
>>
>>   tests/qemu-iotests/028   |  6 +++-
>>   tests/qemu-iotests/039   |  5 +++
>>   tests/qemu-iotests/039.out   | 30 +++--
>>   tests/qemu-iotests/051   |  4 +++
>>   tests/qemu-iotests/061   |  2 ++
>>   tests/qemu-iotests/061.out   | 12 ++-
>>   tests/qemu-iotests/137   |  1 +
>>   tests/qemu-iotests/137.out   |  6 +---
>>   tests/qemu-iotests/183   |  9 +-
>>   tests/qemu-iotests/192   |  6 +++-
>>   tests/qemu-iotests/247   |  6 +++-
>>   tests/qemu-iotests/common.rc | 76 
>> +---
>>   12 files changed, 101 insertions(+), 62 deletions(-)
>>
>> -- 
>> 1.8.3.1
>>
>>

-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [PATCH v5 6/6] iotests: extend sleeping time under Valgrind

2019-08-25 Thread Andrey Shinkevich
On 16/08/2019 04:01, John Snow wrote:
> 
> 
> On 7/19/19 12:30 PM, Andrey Shinkevich wrote:
>> To synchronize the time when QEMU is running longer under the Valgrind,
>> increase the sleeping time in the test 247.
>>
>> Signed-off-by: Andrey Shinkevich 
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>   tests/qemu-iotests/247 | 6 +-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/tests/qemu-iotests/247 b/tests/qemu-iotests/247
>> index 546a794..c853b73 100755
>> --- a/tests/qemu-iotests/247
>> +++ b/tests/qemu-iotests/247
>> @@ -57,7 +57,11 @@ TEST_IMG="$TEST_IMG.4" _make_test_img $size
>>   {"execute":"block-commit",
>>"arguments":{"device":"format-4", "top-node": "format-2", 
>> "base-node":"format-0", "job-id":"job0"}}
>>   EOF
>> -sleep 1
>> +if [ "${VALGRIND_QEMU}" == "y" ]; then
>> +sleep 10
>> +else
>> +sleep 1
>> +fi
>>   echo '{"execute":"quit"}'
>>   ) | $QEMU -qmp stdio -nographic -nodefaults \
>>   -blockdev file,node-name=file-0,filename=$TEST_IMG.0,auto-read-only=on 
>> \
>>
> 
> This makes me nervous, though. Won't this race terribly? (Wait, why
> doesn't it race already?)
> 
It maybe better to rewrite this test in Python.
To let it work under Valgrind in the current implementation, I reserved 
more seconds. We could have this tolerance just for the test.

Andrey
-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [PATCH] block: posix: Always allocate the first block

2019-08-25 Thread Maxim Levitsky
On Sat, 2019-08-17 at 00:21 +0300, Nir Soffer wrote:
> When creating an image with preallocation "off" or "falloc", the first
> block of the image is typically not allocated. When using Gluster
> storage backed by XFS filesystem, reading this block using direct I/O
> succeeds regardless of request length, fooling alignment detection.
> 
> In this case we fallback to a safe value (4096) instead of the optimal
> value (512), which may lead to unneeded data copying when aligning
> requests.  Allocating the first block avoids the fallback.
> 
> When using preallocation=off, we always allocate at least one filesystem
> block:
> 
> $ ./qemu-img create -f raw test.raw 1g
> Formatting 'test.raw', fmt=raw size=1073741824
> 
> $ ls -lhs test.raw
> 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw

Are you sure about this?

[mlevitsk@maximlenovopc ~/work/test_area/posix-file 0]$ qemu-img create -f raw 
test.raw 1g -o preallocation=off
Formatting 'test.raw', fmt=raw size=1073741824 preallocation=off
[mlevitsk@maximlenovopc ~/work/test_area/posix-file 0]$ls -lhs ./test.raw 
0 -rw-r--r--. 1 mlevitsk mlevitsk 1.0G Aug 25 10:38 ./test.raw

ext4, tested on qemu-4.0.0 and qemu git master.


>From what I remember, the only case when posix-raw touches the first block is 
>to zero it out
when running on top of kernel block device, to erase whatever header might be 
there, and this
is also kind of a backward compat hack which might be one day removed.

[...]

Best regards,
Maxim Levitsky




[Qemu-block] [PATCH 1/2] block/nvme: add support for write zeros

2019-08-25 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 block/nvme.c | 72 +++-
 block/trace-events   |  1 +
 include/block/nvme.h | 19 +++-
 3 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 5be3a39b63..f8bd11e19a 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -111,6 +111,8 @@ typedef struct {
 uint64_t max_transfer;
 bool plugged;
 
+bool supports_write_zeros;
+
 CoMutex dma_map_lock;
 CoQueue dma_flush_queue;
 
@@ -421,6 +423,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 NvmeIdNs *idns;
 NvmeLBAF *lbaf;
 uint8_t *resp;
+uint16_t oncs;
 int r;
 uint64_t iova;
 NvmeCmd cmd = {
@@ -458,6 +461,9 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 s->max_transfer = MIN_NON_ZERO(s->max_transfer,
   s->page_size / sizeof(uint64_t) * s->page_size);
 
+oncs = le16_to_cpu(idctrl->oncs);
+s->supports_write_zeros = (oncs & NVME_ONCS_WRITE_ZEROS) != 0;
+
 memset(resp, 0, 4096);
 
 cmd.cdw10 = 0;
@@ -470,6 +476,12 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 s->nsze = le64_to_cpu(idns->nsze);
 lbaf = &idns->lbaf[NVME_ID_NS_FLBAS_INDEX(idns->flbas)];
 
+if (NVME_ID_NS_DLFEAT_WRITE_ZEROS(idns->dlfeat) &&
+NVME_ID_NS_DLFEAT_READ_BEHAVIOR(idns->dlfeat) ==
+NVME_ID_NS_DLFEAT_READ_BEHAVIOR_ZEROS) {
+bs->supported_write_flags |= BDRV_REQ_MAY_UNMAP;
+}
+
 if (lbaf->ms) {
 error_setg(errp, "Namespaces with metadata are not yet supported");
 goto out;
@@ -764,6 +776,8 @@ static int nvme_file_open(BlockDriverState *bs, QDict 
*options, int flags,
 int ret;
 BDRVNVMeState *s = bs->opaque;
 
+bs->supported_write_flags = BDRV_REQ_FUA;
+
 opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
 qemu_opts_absorb_qdict(opts, options, &error_abort);
 device = qemu_opt_get(opts, NVME_BLOCK_OPT_DEVICE);
@@ -792,7 +806,6 @@ static int nvme_file_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 }
-bs->supported_write_flags = BDRV_REQ_FUA;
 return 0;
 fail:
 nvme_close(bs);
@@ -1086,6 +1099,60 @@ static coroutine_fn int nvme_co_flush(BlockDriverState 
*bs)
 }
 
 
+static coroutine_fn int nvme_co_pwrite_zeroes(BlockDriverState *bs,
+  int64_t offset,
+  int bytes,
+  BdrvRequestFlags flags)
+{
+BDRVNVMeState *s = bs->opaque;
+NVMeQueuePair *ioq = s->queues[1];
+NVMeRequest *req;
+
+uint32_t cdw12 = ((bytes >> s->blkshift) - 1) & 0x;
+
+if (!s->supports_write_zeros) {
+return -ENOTSUP;
+}
+
+NvmeCmd cmd = {
+.opcode = NVME_CMD_WRITE_ZEROS,
+.nsid = cpu_to_le32(s->nsid),
+.cdw10 = cpu_to_le32((offset >> s->blkshift) & 0x),
+.cdw11 = cpu_to_le32(((offset >> s->blkshift) >> 32) & 0x),
+};
+
+NVMeCoData data = {
+.ctx = bdrv_get_aio_context(bs),
+.ret = -EINPROGRESS,
+};
+
+if (flags & BDRV_REQ_MAY_UNMAP) {
+cdw12 |= (1 << 25);
+}
+
+if (flags & BDRV_REQ_FUA) {
+cdw12 |= (1 << 30);
+}
+
+cmd.cdw12 = cpu_to_le32(cdw12);
+
+trace_nvme_write_zeros(s, offset, bytes, flags);
+assert(s->nr_queues > 1);
+req = nvme_get_free_req(ioq);
+assert(req);
+
+nvme_submit_command(s, ioq, req, &cmd, nvme_rw_cb, &data);
+
+data.co = qemu_coroutine_self();
+while (data.ret == -EINPROGRESS) {
+qemu_coroutine_yield();
+}
+
+trace_nvme_rw_done(s, true, offset, bytes, data.ret);
+return data.ret;
+}
+
+
 static int nvme_reopen_prepare(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp)
 {
@@ -1190,6 +1257,9 @@ static BlockDriver bdrv_nvme = {
 
 .bdrv_co_preadv   = nvme_co_preadv,
 .bdrv_co_pwritev  = nvme_co_pwritev,
+
+.bdrv_co_pwrite_zeroes= nvme_co_pwrite_zeroes,
+
 .bdrv_co_flush_to_disk= nvme_co_flush,
 .bdrv_reopen_prepare  = nvme_reopen_prepare,
 
diff --git a/block/trace-events b/block/trace-events
index 04209f058d..8209fbd0c7 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -149,6 +149,7 @@ nvme_submit_command_raw(int c0, int c1, int c2, int c3, int 
c4, int c5, int c6,
 nvme_handle_event(void *s) "s %p"
 nvme_poll_cb(void *s) "s %p"
 nvme_prw_aligned(void *s, int is_write, uint64_t offset, uint64_t bytes, int 
flags, int niov) "s %p is_write %d offset %"PRId64" bytes %"PRId64" flags %d 
niov %d"
+nvme_write_zeros(void *s, uint64_t offset, uint64_t bytes, int flags) "s %p 
offset %"PRId64" bytes %"PRId64" flags %d"
 nvme_qiov_unaligned(const void *qiov, int n, void *base, size_t size, int 

[Qemu-block] [PATCH 2/2] block/nvme: add support for discard

2019-08-25 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 block/nvme.c   | 83 ++
 block/trace-events |  2 ++
 2 files changed, 85 insertions(+)

diff --git a/block/nvme.c b/block/nvme.c
index f8bd11e19a..dd041f39c9 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -112,6 +112,7 @@ typedef struct {
 bool plugged;
 
 bool supports_write_zeros;
+bool supports_discard;
 
 CoMutex dma_map_lock;
 CoQueue dma_flush_queue;
@@ -463,6 +464,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 
 oncs = le16_to_cpu(idctrl->oncs);
 s->supports_write_zeros = (oncs & NVME_ONCS_WRITE_ZEROS) != 0;
+s->supports_discard = (oncs & NVME_ONCS_DSM) != 0;
 
 memset(resp, 0, 4096);
 
@@ -1153,6 +1155,86 @@ static coroutine_fn int 
nvme_co_pwrite_zeroes(BlockDriverState *bs,
 }
 
 
+static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs,
+ int64_t offset,
+ int bytes)
+{
+BDRVNVMeState *s = bs->opaque;
+NVMeQueuePair *ioq = s->queues[1];
+NVMeRequest *req;
+NvmeDsmRange *buf;
+QEMUIOVector local_qiov;
+int ret;
+
+NvmeCmd cmd = {
+.opcode = NVME_CMD_DSM,
+.nsid = cpu_to_le32(s->nsid),
+.cdw10 = cpu_to_le32(0), /*number of ranges - 0 based*/
+.cdw11 = cpu_to_le32(1 << 2), /*deallocate bit*/
+};
+
+NVMeCoData data = {
+.ctx = bdrv_get_aio_context(bs),
+.ret = -EINPROGRESS,
+};
+
+if (!s->supports_discard) {
+return -ENOTSUP;
+}
+
+assert(s->nr_queues > 1);
+
+buf = qemu_try_blockalign0(bs, s->page_size);
+if (!buf) {
+return -ENOMEM;
+}
+
+buf->nlb = cpu_to_le32(bytes >> s->blkshift);
+buf->slba = cpu_to_le64(offset >> s->blkshift);
+buf->cattr = 0;
+
+qemu_iovec_init(&local_qiov, 1);
+qemu_iovec_add(&local_qiov, buf, 4096);
+
+req = nvme_get_free_req(ioq);
+assert(req);
+
+qemu_co_mutex_lock(&s->dma_map_lock);
+ret = nvme_cmd_map_qiov(bs, &cmd, req, &local_qiov);
+qemu_co_mutex_unlock(&s->dma_map_lock);
+
+if (ret) {
+req->busy = false;
+goto out;
+}
+
+trace_nvme_dsm(s, offset, bytes);
+
+nvme_submit_command(s, ioq, req, &cmd, nvme_rw_cb, &data);
+
+data.co = qemu_coroutine_self();
+while (data.ret == -EINPROGRESS) {
+qemu_coroutine_yield();
+}
+
+qemu_co_mutex_lock(&s->dma_map_lock);
+ret = nvme_cmd_unmap_qiov(bs, &local_qiov);
+qemu_co_mutex_unlock(&s->dma_map_lock);
+
+if (ret) {
+goto out;
+}
+
+ret = data.ret;
+trace_nvme_dsm_done(s, offset, bytes, ret);
+out:
+qemu_iovec_destroy(&local_qiov);
+qemu_vfree(buf);
+return ret;
+
+}
+
+
 static int nvme_reopen_prepare(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp)
 {
@@ -1259,6 +1341,7 @@ static BlockDriver bdrv_nvme = {
 .bdrv_co_pwritev  = nvme_co_pwritev,
 
 .bdrv_co_pwrite_zeroes= nvme_co_pwrite_zeroes,
+.bdrv_co_pdiscard = nvme_co_pdiscard,
 
 .bdrv_co_flush_to_disk= nvme_co_flush,
 .bdrv_reopen_prepare  = nvme_reopen_prepare,
diff --git a/block/trace-events b/block/trace-events
index 8209fbd0c7..7d1d48b502 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -153,6 +153,8 @@ nvme_write_zeros(void *s, uint64_t offset, uint64_t bytes, 
int flags) "s %p offs
 nvme_qiov_unaligned(const void *qiov, int n, void *base, size_t size, int 
align) "qiov %p n %d base %p size 0x%zx align 0x%x"
 nvme_prw_buffered(void *s, uint64_t offset, uint64_t bytes, int niov, int 
is_write) "s %p offset %"PRId64" bytes %"PRId64" niov %d is_write %d"
 nvme_rw_done(void *s, int is_write, uint64_t offset, uint64_t bytes, int ret) 
"s %p is_write %d offset %"PRId64" bytes %"PRId64" ret %d"
+nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset %"PRId64" 
bytes %"PRId64""
+nvme_dsm_done(void *s, uint64_t offset, uint64_t bytes, int ret) "s %p offset 
%"PRId64" bytes %"PRId64" ret %d"
 nvme_dma_map_flush(void *s) "s %p"
 nvme_free_req_queue_wait(void *q) "q %p"
 nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) "s 
%p cmd %p req %p qiov %p entries %d"
-- 
2.17.2




[Qemu-block] [PATCH 0/2] block/nvme: add support for write zeros and discard

2019-08-25 Thread Maxim Levitsky
This is the second part of the patches I prepared
for this driver back when I worked on mdev-nvme.

Best regards,
Maxim Levitsky

Maxim Levitsky (2):
  block/nvme: add support for write zeros
  block/nvme: add support for discard

 block/nvme.c | 155 ++-
 block/trace-events   |   3 +
 include/block/nvme.h |  19 +-
 3 files changed, 175 insertions(+), 2 deletions(-)

-- 
2.17.2