Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-26 Thread Peter Lieven

Am 26.06.2013 um 05:14 schrieb Bharata B Rao bhar...@linux.vnet.ibm.com:

 On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
 
 Can you please review for the gluster, rbd, sheepdog and ssh driver
 whether it's safe to assume that the image reads back as zeros after
 bdrv_create?
 
 Gluster supports both file and block backends. While the above is true for
 file backend (which uses ftruncate), the same is not true for
 block backend (which uses lvcreate  lvresize).
 
 So overall it is not safe to assume that an image on GlusterFS volume
 reads back as zeroes after create.

Okay, so for safety we have to return has_zero_init = 0. Erroneously assuming
a device is zero initialized can bring severe filesystem corruption.

Do you see a way to query the information of the underlaying
backend from the storage and return 1 or 0 conditionally?

Peter



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-26 Thread Bharata B Rao
On Wed, Jun 26, 2013 at 07:59:27AM +0200, Peter Lieven wrote:
 
 Am 26.06.2013 um 05:14 schrieb Bharata B Rao bhar...@linux.vnet.ibm.com:
 
  On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
  
  Can you please review for the gluster, rbd, sheepdog and ssh driver
  whether it's safe to assume that the image reads back as zeros after
  bdrv_create?
  
  Gluster supports both file and block backends. While the above is true for
  file backend (which uses ftruncate), the same is not true for
  block backend (which uses lvcreate  lvresize).
  
  So overall it is not safe to assume that an image on GlusterFS volume
  reads back as zeroes after create.
 
 Okay, so for safety we have to return has_zero_init = 0. Erroneously assuming
 a device is zero initialized can bring severe filesystem corruption.
 
 Do you see a way to query the information of the underlaying
 backend from the storage and return 1 or 0 conditionally?

Right now no. There have been efforts to get the capabilities (file, block etc)
of gluster volume from gluster cmdline, but there haven't been discussions
about providing such capabilities from libgfapi which is the library
QEMU uses to talk with gluster.

I see has_zero_init being used from qemu-img convert path. Is there any
other use of this in QEMU ?

Regards,
Bharata. 




Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-26 Thread Kevin Wolf
Am 26.06.2013 um 08:46 hat Bharata B Rao geschrieben:
 On Wed, Jun 26, 2013 at 07:59:27AM +0200, Peter Lieven wrote:
  
  Am 26.06.2013 um 05:14 schrieb Bharata B Rao bhar...@linux.vnet.ibm.com:
  
   On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
   
   Can you please review for the gluster, rbd, sheepdog and ssh driver
   whether it's safe to assume that the image reads back as zeros after
   bdrv_create?
   
   Gluster supports both file and block backends. While the above is true for
   file backend (which uses ftruncate), the same is not true for
   block backend (which uses lvcreate  lvresize).
   
   So overall it is not safe to assume that an image on GlusterFS volume
   reads back as zeroes after create.
  
  Okay, so for safety we have to return has_zero_init = 0. Erroneously 
  assuming
  a device is zero initialized can bring severe filesystem corruption.
  
  Do you see a way to query the information of the underlaying
  backend from the storage and return 1 or 0 conditionally?
 
 Right now no. There have been efforts to get the capabilities (file, block 
 etc)
 of gluster volume from gluster cmdline, but there haven't been discussions
 about providing such capabilities from libgfapi which is the library
 QEMU uses to talk with gluster.
 
 I see has_zero_init being used from qemu-img convert path. Is there any
 other use of this in QEMU ?

No, it's the only user.

So what this means is that using 'qemu-img convert' with a glusterfs
target corrupts the image if the volume is backed by a block device.

Kevin



[Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Kevin Wolf
Hi all,

while discussing some iscsi patches with Peter, we came to have a look
at which block drivers implement has_zero_init() to return 0, and which
don't (returning 1 is the default).

The meaning of this value is that if has_zero_init != 0, after
bdrv_create() one can assume that the whole image would read back as all
zero. For example, this is true for the traditional image files, but not
for host_device, where the block device isn't really created during
bdrv_create() but only checked for size.

The full list of protocol level block drivers is:

* blkdebug  - doesn't have bdrv_create
* blkverify - doesn't have bdrv_create
* curl  - doesn't have bdrv_create
* gluster   - currently has_zero_init = 1 (is this correct?)
* iscsi - has_zero_init = 0
* nbd   - doesn't have bdrv_create
* file  - has_zero_init = 1
* host_*- has_zero_init = 0
* rbd   - currently has_zero_init = 1 (is this correct?)
* sheepdog  - currently has_zero_init = 1 (is this correct?)
* ssh   - currently has_zero_init = 1 (is this correct?)
* vvfat - doesn't have bdrv_create

Can you please review for the gluster, rbd, sheepdog and ssh driver
whether it's safe to assume that the image reads back as zeros after
bdrv_create?

It might be possible that the correct value depends on the backend on
the server side for some protocols - for example, I think for SSH it
depends on whether you access a regular file or a block device on the
other host (if accessing a block device is even possible). In such
cases, has_zero_init = 0 is the safe default.

We're probably going to change the meaning of unimplemented
has_zero_init to return 0, but it would be good to check if the current
code was actually meant to do what it does today.

Kevin



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Richard W.M. Jones
On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
 * ssh   - currently has_zero_init = 1 (is this correct?)
[...]
 It might be possible that the correct value depends on the backend on
 the server side for some protocols - for example, I think for SSH it
 depends on whether you access a regular file or a block device on the
 other host (if accessing a block device is even possible).

This seems to depend on the behaviour of O_TRUNC on block devices.
The man page says it's unspecified, but I tested it on Linux, and
Linux ignores it (for logical volumes anyway).

When the ssh driver is asked to do bdrv_create it does on the
remote end:

 - open (filename, O_RDWR|O_CREAT|O_TRUNC, 0644) [1]
 - lseek (fd, size-1, SEEK_SET)  [2]
 - write (fd, '\0', 1)

In other words for regular files, it creates a sparse file.  For block
devices, I tested the sequence above, and it doesn't fail.

So .. I guess that has_zero_init = 0 would be correct?

Unless we fstat the fd after opening it and return some conditional
value from bdrv_has_zero_init eg if it's a block device.  Is that
possible?

Rich.

[1] Mode 0644 is hard-coded :-(

[2] I realize now it's actually possible to use ftruncate, although
it's not obvious.  There is no explicit truncate operation in sftp.
But there is a setattr operation, which if you try to set the size
attr in fact does a truncate at the remote end.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Kevin Wolf
Am 25.06.2013 um 14:05 hat Richard W.M. Jones geschrieben:
 On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
  * ssh   - currently has_zero_init = 1 (is this correct?)
 [...]
  It might be possible that the correct value depends on the backend on
  the server side for some protocols - for example, I think for SSH it
  depends on whether you access a regular file or a block device on the
  other host (if accessing a block device is even possible).
 
 This seems to depend on the behaviour of O_TRUNC on block devices.
 The man page says it's unspecified, but I tested it on Linux, and
 Linux ignores it (for logical volumes anyway).
 
 When the ssh driver is asked to do bdrv_create it does on the
 remote end:
 
  - open (filename, O_RDWR|O_CREAT|O_TRUNC, 0644) [1]
  - lseek (fd, size-1, SEEK_SET)  [2]
  - write (fd, '\0', 1)
 
 In other words for regular files, it creates a sparse file.  For block
 devices, I tested the sequence above, and it doesn't fail.
 
 So .. I guess that has_zero_init = 0 would be correct?

Yes, that's my understanding.

 Unless we fstat the fd after opening it and return some conditional
 value from bdrv_has_zero_init eg if it's a block device.  Is that
 possible?

.bdrv_has_zero_init is a callback that gets a specific BlockDriverState,
so it's possible to do some checks in there. If you can get the
necessary information using SSH, doing it dynamically is certainly an
option.

Kevin



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread MORITA Kazutaka
At Tue, 25 Jun 2013 13:39:11 +0200,
Kevin Wolf wrote:
 
 Hi all,
 
 while discussing some iscsi patches with Peter, we came to have a look
 at which block drivers implement has_zero_init() to return 0, and which
 don't (returning 1 is the default).
 
 The meaning of this value is that if has_zero_init != 0, after
 bdrv_create() one can assume that the whole image would read back as all
 zero. For example, this is true for the traditional image files, but not
 for host_device, where the block device isn't really created during
 bdrv_create() but only checked for size.
 
 The full list of protocol level block drivers is:
 
 * blkdebug  - doesn't have bdrv_create
 * blkverify - doesn't have bdrv_create
 * curl  - doesn't have bdrv_create
 * gluster   - currently has_zero_init = 1 (is this correct?)
 * iscsi - has_zero_init = 0
 * nbd   - doesn't have bdrv_create
 * file  - has_zero_init = 1
 * host_*- has_zero_init = 0
 * rbd   - currently has_zero_init = 1 (is this correct?)
 * sheepdog  - currently has_zero_init = 1 (is this correct?)
 * ssh   - currently has_zero_init = 1 (is this correct?)
 * vvfat - doesn't have bdrv_create
 
 Can you please review for the gluster, rbd, sheepdog and ssh driver
 whether it's safe to assume that the image reads back as zeros after
 bdrv_create?

It's safe for Sheepdog.  Sheepdog uses ftruncate or fallocate to
create data blocks and it is guaranteed that the space will be
initialized to zero.

Thanks,

Kazutaka



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Kevin Wolf
Am 25.06.2013 um 15:11 hat MORITA Kazutaka geschrieben:
 At Tue, 25 Jun 2013 13:39:11 +0200,
 Kevin Wolf wrote:
  
  Hi all,
  
  while discussing some iscsi patches with Peter, we came to have a look
  at which block drivers implement has_zero_init() to return 0, and which
  don't (returning 1 is the default).
  
  The meaning of this value is that if has_zero_init != 0, after
  bdrv_create() one can assume that the whole image would read back as all
  zero. For example, this is true for the traditional image files, but not
  for host_device, where the block device isn't really created during
  bdrv_create() but only checked for size.
  
  The full list of protocol level block drivers is:
  
  * blkdebug  - doesn't have bdrv_create
  * blkverify - doesn't have bdrv_create
  * curl  - doesn't have bdrv_create
  * gluster   - currently has_zero_init = 1 (is this correct?)
  * iscsi - has_zero_init = 0
  * nbd   - doesn't have bdrv_create
  * file  - has_zero_init = 1
  * host_*- has_zero_init = 0
  * rbd   - currently has_zero_init = 1 (is this correct?)
  * sheepdog  - currently has_zero_init = 1 (is this correct?)
  * ssh   - currently has_zero_init = 1 (is this correct?)
  * vvfat - doesn't have bdrv_create
  
  Can you please review for the gluster, rbd, sheepdog and ssh driver
  whether it's safe to assume that the image reads back as zeros after
  bdrv_create?
 
 It's safe for Sheepdog.  Sheepdog uses ftruncate or fallocate to
 create data blocks and it is guaranteed that the space will be
 initialized to zero.

Note that ftruncate/fallocate don't zero-initialise block devices, only
regular files. Not sure if you can use block devices to back Sheepdog
images?

Kevin



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread MORITA Kazutaka
At Tue, 25 Jun 2013 15:20:18 +0200,
Kevin Wolf wrote:
 
 Am 25.06.2013 um 15:11 hat MORITA Kazutaka geschrieben:
  At Tue, 25 Jun 2013 13:39:11 +0200,
  Kevin Wolf wrote:
   
   Hi all,
   
   while discussing some iscsi patches with Peter, we came to have a look
   at which block drivers implement has_zero_init() to return 0, and which
   don't (returning 1 is the default).
   
   The meaning of this value is that if has_zero_init != 0, after
   bdrv_create() one can assume that the whole image would read back as all
   zero. For example, this is true for the traditional image files, but not
   for host_device, where the block device isn't really created during
   bdrv_create() but only checked for size.
   
   The full list of protocol level block drivers is:
   
   * blkdebug  - doesn't have bdrv_create
   * blkverify - doesn't have bdrv_create
   * curl  - doesn't have bdrv_create
   * gluster   - currently has_zero_init = 1 (is this correct?)
   * iscsi - has_zero_init = 0
   * nbd   - doesn't have bdrv_create
   * file  - has_zero_init = 1
   * host_*- has_zero_init = 0
   * rbd   - currently has_zero_init = 1 (is this correct?)
   * sheepdog  - currently has_zero_init = 1 (is this correct?)
   * ssh   - currently has_zero_init = 1 (is this correct?)
   * vvfat - doesn't have bdrv_create
   
   Can you please review for the gluster, rbd, sheepdog and ssh driver
   whether it's safe to assume that the image reads back as zeros after
   bdrv_create?
  
  It's safe for Sheepdog.  Sheepdog uses ftruncate or fallocate to
  create data blocks and it is guaranteed that the space will be
  initialized to zero.
 
 Note that ftruncate/fallocate don't zero-initialise block devices, only
 regular files. Not sure if you can use block devices to back Sheepdog
 images?

We cannot do that.  Sheepdog heavily relies on filesystem features and
we will not support block devices for the backend of Sheepdog.

Thanks,

Kazutaka



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Kevin Wolf
Am 25.06.2013 um 15:42 hat MORITA Kazutaka geschrieben:
 At Tue, 25 Jun 2013 15:20:18 +0200,
 Kevin Wolf wrote:
  
  Am 25.06.2013 um 15:11 hat MORITA Kazutaka geschrieben:
   At Tue, 25 Jun 2013 13:39:11 +0200,
   Kevin Wolf wrote:

Hi all,

while discussing some iscsi patches with Peter, we came to have a look
at which block drivers implement has_zero_init() to return 0, and which
don't (returning 1 is the default).

The meaning of this value is that if has_zero_init != 0, after
bdrv_create() one can assume that the whole image would read back as all
zero. For example, this is true for the traditional image files, but not
for host_device, where the block device isn't really created during
bdrv_create() but only checked for size.

The full list of protocol level block drivers is:

* blkdebug  - doesn't have bdrv_create
* blkverify - doesn't have bdrv_create
* curl  - doesn't have bdrv_create
* gluster   - currently has_zero_init = 1 (is this correct?)
* iscsi - has_zero_init = 0
* nbd   - doesn't have bdrv_create
* file  - has_zero_init = 1
* host_*- has_zero_init = 0
* rbd   - currently has_zero_init = 1 (is this correct?)
* sheepdog  - currently has_zero_init = 1 (is this correct?)
* ssh   - currently has_zero_init = 1 (is this correct?)
* vvfat - doesn't have bdrv_create

Can you please review for the gluster, rbd, sheepdog and ssh driver
whether it's safe to assume that the image reads back as zeros after
bdrv_create?
   
   It's safe for Sheepdog.  Sheepdog uses ftruncate or fallocate to
   create data blocks and it is guaranteed that the space will be
   initialized to zero.
  
  Note that ftruncate/fallocate don't zero-initialise block devices, only
  regular files. Not sure if you can use block devices to back Sheepdog
  images?
 
 We cannot do that.  Sheepdog heavily relies on filesystem features and
 we will not support block devices for the backend of Sheepdog.

Okay, so Sheepdog is indeed safe. Thanks, Kazutaka!

Kevin



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Josh Durgin

On 06/25/2013 04:39 AM, Kevin Wolf wrote:

Can you please review for the gluster, rbd, sheepdog and ssh driver
whether it's safe to assume that the image reads back as zeros after
bdrv_create?


This is always safe for rbd.




Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-25 Thread Bharata B Rao
On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
 
 Can you please review for the gluster, rbd, sheepdog and ssh driver
 whether it's safe to assume that the image reads back as zeros after
 bdrv_create?

Gluster supports both file and block backends. While the above is true for
file backend (which uses ftruncate), the same is not true for
block backend (which uses lvcreate  lvresize).

So overall it is not safe to assume that an image on GlusterFS volume
reads back as zeroes after create.

Regards,
Bharata.