Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Kevin Wolf
Am 25.10.2011 16:06, schrieb Anthony Liguori:
 On 10/25/2011 08:56 AM, Kevin Wolf wrote:
 Am 25.10.2011 15:05, schrieb Anthony Liguori:
 I'd be much more open to changing the default mode to cache=none FWIW since 
 the
 risk of data loss there is much, much lower.

 I think people said that they'd rather not have cache=none as default
 because O_DIRECT doesn't work everywhere.
 
 Where doesn't it work these days?  I know it doesn't work on tmpfs.  I know 
 it 
 works on ext[234], btrfs, nfs.

Besides file systems (and probably OSes) that don't support O_DIRECT,
there's another case: Our defaults don't work on 4k sector disks today.
You need to explicitly specify the logical_block_size qdev property for
cache=none to work on them.

And changing this default isn't trivial as the right value doesn't only
depend on the host disk, but it's also guest visible. The only way out
would be bounce buffers, but I'm not sure that doing that silently is a
good idea...

Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Markus Armbruster
Kevin Wolf kw...@redhat.com writes:

 Am 25.10.2011 16:06, schrieb Anthony Liguori:
 On 10/25/2011 08:56 AM, Kevin Wolf wrote:
 Am 25.10.2011 15:05, schrieb Anthony Liguori:
 I'd be much more open to changing the default mode to cache=none FWIW 
 since the
 risk of data loss there is much, much lower.

 I think people said that they'd rather not have cache=none as default
 because O_DIRECT doesn't work everywhere.
 
 Where doesn't it work these days?  I know it doesn't work on tmpfs.  I know 
 it 
 works on ext[234], btrfs, nfs.

 Besides file systems (and probably OSes) that don't support O_DIRECT,
 there's another case: Our defaults don't work on 4k sector disks today.
 You need to explicitly specify the logical_block_size qdev property for
 cache=none to work on them.

 And changing this default isn't trivial as the right value doesn't only
 depend on the host disk, but it's also guest visible. The only way out
 would be bounce buffers, but I'm not sure that doing that silently is a
 good idea...

Sector size is a device property.

If the user asks for a 4K sector disk, and the backend can't support
that, we need to reject the configuration.  Just like we reject
read-only backends for read/write disks.

If the backend can only support it by using bounce buffers, I'd say
reject it unless the user explicitly permits bounce buffers.  But that's
debatable.

It's okay to default device properties to some backend-dependent value,
if that improves usability.



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Paolo Bonzini

On 10/26/2011 10:48 AM, Markus Armbruster wrote:

Sector size is a device property.

If the user asks for a 4K sector disk, and the backend can't support
that, we need to reject the configuration.  Just like we reject
read-only backends for read/write disks.


Isn't it the other way round, i.e. the user asks for a 512-byte sector 
disk (i.e. the default) with cache=none but the disk has 4k sectors? 
We're basically saying choose between NFS and migration if you have 4k 
sector disks but your guest doesn't support them.  Understandable 
perhaps, but not exactly kind, and virtualization is also about 
shielding from this kind of hardware dependency even at the cost of 
performance.  QEMU should just warn about performance degradations, 
erroring out would be a policy decision that should be up to management.



It's okay to default device properties to some backend-dependent value,
if that improves usability.


On the other hand, not all guests support 4k-sectors properly.

Paolo



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Daniel P. Berrange
On Wed, Oct 26, 2011 at 10:48:12AM +0200, Markus Armbruster wrote:
 Kevin Wolf kw...@redhat.com writes:
 
  Am 25.10.2011 16:06, schrieb Anthony Liguori:
  On 10/25/2011 08:56 AM, Kevin Wolf wrote:
  Am 25.10.2011 15:05, schrieb Anthony Liguori:
  I'd be much more open to changing the default mode to cache=none FWIW 
  since the
  risk of data loss there is much, much lower.
 
  I think people said that they'd rather not have cache=none as default
  because O_DIRECT doesn't work everywhere.
  
  Where doesn't it work these days?  I know it doesn't work on tmpfs.  I 
  know it 
  works on ext[234], btrfs, nfs.
 
  Besides file systems (and probably OSes) that don't support O_DIRECT,
  there's another case: Our defaults don't work on 4k sector disks today.
  You need to explicitly specify the logical_block_size qdev property for
  cache=none to work on them.
 
  And changing this default isn't trivial as the right value doesn't only
  depend on the host disk, but it's also guest visible. The only way out
  would be bounce buffers, but I'm not sure that doing that silently is a
  good idea...
 
 Sector size is a device property.
 
 If the user asks for a 4K sector disk, and the backend can't support
 that, we need to reject the configuration.  Just like we reject
 read-only backends for read/write disks.

I don't see why we need to reject a guest disk with 4k sectors,
just because the host disk only has 512 byte sectors. A guest
sector size that's a larger multiple of host sector size should
work just fine. It just means any guest sector write will update
8 host sectors at a time. We only have problems if guest sector
size is not a multiple of host sector size, in which case bounce
buffers are the only option (other than rejecting the config
which is not too nice).

IIUC, current QEMU behaviour is

   Guest 512Guest 4k
 Host 512   * OK  OK
 Host 4k* I/O Err OK

'*' marks defaults

IMHO, QEMU needs to work withot I/O errors in all of these
combinations, even if this means having to use bounce buffers
in some of them. That said, IMHO the default should be for
QEMU to avoid bounce buffers, which implies it should either
chose guest sector size to match host sector size, or it
should unconditionally use 4k guest. IMHO we need the former

   Guest 512  Guest 4k
 Host 512   *OK OK
 Host 4k OK*OK


Yes, I know there are other wierd sector sizes besides 512
and 4k, but the same general principals apply of either one
being a multiple of the other, or needing to use bounce
buffers.

 If the backend can only support it by using bounce buffers, I'd say
 reject it unless the user explicitly permits bounce buffers.  But that's
 debatable.

I don't think it really adds value for QEMU to force the user to specify
some extra magic flag in order to make the user's requested config
actually be honoured. If a config needs bounce buffers, QEMU should just
do it, without needing 'use-bounce-buffers=1'. A higher level mgmt app
is in a better position to inform users about the consequences.


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Markus Armbruster
Paolo Bonzini pbonz...@redhat.com writes:

 On 10/26/2011 10:48 AM, Markus Armbruster wrote:
 Sector size is a device property.

 If the user asks for a 4K sector disk, and the backend can't support
 that, we need to reject the configuration.  Just like we reject
 read-only backends for read/write disks.

 Isn't it the other way round, i.e. the user asks for a 512-byte sector
 disk (i.e. the default) with cache=none but the disk has 4k sectors?

Let me rephrase: If the user asks for a FOO disk, and the backend can't
support that, we need to reject the configuration.  Just like we reject
read-only backends for read/write disks.

 We're basically saying choose between NFS and migration if you have
 4k sector disks but your guest doesn't support them.  Understandable
 perhaps, but not exactly kind, and virtualization is also about
 shielding from this kind of hardware dependency even at the cost of
 performance.  QEMU should just warn about performance degradations,
 erroring out would be a policy decision that should be up to
 management.

I don't have strong opinions on that.

 It's okay to default device properties to some backend-dependent value,
 if that improves usability.

 On the other hand, not all guests support 4k-sectors properly.

You can't pick perfect defaults for all conceivable guests.  Life's
tough.



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Kevin Wolf
Am 26.10.2011 11:57, schrieb Daniel P. Berrange:
 On Wed, Oct 26, 2011 at 10:48:12AM +0200, Markus Armbruster wrote:
 Kevin Wolf kw...@redhat.com writes:

 Am 25.10.2011 16:06, schrieb Anthony Liguori:
 On 10/25/2011 08:56 AM, Kevin Wolf wrote:
 Am 25.10.2011 15:05, schrieb Anthony Liguori:
 I'd be much more open to changing the default mode to cache=none FWIW 
 since the
 risk of data loss there is much, much lower.

 I think people said that they'd rather not have cache=none as default
 because O_DIRECT doesn't work everywhere.

 Where doesn't it work these days?  I know it doesn't work on tmpfs.  I 
 know it 
 works on ext[234], btrfs, nfs.

 Besides file systems (and probably OSes) that don't support O_DIRECT,
 there's another case: Our defaults don't work on 4k sector disks today.
 You need to explicitly specify the logical_block_size qdev property for
 cache=none to work on them.

 And changing this default isn't trivial as the right value doesn't only
 depend on the host disk, but it's also guest visible. The only way out
 would be bounce buffers, but I'm not sure that doing that silently is a
 good idea...

 Sector size is a device property.

 If the user asks for a 4K sector disk, and the backend can't support
 that, we need to reject the configuration.  Just like we reject
 read-only backends for read/write disks.
 
 I don't see why we need to reject a guest disk with 4k sectors,
 just because the host disk only has 512 byte sectors. A guest
 sector size that's a larger multiple of host sector size should
 work just fine. It just means any guest sector write will update
 8 host sectors at a time. We only have problems if guest sector
 size is not a multiple of host sector size, in which case bounce
 buffers are the only option (other than rejecting the config
 which is not too nice).
 
 IIUC, current QEMU behaviour is
 
Guest 512Guest 4k
  Host 512   * OK  OK
  Host 4k* I/O Err OK
 
 '*' marks defaults
 
 IMHO, QEMU needs to work withot I/O errors in all of these
 combinations, even if this means having to use bounce buffers
 in some of them. That said, IMHO the default should be for
 QEMU to avoid bounce buffers, which implies it should either
 chose guest sector size to match host sector size, or it
 should unconditionally use 4k guest. IMHO we need the former
 
Guest 512  Guest 4k
  Host 512   *OK OK
  Host 4k OK*OK

I'm not sure if a 4k host should imply a 4k guest by default. This means
that some guests wouldn't be able to run on a 4k host. On the other
hand, for those guests that can do 4k, it would be the much better option.

So I think this decision is the hard thing about it.

 Yes, I know there are other wierd sector sizes besides 512
 and 4k, but the same general principals apply of either one
 being a multiple of the other, or needing to use bounce
 buffers.
 
 If the backend can only support it by using bounce buffers, I'd say
 reject it unless the user explicitly permits bounce buffers.  But that's
 debatable.
 
 I don't think it really adds value for QEMU to force the user to specify
 some extra magic flag in order to make the user's requested config
 actually be honoured. 

The user's requested config is often enough something like -hda
foo.img. Give me a working disk, I don't care how do you it. (And of
course I don't tell you what sector sizes my guest can cope with)

 If a config needs bounce buffers, QEMU should just
 do it, without needing 'use-bounce-buffers=1'. A higher level mgmt app
 is in a better position to inform users about the consequences.

A higher level management app doesn't exist in the general case.

Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Daniel P. Berrange
On Wed, Oct 26, 2011 at 01:23:05PM +0200, Kevin Wolf wrote:
 Am 26.10.2011 11:57, schrieb Daniel P. Berrange:
  On Wed, Oct 26, 2011 at 10:48:12AM +0200, Markus Armbruster wrote:
  Kevin Wolf kw...@redhat.com writes:
 
  Am 25.10.2011 16:06, schrieb Anthony Liguori:
  On 10/25/2011 08:56 AM, Kevin Wolf wrote:
  Am 25.10.2011 15:05, schrieb Anthony Liguori:
  I'd be much more open to changing the default mode to cache=none FWIW 
  since the
  risk of data loss there is much, much lower.
 
  I think people said that they'd rather not have cache=none as default
  because O_DIRECT doesn't work everywhere.
 
  Where doesn't it work these days?  I know it doesn't work on tmpfs.  I 
  know it 
  works on ext[234], btrfs, nfs.
 
  Besides file systems (and probably OSes) that don't support O_DIRECT,
  there's another case: Our defaults don't work on 4k sector disks today.
  You need to explicitly specify the logical_block_size qdev property for
  cache=none to work on them.
 
  And changing this default isn't trivial as the right value doesn't only
  depend on the host disk, but it's also guest visible. The only way out
  would be bounce buffers, but I'm not sure that doing that silently is a
  good idea...
 
  Sector size is a device property.
 
  If the user asks for a 4K sector disk, and the backend can't support
  that, we need to reject the configuration.  Just like we reject
  read-only backends for read/write disks.
  
  I don't see why we need to reject a guest disk with 4k sectors,
  just because the host disk only has 512 byte sectors. A guest
  sector size that's a larger multiple of host sector size should
  work just fine. It just means any guest sector write will update
  8 host sectors at a time. We only have problems if guest sector
  size is not a multiple of host sector size, in which case bounce
  buffers are the only option (other than rejecting the config
  which is not too nice).
  
  IIUC, current QEMU behaviour is
  
 Guest 512Guest 4k
   Host 512   * OK  OK
   Host 4k* I/O Err OK
  
  '*' marks defaults
  
  IMHO, QEMU needs to work withot I/O errors in all of these
  combinations, even if this means having to use bounce buffers
  in some of them. That said, IMHO the default should be for
  QEMU to avoid bounce buffers, which implies it should either
  chose guest sector size to match host sector size, or it
  should unconditionally use 4k guest. IMHO we need the former
  
 Guest 512  Guest 4k
   Host 512   *OK OK
   Host 4k OK*OK
 
 I'm not sure if a 4k host should imply a 4k guest by default. This means
 that some guests wouldn't be able to run on a 4k host. On the other
 hand, for those guests that can do 4k, it would be the much better option.
 
 So I think this decision is the hard thing about it.

I guess it somewhat depends whether we want to strive for

 1. Give the user the fastest working config by default
 2. Give the user a working config by default
 3. Give the user the fastest (possibly broken) config by default

IMHO 3 is not a serious option, but I could see 2 as a reasonable
tradeoff to avoid complexity in chosing QEMU defaults. The user
would have a working config with 512 sectors, but sub-optimal perf
on 4k hosts due to bounce buffering. Ideally libvirt or other
higher app would be setting the best block size that a guest
can support by default, so bounce buffers would rarely be needed.
So only people using QEMU directly without setting a block size
would ordinarily suffer the bounce buffer perf hit on a 4k host
host

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Kevin Wolf
Am 26.10.2011 13:39, schrieb Daniel P. Berrange:
 On Wed, Oct 26, 2011 at 01:23:05PM +0200, Kevin Wolf wrote:
 Am 26.10.2011 11:57, schrieb Daniel P. Berrange:
 On Wed, Oct 26, 2011 at 10:48:12AM +0200, Markus Armbruster wrote:
 Kevin Wolf kw...@redhat.com writes:

 Am 25.10.2011 16:06, schrieb Anthony Liguori:
 On 10/25/2011 08:56 AM, Kevin Wolf wrote:
 Am 25.10.2011 15:05, schrieb Anthony Liguori:
 I'd be much more open to changing the default mode to cache=none FWIW 
 since the
 risk of data loss there is much, much lower.

 I think people said that they'd rather not have cache=none as default
 because O_DIRECT doesn't work everywhere.

 Where doesn't it work these days?  I know it doesn't work on tmpfs.  I 
 know it 
 works on ext[234], btrfs, nfs.

 Besides file systems (and probably OSes) that don't support O_DIRECT,
 there's another case: Our defaults don't work on 4k sector disks today.
 You need to explicitly specify the logical_block_size qdev property for
 cache=none to work on them.

 And changing this default isn't trivial as the right value doesn't only
 depend on the host disk, but it's also guest visible. The only way out
 would be bounce buffers, but I'm not sure that doing that silently is a
 good idea...

 Sector size is a device property.

 If the user asks for a 4K sector disk, and the backend can't support
 that, we need to reject the configuration.  Just like we reject
 read-only backends for read/write disks.

 I don't see why we need to reject a guest disk with 4k sectors,
 just because the host disk only has 512 byte sectors. A guest
 sector size that's a larger multiple of host sector size should
 work just fine. It just means any guest sector write will update
 8 host sectors at a time. We only have problems if guest sector
 size is not a multiple of host sector size, in which case bounce
 buffers are the only option (other than rejecting the config
 which is not too nice).

 IIUC, current QEMU behaviour is

Guest 512Guest 4k
  Host 512   * OK  OK
  Host 4k* I/O Err OK

 '*' marks defaults

 IMHO, QEMU needs to work withot I/O errors in all of these
 combinations, even if this means having to use bounce buffers
 in some of them. That said, IMHO the default should be for
 QEMU to avoid bounce buffers, which implies it should either
 chose guest sector size to match host sector size, or it
 should unconditionally use 4k guest. IMHO we need the former

Guest 512  Guest 4k
  Host 512   *OK OK
  Host 4k OK*OK

 I'm not sure if a 4k host should imply a 4k guest by default. This means
 that some guests wouldn't be able to run on a 4k host. On the other
 hand, for those guests that can do 4k, it would be the much better option.

 So I think this decision is the hard thing about it.
 
 I guess it somewhat depends whether we want to strive for
 
  1. Give the user the fastest working config by default
  2. Give the user a working config by default
  3. Give the user the fastest (possibly broken) config by default
 
 IMHO 3 is not a serious option, but I could see 2 as a reasonable
 tradeoff to avoid complexity in chosing QEMU defaults. The user
 would have a working config with 512 sectors, but sub-optimal perf
 on 4k hosts due to bounce buffering. Ideally libvirt or other
 higher app would be setting the best block size that a guest
 can support by default, so bounce buffers would rarely be needed.
 So only people using QEMU directly without setting a block size
 would ordinarily suffer the bounce buffer perf hit on a 4k host
 host

Yes, I'm currently tending towards this plus a warning on stderr if
bounce buffering is used.

Or, coming back to the original subject of this discussion, we can
default to cache=writeback and forget about alignment. If you specify
cache=none, you have to take care to explicitly specify a block size 
512 bytes, too.

Maybe the best is actually to do both: Default to cache=writeback,
completely avoiding bounce buffers. If the user specifies cache=none,
but doesn't change the sector size of the virtual disk, print a warning
and enable bounce buffers.

Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-26 Thread Anthony Liguori

On 10/25/2011 10:32 AM, Kevin Wolf wrote:

Am 25.10.2011 16:06, schrieb Anthony Liguori:

On 10/25/2011 08:56 AM, Kevin Wolf wrote:

Am 25.10.2011 15:05, schrieb Anthony Liguori:

On 10/25/2011 07:35 AM, Kevin Wolf wrote:

Am 24.10.2011 13:35, schrieb Paolo Bonzini:

On 10/24/2011 01:04 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


- What's left to merge for 1.0.


I would still like to cache the default cache mode (probably to
cache=writeback). We don't allow guests to toggle WCE yet which Anthony
would have liked to see before doing the change. Is it a strict requirement?


I don't see a way around it.  If the default mode is cache=writeback, then we're
open to data corruption in any guest where barrier=0.  With guest togglable WCE,
it ends up being a guest configuration issue so we can more or less defer
responsibility.


So do you think that offering a WCE inside the guest would be a real
solution or just a way to have an excuse?


No, it offers a mechanism to fix mistakes at run-time verses at start up time.


This is true (in both directions). But I think it's independent from the
right default.


   It also means that you can make template images that understand that they
don't support barriers and change the WCE setting appropriately.


Isn't that really a job for management tools?


Christoph said that OSes don't usually change this by themselves, it
would need an administrator manually changing the setting. But if we
require that, we can just as well require that the administrator set
cache=writethrough on the qemu command line.


The administrator of the guest != the administrator of the host.


But the administrator of the guest == the owner of the qemu instance,
no? He should be the one to use the management tools and configure his VMs.


You're really talking about a multi-tenancy virtualization management solution. 
 There really aren't a lot of these today.  The most common variant is a IaaS 
platform where the end-user API is mostly just create a VM, destroy a VM. 
There's not a lot of dynamic configurability (just look at EC2s API).



Do you think it's a good idea to change the default mode w/o guest WCE toggle
support?  What's your view about older guests if we change the default mode?
What's your main motivation for wanting to change the default mode?


Because people are constantly complaining about the awful
(cache=writethrough) performance they get before they are told they
should use a different cache option. And they are right. The
out-of-the-box experience with qemu's block performance really sucks.


With qcow2 you mean, right?


No, with any format, including raw. Which isn't surprising at all,
O_SYNC makes writes very expensive.


I'd be much more open to changing the default mode to cache=none FWIW since the
risk of data loss there is much, much lower.


I think people said that they'd rather not have cache=none as default
because O_DIRECT doesn't work everywhere.


Where doesn't it work these days?  I know it doesn't work on tmpfs.  I know it
works on ext[234], btrfs, nfs.


I think tmpfs was named (and failing to start with default settings on
tmpfs would be nasty enough), but iirc Alex had another one.


Alex?  We can detect tmpfs with fsstat and do the right thing.

Regards,

Anthony Liguori




Kevin






Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Kevin Wolf
Am 24.10.2011 13:35, schrieb Paolo Bonzini:
 On 10/24/2011 01:04 PM, Juan Quintela wrote:

 Hi

 Please send in any agenda items you are interested in covering.
 
 - What's left to merge for 1.0.

I would still like to cache the default cache mode (probably to
cache=writeback). We don't allow guests to toggle WCE yet which Anthony
would have liked to see before doing the change. Is it a strict requirement?

Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Anthony Liguori

On 10/25/2011 07:35 AM, Kevin Wolf wrote:

Am 24.10.2011 13:35, schrieb Paolo Bonzini:

On 10/24/2011 01:04 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


- What's left to merge for 1.0.


I would still like to cache the default cache mode (probably to
cache=writeback). We don't allow guests to toggle WCE yet which Anthony
would have liked to see before doing the change. Is it a strict requirement?


I don't see a way around it.  If the default mode is cache=writeback, then we're 
open to data corruption in any guest where barrier=0.  With guest togglable WCE, 
it ends up being a guest configuration issue so we can more or less defer 
responsibility.


Do you think it's a good idea to change the default mode w/o guest WCE toggle 
support?  What's your view about older guests if we change the default mode? 
What's your main motivation for wanting to change the default mode?


I'd be much more open to changing the default mode to cache=none FWIW since the 
risk of data loss there is much, much lower.


Regards,

Anthony Liguori



Kevin






Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Dor Laor

On 10/25/2011 03:05 PM, Anthony Liguori wrote:

On 10/25/2011 07:35 AM, Kevin Wolf wrote:

Am 24.10.2011 13:35, schrieb Paolo Bonzini:

On 10/24/2011 01:04 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


- What's left to merge for 1.0.


I would still like to cache the default cache mode (probably to
cache=writeback). We don't allow guests to toggle WCE yet which Anthony
would have liked to see before doing the change. Is it a strict
requirement?


I don't see a way around it. If the default mode is cache=writeback,
then we're open to data corruption in any guest where barrier=0. With
guest togglable WCE, it ends up being a guest configuration issue so we
can more or less defer responsibility.

Do you think it's a good idea to change the default mode w/o guest WCE
toggle support? What's your view about older guests if we change the
default mode? What's your main motivation for wanting to change the
default mode?

I'd be much more open to changing the default mode to cache=none FWIW
since the risk of data loss there is much, much lower.


A bit related to this, it would be nice to mark a VM un-migratable if 
cache!=none. Juan reports that currently it such VMs are exposed to data 
integrity issues so we need to fail migrating them automatically.




Regards,

Anthony Liguori



Kevin









Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Anthony Liguori

On 10/25/2011 08:18 AM, Dor Laor wrote:

On 10/25/2011 03:05 PM, Anthony Liguori wrote:

On 10/25/2011 07:35 AM, Kevin Wolf wrote:

Am 24.10.2011 13:35, schrieb Paolo Bonzini:

On 10/24/2011 01:04 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


- What's left to merge for 1.0.


I would still like to cache the default cache mode (probably to
cache=writeback). We don't allow guests to toggle WCE yet which Anthony
would have liked to see before doing the change. Is it a strict
requirement?


I don't see a way around it. If the default mode is cache=writeback,
then we're open to data corruption in any guest where barrier=0. With
guest togglable WCE, it ends up being a guest configuration issue so we
can more or less defer responsibility.

Do you think it's a good idea to change the default mode w/o guest WCE
toggle support? What's your view about older guests if we change the
default mode? What's your main motivation for wanting to change the
default mode?

I'd be much more open to changing the default mode to cache=none FWIW
since the risk of data loss there is much, much lower.


A bit related to this, it would be nice to mark a VM un-migratable if
cache!=none. Juan reports that currently it such VMs are exposed to data
integrity issues so we need to fail migrating them automatically.


That's not correct.  cache!=none is perfectly safe *if* you have coherent shared 
storage.


Regards,

Anthony Liguori





Regards,

Anthony Liguori



Kevin












Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Andreas Färber
Am 25.10.2011 15:18, schrieb Dor Laor:
 [...] it would be nice to mark a VM un-migratable [snip]

Speaking of which, I'm working on the missing migration support for AHCI
but fear I won't quite make it for the Nov 1 deadline.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Kevin Wolf
Am 25.10.2011 15:05, schrieb Anthony Liguori:
 On 10/25/2011 07:35 AM, Kevin Wolf wrote:
 Am 24.10.2011 13:35, schrieb Paolo Bonzini:
 On 10/24/2011 01:04 PM, Juan Quintela wrote:

 Hi

 Please send in any agenda items you are interested in covering.

 - What's left to merge for 1.0.

 I would still like to cache the default cache mode (probably to
 cache=writeback). We don't allow guests to toggle WCE yet which Anthony
 would have liked to see before doing the change. Is it a strict requirement?
 
 I don't see a way around it.  If the default mode is cache=writeback, then 
 we're 
 open to data corruption in any guest where barrier=0.  With guest togglable 
 WCE, 
 it ends up being a guest configuration issue so we can more or less defer 
 responsibility.

So do you think that offering a WCE inside the guest would be a real
solution or just a way to have an excuse?

Christoph said that OSes don't usually change this by themselves, it
would need an administrator manually changing the setting. But if we
require that, we can just as well require that the administrator set
cache=writethrough on the qemu command line.

 Do you think it's a good idea to change the default mode w/o guest WCE toggle 
 support?  What's your view about older guests if we change the default mode? 
 What's your main motivation for wanting to change the default mode?

Because people are constantly complaining about the awful
(cache=writethrough) performance they get before they are told they
should use a different cache option. And they are right. The
out-of-the-box experience with qemu's block performance really sucks.

 I'd be much more open to changing the default mode to cache=none FWIW since 
 the 
 risk of data loss there is much, much lower.

I think people said that they'd rather not have cache=none as default
because O_DIRECT doesn't work everywhere.

Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Anthony Liguori

On 10/25/2011 08:56 AM, Kevin Wolf wrote:

Am 25.10.2011 15:05, schrieb Anthony Liguori:

On 10/25/2011 07:35 AM, Kevin Wolf wrote:

Am 24.10.2011 13:35, schrieb Paolo Bonzini:

On 10/24/2011 01:04 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


- What's left to merge for 1.0.


I would still like to cache the default cache mode (probably to
cache=writeback). We don't allow guests to toggle WCE yet which Anthony
would have liked to see before doing the change. Is it a strict requirement?


I don't see a way around it.  If the default mode is cache=writeback, then we're
open to data corruption in any guest where barrier=0.  With guest togglable WCE,
it ends up being a guest configuration issue so we can more or less defer
responsibility.


So do you think that offering a WCE inside the guest would be a real
solution or just a way to have an excuse?


No, it offers a mechanism to fix mistakes at run-time verses at start up time. 
 It also means that you can make template images that understand that they 
don't support barriers and change the WCE setting appropriately.



Christoph said that OSes don't usually change this by themselves, it
would need an administrator manually changing the setting. But if we
require that, we can just as well require that the administrator set
cache=writethrough on the qemu command line.


The administrator of the guest != the administrator of the host.


Do you think it's a good idea to change the default mode w/o guest WCE toggle
support?  What's your view about older guests if we change the default mode?
What's your main motivation for wanting to change the default mode?


Because people are constantly complaining about the awful
(cache=writethrough) performance they get before they are told they
should use a different cache option. And they are right. The
out-of-the-box experience with qemu's block performance really sucks.


With qcow2 you mean, right?


I'd be much more open to changing the default mode to cache=none FWIW since the
risk of data loss there is much, much lower.


I think people said that they'd rather not have cache=none as default
because O_DIRECT doesn't work everywhere.


Where doesn't it work these days?  I know it doesn't work on tmpfs.  I know it 
works on ext[234], btrfs, nfs.


Regards,

Anthony Liguori


Kevin






Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Kevin Wolf
Am 25.10.2011 16:06, schrieb Anthony Liguori:
 On 10/25/2011 08:56 AM, Kevin Wolf wrote:
 Am 25.10.2011 15:05, schrieb Anthony Liguori:
 On 10/25/2011 07:35 AM, Kevin Wolf wrote:
 Am 24.10.2011 13:35, schrieb Paolo Bonzini:
 On 10/24/2011 01:04 PM, Juan Quintela wrote:

 Hi

 Please send in any agenda items you are interested in covering.

 - What's left to merge for 1.0.

 I would still like to cache the default cache mode (probably to
 cache=writeback). We don't allow guests to toggle WCE yet which Anthony
 would have liked to see before doing the change. Is it a strict 
 requirement?

 I don't see a way around it.  If the default mode is cache=writeback, then 
 we're
 open to data corruption in any guest where barrier=0.  With guest togglable 
 WCE,
 it ends up being a guest configuration issue so we can more or less defer
 responsibility.

 So do you think that offering a WCE inside the guest would be a real
 solution or just a way to have an excuse?
 
 No, it offers a mechanism to fix mistakes at run-time verses at start up 
 time. 

This is true (in both directions). But I think it's independent from the
right default.

   It also means that you can make template images that understand that they 
 don't support barriers and change the WCE setting appropriately.

Isn't that really a job for management tools?

 Christoph said that OSes don't usually change this by themselves, it
 would need an administrator manually changing the setting. But if we
 require that, we can just as well require that the administrator set
 cache=writethrough on the qemu command line.
 
 The administrator of the guest != the administrator of the host.

But the administrator of the guest == the owner of the qemu instance,
no? He should be the one to use the management tools and configure his VMs.

 Do you think it's a good idea to change the default mode w/o guest WCE 
 toggle
 support?  What's your view about older guests if we change the default mode?
 What's your main motivation for wanting to change the default mode?

 Because people are constantly complaining about the awful
 (cache=writethrough) performance they get before they are told they
 should use a different cache option. And they are right. The
 out-of-the-box experience with qemu's block performance really sucks.
 
 With qcow2 you mean, right?

No, with any format, including raw. Which isn't surprising at all,
O_SYNC makes writes very expensive.

 I'd be much more open to changing the default mode to cache=none FWIW since 
 the
 risk of data loss there is much, much lower.

 I think people said that they'd rather not have cache=none as default
 because O_DIRECT doesn't work everywhere.
 
 Where doesn't it work these days?  I know it doesn't work on tmpfs.  I know 
 it 
 works on ext[234], btrfs, nfs.

I think tmpfs was named (and failing to start with default settings on
tmpfs would be nasty enough), but iirc Alex had another one.

Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-25 Thread Alexander Graf




On 25.10.2011, at 17:32, Kevin Wolf kw...@redhat.com wrote:

 Am 25.10.2011 16:06, schrieb Anthony Liguori:
 On 10/25/2011 08:56 AM, Kevin Wolf wrote:
 Am 25.10.2011 15:05, schrieb Anthony Liguori:
 On 10/25/2011 07:35 AM, Kevin Wolf wrote:
 Am 24.10.2011 13:35, schrieb Paolo Bonzini:
 On 10/24/2011 01:04 PM, Juan Quintela wrote:
 
 Hi
 
 Please send in any agenda items you are interested in covering.
 
 - What's left to merge for 1.0.
 
 I would still like to cache the default cache mode (probably to
 cache=writeback). We don't allow guests to toggle WCE yet which Anthony
 would have liked to see before doing the change. Is it a strict 
 requirement?
 
 I don't see a way around it.  If the default mode is cache=writeback, then 
 we're
 open to data corruption in any guest where barrier=0.  With guest 
 togglable WCE,
 it ends up being a guest configuration issue so we can more or less defer
 responsibility.
 
 So do you think that offering a WCE inside the guest would be a real
 solution or just a way to have an excuse?
 
 No, it offers a mechanism to fix mistakes at run-time verses at start up 
 time. 
 
 This is true (in both directions). But I think it's independent from the
 right default.
 
  It also means that you can make template images that understand that they 
 don't support barriers and change the WCE setting appropriately.
 
 Isn't that really a job for management tools?
 
 Christoph said that OSes don't usually change this by themselves, it
 would need an administrator manually changing the setting. But if we
 require that, we can just as well require that the administrator set
 cache=writethrough on the qemu command line.
 
 The administrator of the guest != the administrator of the host.
 
 But the administrator of the guest == the owner of the qemu instance,
 no? He should be the one to use the management tools and configure his VMs.
 
 Do you think it's a good idea to change the default mode w/o guest WCE 
 toggle
 support?  What's your view about older guests if we change the default 
 mode?
 What's your main motivation for wanting to change the default mode?
 
 Because people are constantly complaining about the awful
 (cache=writethrough) performance they get before they are told they
 should use a different cache option. And they are right. The
 out-of-the-box experience with qemu's block performance really sucks.
 
 With qcow2 you mean, right?
 
 No, with any format, including raw. Which isn't surprising at all,
 O_SYNC makes writes very expensive.
 
 I'd be much more open to changing the default mode to cache=none FWIW 
 since the
 risk of data loss there is much, much lower.
 
 I think people said that they'd rather not have cache=none as default
 because O_DIRECT doesn't work everywhere.
 
 Where doesn't it work these days?  I know it doesn't work on tmpfs.  I know 
 it 
 works on ext[234], btrfs, nfs.
 
 I think tmpfs was named (and failing to start with default settings on
 tmpfs would be nasty enough), but iirc Alex had another one.

Yeah, IIRC NFS also failed on me :)

Alex

 
 Kevin



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-24 Thread Paolo Bonzini

On 10/24/2011 01:04 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


- What's left to merge for 1.0.

- What kind of patch after the end of the freeze

Paolo




Re: [Qemu-devel] KVM call agenda for October 25

2011-10-24 Thread Peter Maydell
On 24 October 2011 12:35, Paolo Bonzini pbonz...@redhat.com wrote:
 On 10/24/2011 01:04 PM, Juan Quintela wrote:
 Please send in any agenda items you are interested in covering.

 - What's left to merge for 1.0.

Things on my list, FWIW:
 * current target-arm pullreq
 * PL041 support (needs another patch round to fix a minor bug
   Andrzej spotted)
 * cpu_single_env must be thread-local

I also think that it's somewhat unfortunate that we now will
compile on ARM hosts so that we always abort on startup (due
to the reliance on a working makecontext()) but I'm not really
sure how to deal with that one.

-- PMM



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-24 Thread Andreas Färber
Am 24.10.2011 14:02, schrieb Peter Maydell:
 On 24 October 2011 12:35, Paolo Bonzini pbonz...@redhat.com wrote:
 On 10/24/2011 01:04 PM, Juan Quintela wrote:
 Please send in any agenda items you are interested in covering.

 - What's left to merge for 1.0.

 I also think that it's somewhat unfortunate that we now will
 compile on ARM hosts so that we always abort on startup (due
 to the reliance on a working makecontext()) but I'm not really
 sure how to deal with that one.

FWIW we're also not working / not building on Darwin ppc+Intel, which is
related to a) softfloat integer types, b) GThread initialization, c)
unknown issues. Bisecting did not work well and I am lacking time and
ideas to investigate and fix this. For softfloat there are several
solutions around, in need of a decision.

Nice to merge would be the Cocoa sheet issue, once verified.

Andreas



Re: [Qemu-devel] KVM call agenda for October 25

2011-10-24 Thread Luiz Capitulino
On Mon, 24 Oct 2011 13:02:05 +0100
Peter Maydell peter.mayd...@linaro.org wrote:

 On 24 October 2011 12:35, Paolo Bonzini pbonz...@redhat.com wrote:
  On 10/24/2011 01:04 PM, Juan Quintela wrote:
  Please send in any agenda items you are interested in covering.
 
  - What's left to merge for 1.0.
 
 Things on my list, FWIW:
  * current target-arm pullreq
  * PL041 support (needs another patch round to fix a minor bug
Andrzej spotted)
  * cpu_single_env must be thread-local

I submitted today the second round of QAPI conversions, which converts all
existing QMP query commands to the QAPI (plus some fixes).

I expect that to make 1.0.