On Tue, 09/19 19:00, John Snow wrote:
>
>
> On 09/19/2017 04:18 PM, Eric Blake wrote:
> > We've previously fixed several places where we failed to account
> > for possible errors from bdrv_nb_sectors(). Fix another one by
> > making bdrv_dirty_bitmap_truncate() take the new size from the
> >
On 09/19/2017 04:18 PM, Eric Blake wrote:
> We've previously fixed several places where we failed to account
> for possible errors from bdrv_nb_sectors(). Fix another one by
> making bdrv_dirty_bitmap_truncate() take the new size from the
> caller instead of querying itself; then adjust the
Now that all callers are using byte-based interfaces, there's no
reason for our internal hbitmap to remain with sector-based
granularity. It also simplifies our internal scaling, since we
already know that hbitmap widens requests out to granularity
boundaries.
Signed-off-by: Eric Blake
Both callers already had bytes available, but were scaling to
sectors. Move the scaling to internal code. In the case of
bdrv_aligned_pwritev(), we are now passing the exact offset
rather than a rounded sector-aligned value, but that's okay
as long as dirty bitmap widens start/bytes to
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.
Signed-off-by: Eric Blake
Reviewed-by: John Snow
Reviewed-by: Kevin
Some of the callers were already scaling bytes to sectors; others
can be easily converted to pass byte offsets, all in our shift
towards a consistent byte interface everywhere. Making the change
will also make it easier to write the hold-out callers to use byte
rather than sectors for their
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Change the qcow2 bitmap
helper function sectors_covered_by_bitmap_cluster(), renaming it
to bytes_covered_by_bitmap_cluster() in the process.
Signed-off-by: Eric Blake
Half the callers were already scaling bytes to sectors; the other
half can eventually be simplified to use byte iteration. Both
callers were already using the result as a bool, so make that
explicit. Making the change also makes it easier for a future
dirty-bitmap patch to offload scaling over
Thanks to recent cleanups, most callers were scaling a return value
of sectors into bytes (the exception, in qcow2-bitmap, will be
converted to byte-based iteration later). Update the interface to
do the scaling internally instead.
In qcow2-bitmap, the code was specifically checking for an error
Right now, the dirty-bitmap code exposes the fact that we use
a scale of sector granularity in the underlying hbitmap to anything
that wants to serialize a dirty bitmap. It's nicer to uniformly
expose bytes as our dirty-bitmap interface, matching the previous
change to bitmap size. The only
We're already reporting bytes for bdrv_dirty_bitmap_granularity();
mixing bytes and sectors in our return values is a recipe for
confusion. A later cleanup will convert dirty bitmap internals
to be entirely byte-based, but in the meantime, we should report
the bitmap size in bytes.
The only
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.
iotests 165 was rather weak - on a default 64k-cluster image, where
bitmap granularity also defaults to 64k
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.
Signed-off-by: Eric Blake
Reviewed-by: John Snow
Reviewed-by: Vladimir
Thanks to recent cleanups, all callers were scaling a return value
of sectors into bytes; do the scaling internally instead.
Signed-off-by: Eric Blake
Reviewed-by: John Snow
Reviewed-by: Kevin Wolf
Reviewed-by: Fam Zheng
This is new code, but it is easier to read if it makes passes over
the image using bytes rather than sectors (and will get easier in
the future when bdrv_get_block_status is converted to byte-based).
Signed-off-by: Eric Blake
Reviewed-by: John Snow
All callers to bdrv_dirty_iter_new() passed 0 for their initial
starting point, drop that parameter.
Most callers to bdrv_set_dirty_iter() were scaling a byte offset to
a sector number; the exception qcow2-bitmap will be converted later
to use byte rather than sector iteration. Move the scaling
We are still using an internal hbitmap that tracks a size in sectors,
with the granularity scaled down accordingly, because it lets us
use a shortcut for our iterators which are currently sector-based.
But there's no reason we can't track the dirty bitmap size in bytes,
since it is (mostly) an
All callers of bdrv_img_create() pass in a size, or -1 to read the
size from the backing file. We then set that size as the QemuOpt
default, which means we will reuse that default rather than the
final parameter to qemu_opt_get_size() several lines later. But
it is rather confusing to read
There are patches floating around to add NBD_CMD_BLOCK_STATUS,
but NBD wants to report status on byte granularity (even if the
reporting will probably be naturally aligned to sectors or even
much higher levels). I've therefore started the task of
converting our block status code to report at a
We've previously fixed several places where we failed to account
for possible errors from bdrv_nb_sectors(). Fix another one by
making bdrv_dirty_bitmap_truncate() take the new size from the
caller instead of querying itself; then adjust the sole caller
bdrv_truncate() to pass the size just
When subdividing a bitmap serialization, the code in hbitmap.c
enforces that start/count parameters are aligned (except that
count can end early at end-of-bitmap). We exposed this required
alignment through bdrv_dirty_bitmap_serialization_align(), but
forgot to actually check that we comply with
We had several functions that no one is currently using, and which
use sector-based interfaces. I'm trying to convert towards byte-based
interfaces, so it's easier to just drop the unused functions:
bdrv_dirty_bitmap_get_meta
bdrv_dirty_bitmap_get_meta_locked
bdrv_dirty_bitmap_reset_meta
The only client of hbitmap_serialization_granularity() is dirty-bitmap's
bdrv_dirty_bitmap_serialization_align(). Keeping the two names consistent
is worthwhile, and the shorter name is more representative of what the
function returns (the required alignment to be used for start/count of
other
On 09/19/2017 07:44 AM, Kevin Wolf wrote:
> Am 18.09.2017 um 20:58 hat Eric Blake geschrieben:
>> Now that we have adjusted the majority of the calls this function
>> makes to be byte-based, it is easier to read the code if it makes
>> passes over the image using bytes rather than sectors.
>>
>>
On 19.09.2017 21:01, John Snow wrote:
>
>
> On 09/19/2017 04:06 AM, Paolo Bonzini wrote:
>> On 13/09/2017 21:08, John Snow wrote:
>>>
>>>
>>> On 09/13/2017 06:21 AM, Thomas Huth wrote:
Remove the unnecessary home-grown redefinition of the assert() macro here,
and remove the unusable
* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> ping for 1-3
> Can we merge them?
I see all of them have R-b's; so lets try and put them in the next
migration merge.
Quintela: Sound good?
Dave
> 22.08.2017 02:34, John Snow wrote:
> >
> > On 07/11/2017 09:38 AM, Vladimir
On 19.09.2017 20:54, John Snow wrote:
>
>
> On 09/19/2017 04:14 AM, Thomas Huth wrote:
>> On 19.09.2017 10:06, Paolo Bonzini wrote:
>>> On 13/09/2017 21:08, John Snow wrote:
>> [...]
Farewell, bitrot code.
Reviewed-by: John Snow
Out of curiosity, I
On 09/19/2017 04:06 AM, Paolo Bonzini wrote:
> On 13/09/2017 21:08, John Snow wrote:
>>
>>
>> On 09/13/2017 06:21 AM, Thomas Huth wrote:
>>> Remove the unnecessary home-grown redefinition of the assert() macro here,
>>> and remove the unusable debug code at the end of the checkpoint() function.
On 09/19/2017 04:14 AM, Thomas Huth wrote:
> On 19.09.2017 10:06, Paolo Bonzini wrote:
>> On 13/09/2017 21:08, John Snow wrote:
> [...]
>>> Farewell, bitrot code.
>>>
>>> Reviewed-by: John Snow
>>>
>>> Out of curiosity, I wonder ...
>>>
>>> jhuston@probe (foobar) ~/s/qemu> git
On Tue, Sep 19, 2017 at 11:07:53AM -0500, Eric Blake wrote:
> On 09/19/2017 10:50 AM, Stefan Hajnoczi wrote:
> > Clear tg->any_timer_armed[] when throttling timers are destroy during
>
> s/destroy/destroyed/
All your base are belong to us!
Hello,
First post here, so maybe I should introduce myself :
- I'm a sysadmin for decades and currently managing 4 oVirt clusters,
made out of tens of hypervisors, all are CentOS 7.2+ based.
- I'm very happy with this solution we choose especially because it is
based on qemu-kvm (open source,
On 09/19/2017 10:50 AM, Stefan Hajnoczi wrote:
> Clear tg->any_timer_armed[] when throttling timers are destroy during
s/destroy/destroyed/
> AioContext attach/detach. Failure to do so causes throttling to hang
> because we believe the timer is already scheduled!
>
> The following was broken
On 09/19/2017 10:46 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi Fam!
>
> I have a question about your image locking series:
>
> Can you please explain, why OFD locking is enabled by default and posix
> locking is not? What is wrong with posix locking, what are the problems?
POSIX locking
On Tue, Sep 19, 2017 at 06:46:19PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi Fam!
>
> I have a question about your image locking series:
>
> Can you please explain, why OFD locking is enabled by default and posix
> locking is not? What is wrong with posix locking, what are the problems?
Clear tg->any_timer_armed[] when throttling timers are destroy during
AioContext attach/detach. Failure to do so causes throttling to hang
because we believe the timer is already scheduled!
The following was broken at least since QEMU 2.10.0 with -drive
iops=100:
$ dd if=/dev/zero of=/dev/vdb
Hi Fam!
I have a question about your image locking series:
Can you please explain, why OFD locking is enabled by default and posix
locking is not? What is wrong with posix locking, what are the problems?
--
Best regards,
Vladimir
Hi everyone,
over the past few weeks I have been testing the effects of reducing
the size of the entries in the qcow2 L2 cache. This was briefly
mentioned by Denis in the same thread where we discussed subcluster
allocation back in April, but I'll describe here the problem and the
proposal in
On 09/19/2017 06:07 PM, Alberto Garcia wrote:
> Hi everyone,
>
> over the past few weeks I have been testing the effects of reducing
> the size of the entries in the qcow2 L2 cache. This was briefly
> mentioned by Denis in the same thread where we discussed subcluster
> allocation back in April,
On Mon, 09/18 13:57, Eric Blake wrote:
> There are patches floating around to add NBD_CMD_BLOCK_STATUS,
> but NBD wants to report status on byte granularity (even if the
> reporting will probably be naturally aligned to sectors or even
> much higher levels). I've therefore started the task of
>
On 19/09/2017 15:36, Peter Lieven wrote:
> Hi,
>
> I just noticed that CPU throttling and Block Migration don't work
> together very well.
> During block migration the throttling heuristic detects that we
> obviously make no progress
> in ram transfer. But the reason is the running block
* Peter Lieven (p...@kamp.de) wrote:
> Am 19.09.2017 um 16:38 schrieb Dr. David Alan Gilbert:
> > * Peter Lieven (p...@kamp.de) wrote:
> > > Hi,
> > >
> > > I just noticed that CPU throttling and Block Migration don't work
> > > together very well.
> > > During block migration the throttling
Am 19.09.2017 um 16:38 schrieb Dr. David Alan Gilbert:
* Peter Lieven (p...@kamp.de) wrote:
Hi,
I just noticed that CPU throttling and Block Migration don't work together very
well.
During block migration the throttling heuristic detects that we obviously make
no progress
in ram transfer.
* Peter Lieven (p...@kamp.de) wrote:
> Hi,
>
> I just noticed that CPU throttling and Block Migration don't work together
> very well.
> During block migration the throttling heuristic detects that we obviously
> make no progress
> in ram transfer. But the reason is the running block migration
On 19/09/2017 15:26, Daniel P. Berrange wrote:
> On Tue, Sep 19, 2017 at 03:23:09PM +0200, Paolo Bonzini wrote:
>> On 19/09/2017 15:12, Daniel P. Berrange wrote:
>>> On Tue, Sep 19, 2017 at 02:57:00PM +0200, Paolo Bonzini wrote:
On 19/09/2017 14:53, Daniel P. Berrange wrote:
>> +/*
On 19/09/2017 15:33, Daniel P. Berrange wrote:
> On Tue, Sep 19, 2017 at 12:24:32PM +0200, Paolo Bonzini wrote:
>> Introduce a privileged helper to run persistent reservation commands.
>> This lets virtual machines send persistent reservations without using
>> CAP_SYS_RAWIO or out-of-tree patches.
On Tue, Sep 19, 2017 at 12:24:32PM +0200, Paolo Bonzini wrote:
> Introduce a privileged helper to run persistent reservation commands.
> This lets virtual machines send persistent reservations without using
> CAP_SYS_RAWIO or out-of-tree patches. The helper uses Unix permissions
> and SCM_RIGHTS
On 09/19/2017 05:00 AM, Vladimir Sementsov-Ogievskiy wrote:
> 19.09.2017 01:36, Eric Blake wrote:
>> On 09/18/2017 08:59 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Refactor nbd client to not yield from nbd_read_reply_entry. It's
>>> possible now as all reading is done in nbd_read_reply_entry and
Hi,
I just noticed that CPU throttling and Block Migration don't work together very
well.
During block migration the throttling heuristic detects that we obviously make
no progress
in ram transfer. But the reason is the running block migration and not a too
high dirty pages rate.
The result
On 19/09/2017 15:12, Daniel P. Berrange wrote:
> On Tue, Sep 19, 2017 at 02:57:00PM +0200, Paolo Bonzini wrote:
>> On 19/09/2017 14:53, Daniel P. Berrange wrote:
+/* Try to reconnect while sending the CDB. */
+for (attempts = 0; attempts < PR_MAX_RECONNECT_ATTEMPTS; attempts++)
On 19/09/2017 14:53, Daniel P. Berrange wrote:
>> +/* Try to reconnect while sending the CDB. */
>> +for (attempts = 0; attempts < PR_MAX_RECONNECT_ATTEMPTS; attempts++) {
>
> I'm curious why you need to loop here. The helper daemon should be running
> already, as you're not spawning it
On Tue, Sep 19, 2017 at 12:24:34PM +0200, Paolo Bonzini wrote:
> This adds a concrete subclass of pr-manager that talks to qemu-pr-helper.
>
> Signed-off-by: Paolo Bonzini
> ---
> v1->v2: fixed string property double-free
> fixed/cleaned up error handling
>
On Tue, Sep 19, 2017 at 02:57:00PM +0200, Paolo Bonzini wrote:
> On 19/09/2017 14:53, Daniel P. Berrange wrote:
> >> +/* Try to reconnect while sending the CDB. */
> >> +for (attempts = 0; attempts < PR_MAX_RECONNECT_ATTEMPTS; attempts++) {
> >
> > I'm curious why you need to loop here.
On Tue, Sep 19, 2017 at 03:23:09PM +0200, Paolo Bonzini wrote:
> On 19/09/2017 15:12, Daniel P. Berrange wrote:
> > On Tue, Sep 19, 2017 at 02:57:00PM +0200, Paolo Bonzini wrote:
> >> On 19/09/2017 14:53, Daniel P. Berrange wrote:
> +/* Try to reconnect while sending the CDB. */
> +
On 19/09/2017 13:03, Vladimir Sementsov-Ogievskiy wrote:
>>>
>> I disagree that it is easier to extend it in the future. If some
>> commands in the future need a different "how do we read it" (e.g. for
>> structured reply), nbd_read_reply_entry may not have all the information
>> it needs
On Tue, Sep 19, 2017 at 12:24:34PM +0200, Paolo Bonzini wrote:
> This adds a concrete subclass of pr-manager that talks to qemu-pr-helper.
>
> Signed-off-by: Paolo Bonzini
> ---
> v1->v2: fixed string property double-free
> fixed/cleaned up error handling
>
Am 18.09.2017 um 20:57 hat Eric Blake geschrieben:
> There are patches floating around to add NBD_CMD_BLOCK_STATUS,
> but NBD wants to report status on byte granularity (even if the
> reporting will probably be naturally aligned to sectors or even
> much higher levels). I've therefore started the
Am 18.09.2017 um 20:58 hat Eric Blake geschrieben:
> Now that we have adjusted the majority of the calls this function
> makes to be byte-based, it is easier to read the code if it makes
> passes over the image using bytes rather than sectors.
>
> Signed-off-by: Eric Blake
>
19.09.2017 13:03, Paolo Bonzini wrote:
On 19/09/2017 11:43, Vladimir Sementsov-Ogievskiy wrote:
I'm trying to look forward to structured reads, where we will have to
deal with more than one server message in reply to a single client
request. For read, we just piece together portions of the
19.09.2017 13:01, Paolo Bonzini wrote:
On 19/09/2017 11:25, Vladimir Sementsov-Ogievskiy wrote:
18.09.2017 18:43, Paolo Bonzini wrote:
On 18/09/2017 15:59, Vladimir Sementsov-Ogievskiy wrote:
Read the whole reply in one place - in nbd_read_reply_entry.
Signed-off-by: Vladimir
Hi Stefan,
On 09/19/2017 06:48 AM, Stefan Hajnoczi wrote:
Cc: Keith Busch
Signed-off-by: Stefan Hajnoczi
---
hw/block/nvme.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 9aa32692a3..513ec7065d 100644
Introduce a privileged helper to run persistent reservation commands.
This lets virtual machines send persistent reservations without using
CAP_SYS_RAWIO or out-of-tree patches. The helper uses Unix permissions
and SCM_RIGHTS to restrict access to processes that can access its socket
and prove
It is a common requirement for virtual machine to send persistent
reservations, but this currently requires either running QEMU with
CAP_SYS_RAWIO, or using out-of-tree patches that let an unprivileged
QEMU bypass Linux's filter on SG_IO commands.
As an alternative mechanism, the next patches
This adds a concrete subclass of pr-manager that talks to qemu-pr-helper.
Signed-off-by: Paolo Bonzini
---
v1->v2: fixed string property double-free
fixed/cleaned up error handling
handle buffer underrun
scsi/Makefile.objs | 2 +-
Proper support of persistent reservation for multipath devices requires
communication with the multipath daemon, so that the reservation is
registered and applied when a path comes up. The device mapper
utilities provide a library to do so; this patch makes qemu-pr-helper.c
detect multipath
SCSI persistent Reservations allow restricting access to block devices
to specific initiators in a shared storage setup. When implementing
clustering of virtual machines, it is a common requirement for virtual
machines to send persistent reservation SCSI commands. However,
the operating system
Hello,
First post here, so maybe I should introduce myself :
- I'm a sysadmin for decades and currently managing 4 oVirt clusters,
made out of tens of hypervisors, all are CentOS 7.2+ based.
- I'm very happy with this solution we choose especially because it is
based on qemu-kvm (open source,
On 19/09/2017 11:43, Vladimir Sementsov-Ogievskiy wrote:
>>
>> I'm trying to look forward to structured reads, where we will have to
>> deal with more than one server message in reply to a single client
>> request. For read, we just piece together portions of the qiov until
>> the server has sent
On 19/09/2017 11:25, Vladimir Sementsov-Ogievskiy wrote:
> 18.09.2017 18:43, Paolo Bonzini wrote:
>> On 18/09/2017 15:59, Vladimir Sementsov-Ogievskiy wrote:
>>> Read the whole reply in one place - in nbd_read_reply_entry.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy
On Tue, Sep 19, 2017 at 10:44:16AM +0100, Stefan Hajnoczi wrote:
> On Mon, Sep 18, 2017 at 06:26:51PM +0200, Max Reitz wrote:
> > On 2017-09-18 12:06, Stefan Hajnoczi wrote:
> > > On Sat, Sep 16, 2017 at 03:58:01PM +0200, Max Reitz wrote:
> > >> On 2017-09-14 17:57, Stefan Hajnoczi wrote:
> > >>>
Cc: Keith Busch
Signed-off-by: Stefan Hajnoczi
---
hw/block/nvme.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 9aa32692a3..513ec7065d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1057,6 +1057,7
19.09.2017 01:27, Eric Blake wrote:
On 09/18/2017 08:59 AM, Vladimir Sementsov-Ogievskiy wrote:
Do not continue any operation if s->quit is set in parallel.
Signed-off-by: Vladimir Sementsov-Ogievskiy
---
block/nbd-client.c | 7 +++
1 file changed, 3
On Mon, Sep 18, 2017 at 06:26:51PM +0200, Max Reitz wrote:
> On 2017-09-18 12:06, Stefan Hajnoczi wrote:
> > On Sat, Sep 16, 2017 at 03:58:01PM +0200, Max Reitz wrote:
> >> On 2017-09-14 17:57, Stefan Hajnoczi wrote:
> >>> On Wed, Sep 13, 2017 at 08:19:07PM +0200, Max Reitz wrote:
> This
18.09.2017 18:43, Paolo Bonzini wrote:
On 18/09/2017 15:59, Vladimir Sementsov-Ogievskiy wrote:
Read the whole reply in one place - in nbd_read_reply_entry.
Signed-off-by: Vladimir Sementsov-Ogievskiy
---
block/nbd-client.h | 1 +
block/nbd-client.c | 42
On Mon, 09/18 13:58, Eric Blake wrote:
> We've previously fixed several places where we failed to account
> for possible errors from bdrv_nb_sectors(). Fix another one by
> making bdrv_dirty_bitmap_truncate() take the new size from the
> caller instead of querying itself; then adjust the sole
On Mon, Sep 18, 2017 at 04:46:49PM -0500, Eric Blake wrote:
> If 'bs' is a complex expression, we were only casting the front half
> rather than the full expression. Luckily, none of the callers were
> passing bad arguments, but it's better to be robust up front.
>
> Signed-off-by: Eric Blake
On 19.09.2017 10:06, Paolo Bonzini wrote:
> On 13/09/2017 21:08, John Snow wrote:
[...]
>> Farewell, bitrot code.
>>
>> Reviewed-by: John Snow
>>
>> Out of curiosity, I wonder ...
>>
>> jhuston@probe (foobar) ~/s/qemu> git grep '#if 0' | wc -l
>> 320
>
> $ git grep -c '#if 0' |
On Mon, 09/18 18:53, Max Reitz wrote:
> >> +
> >> +if sync_source_and_target:
> >> +# If source and target should be in sync after the mirror,
> >> +# we have to flush before completion
> >
> > Not sure I understand this requirements, does it apply to libvirt and
On 13/09/2017 21:08, John Snow wrote:
>
>
> On 09/13/2017 06:21 AM, Thomas Huth wrote:
>> Remove the unnecessary home-grown redefinition of the assert() macro here,
>> and remove the unusable debug code at the end of the checkpoint() function.
>> The code there uses assert() with side-effects
On Mon, 09/18 19:53, Max Reitz wrote:
> On 2017-09-18 08:46, Fam Zheng wrote:
> > On Wed, 09/13 20:19, Max Reitz wrote:
> >> Add a new parameter -B to qemu-io's write command. When used, qemu-io
> >> will not wait for the result of the operation and instead execute it in
> >> the background.
> >
Am 18.09.2017 um 22:25 hat Manos Pitsidianakis geschrieben:
> RestartData is the opaque data of the throttle_group_restart_queue_entry
> coroutine. By being stack allocated, it isn't available anymore if
> aio_co_enter schedules the coroutine with a bottom halve and runs after
>
80 matches
Mail list logo