On Mon, 05/11 15:22, John Snow wrote:
>
>
> On 05/06/2015 12:52 AM, Fam Zheng wrote:
> > Unsetting dirty globally with discard is not very correct. The discard may
> > zero
> > out sectors (depending on can_write_zeroes_with_unmap), we should replicate
> > this change to destinition side to make
Test zero write in byte range 512~1024 for 4k alignment.
Signed-off-by: Fam Zheng
Reviewed-by: Stefan Hajnoczi
---
tests/qemu-iotests/033 | 13 +
tests/qemu-iotests/033.out | 30 ++
2 files changed, 43 insertions(+)
diff --git a/tests/qemu-iotests/03
This reverts commit fc3959e4669a1c2149b91ccb05101cfc7ae1fc05.
The core write code already handles the case, so remove this
duplication.
Because commit 61007b316 moved the touched code from block.c to
block/io.c, the change is manually reverted.
Signed-off-by: Fam Zheng
Reviewed-by: Stefan Hajno
An unaligned zero write causes NULL deferencing in bdrv_co_do_pwritev. That
path is reachable from bdrv_co_write_zeroes and bdrv_aio_write_zeroes.
You can easily trigger through the former with qemu-io, as the test case added
by 61815d6e0aa. For bdrv_aio_write_zeroes, in common cases there's alway
For zero write, callers pass in NULL qiov (qemu-io "write -z" or
scsi-disk "write same").
Commit fc3959e466 fixed bdrv_co_write_zeroes which is the common case
for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler
fix would be in bdrv_co_do_pwritev which is the NULL dereference poi
I have used the following program to test
#define _GNU_SOURCE
#include
#include
#include
#include
#include
#include
int main(int argc, char *argv[])
{
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
void *buf;
int i = 0, align = atoi(argv[2]);
do {
buf =
The patch introduces new concept: minimal memory alignment for bounce
buffers. Original so called "optimal" value is actually minimal required
value for aligment. It should be used for validation that the IOVec
is properly aligned and bounce buffer is not required.
Though, from the performance poi
The following sequence
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
for (i = 0; i < 10; i++)
write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.
The difference is quite reliable.
On the other hand we do not want at the moment to enforce
On Mon, 05/11 15:43, Stefan Hajnoczi wrote:
> On Tue, May 05, 2015 at 10:51:14AM +0800, Fam Zheng wrote:
>
> This function is complex. I had to draw a diagram to remember the
> relationships between the variables. It would be nice to split it if
> that can be done in a way that makes the code ni
Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.
Verify that no bitmap data was lost due to the partial transaction
failure.
Signed-off-by: John Snow
Reviewed-by: Max Reitz
---
tests/qemu-iotests/124 | 12
This patch actually implements the transactional callback system
for the drive_backup action.
(1) We manually pick up a reference to the bitmap if present to allow
its cleanup to be delayed until after all drive_backup jobs launched
by the transaction have fully completed.
(2) We create a
We'd like to be able to specify the callback given to backup_start
manually in the case of transactions, so split apart qmp_drive_backup
into an implementation and a wrapper.
Switch drive_backup_prepare to use the new wrapper, but don't overload
the callback and closure yet.
Signed-off-by: John S
From: Kashyap Chamarthy
Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:
drive-backup
blockdev-backup
blockdev-snapshot-internal-sync
This adds two qmp commands to transactions.
block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.
block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongsid
Allow bitmap successors to carry reference counts.
We can in a later patch use this ability to clean up the dirty bitmap
according to both the individual job's success and the success of all
jobs in the transaction group.
The code for cleaning up a bitmap is also moved from backup_run to
backup_c
These structures are misnomers, somewhat.
(1) BlockTransactionState is not state for a transaction,
but is rather state for a single transaction action.
Rename it "BlkActionState" to be more accurate.
(2) The BdrvActionOps describes operations for the BlkActionState,
above. This name
Now that the structure formerly known as BlkTransactionState has been
renamed to something sensible (BlkActionState), re-introduce an actual
BlkTransactionState that actually manages state for the entire Transaction.
In the process, convert the old QSIMPLEQ list of actions into a QTAILQ,
to let us
The goal here is to add a new method to transactions that allows
developers to specify a callback that will get invoked only once
all jobs spawned by a transaction are completed, allowing developers
the chance to perform actions conditionally pending complete success,
partial failure, or complete f
Patch 1 adds basic support for add and clear transactions.
Patch 2 tests this basic support.
Patches 3-4 refactor transactions a little bit, to add clarity.
Patch 5 adds the framework for error scenarios where only
some jobs that were launched by a transaction complete successfully,
and we
If we want to get at the job after the life of the job,
we'll need a refcount for this object.
This may occur for example if we wish to inspect the actions
taken by a particular job after a transactional group of jobs
runs, and further actions are required.
Signed-off-by: John Snow
Reviewed-by:
Test simple usage cases for using transactions to create
and synchronize incremental backups.
Signed-off-by: John Snow
Reviewed-by: Max Reitz
Reviewed-by: Stefan Hajnoczi
---
tests/qemu-iotests/124 | 54 ++
tests/qemu-iotests/124.out | 4 ++--
2
On 05/06/2015 12:52 AM, Fam Zheng wrote:
> Only poll the specific type of event we are interested in, to avoid
> stealing events that should be consumed by someone else.
>
> Suggested-by: John Snow
> Signed-off-by: Fam Zheng
> ---
> tests/qemu-iotests/iotests.py | 9 ++---
> 1 file change
On 05/06/2015 12:52 AM, Fam Zheng wrote:
> This checks that the discard on mirror source that effectively zeroes
> data is also reflected by the data of target.
>
> Signed-off-by: Fam Zheng
> ---
> tests/qemu-iotests/131 | 59
> ++
> tests/qemu-
On 05/06/2015 12:52 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng
> ---
> tests/qemu-iotests/041| 66
> ++-
> tests/qemu-iotests/iotests.py | 28 ++
> 2 files changed, 43 insertions(+), 51 deletions(-)
>
> diff --git a/tests/qe
On 05/06/2015 12:52 AM, Fam Zheng wrote:
> Unsetting dirty globally with discard is not very correct. The discard may
> zero
> out sectors (depending on can_write_zeroes_with_unmap), we should replicate
> this change to destinition side to make sure that the guest sees the same
> data.
>
> Cal
On 05/06/2015 12:52 AM, Fam Zheng wrote:
> Using this function would always be wrong because a dirty bitmap must
> have a specific owner that consumes the dirty bits and calls
> bdrv_reset_dirty_bitmap().
>
Good point.
> Remove the unused function to avoid future misuse.
>
> Reviewed-by: Eric
On 05/08/2015 02:10 PM, Eric Blake wrote:
> On 05/08/2015 11:47 AM, Dimitris Aragiorgis wrote:
>> Building the QEMU tools fails if we #define DEBUG_BLOCK inside
>> block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
>> so that DEBUG_BLOCK_PRINT can be used, we substitute the latter
On 08.05.2015 19:21, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf
---
qemu-io-cmds.c | 71 ++
1 file changed, 71 insertions(+)
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 1afcfc0..ef8f3fd 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-
On 11/05/15 19:07, Denis V. Lunev wrote:
On 11/05/15 18:08, Stefan Hajnoczi wrote:
On Mon, May 04, 2015 at 04:42:22PM +0300, Denis V. Lunev wrote:
The difference is quite reliable and the same 5%.
qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
for image in qcow2 format is 1% faster.
I looked a li
On 08.05.2015 19:21, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf
---
block.c | 42 +++---
block/commit.c| 4 ++--
include/block/block.h | 4 +++-
3 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/block.c b/block.c
ind
On 08.05.2015 19:21, Kevin Wolf wrote:
For bs->file, using references to existing BDSes has been possible for a
while already. This patch enables the same for bs->backing_hd.
Signed-off-by: Kevin Wolf
---
block.c | 42 --
block/mirror.c
On 11/05/15 18:08, Stefan Hajnoczi wrote:
On Mon, May 04, 2015 at 04:42:22PM +0300, Denis V. Lunev wrote:
The difference is quite reliable and the same 5%.
qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
for image in qcow2 format is 1% faster.
I looked a little at the qemu-io invocation but am not
On 08.05.2015 19:21, Kevin Wolf wrote:
When reopening an image, the block layer already takes care to reopen
bs->file as well with recalculated inherited flags. The same must happen
for any other child (most notably missing before this patch: backing
files).
If bs->file (or any other child) didn
On 11/05/15 18:32, Eric Blake wrote:
On 05/11/2015 08:54 AM, Stefan Hajnoczi wrote:
On Mon, May 04, 2015 at 04:42:24PM +0300, Denis V. Lunev wrote:
@@ -726,7 +727,8 @@ static void raw_refresh_limits(BlockDriverState *bs, Error
**errp)
raw_probe_alignment(bs, s->fd, errp);
bs->b
On 08.05.2015 19:21, Kevin Wolf wrote:
Currently, the block layer assumes that any block node can have only one
parent, and if it has a parent, that it inherits some options/flags from
this parent.
This is not true any more: With references used in block device
creation, a single node can be use
On 08.05.2015 19:21, Kevin Wolf wrote:
This allows iterating over all children of a given BDS, not only
including bs->file and bs->backing_hd, but also driver-specific
ones like VMDK extents or Quorum children.
Signed-off-by: Kevin Wolf
---
block.c | 27 +
On 05/11/2015 08:54 AM, Stefan Hajnoczi wrote:
> On Mon, May 04, 2015 at 04:42:24PM +0300, Denis V. Lunev wrote:
>> @@ -726,7 +727,8 @@ static void raw_refresh_limits(BlockDriverState *bs,
>> Error **errp)
>>
>> raw_probe_alignment(bs, s->fd, errp);
>> bs->bl.min_mem_alignment = s->buf
On 05/11/2015 08:40 AM, Kevin Wolf wrote:
>>> +char indexstr[slen], prefix[slen];
>>
>> And more dependence on a working C99 compiler, thanks to variable length
>> array (VLA).
>>
>>> +size_t snprintf_ret;
>>> +
>>> +snprintf_ret = snprintf(indexstr, slen, "%s%u", subqdict,
On 08.05.2015 19:21, Kevin Wolf wrote:
Instead of letting every caller of bdrv_open() determine the right flags
for its child node manually and pass them to the function, pass the
parent node and the role of the newly opened child (like backing file,
protocol layer, etc.).
Signed-off-by: Kevin W
On 05/11/2015 06:55 AM, Alberto Garcia wrote:
> QEMU has options to configure the size of the L2 and refcount caches
> for the qcow2 format. However, choosing the right sizes for a
> particular disk image is not a straightforward operation since the
> ratio between the cache size and the allocated
On Mon, May 04, 2015 at 04:42:22PM +0300, Denis V. Lunev wrote:
> The difference is quite reliable and the same 5%.
> qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
> for image in qcow2 format is 1% faster.
I looked a little at the qemu-io invocation but am not clear why there
would be a measurable pe
On 08.05.2015 19:21, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf
---
blockdev.c| 24
include/block/block.h | 8
2 files changed, 20 insertions(+), 12 deletions(-)
Reviewed-by: Max Reitz
On 11.05.2015 16:51, Kevin Wolf wrote:
Am 11.05.2015 um 16:40 hat Max Reitz geschrieben:
On 08.05.2015 19:21, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf
---
blockdev.c| 24
include/block/block.h | 8
2 files changed, 20 insertions(+), 12 dele
On 08.05.2015 19:21, Kevin Wolf wrote:
Instead of manually parsing options and then deleting them from the
options QDict, just use QemuOpts like most other places that deal with
block device options.
More options will be added there and then QemuOpts is a lot more
managable than open-coding ever
On Mon, May 04, 2015 at 04:42:24PM +0300, Denis V. Lunev wrote:
> @@ -726,7 +727,8 @@ static void raw_refresh_limits(BlockDriverState *bs,
> Error **errp)
>
> raw_probe_alignment(bs, s->fd, errp);
> bs->bl.min_mem_alignment = s->buf_align;
> -bs->bl.opt_mem_alignment = s->buf_align
Am 11.05.2015 um 16:40 hat Max Reitz geschrieben:
> On 08.05.2015 19:21, Kevin Wolf wrote:
> >Signed-off-by: Kevin Wolf
> >---
> > blockdev.c| 24
> > include/block/block.h | 8
> > 2 files changed, 20 insertions(+), 12 deletions(-)
>
> Any reason f
On 08.05.2015 19:21, Kevin Wolf wrote:
Besides standardising on a single interface for opening child nodes,
this patch allows the user to specify options to individual extent
nodes. Overriding file names isn't possible with this yet, so it's of
limited usefulness, but still a step forward.
Signe
On Tue, May 05, 2015 at 10:51:13AM +0800, Fam Zheng wrote:
> This reverts commit fc3959e4669a1c2149b91ccb05101cfc7ae1fc05.
>
> The core write code already handles the case, so remove this
> duplication.
>
> Because commit 61007b316 moved the touched code from block.c to
> block/io.c, the change i
On Tue, May 05, 2015 at 10:51:15AM +0800, Fam Zheng wrote:
> Test zero write in byte range 512~1024 for 4k alignment.
>
> Signed-off-by: Fam Zheng
> ---
> tests/qemu-iotests/033 | 13 +
> tests/qemu-iotests/033.out | 30 ++
> 2 files changed, 43 insert
On Tue, May 05, 2015 at 10:51:14AM +0800, Fam Zheng wrote:
This function is complex. I had to draw a diagram to remember the
relationships between the variables. It would be nice to split it if
that can be done in a way that makes the code nicer.
> @@ -1236,13 +1238,39 @@ static int coroutine_f
Am 08.05.2015 um 22:06 hat Eric Blake geschrieben:
> On 05/08/2015 11:21 AM, Kevin Wolf wrote:
> > Signed-off-by: Kevin Wolf
> > ---
>
> Might want to include mention of what it will be used for in the commit
> body.
You're right. This is the new commit message:
This counts the entries in a
On 08.05.2015 19:21, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf
---
blockdev.c| 24
include/block/block.h | 8
2 files changed, 20 insertions(+), 12 deletions(-)
Any reason for not making it part of the BLOCK_OPT_* macros in block_int.h?
Max
On 08.05.2015 19:21, Kevin Wolf wrote:
Besides standardising on a single interface for opening child nodes,
this simplifies the .bdrv_open() implementation of the quorum block
driver by using block layer functionality for handling BlockdevRefs.
Signed-off-by: Kevin Wolf
---
block/quorum.c | 5
On 08.05.2015 19:21, Kevin Wolf wrote:
In the block layer functions that determine options for a child block
device, it's a common pattern to either copy options from the parent's
options or to set a default string if the option isn't explicitly set
yet for the child. Provide convenience function
On 11.05.2015 15:30, Alberto Garcia wrote:
On Mon 11 May 2015 03:23:25 PM CEST, Max Reitz wrote:
+ disk_size = l2_cache_size * cluster_size / 8
+ disk_size = refcount_cache_size * cluster_size / 2
Only with the default of refcount_bits=16. In the general case, it's
refcount_cache_size * c
On Wed, Apr 22, 2015 at 08:04:45PM -0400, John Snow wrote:
> Test simple usage cases for using transactions to create
> and synchronize incremental backups.
>
> Signed-off-by: John Snow
> ---
> tests/qemu-iotests/124 | 54
> ++
> tests/qemu-iotest
Am 08.05.2015 um 23:30 hat Eric Blake geschrieben:
> On 05/08/2015 11:21 AM, Kevin Wolf wrote:
> > In the block layer functions that determine options for a child block
> > device, it's a common pattern to either copy options from the parent's
> > options or to set a default string if the option is
On 08.05.2015 19:21, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf
---
include/qapi/qmp/qdict.h | 1 +
qobject/qdict.c | 68 +---
2 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qd
On Mon 11 May 2015 03:23:25 PM CEST, Max Reitz wrote:
>> + disk_size = l2_cache_size * cluster_size / 8
>> + disk_size = refcount_cache_size * cluster_size / 2
>
> Only with the default of refcount_bits=16. In the general case, it's
> refcount_cache_size * cluster_size * 8 / refcount_bits.
I
On 11.05.2015 14:55, Alberto Garcia wrote:
QEMU has options to configure the size of the L2 and refcount caches
for the qcow2 format. However, choosing the right sizes for a
particular disk image is not a straightforward operation since the
ratio between the cache size and the allocated disk spac
On 11.05.2015 14:54, Alberto Garcia wrote:
This function never receives an invalid table pointer, so we can make
it void and remove all the error checking code.
Signed-off-by: Alberto Garcia
Reviewed-by: Stefan Hajnoczi
---
block/qcow2-cache.c| 7 +--
block/qcow2-cluster.c | 50 ++
This function never receives an invalid table pointer, so we can make
it void and remove all the error checking code.
Signed-off-by: Alberto Garcia
Reviewed-by: Stefan Hajnoczi
---
block/qcow2-cache.c| 7 +--
block/qcow2-cluster.c | 50 ++---
On Fri, May 08, 2015 at 08:29:09AM -0600, Eric Blake wrote:
> On 05/08/2015 07:14 AM, Stefan Hajnoczi wrote:
>
> > No it doesn't. Actions have to appear atomic to the qmp_transaction
> > caller. Both approaches achieve that so they are both correct in
> > isolation.
> >
> > The ambiguity is whe
Fix pointer declaration to make it consistent with the rest of the
code.
Signed-off-by: Alberto Garcia
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Max Reitz
---
block/qcow2-cache.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
New version of the qcow2 cache patches:
v3:
- Removed a dead comment in patch #3
- New document explaining how to configure the cache sizes
v2: https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00833.html
- Don't do pointer arithmetic on void *
- Rename table_addr() to qcow2_cache_get_t
The current algorithm to evict entries from the cache gives always
preference to those in the lowest positions. As the size of the cache
increases, the chances of the later elements of being removed decrease
exponentially.
In a scenario with random I/O and lots of cache misses, entries in
position
QEMU has options to configure the size of the L2 and refcount caches
for the qcow2 format. However, choosing the right sizes for a
particular disk image is not a straightforward operation since the
ratio between the cache size and the allocated disk space is not
obvious and depends on the size of t
The qcow2 L2/refcount cache contains one separate table for each cache
entry. Doing one allocation per table adds unnecessary overhead and it
also requires us to store the address of each table separately.
Since the size of the cache is constant during its lifetime, it's
better to have an array th
The current cache algorithm traverses the array starting always from
the beginning, so the average number of comparisons needed to perform
a lookup is proportional to the size of the array.
By using a hash of the offset as the starting point, lookups are
faster and independent from the array size.
A cache miss means that the whole array was traversed and the entry
we were looking for was not found, so there's no need to traverse it
again in order to select an entry to replace.
Signed-off-by: Alberto Garcia
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Max Reitz
---
block/qcow2-cache.c | 45
Since all tables are now stored together, it is possible to obtain
the position of a particular table directly from its address, so the
operation becomes O(1).
Signed-off-by: Alberto Garcia
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Max Reitz
---
block/qcow2-cache.c | 32 +++
On Fri, Apr 24, 2015 at 05:00:18PM +0200, Max Reitz wrote:
> On 18.03.2015 21:56, Max Reitz wrote:
> >This series adds support to qemu for changing the refcount_bits option
> >of an existing qcow2 file through the qemu-img amend command.
> >
> >Originally (up until v7), this series was called
> >"q
On Wed, Mar 18, 2015 at 04:56:28PM -0400, Max Reitz wrote:
> Add tests for conversion between different refcount widths.
>
> Signed-off-by: Max Reitz
> Reviewed-by: Eric Blake
> ---
> tests/qemu-iotests/112 | 109
> +
> tests/qemu-iotests/112.out
On 11/05/2015 12:16, Kevin Wolf wrote:
> Am 08.05.2015 um 19:47 hat Dimitris Aragiorgis geschrieben:
>> > Building the QEMU tools fails if we #define DEBUG_BLOCK inside
>> > block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
>> > so that DEBUG_BLOCK_PRINT can be used, we substitu
Am 08.05.2015 um 19:47 hat Dimitris Aragiorgis geschrieben:
> Get rid of several #ifdef DEBUG_FLOPPY and substitute them with
> DPRINTF.
>
> Signed-off-by: Dimitris Aragiorgis
Hm, this removes the option of selectively enabling debug messages. It's
probably not a big probem in this case, though.
Am 08.05.2015 um 19:47 hat Dimitris Aragiorgis geschrieben:
> Building the QEMU tools fails if we #define DEBUG_BLOCK inside
> block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
> so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with
> a simple DPRINTF().
>
> Signed
Am 08.05.2015 um 19:47 hat Dimitris Aragiorgis geschrieben:
> During migration, QEMU uses fsync()/fdatasync() on the open file
> descriptor for read-write block devices to flush data just before
> stopping the VM.
>
> However, fsync() on a scsi-generic device returns -EINVAL which
> causes the mig
Am 08.05.2015 um 19:47 hat Dimitris Aragiorgis geschrieben:
> Instead of checking bs->sg use bdrv_is_sg() consistently throughout
> the code.
>
> Signed-off-by: Dimitris Aragiorgis
> Reviewed-by: Paolo Bonzini
> ---
> block.c |6 +++---
> block/iscsi.c |2 +-
> block/raw-p
On 11/05/2015 10:02, Fam Zheng wrote:
>
> /*
>* ...
>*
>* 'pnum' is set to the number of sectors (including and immediately
> following
>* the specified sector) that are known to be in the same
>* allocated/unallocated state.
>*
>* '
On Wed, 05/06 12:21, Paolo Bonzini wrote:
>
>
> On 06/05/2015 11:50, Fam Zheng wrote:
> > # src can_write_zeroes_with_unmap target
> > can_write_zeroes_with_unmap
> >
> > 1 true
80 matches
Mail list logo