On Fri 24 Apr 2015 03:04:06 PM CEST, Kevin Wolf kw...@redhat.com wrote:
I think it would be nice to have a way to free unused cache
entries after a while.
Do you think mmap plus a periodic timer would work?
I'm hesitant about changes like this because they make QEMU more
complex,
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
blockdev.c| 8
include/block/block.h | 1 +
qemu-options.hx | 4
3 files changed, 13 insertions(+)
diff
On Thu, May 07, 2015 at 02:43:43PM +0100, Stefan Hajnoczi wrote:
On Wed, May 06, 2015 at 07:23:32PM +0800, Fam Zheng wrote:
Reported by Paolo.
Unlike the iohandler in main loop, iothreads currently process the event
notifier used as virtio-blk ioeventfd in all nested aio_poll. This is
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
Cc: Jeff Cody jc...@redhat.com
---
block/backup.c | 13 +
blockjob.c | 10 ++
Usage:
-drive file=xxx,id=Y, \
-drive
file=,id=X,backing_reference.drive_id=Y,backing_reference.hidden-disk.*
It will create such backing chain:
{virtio-blk dev 'Y'}
{virtio-blk dev 'X'}
|
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
docs/block-replication.txt | 179 +
1 file
If the child is not ready, read/write/getlength/flush will
return -errno. It is not critical error, and can be ignored:
1. read/write:
Just not report the error event.
2. getlength:
just ignore it. If all children's getlength return -errno,
and be ignored, return -EIO.
3. flush:
Just
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
block/quorum.c | 78 ++
1 file changed, 78 insertions(+)
diff --git
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
block.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/block.c b/block.c
index 35e1a95..15d21da 100644
--- a/block.c
+++
When opening BDS, we need to create backup jobs for
image-fleecing.
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
Cc: Jeff Cody jc...@redhat.com
---
block/Makefile.objs | 2 +-
1 file
On Tue, May 05, 2015 at 04:23:56PM +0100, Dr. David Alan Gilbert wrote:
* Stefan Hajnoczi (stefa...@redhat.com) wrote:
On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote:
On 24/04/2015 11:38, Wen Congyang wrote:
That can be done with drive-mirror. But I think
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
block.c | 12
1 file changed, 12 insertions(+)
diff --git a/block.c b/block.c
index 54d5b29..a442b5f 100644
--- a/block.c
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
block/Makefile.objs | 1 +
block/replication.c | 512
2 files changed, 513
Am 06.05.2015 um 13:29 hat Kevin Wolf geschrieben:
Before a freed cluster can be reused, pending discards for this cluster
must be processed.
The original assumption was that this was not a problem because discards
are only cached during discard/write zeroes operations, which are
* Stefan Hajnoczi (stefa...@redhat.com) wrote:
On Tue, May 05, 2015 at 04:23:56PM +0100, Dr. David Alan Gilbert wrote:
* Stefan Hajnoczi (stefa...@redhat.com) wrote:
On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote:
On 24/04/2015 11:38, Wen Congyang wrote:
Am 08.05.2015 um 11:00 hat Alberto Garcia geschrieben:
On Fri 24 Apr 2015 03:04:06 PM CEST, Kevin Wolf kw...@redhat.com wrote:
I think it would be nice to have a way to free unused cache
entries after a while.
Do you think mmap plus a periodic timer would work?
I'm hesitant
Am 06.05.2015 um 20:01 hat Max Reitz geschrieben:
On 06.05.2015 14:23, Fam Zheng wrote:
This fixes the bug introduced by commit c6ac36e (vmdk: Optimize cluster
allocation).
Sometimes, write_len could be larger than cluster size, because it
contains both data and marker. We must advance
On 08/05/2015 12:34, Kevin Wolf wrote:
Am 08.05.2015 um 12:16 hat Paolo Bonzini geschrieben:
On 08/05/2015 12:08, Kevin Wolf wrote:
If so, the commands seem to be hopelessly underspecified, especially
with respect to error conditions. And where it says something about
errors, it doesn't
On 08.05.2015 15:14, Stefan Hajnoczi wrote:
On Thu, May 07, 2015 at 01:22:26PM -0400, John Snow wrote:
On 05/07/2015 10:54 AM, Stefan Hajnoczi wrote:
On Wed, Apr 22, 2015 at 08:04:44PM -0400, John Snow wrote:
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState
*common, +
On 06.05.2015 15:39, Alberto Garcia wrote:
The qcow2 L2/refcount cache contains one separate table for each cache
entry. Doing one allocation per table adds unnecessary overhead and it
also requires us to store the address of each table separately.
Since the size of the cache is constant during
On 06.05.2015 15:39, Alberto Garcia wrote:
A cache miss means that the whole array was traversed and the entry
we were looking for was not found, so there's no need to traverse it
again in order to select an entry to replace.
Signed-off-by: Alberto Garcia be...@igalia.com
---
On 06.05.2015 15:39, Alberto Garcia wrote:
The current algorithm to evict entries from the cache gives always
preference to those in the lowest positions. As the size of the cache
increases, the chances of the later elements of being removed decrease
exponentially.
In a scenario with random I/O
On 06.05.2015 15:39, Alberto Garcia wrote:
The current cache algorithm traverses the array starting always from
the beginning, so the average number of comparisons needed to perform
a lookup is proportional to the size of the array.
By using a hash of the offset as the starting point, lookups
If the child was defined in the same context (-drive argument or
blockdev-add QMP command) as its parent, a reopen of the parent should
work the same and allow changing options of the child.
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block.c | 12 ++--
1 file changed, 10
This is doing a more complete test on setting cache modes both while
opening an image (i.e. in a -drive command line) and in reopen
situations. It checks that reopen can specify options for child nodes
and that cache modes are correctly inherited from parent nodes where
they are not specified.
Currently, the block layer assumes that any block node can have only one
parent, and if it has a parent, that it inherits some options/flags from
this parent.
This is not true any more: With references used in block device
creation, a single node can be used by multiple parents, or it can be
This adds the cache mode options to the QDict, so that they can be
specified for child nodes (e.g. backing.cache.direct=off).
The cache modes are not removed from the flags at this point; instead,
options and flags are kept in sync. If the user specifies both flags and
options, the options take
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block.c | 42 +++---
block/commit.c| 4 ++--
include/block/block.h | 4 +++-
3 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/block.c b/block.c
index 95dc51e..561cefd 100644
First of all, sorry for the lengthy series that attacks more things than
could fit in the subject line. However, for the most part the changes
are hard to separate: Either it's just infrastructure without a user, or
it's a user, but the infrastructure changes wouldn't be complete.
The only
Signed-off-by: Kevin Wolf kw...@redhat.com
---
include/qapi/qmp/qdict.h | 1 +
qobject/qdict.c | 68 +---
2 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index
In the block layer functions that determine options for a child block
device, it's a common pattern to either copy options from the parent's
options or to set a default string if the option isn't explicitly set
yet for the child. Provide convenience functions so that it becomes a
one-liner for
On Fri 08 May 2015 05:46:57 PM CEST, Max Reitz wrote:
Let's assume the cache is full. Now this hash algorithm (direct mapped
cache) basically becomes futile, because the LRU algorithm (fully
associative cache) takes over
That's right, although in that scenario I guess there's no good
On 06.05.2015 15:39, Alberto Garcia wrote:
This function never receives an invalid table pointer, so we can make
it void and remove all the error checking code.
Signed-off-by: Alberto Garcia be...@igalia.com
---
block/qcow2-cache.c| 7 +--
block/qcow2-cluster.c | 50
The code already special-cased node-name, which is currently the only
option passed in the QDict that isn't driver-specific. Generalise the
code to take all general block layer options into consideration.
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block.c | 26 ++
1
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block/qcow2.c | 12 ++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index abe22f3..84d6e0f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -546,8 +546,8 @@ static int
For updating the cache sizes or disabling lazy refcounts there is a bit
more to do than just changing the variables, but otherwise we're all set
for changing options during bdrv_reopen().
Just implement the missing pieces and hook the functions up in
bdrv_reopen().
Signed-off-by: Kevin Wolf
Before we can allow updating options at runtime with bdrv_reopen(), we
need to split the function into prepare/commit/abort parts.
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block/qcow2.c | 101 ++
1 file changed, 67 insertions(+), 34
Options are not actually inherited from the parent node yet, but this
commit lays the grounds for doing so.
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block.c | 51 ++-
include/block/block_int.h | 3 ++-
2 files changed, 30
On Fri 08 May 2015 05:51:30 PM CEST, Max Reitz wrote:
-int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
+void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
{
int i = qcow2_cache_get_table_idx(bs, c, *table);
-if (c-entries[i].offset
On 05/08/2015 09:17 AM, Max Reitz wrote:
On 08.05.2015 15:14, Stefan Hajnoczi wrote:
On Thu, May 07, 2015 at 01:22:26PM -0400, John Snow wrote:
On 05/07/2015 10:54 AM, Stefan Hajnoczi wrote:
On Wed, Apr 22, 2015 at 08:04:44PM -0400, John Snow wrote:
+static void
In order to decide whether a blkdebug: filename can be produced or a
json: one is necessary, blkdebug checked whether bs-options had more
options than just config, x-image or image (the latter including
nested options). That doesn't work well when generic block layer options
are present.
This
Specifying the cache mode for a driver without a medium is not a useful
thing to do: As long as there is no medium, the cache mode doesn't make
a difference, and once the 'change' command is used to insert a medium,
it ignores the old cache mode and makes the new medium use
cache=writethrough.
Signed-off-by: Kevin Wolf kw...@redhat.com
---
qemu-io-cmds.c | 71 ++
1 file changed, 71 insertions(+)
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 1afcfc0..ef8f3fd 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1978,6 +1978,76
Instead of passing a separate drv argument to bdrv_open_common(), just
make sure that a driver option is set in the QDict. This also means
that a driver entry is consistently present in bs-options now.
This is another step towards keeping all options in the QDict (which is
the represenation of
Some drivers have nested options (e.g. blkdebug rule arrays), which
don't belong to a child node and shouldn't be removed. Don't remove all
options with . in their name, but check for the complete prefixes of
actually existing child nodes.
Signed-off-by: Kevin Wolf kw...@redhat.com
---
block.c
Until now, an SG device was identified only by checking if its path
started with /dev/sg. Then, hdev_open() set bs-sg accordingly.
This is very fragile, e.g. it fails with symlinks or relative paths.
We should rely on the actual properties of the device instead of the
specified file path.
Test
Instead of checking bs-sg use bdrv_is_sg() consistently throughout
the code.
Signed-off-by: Dimitris Aragiorgis dim...@arrikto.com
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
---
block.c |6 +++---
block/iscsi.c |2 +-
block/raw-posix.c |4 ++--
3 files changed, 6
Get rid of several #ifdef DEBUG_FLOPPY and substitute them with
DPRINTF.
Signed-off-by: Dimitris Aragiorgis dim...@arrikto.com
---
block/raw-posix.c | 20 +---
1 file changed, 5 insertions(+), 15 deletions(-)
diff --git a/block/raw-posix.c b/block/raw-posix.c
index
During migration, QEMU uses fsync()/fdatasync() on the open file
descriptor for read-write block devices to flush data just before
stopping the VM.
However, fsync() on a scsi-generic device returns -EINVAL which
causes the migration to fail. This patch skips flushing data in case
of an SG device,
Building the QEMU tools fails if we #define DEBUG_BLOCK inside
block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with
a simple DPRINTF().
Signed-off-by: Dimitris Aragiorgis dim...@arrikto.com
---
block/raw-posix.c
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf kw...@redhat.com
---
Might want to include mention of what it will be used for in the commit
body.
include/qapi/qmp/qdict.h | 1 +
qobject/qdict.c | 68
+---
2 files
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
Besides standardising on a single interface for opening child nodes,
this simplifies the .bdrv_open() implementation of the quorum block
driver by using block layer functionality for handling BlockdevRefs.
Signed-off-by: Kevin Wolf kw...@redhat.com
On 08.05.2015 12:08, Kevin Wolf wrote:
Am 07.05.2015 um 16:50 hat Paolo Bonzini geschrieben:
On 07/05/2015 16:34, Kevin Wolf wrote:
Am 07.05.2015 um 16:16 hat Paolo Bonzini geschrieben:
On 07/05/2015 16:07, Kevin Wolf wrote:
This is not right for two reasons: The first is that this is
On 07.05.2015 06:04, Zhe Qiu wrote:
From: phoeagon phoea...@gmail.com
In reference to
b0ad5a455d7e5352d4c86ba945112011dbeadfb8~078a458e077d6b0db262c4b05fee51d01de2d1d2,
metadata writes to qcow2/cow/qcow/vpc/vmdk are all synced prior to succeeding
writes.
Only when write is successful that
On Thu, May 07, 2015 at 01:22:26PM -0400, John Snow wrote:
On 05/07/2015 10:54 AM, Stefan Hajnoczi wrote:
On Wed, Apr 22, 2015 at 08:04:44PM -0400, John Snow wrote:
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState
*common, +
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
Signed-off-by: Kevin Wolf kw...@redhat.com
---
blockdev.c| 24
include/block/block.h | 8
2 files changed, 20 insertions(+), 12 deletions(-)
Reviewed-by: Eric Blake ebl...@redhat.com
--
Eric Blake
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
Instead of manually parsing options and then deleting them from the
options QDict, just use QemuOpts like most other places that deal with
block device options.
More options will be added there and then QemuOpts is a lot more
managable than
On 08/05/2015 13:38, Dimitris Aragiorgis wrote:
Instead of checking bs-sg use bdrv_is_sg() consistently throughout
the code.
Signed-off-by: Dimitris Aragiorgis dim...@arrikto.com
---
block.c |6 +++---
block/iscsi.c |2 +-
block/raw-posix.c |4 ++--
3 files
On 08/05/2015 13:38, Dimitris Aragiorgis wrote:
During migration, QEMU uses fsync()/fdatasync() on the open file
descriptor for read-write block devices to flush data just before
stopping the VM.
However, fsync() on a scsi-generic device returns -EINVAL which
causes the migration to fail.
On Fri 08 May 2015 11:47:00 AM CEST, Kevin Wolf wrote:
There's also the problem that with a single chunk of memory for all
cache tables it's not so easy to free individual entries.
That's one of the reasons why I suggested to fix the problem by using
mmap() for the individual entries rather
In case of correctness, lacking a sync here does not introduce data
corruption I can think of. But this reduces the volatile window during
which the metadata changes are NOT guaranteed on disk. Without a barrier,
in case of power loss you may end up with the bitmap changes on disk and
not the
Thanks. Dbench does not logically allocate new disk space all the time,
because it's a FS level benchmark that creates file and deletes them.
Therefore it also depends on the guest FS, say, a btrfs guest FS allocates
about 1.8x space of that from EXT4, due to its COW nature. It does cause
the FS
BTW, how do you usually measure the time to install a Linux distro within?
Most distros ISOs do NOT have unattended installation ISOs in place. (True
I can bake my own ISOs for this...) But do you have any ISOs made ready for
this purpose?
On Sat, May 9, 2015 at 11:54 AM phoeagon
Building the QEMU tools fails if we #define DEBUG_BLOCK inside
block/raw-posix.c. This happens because qemu-log.o is missing from
block-obj-y, which causes the link to fail. Fix this.
Signed-off-by: Dimitris Aragiorgis dim...@arrikto.com
---
Makefile.objs |2 +-
1 file changed, 1
Hi all,
These four patches make slight changes to the way QEMU handles SCSI
generic devices to fix a number of small problems.
I am sending them against the master branch, since I don't know if they
can be considered bugfixes.
Thanks,
dimara
Dimitris Aragiorgis (4):
Fix migration in case of
Until now, an SG device was identified only by checking if its path
started with /dev/sg. Then, hdev_open() set bs-sg accordingly.
This is very fragile, e.g. it fails with symlinks or relative paths.
We should rely on the actual properties of the device instead of the
specified file path.
Test
During migration, QEMU uses fsync()/fdatasync() on the open file
descriptor for read-write block devices to flush data just before
stopping the VM.
However, fsync() on a scsi-generic device returns -EINVAL which
causes the migration to fail. This patch skips flushing data in case
of an SG device,
Am 08.05.2015 um 13:50 hat phoeagon geschrieben:
In case of correctness, lacking a sync here does not introduce data corruption
I can think of. But this reduces the volatile window during which the metadata
changes are NOT guaranteed on disk. Without a barrier, in case of power loss
you may
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
Currently, the block layer assumes that any block node can have only one
parent, and if it has a parent, that it inherits some options/flags from
this parent.
This is not true any more: With references used in block device
creation, a single node
On 05/08/2015 11:47 AM, Dimitris Aragiorgis wrote:
Building the QEMU tools fails if we #define DEBUG_BLOCK inside
block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with
a simple DPRINTF().
Signed-off-by:
70 matches
Mail list logo