[Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
Use numeric value to replace in_use flag in BDS, this will make lifecycle management with ref count possible. This patch only replaces existing uses of bdrv_set_in_use, so no logic change here. Signed-off-by: Fam Zheng f...@redhat.com --- block-migration.c | 4 ++-- block.c | 23 +++ blockjob.c | 6 +++--- hw/block/dataplane/virtio-blk.c | 4 ++-- include/block/block.h | 3 ++- include/block/block_int.h | 2 +- 6 files changed, 25 insertions(+), 17 deletions(-) diff --git a/block-migration.c b/block-migration.c index 2fd7699..2efb6c0 100644 --- a/block-migration.c +++ b/block-migration.c @@ -321,7 +321,7 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs) bmds-shared_base = block_mig_state.shared_base; alloc_aio_bitmap(bmds); drive_get_ref(drive_get_by_blockdev(bs)); -bdrv_set_in_use(bs, 1); +bdrv_get_ref(bs); block_mig_state.total_sector_sum += sectors; @@ -557,7 +557,7 @@ static void blk_mig_cleanup(void) blk_mig_lock(); while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) { QSIMPLEQ_REMOVE_HEAD(block_mig_state.bmds_list, entry); -bdrv_set_in_use(bmds-bs, 0); +bdrv_get_ref(bmds-bs); drive_put_ref(drive_get_by_blockdev(bmds-bs)); g_free(bmds-aio_bitmap); g_free(bmds); diff --git a/block.c b/block.c index 6c493ad..84c3181 100644 --- a/block.c +++ b/block.c @@ -1503,8 +1503,10 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest, /* dirty bitmap */ bs_dest-dirty_bitmap = bs_src-dirty_bitmap; +/* ref count */ +bs_dest-refcount = bs_src-refcount; + /* job */ -bs_dest-in_use = bs_src-in_use; bs_dest-job= bs_src-job; /* keep the same entry in bdrv_states */ @@ -1534,7 +1536,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) assert(bs_new-dirty_bitmap == NULL); assert(bs_new-job == NULL); assert(bs_new-dev == NULL); -assert(bs_new-in_use == 0); +assert(bs_new-refcount == 0); assert(bs_new-io_limits_enabled == false); assert(bs_new-block_timer == NULL); @@ -1553,7 +1555,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) /* Check a few fields that should remain attached to the device */ assert(bs_new-dev == NULL); assert(bs_new-job == NULL); -assert(bs_new-in_use == 0); +assert(bs_new-refcount == 0); assert(bs_new-io_limits_enabled == false); assert(bs_new-block_timer == NULL); @@ -1590,7 +1592,7 @@ void bdrv_delete(BlockDriverState *bs) { assert(!bs-dev); assert(!bs-job); -assert(!bs-in_use); +assert(!bs-refcount); /* remove from list, if necessary */ bdrv_make_anon(bs); @@ -4360,15 +4362,20 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs) } } -void bdrv_set_in_use(BlockDriverState *bs, int in_use) +void bdrv_put_ref(BlockDriverState *bs) +{ +assert(bs-refcount 0); +bs-refcount--; +} + +void bdrv_get_ref(BlockDriverState *bs) { -assert(bs-in_use != in_use); -bs-in_use = in_use; +bs-refcount++; } int bdrv_in_use(BlockDriverState *bs) { -return bs-in_use; +return bs-refcount 0; } void bdrv_iostatus_enable(BlockDriverState *bs) diff --git a/blockjob.c b/blockjob.c index ca80df1..a841a66 100644 --- a/blockjob.c +++ b/blockjob.c @@ -45,7 +45,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs, error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs)); return NULL; } -bdrv_set_in_use(bs, 1); +bdrv_get_ref(bs); job = g_malloc0(job_type-instance_size); job-job_type = job_type; @@ -63,7 +63,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs, if (error_is_set(local_err)) { bs-job = NULL; g_free(job); -bdrv_set_in_use(bs, 0); +bdrv_put_ref(bs); error_propagate(errp, local_err); return NULL; } @@ -79,7 +79,7 @@ void block_job_completed(BlockJob *job, int ret) job-cb(job-opaque, ret); bs-job = NULL; g_free(job); -bdrv_set_in_use(bs, 0); +bdrv_put_ref(bs); } void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 0356665..9893a11 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -431,7 +431,7 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk, s-blk = blk; /* Prevent block operations that conflict with data plane thread */ -bdrv_set_in_use(blk-conf.bs, 1); +bdrv_get_ref(blk-conf.bs); error_setg(s-migration_blocker, x-data-plane does not support migration); @@
[Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD
This series adds for point-in-time snapshot NBD exporting based on drive-backup. The ideas is described below and patches followed (the missing part is item 3, which work is in progress by Ian Main who will have another patch on it). As the work does not overlap, these series should be quite reviewable by itself. Background == The goal of image fleecing is to provide a interface to inspect a point-in-time snapshot of guest image data, not being interfered with guest overwrites after it's created. With drive-backup we already have the point-in-time snapshot image (the target image), we only need three modifications to realize this: 1. Give backup target an id, so we can add it to NBD server. 2. Assign source device as backing of target, so reading the unallocated will be passed to source. As there's copy-on-write mechanism with drive-backup job, all the modified data after snapshot created is copied to target, the unallocated data is guaranteed to be unchanged, so reading from the source is correct. Note that this requires target format supports backing file. 3. Adding sync mode 'none' to drive-backup, so the block job only copy changed data from source, which has minimal IO overhead. Usage = With above three, we can simply export a point-in-time snapshot with two QMP commands: drive-backup device=virtio0 format=qcow2 target=point-in-time.qcow2 target-id=pit0 sync=none (sync=none is not implemented for now but we can simulate with speed=1) nbd-server-add device=pit0 writable=no Lifecycle = Reference count for BlockDriverState is implemented to manage their lifecycles. Device attachment, block job, backing hd, bs-file NBD and others are current referred points of a BDS, they each call bdrv_get_ref when starting to use a BDS and bdrv_put_ref when releasing, e.g. on hot plug/unplug, nbd_server_add, drive-backup, etc. I.e., when a drive-backup target bs is being exported through NBD and the backup job finishes (or cancelled), the device is not deleted, the client can continue to access the NBD target until nbd_server_stop. It is automatically deleted when the last reference is released. The target image file is not removed automatically, since we can't assume that it's temporary. It's up to the user to remove it manually, or we need to add a command or option to mark the target bs temporary. Fam Zheng (7): block: Convert BlockDriverState.in_use to refcount block: use refcount to manage BlockDriverState lifecycle nbd: use BDS refcount block: simplify bdrv_drop_intermediate block: rename bdrv_in_use to bdrv_is_shared block: add target-id option to drive-backup QMP command block: assign backing relationship in drive-backup block-migration.c | 5 +- block.c | 123 +++- block/backup.c | 16 +- block/blkdebug.c| 1 + block/blkverify.c | 2 + block/mirror.c | 4 +- block/snapshot.c| 3 +- block/stream.c | 2 +- block/vvfat.c | 4 +- blockdev-nbd.c | 9 +-- blockdev.c | 19 --- blockjob.c | 8 +-- hw/block/dataplane/virtio-blk.c | 4 +- hw/block/xen_disk.c | 7 +-- include/block/block.h | 5 +- include/block/block_int.h | 18 +- nbd.c | 5 ++ qapi-schema.json| 7 ++- qmp-commands.hx | 3 +- 19 files changed, 126 insertions(+), 119 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH 2/7] block: use refcount to manage BlockDriverState lifecycle
Device attach, NBD server and block job can share a BlockDriverState, this patch makes them to use refcount to grab/release a bs so they don't interfere each other with BDS lifecycle. Local BDS allocation/releasing don't need to use refcount to manage it, just use bdrv_new() and bdrv_delete(). If BDS is possibly shared with other code (e.g. NBD and guest device), use bdrv_new to allocate, then bdrv_get_ref to grab it, when job is done, call bdrv_put_ref() to release it. Don't call bdrv_delete(), since it's deleted automatically by the last bdrv_put_ref(). Signed-off-by: Fam Zheng f...@redhat.com --- block-migration.c | 1 - block.c | 31 --- block/backup.c| 3 ++- block/blkdebug.c | 1 + block/blkverify.c | 2 ++ block/mirror.c| 4 ++-- block/snapshot.c | 3 ++- block/stream.c| 2 +- block/vvfat.c | 4 +++- blockdev.c| 5 +++-- hw/block/xen_disk.c | 7 +-- include/block/block_int.h | 16 12 files changed, 53 insertions(+), 26 deletions(-) diff --git a/block-migration.c b/block-migration.c index 2efb6c0..2154b3a 100644 --- a/block-migration.c +++ b/block-migration.c @@ -558,7 +558,6 @@ static void blk_mig_cleanup(void) while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) { QSIMPLEQ_REMOVE_HEAD(block_mig_state.bmds_list, entry); bdrv_get_ref(bmds-bs); -drive_put_ref(drive_get_by_blockdev(bmds-bs)); g_free(bmds-aio_bitmap); g_free(bmds); } diff --git a/block.c b/block.c index 84c3181..d1ce570 100644 --- a/block.c +++ b/block.c @@ -740,7 +740,6 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file, ret = -EINVAL; goto free_and_fail; } -assert(file != NULL); bs-file = file; ret = drv-bdrv_open(bs, options, open_flags); } @@ -748,7 +747,6 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file, if (ret 0) { goto free_and_fail; } - ret = refresh_total_sectors(bs, bs-total_sectors); if (ret 0) { goto free_and_fail; @@ -925,6 +923,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options) bs-open_flags |= BDRV_O_NO_BACKING; return ret; } +bdrv_get_ref(bs-backing_hd); return 0; } @@ -1068,9 +1067,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options, if (bs-file != file) { bdrv_delete(file); -file = NULL; } +file = NULL; +bdrv_get_ref(bs-file); + /* If there is a backing file, use it */ if ((flags BDRV_O_NO_BACKING) == 0) { QDict *backing_options; @@ -1367,7 +1368,7 @@ void bdrv_close(BlockDriverState *bs) if (bs-drv) { if (bs-backing_hd) { -bdrv_delete(bs-backing_hd); +bdrv_put_ref(bs-backing_hd); bs-backing_hd = NULL; } bs-drv-bdrv_close(bs); @@ -1391,7 +1392,7 @@ void bdrv_close(BlockDriverState *bs) bs-options = NULL; if (bs-file != NULL) { -bdrv_delete(bs-file); +bdrv_put_ref(bs-file); bs-file = NULL; } } @@ -1536,7 +1537,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) assert(bs_new-dirty_bitmap == NULL); assert(bs_new-job == NULL); assert(bs_new-dev == NULL); -assert(bs_new-refcount == 0); +assert(bs_new-refcount = 1); assert(bs_new-io_limits_enabled == false); assert(bs_new-block_timer == NULL); @@ -1555,7 +1556,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) /* Check a few fields that should remain attached to the device */ assert(bs_new-dev == NULL); assert(bs_new-job == NULL); -assert(bs_new-refcount == 0); +assert(bs_new-refcount = 1); assert(bs_new-io_limits_enabled == false); assert(bs_new-block_timer == NULL); @@ -1609,6 +1610,7 @@ int bdrv_attach_dev(BlockDriverState *bs, void *dev) return -EBUSY; } bs-dev = dev; +bdrv_get_ref(bs); bdrv_iostatus_reset(bs); return 0; } @@ -1626,6 +1628,7 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev) { assert(bs-dev == dev); bs-dev = NULL; +bdrv_put_ref(bs); bs-dev_ops = NULL; bs-dev_opaque = NULL; bs-buffer_alignment = 512; @@ -2102,13 +2105,16 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top, if (ret) { goto exit; } +if (new_top_bs-backing_hd) { +bdrv_put_ref(new_top_bs-backing_hd); +} new_top_bs-backing_hd = base_bs; - +bdrv_get_ref(base_bs); QSIMPLEQ_FOREACH_SAFE(intermediate_state, states_to_delete, entry, next) { /* so that bdrv_close() does not recursively close the chain */ intermediate_state-bs-backing_hd =
[Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
Previously, nbd call drive_get_ref on the drive of bs. A BDS doesn't always have associated dinfo, it's more proper to use bdrv_get_ref(). Signed-off-by: Fam Zheng f...@redhat.com --- blockdev-nbd.c | 9 + nbd.c | 5 + 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 95f10c8..d8bcd6f 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -69,12 +69,6 @@ static void nbd_close_notifier(Notifier *n, void *data) g_free(cn); } -static void nbd_server_put_ref(NBDExport *exp) -{ -BlockDriverState *bs = nbd_export_get_blockdev(exp); -drive_put_ref(drive_get_by_blockdev(bs)); -} - void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, Error **errp) { @@ -106,10 +100,9 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, } exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, - nbd_server_put_ref); + NULL); nbd_export_set_name(exp, device); -drive_get_ref(drive_get_by_blockdev(bs)); n = g_malloc0(sizeof(NBDCloseNotifier)); n-n.notify = nbd_close_notifier; diff --git a/nbd.c b/nbd.c index 2606403..f28b9fb 100644 --- a/nbd.c +++ b/nbd.c @@ -881,6 +881,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, exp-nbdflags = nbdflags; exp-size = size == -1 ? bdrv_getlength(bs) : size; exp-close = close; +bdrv_get_ref(bs); return exp; } @@ -927,6 +928,10 @@ void nbd_export_close(NBDExport *exp) } nbd_export_set_name(exp, NULL); nbd_export_put(exp); +if (exp-bs) { +bdrv_put_ref(exp-bs); +exp-bs = NULL; +} } void nbd_export_get(NBDExport *exp) -- 1.8.3.1
[Qemu-devel] [PATCH 5/7] block: rename bdrv_in_use to bdrv_is_shared
The patch only does a rename: bdrv_in_use is obsecure literally (any BDS is certain to be used somewhere). Rename it to bdrv_is_shared since we have reference count now and the user number of the BDS is reflected there. Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 16 ++-- blockdev.c| 10 +- blockjob.c| 2 +- include/block/block.h | 2 +- 4 files changed, 17 insertions(+), 13 deletions(-) diff --git a/block.c b/block.c index ae5de17..85dc76f 100644 --- a/block.c +++ b/block.c @@ -1775,7 +1775,7 @@ int bdrv_commit(BlockDriverState *bs) return -ENOTSUP; } -if (bdrv_in_use(bs) || bdrv_in_use(bs-backing_hd)) { +if (bdrv_is_shared(bs) || bdrv_is_shared(bs-backing_hd)) { return -EBUSY; } @@ -2621,14 +2621,18 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset) { BlockDriver *drv = bs-drv; int ret; -if (!drv) +if (!drv) { return -ENOMEDIUM; -if (!drv-bdrv_truncate) +} +if (!drv-bdrv_truncate) { return -ENOTSUP; -if (bs-read_only) +} +if (bs-read_only) { return -EACCES; -if (bdrv_in_use(bs)) +} +if (bdrv_is_shared(bs)) { return -EBUSY; +} ret = drv-bdrv_truncate(bs, offset); if (ret == 0) { ret = refresh_total_sectors(bs, offset BDRV_SECTOR_BITS); @@ -4333,7 +4337,7 @@ void bdrv_get_ref(BlockDriverState *bs) bs-refcount++; } -int bdrv_in_use(BlockDriverState *bs) +int bdrv_is_shared(BlockDriverState *bs) { return bs-refcount 1; } diff --git a/blockdev.c b/blockdev.c index 2c2ea59..d02d99a 100644 --- a/blockdev.c +++ b/blockdev.c @@ -858,7 +858,7 @@ static void external_snapshot_prepare(BlkTransactionState *common, return; } -if (bdrv_in_use(state-old_bs)) { +if (bdrv_is_shared(state-old_bs)) { error_set(errp, QERR_DEVICE_IN_USE, device); return; } @@ -1064,7 +1064,7 @@ exit: static void eject_device(BlockDriverState *bs, int force, Error **errp) { -if (bdrv_in_use(bs)) { +if (bdrv_is_shared(bs)) { error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs)); return; } @@ -1226,7 +1226,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) qerror_report(QERR_DEVICE_NOT_FOUND, id); return -1; } -if (bdrv_in_use(bs)) { +if (bdrv_is_shared(bs)) { qerror_report(QERR_DEVICE_IN_USE, id); return -1; } @@ -1471,7 +1471,7 @@ void qmp_drive_backup(const char *device, const char *target, } } -if (bdrv_in_use(bs)) { +if (bdrv_is_shared(bs)) { error_set(errp, QERR_DEVICE_IN_USE, device); return; } @@ -1588,7 +1588,7 @@ void qmp_drive_mirror(const char *device, const char *target, } } -if (bdrv_in_use(bs)) { +if (bdrv_is_shared(bs)) { error_set(errp, QERR_DEVICE_IN_USE, device); return; } diff --git a/blockjob.c b/blockjob.c index a841a66..3e9b9a8 100644 --- a/blockjob.c +++ b/blockjob.c @@ -41,7 +41,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs, { BlockJob *job; -if (bs-job || bdrv_in_use(bs)) { +if (bs-job || bdrv_is_shared(bs)) { error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs)); return NULL; } diff --git a/include/block/block.h b/include/block/block.h index 77f0f0d..6b33f5a 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -355,7 +355,7 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs); void bdrv_put_ref(BlockDriverState *bs); void bdrv_get_ref(BlockDriverState *bs); -int bdrv_in_use(BlockDriverState *bs); +int bdrv_is_shared(BlockDriverState *bs); #ifdef CONFIG_LINUX_AIO int raw_get_aio_fd(BlockDriverState *bs); -- 1.8.3.1
[Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command
Add target-id (optional) to drive-backup command, to make the target bs a named drive so that we can operate on it (e.g. export with NBD). Signed-off-by: Fam Zheng f...@redhat.com --- blockdev.c | 4 +++- qapi-schema.json | 7 +-- qmp-commands.hx | 3 ++- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/blockdev.c b/blockdev.c index d02d99a..a297eaf 100644 --- a/blockdev.c +++ b/blockdev.c @@ -936,6 +936,7 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp) backup = common-action-drive_backup; qmp_drive_backup(backup-device, backup-target, + backup-has_target_id, backup-target_id, backup-has_format, backup-format, backup-has_mode, backup-mode, backup-has_speed, backup-speed, @@ -1421,6 +1422,7 @@ void qmp_block_commit(const char *device, } void qmp_drive_backup(const char *device, const char *target, + bool has_target_id, const char *target_id, bool has_format, const char *format, bool has_mode, enum NewImageMode mode, bool has_speed, int64_t speed, @@ -1495,7 +1497,7 @@ void qmp_drive_backup(const char *device, const char *target, return; } -target_bs = bdrv_new(); +target_bs = bdrv_new(has_target_id ? target_id : ); ret = bdrv_open(target_bs, target, NULL, flags, drv); if (ret 0) { bdrv_delete(target_bs); diff --git a/qapi-schema.json b/qapi-schema.json index 5c32528..2f2a87f 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1654,7 +1654,8 @@ # Since: 1.6 ## { 'type': 'DriveBackup', - 'data': { 'device': 'str', 'target': 'str', '*format': 'str', + 'data': { 'device': 'str', 'target': 'str', +'*target-id': 'str', '*format': 'str', '*mode': 'NewImageMode', '*speed': 'int', '*on-source-error': 'BlockdevOnError', '*on-target-error': 'BlockdevOnError' } } @@ -1807,6 +1808,7 @@ # is a device, the existing file/device will be used as the new # destination. If it does not exist, a new file will be created. # +# @target-id: #optional the drive id of the target. # @format: #optional the format of the new destination, default is to # probe if @mode is 'existing', else the format of the source # @@ -1833,7 +1835,8 @@ # Since 1.6 ## { 'command': 'drive-backup', - 'data': { 'device': 'str', 'target': 'str', '*format': 'str', + 'data': { 'device': 'str', 'target': 'str', +'*target-id': 'str', '*format': 'str', '*mode': 'NewImageMode', '*speed': 'int', '*on-source-error': 'BlockdevOnError', '*on-target-error': 'BlockdevOnError' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index 362f0e1..3ed03de 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -913,7 +913,7 @@ EQMP { .name = drive-backup, -.args_type = device:B,target:s,speed:i?,mode:s?,format:s?, +.args_type = device:B,target:s,target-id:s?,speed:i?,mode:s?,format:s?, on-source-error:s?,on-target-error:s?, .mhandler.cmd_new = qmp_marshal_input_drive_backup, }, @@ -936,6 +936,7 @@ Arguments: device, the existing file/device will be used as the new destination. If it does not exist, a new file will be created. (json-string) +- target-id: the drive id of the target image. - format: the format of the new destination, default is to probe if 'mode' is 'existing', else the format of the source (json-string, optional) -- 1.8.3.1
[Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate
Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 71 ++--- 1 file changed, 11 insertions(+), 60 deletions(-) diff --git a/block.c b/block.c index d1ce570..ae5de17 100644 --- a/block.c +++ b/block.c @@ -2015,12 +2015,6 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active, return overlay; } -typedef struct BlkIntermediateStates { -BlockDriverState *bs; -QSIMPLEQ_ENTRY(BlkIntermediateStates) entry; -} BlkIntermediateStates; - - /* * Drops images above 'base' up to and including 'top', and sets the image * above 'top' to have base as its backing file. @@ -2050,15 +2044,9 @@ typedef struct BlkIntermediateStates { int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top, BlockDriverState *base) { -BlockDriverState *intermediate; -BlockDriverState *base_bs = NULL; BlockDriverState *new_top_bs = NULL; -BlkIntermediateStates *intermediate_state, *next; int ret = -EIO; -QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete; -QSIMPLEQ_INIT(states_to_delete); - if (!top-drv || !base-drv) { goto exit; } @@ -2070,58 +2058,21 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top, goto exit; } -/* special case of new_top_bs-backing_hd already pointing to base - nothing - * to do, no intermediate images */ -if (new_top_bs-backing_hd == base) { -ret = 0; -goto exit; -} - -intermediate = top; - -/* now we will go down through the list, and add each BDS we find - * into our deletion queue, until we hit the 'base' - */ -while (intermediate) { -intermediate_state = g_malloc0(sizeof(BlkIntermediateStates)); -intermediate_state-bs = intermediate; -QSIMPLEQ_INSERT_TAIL(states_to_delete, intermediate_state, entry); - -if (intermediate-backing_hd == base) { -base_bs = intermediate-backing_hd; -break; +while (new_top_bs-backing_hd new_top_bs-backing_hd != base) { +BlockDriverState *backing = new_top_bs-backing_hd; +if (backing == NULL) { +goto exit; } -intermediate = intermediate-backing_hd; -} -if (base_bs == NULL) { -/* something went wrong, we did not end at the base. safely - * unravel everything, and exit with error */ -goto exit; +new_top_bs-backing_hd = backing-backing_hd; +/* break backing_hd chain before releasing bs, so we don't free all the + * way up the backing chain */ +backing-backing_hd = NULL; +bdrv_put_ref(backing); } -/* success - we can delete the intermediate states, and link top-base */ -ret = bdrv_change_backing_file(new_top_bs, base_bs-filename, - base_bs-drv ? base_bs-drv-format_name : ); -if (ret) { -goto exit; -} -if (new_top_bs-backing_hd) { -bdrv_put_ref(new_top_bs-backing_hd); -} -new_top_bs-backing_hd = base_bs; -bdrv_get_ref(base_bs); - -QSIMPLEQ_FOREACH_SAFE(intermediate_state, states_to_delete, entry, next) { -/* so that bdrv_close() does not recursively close the chain */ -intermediate_state-bs-backing_hd = NULL; -bdrv_put_ref(intermediate_state-bs); -} -ret = 0; - +ret = bdrv_change_backing_file(new_top_bs, base-filename, + base-drv ? base-drv-format_name : ); exit: -QSIMPLEQ_FOREACH_SAFE(intermediate_state, states_to_delete, entry, next) { -g_free(intermediate_state); -} return ret; } -- 1.8.3.1
[Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup
Assign source image as the backing hd of target bs, so reading target bs gets the point-in-time copy of data from source image. Signed-off-by: Fam Zheng f...@redhat.com --- block/backup.c | 13 + 1 file changed, 13 insertions(+) diff --git a/block/backup.c b/block/backup.c index 4e9f927..2dd0540 100644 --- a/block/backup.c +++ b/block/backup.c @@ -294,6 +294,11 @@ static void coroutine_fn backup_run(void *opaque) hbitmap_free(job-bitmap); bdrv_iostatus_disable(target); + +bdrv_put_ref(target-backing_hd); +target-backing_hd = NULL; +target-backing_file[0] = '\0'; +target-backing_format[0] = '\0'; bdrv_put_ref(target); block_job_completed(job-common, ret); @@ -332,7 +337,15 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, return; } +target-backing_hd = bs; +pstrcpy(target-backing_file, sizeof(target-backing_file), +bs-filename); +pstrcpy(target-backing_format, sizeof(target-backing_format), +bs-drv-format_name); bdrv_get_ref(target); +/* Get another ref to source for backing_hd relationship */ +bdrv_get_ref(bs); + job-on_source_error = on_source_error; job-on_target_error = on_target_error; job-target = target; -- 1.8.3.1
Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information
On Wed, Jun 26, 2013 at 11:54:20AM +0200, Markus Armbruster wrote: Amos Kong ak...@redhat.com writes: We want to implement mac programming over macvtap through Libvirt, related rx-filter configuration contains main mac, some of rx-mode and mac-table. The previous patch adds QMP event to notify management of rx-filter change. This patch adds a monitor command to query rx-filter information. A flag is used to avoid events flooding, if user don't query rx-filter after receives one event, new events won't be sent to qmp monitor. (qemu) info rx-filter vnet0 vnet0: \ promiscuous: on \ multicast: normal \ unicast: normal \ broadcast-allowed: off \ multicast-overflow: off \ unicast-overflow: off \ main-mac: 52:54:00:12:34:56 \ unicast-table: \ multicast-table: 01:00:5e:00:00:01 33:33:00:00:00:01 33:33:ff:12:34:56 Signed-off-by: Amos Kong ak...@redhat.com Thanks for your comments, some comments are out-of-data, I removed them in this reply. +static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) +{ +VirtIONet *n = qemu_get_nic_opaque(nc); +RxFilterInfo *info; +strList *str_list = NULL; +strList *entry; +int i; + +info = g_malloc0(sizeof(*info)); +info-name = g_strdup(nc-name); +info-promiscuous = n-promisc; + +if (n-nouni) { +info-unicast = RX_STATE_NO; +} else if (n-alluni) { +info-unicast = RX_STATE_ALL; +} else { +info-unicast = RX_STATE_NORMAL; +} + +if (n-nomulti) { +info-multicast = RX_STATE_NO; +} else if (n-allmulti) { +info-multicast = RX_STATE_ALL; +} else { +info-multicast = RX_STATE_NORMAL; +} Makes me wonder whether replacing VirtIONet members noFOO and allFOO by an enum RxState would make things clearer there. Outside the scope of this patch. Good suggestion, added to my todolist. + +info-broadcast_allowed = n-nobcast; +info-multicast_overflow = n-mac_table.multi_overflow; +info-unicast_overflow = n-mac_table.uni_overflow; +info-main_mac = g_strdup_printf(%.2x:%.2x:%.2x:%.2x:%.2x:%.2x, + n-mac[0], n-mac[1], n-mac[2], + n-mac[3], n-mac[4], n-mac[5]); + Please initialize str_list here rather than at its declaration, because that'll make the loop's workings more locally obvious, and because... +for (i = 0; i n-mac_table.first_multi; i++) { +entry = g_malloc0(sizeof(*entry)); +entry-value = g_strdup_printf(%.2x:%.2x:%.2x:%.2x:%.2x:%.2x, + n-mac_table.macs[i * ETH_ALEN], + n-mac_table.macs[i * ETH_ALEN + 1], + n-mac_table.macs[i * ETH_ALEN + 2], + n-mac_table.macs[i * ETH_ALEN + 3], + n-mac_table.macs[i * ETH_ALEN + 4], + n-mac_table.macs[i * ETH_ALEN + 5]); +entry-next = str_list; +str_list = entry; +} +info-unicast_table = str_list; + ... it's how this loop works. Actually, the two loops are duplicates. Consider factoring out a helper function. +return info; +} + static void virtio_net_reset(VirtIODevice *vdev) { VirtIONet *n = VIRTIO_NET(vdev); @@ -442,6 +528,8 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +rxfilter_notify(n-netclient_name); + return VIRTIO_NET_OK; } @@ -495,6 +583,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, s = iov_to_buf(iov, iov_cnt, 0, n-mac, sizeof(n-mac)); assert(s == sizeof(n-mac)); qemu_format_nic_info_str(qemu_get_queue(n-nic), n-mac); +rxfilter_notify(n-netclient_name); + return VIRTIO_NET_OK; } @@ -559,6 +649,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +rxfilter_notify(n-netclient_name); + return VIRTIO_NET_OK; } The error returns don't trigger the event. We can fail after clearing n-mac_table. Why is that okay? We should notify in error path if n-mac_table is changed. @@ -1312,6 +1404,7 @@ static NetClientInfo net_virtio_info = { .receive = virtio_net_receive, .cleanup = virtio_net_cleanup, .link_status_changed = virtio_net_set_link_status, +.query_rx_filter = virtio_net_query_rxfilter, }; static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx) diff --git a/include/net/net.h b/include/net/net.h index 43d85a1..0af5ba3 100644 --- a/include/net/net.h +++ b/include/net/net.h @@ -49,6 +49,7 @@ typedef ssize_t
Re: [Qemu-devel] [PATCH v4 08/10] qemu-ga: call Windows VSS requester in fsfreeze command handler
On 07/01/13 19:59, Tomoki Sekiyama wrote: On 7/1/13 9:29 , Laszlo Ersek ler...@redhat.com wrote: +error: +qmp_guest_fsfreeze_thaw(NULL); Passing NULL here as errp concerns me slightly. I've been assuming that errp is never NULL due to the outermost QMP layer always wanting to receive errors and to serialize them. Specifically, a NULL errp would turn all error_set*(errp, ...) calls into no-ops under qmp_guest_fsfreeze_thaw(), and all error_is_set(errp) questions would answer with false. That said, nothing seems to be affected right now. Maybe a dummy local variable would be more future-proof... OTOH it would be stylistically questionable :) OK, then it should be: Error *local_err = NULL; ... error: qmp_guest_fsfreeze_thaw(local_err); if (error_is_set(local_err)) { error_free(local_err); } I think so, yes. You could also log it before freeing it I guess: g_debug(cleanup thaw: %s, error_get_pretty(local_err)); or some such, but it's probably not important. Thanks, Laszlo
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
Il 01/07/2013 20:33, Jan Kiszka ha scritto: On 2013-06-28 18:58, Paolo Bonzini wrote: Add ref/unref calls at the following places: - places where memory regions are stashed by a listener and used outside the BQL (including in Xen or KVM). - memory_region_find callsites You missed some recently added ones. Check hw/acpi/piix4.c and hw/isa/lpc_ich9.c (both will require some refactorings for this). Here are the required changes: diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 3b95c69..0310053 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -380,6 +380,16 @@ static void piix4_pm_powerdown_req(Notifier *n, void *opaque) acpi_pm1_evt_power_down(s-ar); } +static bool memory_region_present(MemoryRegion *io, hwaddr port) +{ +MemoryRegion *mr = memory_region_find(io, port, 1); +if (!mr) { +return false; +} +memory_region_unref(mr); +return true; +} + static void piix4_pm_machine_ready(Notifier *n, void *opaque) { PIIX4PMState *s = container_of(n, PIIX4PMState, machine_ready); @@ -388,10 +398,10 @@ static void piix4_pm_machine_ready(Notifier *n, void *opaque) pci_conf = s-dev.config; pci_conf[0x5f] = 0x10 | -(memory_region_find(io_as, 0x378, 1).mr ? 0x80 : 0); +(memory_region_present(io_as, 0x378, 1) ? 0x80 : 0); pci_conf[0x63] = 0x60; -pci_conf[0x67] = (memory_region_find(io_as, 0x3f8, 1).mr ? 0x08 : 0) | -(memory_region_find(io_as, 0x2f8, 1).mr ? 0x90 : 0); +pci_conf[0x67] = (memory_region_present(io_as, 0x3f8, 1) ? 0x08 : 0) | +(memory_region_present(io_as, 0x2f8, 1) ? 0x90 : 0); } static int piix4_pm_initfn(PCIDevice *dev) diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 82f8ea6..205ac26 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -478,24 +478,33 @@ static void ich9_lpc_machine_ready(Notifier *n, void *opaque) { ICH9LPCState *s = container_of(n, ICH9LPCState, machine_ready); MemoryRegion *io_as = pci_address_space_io(s-d); +MemoryRegion *mr; uint8_t *pci_conf; pci_conf = s-d.config; -if (memory_region_find(io_as, 0x3f8, 1).mr) { +mr = memory_region_find(io_as, 0x3f8, 1).mr; +if (mr) { /* com1 */ pci_conf[0x82] |= 0x01; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x2f8, 1).mr) { +mr = memory_region_find(io_as, 0x2f8, 1).mr; +if (mr) { /* com2 */ pci_conf[0x82] |= 0x02; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x378, 1).mr) { +mr = memory_region_find(io_as, 0x378, 1).mr; +if (mr) { /* lpt */ pci_conf[0x82] |= 0x04; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x3f0, 1).mr) { +mr = memory_region_find(io_as, 0x3f0, 1).mr; +if (mr) { /* floppy */ pci_conf[0x82] |= 0x08; +memory_region_unref(mr); } }
Re: [Qemu-devel] [PATCH 21/24] hw/p*: pass owner to memory_region_init_io
Il 02/07/2013 01:31, Andreas Färber ha scritto: Signed-off-by: Paolo Bonzini pbonz...@redhat.com Needs an update after ppc-next merge, cf. attached. Thanks! Paolo
Re: [Qemu-devel] [PATCH v5] Add timestamp to error_report()
On 07/01/13 20:54, Seiji Aguchi wrote: [Issue] When we offer a customer support service and a problem happens in a customer's system, we try to understand the problem by comparing what the customer reports with message logs of the customer's system. In this case, we often need to know when the problem happens. But, currently, there is no timestamp in qemu's error messages. Therefore, we may not be able to understand the problem based on error messages. [Solution] Add a timestamp to qemu's error message logged by error_report() with g_time_val_to_iso8601(). Signed-off-by: Seiji Aguchi seiji.agu...@hds.com --- Changelog v4 - v5 - Use sizeof() to define TIMESTAMP_LEN. - Fix descriptions of msg option. - Rename TIME_H to QEMU_TIME_H. (avoiding double inclusion of qemu/time.h) - Change argument of qemu_get_timestamp_str to char * and size_t. - Confirmed msg option is displayed by query-command-line-options. Reviewed-by: Laszlo Ersek ler...@redhat.com Since you plan to post followup patches, please at that time include a one-liner modification: the STEXI-ETEXI section in qemu-options.hx still says (right before ETEXI) (disabled by default). I think it should be fixed in a followup, not by posting v6. Thanks! Laszlo
Re: [Qemu-devel] [PATCH v2 1/2] ps2: add support of auto-repeat
On Wed, Jun 26, 2013 at 01:56:33PM +0200, Markus Armbruster wrote: Luiz Capitulino lcapitul...@redhat.com writes: On Fri, 14 Jun 2013 13:46:41 +0800 Amos Kong ak...@redhat.com wrote: On Fri, May 31, 2013 at 08:31:17PM +0800, Amos Kong wrote: On Thu, May 30, 2013 at 11:48:46AM -0500, Anthony Liguori wrote: Amos Kong ak...@redhat.com writes: diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..8adbb4a 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,10 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ +QEMUTimer *repeat_timer; This state needs to be migrated, no? I suspect it can/should be done via a subsection too. It sounds only reasonable for 'sendkey' command. We want to repeat one key for 100 times, the key should be continaully repeated in the dest vm until it reaches to 100 times. For implement this, we should also migrate key_timer in ui/input.c, then it will send a release event to ps2 queue when the key_timer is expired. The bottom patch migrates repeat_timer repeat_key, where should we save key_timer for migration? Luiz, any suggestion about migrate the key_timer in ui/input.c? I don't have any. Maybe Markus or Juan can help (CC'ed). We need to migrate it, then sendkey can continually work in dest vm until the timer is expired. I read the thread, but I'm not sure I have enough context for a sensible answer. Let me try to piece it together. On a real PC keyboard, key down generates that key's make code, key up generates the key's break code. If the key is typematic, the make code repeats while the key is down. First repeat is after a programmable delay, subsequent repeats happen at a programmable rate. Which keys are typematic is programmable in scan code set 3, but we don't implement the commands to do so. Oh well, the world is full of crappy clone keyboards, this is just one more. What problem are you trying to solve? Your cover letter mentions the sendkey command. Takes an array of keys and an optional hold-time, defaulting to 100ms. Aside: array of keys + a single hold time is a rotten design. Outside the scope of this patch. 100ms is well below the smallest typematic delay, so by default no repeat happens. But if you specify a sufficiently large hold-time, it arguably should. Is that the problem you're trying to fix? The events qemu gets from host userspace is auto-repeated (using host's repeat rate), it's ok to just pass the events to guest. What my patch is doing: 1) process the events from host userspace to the raw events from keyboard hardware, then implement the auto-repeat function in qemu, then it can be configured by guest os(delay/rate). 2) for the sendkey. I had planed to just sent repeated events from sendkey code by calling interface in ps2 code. The discussion wishes to implement real auto-repeat for ps2 emulated device. Actually it's not a urgent/necessary request. Host provided auto-repeat works, and we didn't have real application of holding key by sendkey command now. Why is this problem worth fixing? Does your patch affect anything but the sendkey command? I'm asking because I'm not at all sure injecting emulated repeat between the user's keyboard and the guest OS is a good idea. I'd expect the user's keyboard to repeat according to the user's wishes already, and I'm concerned a second repeat could mess up things. -- Amos.
[Qemu-devel] [Bug 1196773] [NEW] pci_add auto nic failed on qemu monitor
Public bug reported: hello! execute qemu-system-x86_64 -M pc -m 256 -hda pc/img.qcow2 -nographic -net nic,vlan=0,macaddr=00:e0:fc:00:00:01 -net tap,vlan=0,ifname=tap0,script=no -serial tcp::,server,nowait -boot c and then (qemu) pci_add auto nic vlan=1,macaddr=00:e0:fc:40:20:02 Property 'e1000.netdev' can't take value 'hub0port0', it's in use the qemu version is 1.4.1 thx! ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1196773 Title: pci_add auto nic failed on qemu monitor Status in QEMU: New Bug description: hello! execute qemu-system-x86_64 -M pc -m 256 -hda pc/img.qcow2 -nographic -net nic,vlan=0,macaddr=00:e0:fc:00:00:01 -net tap,vlan=0,ifname=tap0,script=no -serial tcp::,server,nowait -boot c and then (qemu) pci_add auto nic vlan=1,macaddr=00:e0:fc:40:20:02 Property 'e1000.netdev' can't take value 'hub0port0', it's in use the qemu version is 1.4.1 thx! To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1196773/+subscriptions
Re: [Qemu-devel] [PATCH v2 07/11] exec: check MRU in qemu_ram_addr_from_host
On 2013-07-01 22:48, Paolo Bonzini wrote: Il 01/07/2013 20:33, Jan Kiszka ha scritto: This function is not used outside the iothread mutex, so it can use ram_list.mru_block. Add a comment to qemu_ram_addr_from_host to document this requirement? Right now there is hardly any documentation of what does _not_ require the iothread mutex... basically everything except qemu_safe_ram_ptr and qemu_ram_ptr_length requires it. I don't disagree regarding the current state. But that doesn't imply it has to be preserved. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH v2 10/11] exec: reorganize address_space_map
On 2013-07-01 22:44, Paolo Bonzini wrote: Il 01/07/2013 20:34, Jan Kiszka ha scritto: The transformation looks correct, but I'm currently not understanding in which cases we still need the loop. address_space_translate should return as much of the target mr as the region can contiguously provide, no? You could have separate sections if, for example, two consecutive ranges map to contiguous sections but one is read-only and one is read-write. OK, makes sense. You can add my Reviewed-by: Jan Kiszka jan.kis...@siemens.com Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
On 2013-07-02 08:42, Paolo Bonzini wrote: Il 01/07/2013 20:33, Jan Kiszka ha scritto: On 2013-06-28 18:58, Paolo Bonzini wrote: Add ref/unref calls at the following places: - places where memory regions are stashed by a listener and used outside the BQL (including in Xen or KVM). - memory_region_find callsites You missed some recently added ones. Check hw/acpi/piix4.c and hw/isa/lpc_ich9.c (both will require some refactorings for this). Here are the required changes: diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 3b95c69..0310053 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -380,6 +380,16 @@ static void piix4_pm_powerdown_req(Notifier *n, void *opaque) acpi_pm1_evt_power_down(s-ar); } +static bool memory_region_present(MemoryRegion *io, hwaddr port) +{ +MemoryRegion *mr = memory_region_find(io, port, 1); +if (!mr) { +return false; +} +memory_region_unref(mr); +return true; +} + static void piix4_pm_machine_ready(Notifier *n, void *opaque) { PIIX4PMState *s = container_of(n, PIIX4PMState, machine_ready); @@ -388,10 +398,10 @@ static void piix4_pm_machine_ready(Notifier *n, void *opaque) pci_conf = s-dev.config; pci_conf[0x5f] = 0x10 | -(memory_region_find(io_as, 0x378, 1).mr ? 0x80 : 0); +(memory_region_present(io_as, 0x378, 1) ? 0x80 : 0); pci_conf[0x63] = 0x60; -pci_conf[0x67] = (memory_region_find(io_as, 0x3f8, 1).mr ? 0x08 : 0) | -(memory_region_find(io_as, 0x2f8, 1).mr ? 0x90 : 0); +pci_conf[0x67] = (memory_region_present(io_as, 0x3f8, 1) ? 0x08 : 0) | +(memory_region_present(io_as, 0x2f8, 1) ? 0x90 : 0); } static int piix4_pm_initfn(PCIDevice *dev) diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 82f8ea6..205ac26 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -478,24 +478,33 @@ static void ich9_lpc_machine_ready(Notifier *n, void *opaque) { ICH9LPCState *s = container_of(n, ICH9LPCState, machine_ready); MemoryRegion *io_as = pci_address_space_io(s-d); +MemoryRegion *mr; uint8_t *pci_conf; pci_conf = s-d.config; -if (memory_region_find(io_as, 0x3f8, 1).mr) { +mr = memory_region_find(io_as, 0x3f8, 1).mr; +if (mr) { /* com1 */ pci_conf[0x82] |= 0x01; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x2f8, 1).mr) { +mr = memory_region_find(io_as, 0x2f8, 1).mr; +if (mr) { /* com2 */ pci_conf[0x82] |= 0x02; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x378, 1).mr) { +mr = memory_region_find(io_as, 0x378, 1).mr; +if (mr) { /* lpt */ pci_conf[0x82] |= 0x04; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x3f0, 1).mr) { +mr = memory_region_find(io_as, 0x3f0, 1).mr; +if (mr) { /* floppy */ pci_conf[0x82] |= 0x08; +memory_region_unref(mr); } } Hmm, two solutions for one problem - can we improve this in the next round? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH v3] curl: refuse to open URL from HTTP server without range support
CURL driver requests partial data from server on guest IO req. For HTTP and HTTPS, it uses Range: *** in requests, and this will not work if server not accepting range. This patch does this check when open. * Removed curl_size_cb, which is not used: On one hand it's registered to libcurl as CURLOPT_WRITEFUNCTION, instead of CURLOPT_HEADERFUNCTION, which will get called with *data*, not *header*. On the other hand the s-len is assigned unconditionally later. In this gone function, the sscanf for Content-Length: %zd, on (void *)ptr, which is not guaranteed to be zero-terminated, is potentially a security bug. So this patch fixes it as a side-effect. The bug is reported as: https://bugs.launchpad.net/qemu/+bug/1188943 (Note the bug is marked private so you might not be able to see it) * Introduced curl_header_cb, which is used to parse header and mark the server as accepting range if Accept-Ranges: bytes line is seen from response header. If protocol is HTTP or HTTPS, but server response has no not this support, refuse to open this URL. Note that python builtin module SimpleHTTPServer is an example of not supporting range, if you need to test this driver, get a better server or use internet URLs. Signed-off-by: Fam Zheng f...@redhat.com --- v3: Fix copypaste error. Improve error text as suggested by Rich. --- block/curl.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/block/curl.c b/block/curl.c index b8935fd..5f75fd9 100644 --- a/block/curl.c +++ b/block/curl.c @@ -81,6 +81,7 @@ typedef struct BDRVCURLState { CURLState states[CURL_NUM_STATES]; char *url; size_t readahead_size; +bool accept_range; } BDRVCURLState; static void curl_clean_state(CURLState *s); @@ -110,14 +111,15 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action, return 0; } -static size_t curl_size_cb(void *ptr, size_t size, size_t nmemb, void *opaque) +static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque) { -CURLState *s = ((CURLState*)opaque); +BDRVCURLState *s = opaque; size_t realsize = size * nmemb; -size_t fsize; +const char *accept_line = Accept-Ranges: bytes; -if(sscanf(ptr, Content-Length: %zd, fsize) == 1) { -s-s-len = fsize; +if (realsize = strlen(accept_line) + strncmp((char *)ptr, accept_line, strlen(accept_line)) == 0) { +s-accept_range = true; } return realsize; @@ -441,8 +443,11 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags) // Get file size +s-accept_range = false; curl_easy_setopt(state-curl, CURLOPT_NOBODY, 1); -curl_easy_setopt(state-curl, CURLOPT_WRITEFUNCTION, (void *)curl_size_cb); +curl_easy_setopt(state-curl, CURLOPT_HEADERFUNCTION, + curl_header_cb); +curl_easy_setopt(state-curl, CURLOPT_HEADERDATA, s); if (curl_easy_perform(state-curl)) goto out; curl_easy_getinfo(state-curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, d); @@ -452,6 +457,13 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags) s-len = (size_t)d; else if(!s-len) goto out; +if ((!strncasecmp(s-url, http://;, strlen(http://;)) +|| !strncasecmp(s-url, https://;, strlen(https://;))) + !s-accept_range) { +pstrcpy(state-errmsg, CURL_ERROR_SIZE, +Server does not support 'range' (byte ranges).); +goto out; +} DPRINTF(CURL: Size = %zd\n, s-len); curl_clean_state(state); -- 1.8.3.1
Re: [Qemu-devel] [RFC] Policy for supported hosts/platforms
On Mon, Jul 01, 2013 at 08:10:34AM -0400, Ed Maste wrote: On 1 July 2013 07:47, Stefan Hajnoczi stefa...@gmail.com wrote: Yes, Christian Berendt admins the buildmaster. How much time has passed since you emailed? June 7; perhaps the email was lost. I would like to start with a FreeBSD 9.x x86_64 instance, potentially adding i386 and FreeBSD 10-CURRENT to the mix later. Okay, I put Christian in the To: field. Let's wait two days for a response. I think the solution is to add another buildmaster administrator. I believe Christian offered that in the past. Stefan
Re: [Qemu-devel] [RFC] Policy for supported hosts/platforms
Hello Stefan. On 07/02/2013 09:31 AM, Stefan Hajnoczi wrote: Okay, I put Christian in the To: field. Let's wait two days for a response. I dropped Andreas a few lines to get further information about the missing packages. I think the solution is to add another buildmaster administrator. I believe Christian offered that in the past. Sure. I think it would be the best to have an additional admin directly involved in the project. Also we planned to update the buildbot to a newer systems and to also upgrade the used system. Christian. -- Christian Berendt Cloud Computing Solution Architect Tel.: +49-171-5542175 Mail: bere...@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
Re: [Qemu-devel] [PATCHv2 08/11] block-migration: efficiently encode zero blocks
On Mon, Jul 01, 2013 at 06:09:55PM +0200, Peter Lieven wrote: Am 01.07.2013 um 16:13 schrieb Stefan Hajnoczi stefa...@redhat.com: On Thu, Jun 27, 2013 at 03:11:32PM +0200, Peter Lieven wrote: /* device name */ len = strlen(blk-bmds-bs-device_name); qemu_put_byte(f, len); qemu_put_buffer(f, (uint8_t *)blk-bmds-bs-device_name, len); +/* if a block is zero we need to flush here since the network + * bandwidth is now a lot higher than the storage device bandwidth. + * thus if we queue zero blocks we slow down the migration */ +if (flags BLK_MIG_FLAG_ZERO_BLOCK) { +qemu_fflush(f); +return; +} Not sure I understand this. Is the problem that the other side may require an slow writev() to fill zeros? So you want to tell the destination about the zeroes ASAP. Sorry, missed this question. Yes. If a lot of zero blocks is queued it delays migration because the target is idle while the source is queuing zero blocks. Thanks. I think this makes sense. Stefan
Re: [Qemu-devel] [PATCHv2 08/11] block-migration: efficiently encode zero blocks
On Mon, Jul 01, 2013 at 05:55:15PM +0200, Peter Lieven wrote: Am 01.07.2013 um 16:13 schrieb Stefan Hajnoczi stefa...@redhat.com: On Thu, Jun 27, 2013 at 03:11:32PM +0200, Peter Lieven wrote: This patch breaks cross-version blog migration. We need to control whether or not to use the new BLK_MIG_FLAG_ZERO_BLOCK flag. you are right the upgrade way works, but downgrade not. what is the proposed way to fix this? The safest way is to add a block migration flag that toggles the zero block. By default it would be off. Maybe you would also add a MigrationCapability in qapi-schema.json to indicated that zero blocks are supported in block migration. That would allow management tools to ask QEMU on both sides if they support zero blocks. Use migrate-set-capabilities to enable the zero block feature. CCed Juan Quintela for advice on migration. Stefan
Re: [Qemu-devel] [PATCHv2 11/11] iscsi: assert that sectors are aligned to LUN blocksize
On Mon, Jul 01, 2013 at 05:59:02PM +0200, Peter Lieven wrote: Am 01.07.2013 um 16:35 schrieb Stefan Hajnoczi stefa...@redhat.com: On Thu, Jun 27, 2013 at 03:11:35PM +0200, Peter Lieven wrote: if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE it is possible that sector_num or nb_sectors are not correctly alligned. to avoid corruption we fail requests which are misaligned. Signed-off-by: Peter Lieven p...@kamp.de --- block/iscsi.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/block/iscsi.c b/block/iscsi.c index 0567b46..bff2e1f 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -298,6 +298,13 @@ static int64_t sector_lun2qemu(int64_t sector, IscsiLun *iscsilun) return sector * iscsilun-block_size / BDRV_SECTOR_SIZE; } +static int64_t is_request_lun_aligned(int64_t sector_num, int nb_sectors, + IscsiLun *iscsilun) +{ +return ((sector_num * BDRV_SECTOR_SIZE) % iscsilun-block_size || +(nb_sectors * BDRV_SECTOR_SIZE) % iscsilun-block_size) ? 0 : 1; +} When QEMU does geometry probing it reads 2KB. So if the LUN has 4KB block size then QEMU will fail to open it? This would also affect image formats on top of iSCSI LUNs. opening a 4K LUN does not fail with my target. So writing unaligned sectors will result in corruption. we should at least fail all those unaligned operations until we have a fix or workaround in place in place. When QEMU tries to probe the master boot record or when an image format performs an unaligned read (say 512 bytes instead of 4 KB), won't we fail the read request? Therefore opening will fail. I agree that it's better to emit an error than to corrupt the disk. I'm just trying to understand what needs to be done on top of this to make 4 KB LUNs work. AFAICT we have no way to passing I/O topology information up from the block driver. not at the moment. Paolo had a patch series back in Dec 2011, but it never went upstream. I asked him off list and he told me that 4K drives where not important enough and the Redhat bug for this was closed. Now with 4K iSCSI targets these old work could become important again. It would be nice to find out if there was anything wrong with them or what has to be done to get them integrated. Okay
Re: [Qemu-devel] [PATCH 2/2] qemu-thread: add TLS wrappers
On Mon, Jul 01, 2013 at 02:52:08PM -0400, Ed Maste wrote: On 1 July 2013 05:35, Stefan Hajnoczi stefa...@redhat.com wrote: From: Paolo Bonzini pbonz...@redhat.com Fast TLS is not available on some platforms, but it is always nice to use it. This wrapper implementation falls back to pthread_get/setspecific on POSIX systems that lack __thread, but uses the dynamic linker's TLS support on Linux and Windows. The most recent version of this patch posted by Paolo that I see has: +#if defined(__linux__) || defined(__FreeBSD__) +#define DECLARE_TLS(type, x) \ while this one has only __linux__. Do you mind picking up that change, if this is likely to make it in via your work? Since additional changes are necessary before these patches can be merged, I hope Paolo can pick up my tls_ rename and send the next revision including his improvements. Stefan
Re: [Qemu-devel] [PATCH v2 07/11] exec: check MRU in qemu_ram_addr_from_host
Il 02/07/2013 09:09, Jan Kiszka ha scritto: Right now there is hardly any documentation of what does _not_ require the iothread mutex... basically everything except qemu_safe_ram_ptr and qemu_ram_ptr_length requires it. I don't disagree regarding the current state. But that doesn't imply it has to be preserved. As soon as there are exec.c APIs that are BQL-safe, I'll document the state of all of them. In the meanwhile, I'll add a note that memory_region_find can be called outside the BQL. Paolo
Re: [Qemu-devel] [PATCH 2/2] qemu-thread: add TLS wrappers
Il 01/07/2013 22:30, Richard Henderson ha scritto: On 07/01/2013 12:25 PM, Peter Maydell wrote: Does any OS have a __thread which compiles but is broken, or can we just have a configure test for this? That would let MacOSX+clang use __thread. I suspect that this will work. Some targets may succeed in using gcc's emutls path, which while slower than TLS is pretty much exactly the pthread get/setcontext fallback that's been proposed elsewhere on this list. We do not want to hit emutls on Windows, but that can be done simply by reordering the three implementation. Paolo
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
Il 02/07/2013 09:11, Jan Kiszka ha scritto: On 2013-07-02 08:42, Paolo Bonzini wrote: Il 01/07/2013 20:33, Jan Kiszka ha scritto: On 2013-06-28 18:58, Paolo Bonzini wrote: Add ref/unref calls at the following places: - places where memory regions are stashed by a listener and used outside the BQL (including in Xen or KVM). - memory_region_find callsites You missed some recently added ones. Check hw/acpi/piix4.c and hw/isa/lpc_ich9.c (both will require some refactorings for this). Here are the required changes: diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 3b95c69..0310053 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -380,6 +380,16 @@ static void piix4_pm_powerdown_req(Notifier *n, void *opaque) acpi_pm1_evt_power_down(s-ar); } +static bool memory_region_present(MemoryRegion *io, hwaddr port) +{ +MemoryRegion *mr = memory_region_find(io, port, 1); +if (!mr) { +return false; +} +memory_region_unref(mr); +return true; +} + static void piix4_pm_machine_ready(Notifier *n, void *opaque) { PIIX4PMState *s = container_of(n, PIIX4PMState, machine_ready); @@ -388,10 +398,10 @@ static void piix4_pm_machine_ready(Notifier *n, void *opaque) pci_conf = s-dev.config; pci_conf[0x5f] = 0x10 | -(memory_region_find(io_as, 0x378, 1).mr ? 0x80 : 0); +(memory_region_present(io_as, 0x378, 1) ? 0x80 : 0); pci_conf[0x63] = 0x60; -pci_conf[0x67] = (memory_region_find(io_as, 0x3f8, 1).mr ? 0x08 : 0) | -(memory_region_find(io_as, 0x2f8, 1).mr ? 0x90 : 0); +pci_conf[0x67] = (memory_region_present(io_as, 0x3f8, 1) ? 0x08 : 0) | +(memory_region_present(io_as, 0x2f8, 1) ? 0x90 : 0); } static int piix4_pm_initfn(PCIDevice *dev) diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 82f8ea6..205ac26 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -478,24 +478,33 @@ static void ich9_lpc_machine_ready(Notifier *n, void *opaque) { ICH9LPCState *s = container_of(n, ICH9LPCState, machine_ready); MemoryRegion *io_as = pci_address_space_io(s-d); +MemoryRegion *mr; uint8_t *pci_conf; pci_conf = s-d.config; -if (memory_region_find(io_as, 0x3f8, 1).mr) { +mr = memory_region_find(io_as, 0x3f8, 1).mr; +if (mr) { /* com1 */ pci_conf[0x82] |= 0x01; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x2f8, 1).mr) { +mr = memory_region_find(io_as, 0x2f8, 1).mr; +if (mr) { /* com2 */ pci_conf[0x82] |= 0x02; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x378, 1).mr) { +mr = memory_region_find(io_as, 0x378, 1).mr; +if (mr) { /* lpt */ pci_conf[0x82] |= 0x04; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x3f0, 1).mr) { +mr = memory_region_find(io_as, 0x3f0, 1).mr; +if (mr) { /* floppy */ pci_conf[0x82] |= 0x08; +memory_region_unref(mr); } } Hmm, two solutions for one problem - can we improve this in the next round? Sure, I can adapt the hw/acpi/piix4.c to use ifs in the same style as hw/isa/lpc_ich9.c (I find the code easier to read). Paolo
Re: [Qemu-devel] [PATCHv2 11/11] iscsi: assert that sectors are aligned to LUN blocksize
Am 02.07.2013 um 09:44 schrieb Stefan Hajnoczi stefa...@gmail.com: On Mon, Jul 01, 2013 at 05:59:02PM +0200, Peter Lieven wrote: Am 01.07.2013 um 16:35 schrieb Stefan Hajnoczi stefa...@redhat.com: On Thu, Jun 27, 2013 at 03:11:35PM +0200, Peter Lieven wrote: if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE it is possible that sector_num or nb_sectors are not correctly alligned. to avoid corruption we fail requests which are misaligned. Signed-off-by: Peter Lieven p...@kamp.de --- block/iscsi.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/block/iscsi.c b/block/iscsi.c index 0567b46..bff2e1f 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -298,6 +298,13 @@ static int64_t sector_lun2qemu(int64_t sector, IscsiLun *iscsilun) return sector * iscsilun-block_size / BDRV_SECTOR_SIZE; } +static int64_t is_request_lun_aligned(int64_t sector_num, int nb_sectors, + IscsiLun *iscsilun) +{ +return ((sector_num * BDRV_SECTOR_SIZE) % iscsilun-block_size || +(nb_sectors * BDRV_SECTOR_SIZE) % iscsilun-block_size) ? 0 : 1; +} When QEMU does geometry probing it reads 2KB. So if the LUN has 4KB block size then QEMU will fail to open it? This would also affect image formats on top of iSCSI LUNs. opening a 4K LUN does not fail with my target. So writing unaligned sectors will result in corruption. we should at least fail all those unaligned operations until we have a fix or workaround in place in place. When QEMU tries to probe the master boot record or when an image format performs an unaligned read (say 512 bytes instead of 4 KB), won't we fail the read request? Therefore opening will fail. I can speak for iSCSI. Currently it fails when opening a boot device. I think there is no MBR probing done for non-boot devices. Ronnie added some magic to iscsi_aio_readv to have an unaligned number for nb_sectors, but this seems not to work anymore (I will add a patch to remove it). But as soon as nb_sectors is aligned the sector_num will be rounded down and its not exactly doing what it should. Writing will succeed but with the wrong offset. UNMAP will always work. I agree that it's better to emit an error than to corrupt the disk. I'm just trying to understand what needs to be done on top of this to make 4 KB LUNs work. Reading is quite easy, you just have to have a wrapper that is reading an aligned portion of sectors around the original request and then extracting what was requested. Writing is very complicated. Requests have to be serialized, Read-Modify-Write for unaligned write requests. Paolo had all this prepared already. I wonder if it would be enough to have the block size of the host/iSCSI device propagated to the guest drivers and read 4K (or the protocol block size bytes) when probing the MBR. Windows 8, Windows 2012 and recent LInux support 4kN. So they should handle all what its needed internally. What Paolo tried to integrate is sth like 512e support for 4kN drives in qemu. What I have read in fact most of the advanced format drives do 512e already. So the only need left is for iSCSI, but I would be totally fine to say if you want to attach a 4kN device to qemu run an operating system that supports it. Peter
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
On 2013-07-02 10:18, Paolo Bonzini wrote: Il 02/07/2013 09:11, Jan Kiszka ha scritto: On 2013-07-02 08:42, Paolo Bonzini wrote: Il 01/07/2013 20:33, Jan Kiszka ha scritto: On 2013-06-28 18:58, Paolo Bonzini wrote: Add ref/unref calls at the following places: - places where memory regions are stashed by a listener and used outside the BQL (including in Xen or KVM). - memory_region_find callsites You missed some recently added ones. Check hw/acpi/piix4.c and hw/isa/lpc_ich9.c (both will require some refactorings for this). Here are the required changes: diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 3b95c69..0310053 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -380,6 +380,16 @@ static void piix4_pm_powerdown_req(Notifier *n, void *opaque) acpi_pm1_evt_power_down(s-ar); } +static bool memory_region_present(MemoryRegion *io, hwaddr port) +{ +MemoryRegion *mr = memory_region_find(io, port, 1); +if (!mr) { +return false; +} +memory_region_unref(mr); +return true; +} + static void piix4_pm_machine_ready(Notifier *n, void *opaque) { PIIX4PMState *s = container_of(n, PIIX4PMState, machine_ready); @@ -388,10 +398,10 @@ static void piix4_pm_machine_ready(Notifier *n, void *opaque) pci_conf = s-dev.config; pci_conf[0x5f] = 0x10 | -(memory_region_find(io_as, 0x378, 1).mr ? 0x80 : 0); +(memory_region_present(io_as, 0x378, 1) ? 0x80 : 0); pci_conf[0x63] = 0x60; -pci_conf[0x67] = (memory_region_find(io_as, 0x3f8, 1).mr ? 0x08 : 0) | -(memory_region_find(io_as, 0x2f8, 1).mr ? 0x90 : 0); +pci_conf[0x67] = (memory_region_present(io_as, 0x3f8, 1) ? 0x08 : 0) | +(memory_region_present(io_as, 0x2f8, 1) ? 0x90 : 0); } static int piix4_pm_initfn(PCIDevice *dev) diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 82f8ea6..205ac26 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -478,24 +478,33 @@ static void ich9_lpc_machine_ready(Notifier *n, void *opaque) { ICH9LPCState *s = container_of(n, ICH9LPCState, machine_ready); MemoryRegion *io_as = pci_address_space_io(s-d); +MemoryRegion *mr; uint8_t *pci_conf; pci_conf = s-d.config; -if (memory_region_find(io_as, 0x3f8, 1).mr) { +mr = memory_region_find(io_as, 0x3f8, 1).mr; +if (mr) { /* com1 */ pci_conf[0x82] |= 0x01; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x2f8, 1).mr) { +mr = memory_region_find(io_as, 0x2f8, 1).mr; +if (mr) { /* com2 */ pci_conf[0x82] |= 0x02; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x378, 1).mr) { +mr = memory_region_find(io_as, 0x378, 1).mr; +if (mr) { /* lpt */ pci_conf[0x82] |= 0x04; +memory_region_unref(mr); } -if (memory_region_find(io_as, 0x3f0, 1).mr) { +mr = memory_region_find(io_as, 0x3f0, 1).mr; +if (mr) { /* floppy */ pci_conf[0x82] |= 0x08; +memory_region_unref(mr); } } Hmm, two solutions for one problem - can we improve this in the next round? Sure, I can adapt the hw/acpi/piix4.c to use ifs in the same style as hw/isa/lpc_ich9.c (I find the code easier to read). I was more referring to memory_region_present vs. open-coding. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH] Citrix PV Bus device
This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. This device therefore acts as an opt-in for those wishing to deploy Citrix PV drivers. Signed-off-by: Paul Durrant paul.durr...@citrix.com --- hw/i386/Makefile.objs|1 + hw/i386/citrix_pv_bus.c | 122 ++ include/hw/pci/pci_ids.h |3 ++ 3 files changed, 126 insertions(+) create mode 100644 hw/i386/citrix_pv_bus.c diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 205d22e..8e28831 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -4,3 +4,4 @@ obj-y += pc.o pc_piix.o pc_q35.o obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o obj-y += kvmvapic.o +obj-y += citrix_pv_bus.o diff --git a/hw/i386/citrix_pv_bus.c b/hw/i386/citrix_pv_bus.c new file mode 100644 index 000..e1e0508 --- /dev/null +++ b/hw/i386/citrix_pv_bus.c @@ -0,0 +1,122 @@ +/* Copyright (c) Citrix Systems Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, + * with or without modification, are permitted provided + * that the following conditions are met: + * + * * Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the + * following disclaimer in the documentation and/or other + * materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include hw/hw.h +#include hw/pci/pci.h + +typedef struct _CitrixPVBusDevice { +PCIDevice dev; +uint8_t revision; +uint32_tpages; +MemoryRegionmmio; +} CitrixPVBusDevice; + +static uint64_t citrix_pv_bus_mmio_read(void *opaque, hwaddr addr, +unsigned size) +{ +fprintf(stderr, WARNING: read from address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); + +return ~(uint64_t)0; +} + +static void citrix_pv_bus_mmio_write(void *opaque, hwaddr addr, + uint64_t val, unsigned size) +{ +fprintf(stderr, WARNING: write to address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); +} + +static const MemoryRegionOps citrix_pv_bus_mmio_ops = { +.read = citrix_pv_bus_mmio_read, +.write = citrix_pv_bus_mmio_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static int citrix_pv_bus_init(PCIDevice *pci_dev) +{ +CitrixPVBusDevice *d = DO_UPCAST(CitrixPVBusDevice, dev, pci_dev); +uint8_t *pci_conf; +uint64_t size; + +pci_conf = pci_dev-config; + +pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_MEMORY); +pci_set_byte(pci_conf + PCI_REVISION_ID, d-revision); + +pci_config_set_prog_interface(pci_conf, 0); + +pci_conf[PCI_INTERRUPT_PIN] = 1; + +size = d-pages * TARGET_PAGE_SIZE; +memory_region_init_io(d-mmio, citrix_pv_bus_mmio_ops, d, + citrix-bus-mmio, size); + +pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_MEM_PREFETCH, + d-mmio); + +return 0; +} + +static Property citrix_pv_bus_props[] = { +DEFINE_PROP_UINT8(revision, CitrixPVBusDevice, revision, 0x01), +DEFINE_PROP_UINT32(pages, CitrixPVBusDevice, pages, 128), +DEFINE_PROP_END_OF_LIST() +}; + +static void citrix_pv_bus_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + +k-init = citrix_pv_bus_init; +k-vendor_id = PCI_VENDOR_ID_CITRIX; +k-device_id = PCI_DEVICE_ID_CITRIX_PV_BUS; +k-class_id = PCI_CLASS_SYSTEM_OTHER; +
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On 02.07.13 at 10:39, Paul Durrant paul.durr...@citrix.com wrote: --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -151,4 +151,7 @@ #define PCI_VENDOR_ID_TEWS 0x1498 #define PCI_DEVICE_ID_TEWS_TPCI200 0x30C8 +#define PCI_VENDOR_ID_CITRIX 0x5853 Is that really the right way to do things, considering that the same value is elsewhere used for PCI_VENDOR_ID_XEN? Jan +#define PCI_DEVICE_ID_CITRIX_PV_BUS 0x0002 + #endif
Re: [Qemu-devel] [RFC] qemu-img: add option -d in convert
On Thu, Jun 27, 2013 at 08:30:04PM +0800, Wenchao Xia wrote: 于 2013-6-27 17:01, Stefan Hajnoczi 写道: On Tue, Jun 25, 2013 at 07:14:19PM +0800, Wenchao Xia wrote: 于 2013-6-25 17:13, Stefan Hajnoczi 写道: On Thu, Jun 20, 2013 at 04:59:17PM +0800, Wenchao Xia wrote: BTW we already have qemu-io -c map which prints out allocation information for an image file. If that command is extended to support -s then you can get your info easily. Do you mean use it like: 1 call qemu-io file -c map -s sn0 2 call qemu-io file -c map -s sn1 3 for (offset = 0; offset len; offset +=512) { if (allocated on sn0) { call qemu-io file read -s sn0, into buf0 } if (allocated on sn1) { call qemu-io file read -s sn1, into buf1 } if strcmp(buf0, buf1) { write down the delta } } ? I think it is workable and flex, the bottle neck is the string parsing of qemu-io. For delta data info retrieving purpose, qemu-img approach would faster. It's also possible to put Step 3 into qemu-img so that this feature is fast and easy to use with a single command. Stefan
Re: [Qemu-devel] [PATCH] full introspection support for QMP
On Thu, Jun 20, 2013 at 11:20:21PM -0400, Luiz Capitulino wrote: On Wed, 19 Jun 2013 20:24:37 +0800 Amos Kong ak...@redhat.com wrote: Introduces new monitor command to query QMP schema information, the return data is a nested dict/list, it contains the useful metadata. Thanks your comments. Thanks for the good work, Amos! When testing this though I actually get qemu-ga's schema, not qmp's. Did you test this with qemu-ga build enabled? Currently qmp-schema.h will be generated two times. QMP's schema will cover qga's schema. I will add an option '-i' / '--instrospec' for qapi-types.py to set the name of generated file. This bug shows that we need to handle qemu-ga properly, which means having query-guest-agent-schema for qemu-ga. I will generate two schema tables for qmp qga qmp-schema.h qga-schema.h It's also a good idea to start the commit log with some json examples btw. OK, I will give some example in commitlog, and add a doc to describe dynamical DataObject. More comments below. we can add events definations to qapi-schema.json, then it can also be queried. Signed-off-by: Amos Kong ak...@redhat.com --- Makefile | 4 +- qapi-schema.json | 68 +++ qmp-commands.hx | 39 +++ qmp.c| 170 +++ scripts/qapi-commands.py | 2 +- scripts/qapi-types.py| 34 +- scripts/qapi-visit.py| 2 +- scripts/qapi.py | 7 +- 8 files changed, 320 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile index 3cfa7d0..42713ef 100644 --- a/Makefile +++ b/Makefile @@ -38,7 +38,7 @@ endif endif GENERATED_HEADERS = config-host.h qemu-options.def -GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h +GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qmp-schema.h GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c GENERATED_HEADERS += trace/generated-events.h @@ -213,7 +213,7 @@ qga/qapi-generated/qga-qmp-commands.h qga/qapi-generated/qga-qmp-marshal.c :\ $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py) $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -o qga/qapi-generated -p qga- $, GEN $@) -qapi-types.c qapi-types.h :\ +qapi-types.c qapi-types.h qmp-schema.h:\ $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py) $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o . -b $, GEN $@) qapi-visit.c qapi-visit.h :\ diff --git a/qapi-schema.json b/qapi-schema.json index 6cc07c2..43abe57 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3608,3 +3608,71 @@ '*cpuid-input-ecx': 'int', 'cpuid-register': 'X86CPURegister32', 'features': 'int' } } + +## +# @DataObject +# +# Details of a data object, it can be nested dictionary/list +# +# @name: #optional the string key of dictionary Data object name if it has one? +# +# @type: the string value of dictionary or list data type name? +# +# @data: #optional a list of @DataObject, dictionary's value is nested +#dictionary/list #optional DataObject list, can be a dictionary or list type? +# +# Since: 1.6 +## +{ 'type': 'DataObject', + 'data': { '*name': 'str', '*type': 'str', '*data': ['DataObject'] } } + +## +# @SchemaMetatype As we're doing CamelCase, this should be SchemaMetaType. Or maybe just SchemaType? +# +# Possible meta types of a schema entry +# +# @Command: QMP monitor command to control guest QMP command is good enough. +# +# @Type: defined new data type +# +# @Enumeration: enumeration data type +# +# @Union: union data type +# +# @Event: QMP event to notify QMP clients I'm not sure we should have events listed here as they are not supported yet. Will remove it first. +# +# Since: 1.6 +## +{ 'enum': 'SchemaMetatype', + 'data': ['Command', 'Type', 'Enumeration', 'Union', 'Event'] } + +## +# @SchemaData Sorry for the bikeshed, but SchemaEntry maybe? +# +# Details of schema items +# +# @type: dict's value, list's value Entry's type in string format. +# +# @name: dict's key Entry name. +# +# @data: #optional list of @DataObject, arguments data of executing +#QMP command #optional list of DataObject. This can have different meaning depending on the 'type' value. For example, for a QMP command, this member contains an argument listing. For an enumeration, it contains the enum's values and so on +# +# @returns: #optional list of DataObject, return data after executing +# QMP command I don't parse what's after the coma. #optional list of DataObject. This is the return date of executing a QMP
Re: [Qemu-devel] [PATCH] block: add the optional file entry to query-block
On Fri, Jun 28, 2013 at 10:32:30AM -0400, Federico Simoncelli wrote: This patch adds the optional file entry to the query-block output. The value is a json-object representing the information about the underlying file or device (when present). Signed-off-by: Federico Simoncelli fsimo...@redhat.com --- block/qapi.c |9 - qapi-schema.json |4 +++- qmp-commands.hx|8 tests/qemu-iotests/043.out | 15 +++ 4 files changed, 34 insertions(+), 2 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index a4bc411..03cd222 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -119,7 +119,7 @@ void bdrv_query_image_info(BlockDriverState *bs, info-filename= g_strdup(bs-filename); info-format = g_strdup(bdrv_get_format_name(bs)); -info-virtual_size= total_sectors * 512; +info-virtual_size= bdrv_getlength(bs); info-actual_size = bdrv_get_allocated_file_size(bs); info-has_actual_size = info-actual_size = 0; if (bdrv_is_encrypted(bs)) { Now total_sectors is unused. Why make this change?
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 09:39 +0100, Paul Durrant wrote: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. What about 9aae0902d5bc7e449b7c8e4e778abcd0053...@lonpex01cl01.citrite.net ?
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 02 July 2013 09:46 To: Paul Durrant Cc: xen-de...@lists.xen.org; qemu-devel@nongnu.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On 02.07.13 at 10:39, Paul Durrant paul.durr...@citrix.com wrote: --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -151,4 +151,7 @@ #define PCI_VENDOR_ID_TEWS 0x1498 #define PCI_DEVICE_ID_TEWS_TPCI200 0x30C8 +#define PCI_VENDOR_ID_CITRIX 0x5853 Is that really the right way to do things, considering that the same value is elsewhere used for PCI_VENDOR_ID_XEN? Jan +#define PCI_DEVICE_ID_CITRIX_PV_BUS 0x0002 + #endif I opted to do this for clarity. The fact that the Xen platform device uses Citrix's vendor ID is historical; I wanted to be clear that this device was distinct. Paul
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 08:57 +, Paul Durrant wrote: -Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 02 July 2013 09:46 To: Paul Durrant Cc: xen-de...@lists.xen.org; qemu-devel@nongnu.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On 02.07.13 at 10:39, Paul Durrant paul.durr...@citrix.com wrote: --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -151,4 +151,7 @@ #define PCI_VENDOR_ID_TEWS 0x1498 #define PCI_DEVICE_ID_TEWS_TPCI200 0x30C8 +#define PCI_VENDOR_ID_CITRIX 0x5853 Is that really the right way to do things, considering that the same value is elsewhere used for PCI_VENDOR_ID_XEN? Jan +#define PCI_DEVICE_ID_CITRIX_PV_BUS 0x0002 + #endif I opted to do this for clarity. The fact that the Xen platform device uses Citrix's vendor ID is historical; AFAIR this was XenSource's vendor ID (it is XS in ASCII) which was donated to the Xen community. I'll clarify this internally though. I wanted to be clear that this device was distinct. I think giving two names to the same ID serves only to obfuscate. Ian.
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Ian Campbell Sent: 02 July 2013 10:02 To: Paul Durrant Cc: Jan Beulich; qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 08:57 +, Paul Durrant wrote: -Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 02 July 2013 09:46 To: Paul Durrant Cc: xen-de...@lists.xen.org; qemu-devel@nongnu.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On 02.07.13 at 10:39, Paul Durrant paul.durr...@citrix.com wrote: --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -151,4 +151,7 @@ #define PCI_VENDOR_ID_TEWS 0x1498 #define PCI_DEVICE_ID_TEWS_TPCI200 0x30C8 +#define PCI_VENDOR_ID_CITRIX 0x5853 Is that really the right way to do things, considering that the same value is elsewhere used for PCI_VENDOR_ID_XEN? Jan +#define PCI_DEVICE_ID_CITRIX_PV_BUS 0x0002 + #endif I opted to do this for clarity. The fact that the Xen platform device uses Citrix's vendor ID is historical; AFAIR this was XenSource's vendor ID (it is XS in ASCII) which was donated to the Xen community. I'll clarify this internally though. The PCI SIG has it registered to Citrix. I wanted to be clear that this device was distinct. I think giving two names to the same ID serves only to obfuscate. Ok. I don't mind dropping the definition if that's generally preferred. Paul
Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information
Amos Kong ak...@redhat.com writes: On Wed, Jun 26, 2013 at 11:54:20AM +0200, Markus Armbruster wrote: Amos Kong ak...@redhat.com writes: We want to implement mac programming over macvtap through Libvirt, related rx-filter configuration contains main mac, some of rx-mode and mac-table. The previous patch adds QMP event to notify management of rx-filter change. This patch adds a monitor command to query rx-filter information. A flag is used to avoid events flooding, if user don't query rx-filter after receives one event, new events won't be sent to qmp monitor. (qemu) info rx-filter vnet0 vnet0: \ promiscuous: on \ multicast: normal \ unicast: normal \ broadcast-allowed: off \ multicast-overflow: off \ unicast-overflow: off \ main-mac: 52:54:00:12:34:56 \ unicast-table: \ multicast-table: 01:00:5e:00:00:01 33:33:00:00:00:01 33:33:ff:12:34:56 Signed-off-by: Amos Kong ak...@redhat.com Thanks for your comments, some comments are out-of-data, I removed them in this reply. Okay :) +static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) +{ +VirtIONet *n = qemu_get_nic_opaque(nc); +RxFilterInfo *info; +strList *str_list = NULL; +strList *entry; +int i; + +info = g_malloc0(sizeof(*info)); +info-name = g_strdup(nc-name); +info-promiscuous = n-promisc; + +if (n-nouni) { +info-unicast = RX_STATE_NO; +} else if (n-alluni) { +info-unicast = RX_STATE_ALL; +} else { +info-unicast = RX_STATE_NORMAL; +} + +if (n-nomulti) { +info-multicast = RX_STATE_NO; +} else if (n-allmulti) { +info-multicast = RX_STATE_ALL; +} else { +info-multicast = RX_STATE_NORMAL; +} Makes me wonder whether replacing VirtIONet members noFOO and allFOO by an enum RxState would make things clearer there. Outside the scope of this patch. Good suggestion, added to my todolist. + +info-broadcast_allowed = n-nobcast; +info-multicast_overflow = n-mac_table.multi_overflow; +info-unicast_overflow = n-mac_table.uni_overflow; +info-main_mac = g_strdup_printf(%.2x:%.2x:%.2x:%.2x:%.2x:%.2x, + n-mac[0], n-mac[1], n-mac[2], + n-mac[3], n-mac[4], n-mac[5]); + Please initialize str_list here rather than at its declaration, because that'll make the loop's workings more locally obvious, and because... +for (i = 0; i n-mac_table.first_multi; i++) { +entry = g_malloc0(sizeof(*entry)); +entry-value = g_strdup_printf(%.2x:%.2x:%.2x:%.2x:%.2x:%.2x, + n-mac_table.macs[i * ETH_ALEN], + n-mac_table.macs[i * ETH_ALEN + 1], + n-mac_table.macs[i * ETH_ALEN + 2], + n-mac_table.macs[i * ETH_ALEN + 3], + n-mac_table.macs[i * ETH_ALEN + 4], + n-mac_table.macs[i * ETH_ALEN + 5]); +entry-next = str_list; +str_list = entry; +} +info-unicast_table = str_list; + ... it's how this loop works. Actually, the two loops are duplicates. Consider factoring out a helper function. +return info; +} + static void virtio_net_reset(VirtIODevice *vdev) { VirtIONet *n = VIRTIO_NET(vdev); @@ -442,6 +528,8 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +rxfilter_notify(n-netclient_name); + return VIRTIO_NET_OK; } @@ -495,6 +583,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, s = iov_to_buf(iov, iov_cnt, 0, n-mac, sizeof(n-mac)); assert(s == sizeof(n-mac)); qemu_format_nic_info_str(qemu_get_queue(n-nic), n-mac); +rxfilter_notify(n-netclient_name); + return VIRTIO_NET_OK; } @@ -559,6 +649,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +rxfilter_notify(n-netclient_name); + return VIRTIO_NET_OK; } The error returns don't trigger the event. We can fail after clearing n-mac_table. Why is that okay? We should notify in error path if n-mac_table is changed. @@ -1312,6 +1404,7 @@ static NetClientInfo net_virtio_info = { .receive = virtio_net_receive, .cleanup = virtio_net_cleanup, .link_status_changed = virtio_net_set_link_status, +.query_rx_filter = virtio_net_query_rxfilter, }; static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx) diff --git a/include/net/net.h b/include/net/net.h index 43d85a1..0af5ba3 100644 --- a/include/net/net.h +++ b/include/net/net.h @@ -49,6 +49,7
[Qemu-devel] [PATCH] target-openrisc: Add typename for CPU models.
Make target-openrisc running OK by add typename in openrisc_cpu_class_by_name(). Signed-off-by: Dongxue Zhang elta@gmail.com --- target-openrisc/cpu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/target-openrisc/cpu.c b/target-openrisc/cpu.c index fd90d37..d38c28b 100644 --- a/target-openrisc/cpu.c +++ b/target-openrisc/cpu.c @@ -96,12 +96,14 @@ static void openrisc_cpu_initfn(Object *obj) static ObjectClass *openrisc_cpu_class_by_name(const char *cpu_model) { ObjectClass *oc; +char *typename; if (cpu_model == NULL) { return NULL; } -oc = object_class_by_name(cpu_model); +typename = g_strdup_printf(%s- TYPE_OPENRISC_CPU, cpu_model); +oc = object_class_by_name(typename); if (oc != NULL (!object_class_dynamic_cast(oc, TYPE_OPENRISC_CPU) || object_class_is_abstract(oc))) { return NULL; -- 1.8.1.2
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Ian Campbell Sent: 02 July 2013 09:57 To: Paul Durrant Cc: qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 09:39 +0100, Paul Durrant wrote: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. What about 9aae0902d5bc7e449b7c8e4e778abcd0053...@lonpex01cl01.citrite.net ? I had actually coded up a solution based on the existing Xen platform device, by having it synthesize a device ID based on the Xen version to which we could then host the xenbus driver, to allow us to deploy multiple versions of xenbus should compatibility (with things such as the shared info interface) become an issue. The co-installer for this driver could also spot existing PV driver installations and make sure they don't get trashed. This idea was rejected by Citrix product teams though, because we would not be able to prevent any Windows guest without some known PV drivers from downloading our new driver from Windows Update and this was seen as undesirable. Paul
Re: [Qemu-devel] [PATCH v5] Add timestamp to error_report()
On Mon, Jul 01, 2013 at 02:54:07PM -0400, Seiji Aguchi wrote: diff --git a/qemu-options.hx b/qemu-options.hx index ca6fdf6..a6dac1a 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3102,3 +3102,15 @@ HXCOMM This is the last statement. Insert new options before this line! STEXI @end table ETEXI + +DEF(msg, HAS_ARG, QEMU_OPTION_msg, +-msg [timestamp=on|off]\n + change the format of messages\n + timestamp=on|off enables leading timestamps (default:on)\n, +QEMU_ARCH_ALL) +STEXI +@item -msg timestamp=on|off +@findex -msg +prepend a timestamp to each log message. +(disabled by default) +ETEXI I am confused. If the user specifies -msg then enable_timestamp_msg is on by default. If the user does not specify -msg then enable_timestmap_msg is off. Did I get that right? This means that the default behavior of QEMU does not change but you can add -msg to enable timestamps. I'm happy with this but find the documentation confusing. diff --git a/util/qemu-time.c b/util/qemu-time.c new file mode 100644 index 000..3862788 --- /dev/null +++ b/util/qemu-time.c @@ -0,0 +1,26 @@ +/* + * Time handling + * + * Copyright (C) 2013 Hitachi Data Systems Corp. + * + * Authors: + * Seiji Aguchi seiji.agu...@hds.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include qemu/time.h + +void qemu_get_timestamp_str(char *timestr, size_t len) +{ +GTimeVal tv; +gchar *tmp_str = NULL; + +/* Size of len should be at least TIMESTAMP_LEN to avoid truncation */ +assert(len = TIMESTAMP_LEN); + +g_get_current_time(tv); +tmp_str = g_time_val_to_iso8601(tv); +g_strlcpy((gchar *)timestr, tmp_str, len); +g_free(tmp_str); +} Why strcpy into a fixed-size buffer when glib gives us a heap-allocated string? This patch would be simpler by dropping qemu-time.c/time.h and simply doing: if (enable_timestamp_msg) { GTimeVal tv; gchar *timestamp; g_get_current_time(tv); timestamp = g_time_val_to_iso8601(tv); error_printf(%s , timestamp); g_free(timestamp); }
Re: [Qemu-devel] [PATCH] target-openrisc: Add typename for CPU models.
Hi Dongxue, On Tue, Jul 2, 2013 at 5:11 PM, Dongxue Zhang elta@gmail.com wrote: Make target-openrisc running OK by add typename in openrisc_cpu_class_by_name(). Signed-off-by: Dongxue Zhang elta@gmail.com --- target-openrisc/cpu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/target-openrisc/cpu.c b/target-openrisc/cpu.c index fd90d37..d38c28b 100644 --- a/target-openrisc/cpu.c +++ b/target-openrisc/cpu.c @@ -96,12 +96,14 @@ static void openrisc_cpu_initfn(Object *obj) static ObjectClass *openrisc_cpu_class_by_name(const char *cpu_model) { ObjectClass *oc; +char *typename; if (cpu_model == NULL) { return NULL; } -oc = object_class_by_name(cpu_model); +typename = g_strdup_printf(%s- TYPE_OPENRISC_CPU, cpu_model); +oc = object_class_by_name(typename); if (oc != NULL (!object_class_dynamic_cast(oc, TYPE_OPENRISC_CPU) || object_class_is_abstract(oc))) { return NULL; Thanks for your fix, it looks and test good to me. -- 1.8.1.2 Regards, Jia
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
Il 02/07/2013 10:36, Jan Kiszka ha scritto: Hmm, two solutions for one problem - can we improve this in the next round? Sure, I can adapt the hw/acpi/piix4.c to use ifs in the same style as hw/isa/lpc_ich9.c (I find the code easier to read). I was more referring to memory_region_present vs. open-coding. Understood, but once you replace ?: with if, the memory_region_present wrapper loses most of the appeal. So I actually prefer the open-coded one. Paolo
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
On 2013-07-02 11:28, Paolo Bonzini wrote: Il 02/07/2013 10:36, Jan Kiszka ha scritto: Hmm, two solutions for one problem - can we improve this in the next round? Sure, I can adapt the hw/acpi/piix4.c to use ifs in the same style as hw/isa/lpc_ich9.c (I find the code easier to read). I was more referring to memory_region_present vs. open-coding. Understood, but once you replace ?: with if, the memory_region_present wrapper loses most of the appeal. So I actually prefer the open-coded one. Well, count the memory_region_unref calls. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH] qom: Use atomics for object refcounting
Objects can soon be referenced/dereference outside the BQL. So we need to use atomics in object_ref/unref. Based on patch by Liu Ping Fan. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- qom/object.c |5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/qom/object.c b/qom/object.c index 803b94b..a76a30b 100644 --- a/qom/object.c +++ b/qom/object.c @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char *implements_type, void object_ref(Object *obj) { -obj-ref++; + __sync_fetch_and_add(obj-ref, 1); } void object_unref(Object *obj) { g_assert(obj-ref 0); -obj-ref--; /* parent always holds a reference to its children */ -if (obj-ref == 0) { +if (__sync_sub_and_fetch(obj-ref, 1) == 0) { object_finalize(obj); } } -- 1.7.3.4
Re: [Qemu-devel] [PATCHv2 07/11] iscsi: let bdrv_create conditionally zero out the device
Il 01/07/2013 23:36, Peter Lieven ha scritto: Am 01.07.2013 um 22:20 schrieb Paolo Bonzini pbonz...@redhat.com: Il 27/06/2013 15:11, Peter Lieven ha scritto: if the device supports unmapping and unmapped blocks read as zero ensure that the whole device is unmapped and report .has_zero_init = 1 in this case to speed up qemu-img convert. This can still take several minutes. Do you need any special timeout in libiscsi? Not as far as I know. The number of unmapped sectors is bound by iscsilun-max_unmap. This is what the storage will handle in one request. For my storage its the internal page size (15MB). Its very fast. Except for some broken storages I expect that unmapping on a thin-provisioned target means deleting of pointers or something similar not actually writing something to the disks. This it what the SANITIZE command does. Ok, I remember someone reporting timeouts when doing a discard with virtio-scsi. But maybe the problem there was that Linux parallelized the requests excessively after splitting them. Do you have evidence that it is really taking minutes somewhere? I tested it with a fully allocated 1TB volume. In my case it was a question of about 20 seconds. I think this was mainly due to the thousands of sync requests that had to be sent. Perhaps we can have a new discard_zeroes field in bdrv_get_info, and the unmap functionality can be moved up to qemu-img convert? Is there any other storage protocol out there that could benefit from it? Definitely LVM. Perhaps in the future gluster too, though right now it only supports discard on files, not block devices. Paolo
Re: [Qemu-devel] [PATCH v2 06/11] memory: add ref/unref calls
Il 02/07/2013 11:31, Jan Kiszka ha scritto: On 2013-07-02 11:28, Paolo Bonzini wrote: Il 02/07/2013 10:36, Jan Kiszka ha scritto: Hmm, two solutions for one problem - can we improve this in the next round? Sure, I can adapt the hw/acpi/piix4.c to use ifs in the same style as hw/isa/lpc_ich9.c (I find the code easier to read). I was more referring to memory_region_present vs. open-coding. Understood, but once you replace ?: with if, the memory_region_present wrapper loses most of the appeal. So I actually prefer the open-coded one. Well, count the memory_region_unref calls. Fair enough... I'll move memory_region_present to memory.h/c. Paolo
Re: [Qemu-devel] [PATCH] block: fix bdrv_flush() ordering in bdrv_close()
Am 01.07.2013 um 13:29 hat Stefan Hajnoczi geschrieben: Since 80ccf93b we flush the block device during close. The bdrv_drain_all() call should come before bdrv_flush() to ensure guest write requests have completed. Otherwise we may miss pending writes when flushing. However, there is still a slight change that cancelling a blockjob or doing bdrv_flush() might leave an I/O request running, so call bdrv_drain_all() again for safety as the final step. Cc: qemu-sta...@nongnu.org Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index 6c493ad..fca55b1 100644 --- a/block.c +++ b/block.c @@ -1358,11 +1358,12 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state) void bdrv_close(BlockDriverState *bs) { -bdrv_flush(bs); +bdrv_drain_all(); /* complete guest I/O */ if (bs-job) { block_job_cancel_sync(bs-job); } -bdrv_drain_all(); +bdrv_flush(bs); +bdrv_drain_all(); /* in case blockjob or flush started I/O */ notifier_list_notify(bs-close_notifiers, bs); I think we need the bdrv_drain_all() immediately before calling bdrv_flush(), so that block job writes are flushed as well. You can probably move the first one there, there doesn't seem to be a reason to drain guest requests and block job requests separately. Kevin
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 10:14 +0100, Paul Durrant wrote: -Original Message- From: Ian Campbell Sent: 02 July 2013 09:57 To: Paul Durrant Cc: qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 09:39 +0100, Paul Durrant wrote: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. What about 9aae0902d5bc7e449b7c8e4e778abcd0053...@lonpex01cl01.citrite.net ? I had actually coded up a solution based on the existing Xen platform device, by having it synthesize a device ID based on the Xen version to which we could then host the xenbus driver, to allow us to deploy multiple versions of xenbus should compatibility (with things such as the shared info interface) become an issue. The co-installer for this driver could also spot existing PV driver installations and make sure they don't get trashed. I think only this last bit of functionality is critical here, and it allows us to avoid having to carry multiple platform devices in upstream, doesn't it? This idea was rejected by Citrix product teams though, because we would not be able to prevent any Windows guest without some known PV drivers from downloading our new driver from Windows Update and this was seen as undesirable. Well, if your product requirements are at odds with doing the right thing upstream then I think it would be best for you to just carry patches to make XS behave how you want. I hope we can find a suitable compromise though. Ian.
Re: [Qemu-devel] [PATCH 28/30] exec: change iotlb APIs to take AddressSpaceDispatch
On 2013-06-28 20:26, Paolo Bonzini wrote: This makes it possible to start following RCU rules, which require not dereferencing as-dispatch more than once. It is not covering the whole of TCG, since the TLB data structures are not RCU-friendly, but it is enough for exec.c. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cputlb.c | 7 --- exec.c| 9 + include/exec/cputlb.h | 9 ++--- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/cputlb.c b/cputlb.c index 51381ae..82875b1 100644 --- a/cputlb.c +++ b/cputlb.c @@ -253,6 +253,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr, hwaddr paddr, int prot, int mmu_idx, target_ulong size) { +AddressSpaceDispatch *d; MemoryRegionSection *section; unsigned int index; target_ulong address; @@ -267,8 +268,8 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr, } sz = size; -section = address_space_translate_for_iotlb(address_space_memory, paddr, -xlat, sz); +d = address_space_memory.dispatch; +section = address_space_translate_for_iotlb(d, paddr, xlat, sz); assert(sz = TARGET_PAGE_SIZE); #if defined(DEBUG_TLB) @@ -288,7 +289,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr, } code_address = address; -iotlb = memory_region_section_get_iotlb(env, section, vaddr, paddr, xlat, +iotlb = memory_region_section_get_iotlb(d, env, section, vaddr, paddr, xlat, prot, address); index = (vaddr TARGET_PAGE_BITS) (CPU_TLB_SIZE - 1); diff --git a/exec.c b/exec.c index 528c4d7..3e1a576 100644 --- a/exec.c +++ b/exec.c @@ -306,11 +306,11 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr, } MemoryRegionSection * -address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat, +address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr, hwaddr *xlat, hwaddr *plen) { MemoryRegionSection *section; -section = address_space_translate_internal(as-dispatch, addr, xlat, plen, false); +section = address_space_translate_internal(d, addr, xlat, plen, false); assert(!section-mr-iommu_ops); return section; @@ -726,7 +726,8 @@ static int cpu_physical_memory_set_dirty_tracking(int enable) return ret; } -hwaddr memory_region_section_get_iotlb(CPUArchState *env, +hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d, + CPUArchState *env, MemoryRegionSection *section, target_ulong vaddr, hwaddr paddr, hwaddr xlat, @@ -746,7 +747,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env, iotlb |= PHYS_SECTION_ROM; } } else { -iotlb = section - address_space_memory.dispatch-sections; +iotlb = section - d-sections; iotlb += xlat; } diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h index e21cb60..968b6a4 100644 --- a/include/exec/cputlb.h +++ b/include/exec/cputlb.h @@ -31,12 +31,15 @@ void tlb_set_dirty(CPUArchState *env, target_ulong vaddr); extern int tlb_flush_count; /* exec.c */ +typedef struct AddressSpaceDispatch AddressSpaceDispatch; + void tb_flush_jmp_cache(CPUArchState *env, target_ulong addr); MemoryRegionSection * -address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat, - hwaddr *plen); -hwaddr memory_region_section_get_iotlb(CPUArchState *env, +address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr, + hwaddr *xlat, hwaddr *plen); +hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d, + CPUArchState *env, MemoryRegionSection *section, target_ulong vaddr, hwaddr paddr, hwaddr xlat, Not sure if this version was also affected, but the current one in your git requires this to build: diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h index 968b6a4..bd7ff2f 100644 --- a/include/exec/cputlb.h +++ b/include/exec/cputlb.h @@ -19,6 +19,8 @@ #ifndef CPUTLB_H #define CPUTLB_H +#include exec/memory-internal.h + #if !defined(CONFIG_USER_ONLY) /* cputlb.c */ void tlb_protect_code(ram_addr_t ram_addr); @@ -31,8 +33,6 @@ void tlb_set_dirty(CPUArchState *env, target_ulong vaddr); extern int tlb_flush_count; /* exec.c */ -typedef struct AddressSpaceDispatch AddressSpaceDispatch; - void tb_flush_jmp_cache(CPUArchState *env, target_ulong addr);
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
At 10:56 +0100 on 02 Jul (1372762607), Ian Campbell wrote: On Tue, 2013-07-02 at 10:14 +0100, Paul Durrant wrote: I had actually coded up a solution based on the existing Xen platform device, by having it synthesize a device ID based on the Xen version to which we could then host the xenbus driver, to allow us to deploy multiple versions of xenbus should compatibility (with things such as the shared info interface) become an issue. The co-installer for this driver could also spot existing PV driver installations and make sure they don't get trashed. I think only this last bit of functionality is critical here, and it allows us to avoid having to carry multiple platform devices in upstream, doesn't it? This idea was rejected by Citrix product teams though, because we would not be able to prevent any Windows guest without some known PV drivers from downloading our new driver from Windows Update and this was seen as undesirable. Well, if your product requirements are at odds with doing the right thing upstream then I think it would be best for you to just carry patches to make XS behave how you want. I think it's a reasonable aim to have the WU drivers not spontaneously appear on VMs (on all Xen hosts, remember) where the admin has chosen not to install PV drivers. Generally, the more I think about this the more I'm convinced that _not_ installing the drivers on any existing systems without explicit permission is the most important thing. I hope we can find a suitable compromise though. Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Tim.
Re: [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
Il 02/07/2013 07:59, Fam Zheng ha scritto: Previously, nbd call drive_get_ref on the drive of bs. A BDS doesn't always have associated dinfo, it's more proper to use bdrv_get_ref(). This has the important side effect of marking the exported disk as in_use (to use the terms before the series). Right now you can serve a disk and, at the same time, stream it or mirror it or create a live snapshot of it. Do we really want to block anything for a device being served? Perhaps truncation, but maybe not even that. The NBD server is meant to be as unobtrusive as possible (in some sense NBD accesses are the same as guest accesses). Paolo Signed-off-by: Fam Zheng f...@redhat.com --- blockdev-nbd.c | 9 + nbd.c | 5 + 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 95f10c8..d8bcd6f 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -69,12 +69,6 @@ static void nbd_close_notifier(Notifier *n, void *data) g_free(cn); } -static void nbd_server_put_ref(NBDExport *exp) -{ -BlockDriverState *bs = nbd_export_get_blockdev(exp); -drive_put_ref(drive_get_by_blockdev(bs)); -} - void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, Error **errp) { @@ -106,10 +100,9 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, } exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, - nbd_server_put_ref); + NULL); nbd_export_set_name(exp, device); -drive_get_ref(drive_get_by_blockdev(bs)); n = g_malloc0(sizeof(NBDCloseNotifier)); n-n.notify = nbd_close_notifier; diff --git a/nbd.c b/nbd.c index 2606403..f28b9fb 100644 --- a/nbd.c +++ b/nbd.c @@ -881,6 +881,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, exp-nbdflags = nbdflags; exp-size = size == -1 ? bdrv_getlength(bs) : size; exp-close = close; +bdrv_get_ref(bs); return exp; } @@ -927,6 +928,10 @@ void nbd_export_close(NBDExport *exp) } nbd_export_set_name(exp, NULL); nbd_export_put(exp); +if (exp-bs) { +bdrv_put_ref(exp-bs); +exp-bs = NULL; +} } void nbd_export_get(NBDExport *exp)
Re: [Qemu-devel] [PATCH] target-openrisc: Add typename for CPU models.
Hi Jia, Am 02.07.2013 11:29, schrieb Jia Liu: On Tue, Jul 2, 2013 at 5:11 PM, Dongxue Zhang elta@gmail.com mailto:elta@gmail.com wrote: Make target-openrisc running OK by add typename in openrisc_cpu_class_by_name(). Signed-off-by: Dongxue Zhang elta@gmail.com mailto:elta@gmail.com --- target-openrisc/cpu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/target-openrisc/cpu.c b/target-openrisc/cpu.c index fd90d37..d38c28b 100644 --- a/target-openrisc/cpu.c +++ b/target-openrisc/cpu.c @@ -96,12 +96,14 @@ static void openrisc_cpu_initfn(Object *obj) static ObjectClass *openrisc_cpu_class_by_name(const char *cpu_model) { ObjectClass *oc; +char *typename; if (cpu_model == NULL) { return NULL; } -oc = object_class_by_name(cpu_model); +typename = g_strdup_printf(%s- TYPE_OPENRISC_CPU, cpu_model); +oc = object_class_by_name(typename); if (oc != NULL (!object_class_dynamic_cast(oc, TYPE_OPENRISC_CPU) || object_class_is_abstract(oc))) { return NULL; Thanks for your fix, it looks and test good to me. Sorry for the breakage. Do you want to add a Reviewed-by/Tested-by/Acked-by? I'd queue it for you then. If you could upload a Linux test image somewhere that may help avoid breakages in the future. Also we reported that there was no maintainer for target-openrisc/ in MAINTAINERS file, do you want to put yourself there so that you are CC'ed on patches? Here's a pointer to the latest refactoring that partially affects or32: http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg05354.html Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
Il 02/07/2013 07:59, Fam Zheng ha scritto: Use numeric value to replace in_use flag in BDS, this will make lifecycle management with ref count possible. This patch only replaces existing uses of bdrv_set_in_use, so no logic change here. This still does not entirely explain the rules for who sets in_use (or gets a reference) and who checks in_use. Why should offline commit worry about the disk being shared, for example? The reason of course is that a background job might modify bs-backing_hd while commit is running (or might expect bs-backing_hd to not change). However, this is in no way related to a reference count. So I think your series is doing two things right (setting the in_use flag for BDSes in the -file and -backing_hd chains; turning the in_use flag into a counter) and one wrong (tying bdrv_in_use checks to the lifecycle). I wonder thus if we need two counters: the in use counter and the reference count for the lifecycle. One further nit is inline. Signed-off-by: Fam Zheng f...@redhat.com --- block-migration.c | 4 ++-- block.c | 23 +++ blockjob.c | 6 +++--- hw/block/dataplane/virtio-blk.c | 4 ++-- include/block/block.h | 3 ++- include/block/block_int.h | 2 +- 6 files changed, 25 insertions(+), 17 deletions(-) diff --git a/block-migration.c b/block-migration.c index 2fd7699..2efb6c0 100644 --- a/block-migration.c +++ b/block-migration.c @@ -321,7 +321,7 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs) bmds-shared_base = block_mig_state.shared_base; alloc_aio_bitmap(bmds); drive_get_ref(drive_get_by_blockdev(bs)); -bdrv_set_in_use(bs, 1); +bdrv_get_ref(bs); block_mig_state.total_sector_sum += sectors; @@ -557,7 +557,7 @@ static void blk_mig_cleanup(void) blk_mig_lock(); while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) { QSIMPLEQ_REMOVE_HEAD(block_mig_state.bmds_list, entry); -bdrv_set_in_use(bmds-bs, 0); +bdrv_get_ref(bmds-bs); drive_put_ref(drive_get_by_blockdev(bmds-bs)); g_free(bmds-aio_bitmap); g_free(bmds); diff --git a/block.c b/block.c index 6c493ad..84c3181 100644 --- a/block.c +++ b/block.c @@ -1503,8 +1503,10 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest, /* dirty bitmap */ bs_dest-dirty_bitmap = bs_src-dirty_bitmap; +/* ref count */ +bs_dest-refcount = bs_src-refcount; + /* job */ -bs_dest-in_use = bs_src-in_use; bs_dest-job= bs_src-job; /* keep the same entry in bdrv_states */ @@ -1534,7 +1536,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) assert(bs_new-dirty_bitmap == NULL); assert(bs_new-job == NULL); assert(bs_new-dev == NULL); -assert(bs_new-in_use == 0); +assert(bs_new-refcount == 0); assert(bs_new-io_limits_enabled == false); assert(bs_new-block_timer == NULL); @@ -1553,7 +1555,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) /* Check a few fields that should remain attached to the device */ assert(bs_new-dev == NULL); assert(bs_new-job == NULL); -assert(bs_new-in_use == 0); +assert(bs_new-refcount == 0); assert(bs_new-io_limits_enabled == false); assert(bs_new-block_timer == NULL); @@ -1590,7 +1592,7 @@ void bdrv_delete(BlockDriverState *bs) { assert(!bs-dev); assert(!bs-job); -assert(!bs-in_use); +assert(!bs-refcount); /* remove from list, if necessary */ bdrv_make_anon(bs); @@ -4360,15 +4362,20 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs) } } -void bdrv_set_in_use(BlockDriverState *bs, int in_use) +void bdrv_put_ref(BlockDriverState *bs) +{ +assert(bs-refcount 0); +bs-refcount--; +} + +void bdrv_get_ref(BlockDriverState *bs) { -assert(bs-in_use != in_use); -bs-in_use = in_use; +bs-refcount++; } The convention that we use in QEMU is bdrv_ref/bdrv_unref. Paolo int bdrv_in_use(BlockDriverState *bs) { -return bs-in_use; +return bs-refcount 0; } void bdrv_iostatus_enable(BlockDriverState *bs) diff --git a/blockjob.c b/blockjob.c index ca80df1..a841a66 100644 --- a/blockjob.c +++ b/blockjob.c @@ -45,7 +45,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs, error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs)); return NULL; } -bdrv_set_in_use(bs, 1); +bdrv_get_ref(bs); job = g_malloc0(job_type-instance_size); job-job_type = job_type; @@ -63,7 +63,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs, if (error_is_set(local_err)) { bs-job = NULL;
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 11:15 +0100, Tim Deegan wrote: At 10:56 +0100 on 02 Jul (1372762607), Ian Campbell wrote: On Tue, 2013-07-02 at 10:14 +0100, Paul Durrant wrote: I had actually coded up a solution based on the existing Xen platform device, by having it synthesize a device ID based on the Xen version to which we could then host the xenbus driver, to allow us to deploy multiple versions of xenbus should compatibility (with things such as the shared info interface) become an issue. The co-installer for this driver could also spot existing PV driver installations and make sure they don't get trashed. I think only this last bit of functionality is critical here, and it allows us to avoid having to carry multiple platform devices in upstream, doesn't it? This idea was rejected by Citrix product teams though, because we would not be able to prevent any Windows guest without some known PV drivers from downloading our new driver from Windows Update and this was seen as undesirable. Well, if your product requirements are at odds with doing the right thing upstream then I think it would be best for you to just carry patches to make XS behave how you want. I think it's a reasonable aim to have the WU drivers not spontaneously appear on VMs (on all Xen hosts, remember) where the admin has chosen not to install PV drivers. Generally, the more I think about this the more I'm convinced that _not_ installing the drivers on any existing systems without explicit permission is the most important thing. Will WU install a completely fresh driver for a new (or indeed old) bit of hardware on native entirely without prompting? I'd have expected the old Windows has found a driver for your device dance. I hope we can find a suitable compromise though. Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right.
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Ian Campbell Sent: 02 July 2013 11:24 To: Tim (Xen.org) Cc: Paul Durrant; qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 11:15 +0100, Tim Deegan wrote: At 10:56 +0100 on 02 Jul (1372762607), Ian Campbell wrote: On Tue, 2013-07-02 at 10:14 +0100, Paul Durrant wrote: I had actually coded up a solution based on the existing Xen platform device, by having it synthesize a device ID based on the Xen version to which we could then host the xenbus driver, to allow us to deploy multiple versions of xenbus should compatibility (with things such as the shared info interface) become an issue. The co-installer for this driver could also spot existing PV driver installations and make sure they don't get trashed. I think only this last bit of functionality is critical here, and it allows us to avoid having to carry multiple platform devices in upstream, doesn't it? This idea was rejected by Citrix product teams though, because we would not be able to prevent any Windows guest without some known PV drivers from downloading our new driver from Windows Update and this was seen as undesirable. Well, if your product requirements are at odds with doing the right thing upstream then I think it would be best for you to just carry patches to make XS behave how you want. I think it's a reasonable aim to have the WU drivers not spontaneously appear on VMs (on all Xen hosts, remember) where the admin has chosen not to install PV drivers. Generally, the more I think about this the more I'm convinced that _not_ installing the drivers on any existing systems without explicit permission is the most important thing. Will WU install a completely fresh driver for a new (or indeed old) bit of hardware on native entirely without prompting? I'd have expected the old Windows has found a driver for your device dance. I hope we can find a suitable compromise though. Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. No, that is the eventual aim so I don't think the 'upgrade only' options is really future-proof. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Note that I'm not proposing the new device displaces the existing platform device in any way. Paul
Re: [Qemu-devel] [PATCHv2 07/11] iscsi: let bdrv_create conditionally zero out the device
Am 02.07.2013 um 11:22 schrieb Paolo Bonzini pbonz...@redhat.com: Il 01/07/2013 23:36, Peter Lieven ha scritto: Am 01.07.2013 um 22:20 schrieb Paolo Bonzini pbonz...@redhat.com: Il 27/06/2013 15:11, Peter Lieven ha scritto: if the device supports unmapping and unmapped blocks read as zero ensure that the whole device is unmapped and report .has_zero_init = 1 in this case to speed up qemu-img convert. This can still take several minutes. Do you need any special timeout in libiscsi? Not as far as I know. The number of unmapped sectors is bound by iscsilun-max_unmap. This is what the storage will handle in one request. For my storage its the internal page size (15MB). Its very fast. Except for some broken storages I expect that unmapping on a thin-provisioned target means deleting of pointers or something similar not actually writing something to the disks. This it what the SANITIZE command does. Ok, I remember someone reporting timeouts when doing a discard with virtio-scsi. But maybe the problem there was that Linux parallelized the requests excessively after splitting them. Do you have evidence that it is really taking minutes somewhere? I tested it with a fully allocated 1TB volume. In my case it was a question of about 20 seconds. I think this was mainly due to the thousands of sync requests that had to be sent. Perhaps we can have a new discard_zeroes field in bdrv_get_info, and the unmap functionality can be moved up to qemu-img convert? Is there any other storage protocol out there that could benefit from it? Definitely LVM. Perhaps in the future gluster too, though right now it only supports discard on files, not block devices. Is discards on LVM sth that is already implemented in qemu? Would you mind if we postpone the general approach to a later point. It seems that this is much more complex than the iSCSI approach. My intention was to fix the iSCSI thin-provisioning problem here. Peter
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Ian Campbell Sent: 02 July 2013 11:24 To: Tim (Xen.org) Cc: Paul Durrant; qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 11:15 +0100, Tim Deegan wrote: At 10:56 +0100 on 02 Jul (1372762607), Ian Campbell wrote: On Tue, 2013-07-02 at 10:14 +0100, Paul Durrant wrote: I had actually coded up a solution based on the existing Xen platform device, by having it synthesize a device ID based on the Xen version to which we could then host the xenbus driver, to allow us to deploy multiple versions of xenbus should compatibility (with things such as the shared info interface) become an issue. The co-installer for this driver could also spot existing PV driver installations and make sure they don't get trashed. I think only this last bit of functionality is critical here, and it allows us to avoid having to carry multiple platform devices in upstream, doesn't it? This idea was rejected by Citrix product teams though, because we would not be able to prevent any Windows guest without some known PV drivers from downloading our new driver from Windows Update and this was seen as undesirable. Well, if your product requirements are at odds with doing the right thing upstream then I think it would be best for you to just carry patches to make XS behave how you want. I think it's a reasonable aim to have the WU drivers not spontaneously appear on VMs (on all Xen hosts, remember) where the admin has chosen not to install PV drivers. Generally, the more I think about this the more I'm convinced that _not_ installing the drivers on any existing systems without explicit permission is the most important thing. Will WU install a completely fresh driver for a new (or indeed old) bit of hardware on native entirely without prompting? I believe it can be configured to do so. I'd have expected the old Windows has found a driver for your device dance. I hope we can find a suitable compromise though. I still think the extra device is the cleanest solution, and I have gone through all the alternatives I can think of. Paul
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 11:31 +0100, Paul Durrant wrote: XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Note that I'm not proposing the new device displaces the existing platform device in any way. A dislike for having to coordinate guest internal and guest external configuration changes in this way has been expressed by several people. However if you only intend to support your drivers on XenServer (and it is starting to seem like your constraints may end up forcing this option upon you) then you can equally well carry that new device in your patches too and manage it however your PMs require. Ian.
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
At 10:31 + on 02 Jul (1372761105), Paul Durrant wrote: Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. No, that is the eventual aim so I don't think the 'upgrade only' options is really future-proof. Well, you can have them install by default on platforms that want it, or on new enough Xen versions. Or, even better, on new enough _windows_ versions. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Only if it's OK that the _host_ admin has to be involved (which was the original objection). Upgrade-only but hooked to the existing ID lets a guest admin install the drivers manually without plumbing it through $CLOUDPROVIDER's toolstack, and without having it appear suddenly on existing VMs in the dead of night. If you could have it bind to either device then I guess a second PCI ID is another way of signalling that policy, but we should probably use one of the many other channels we have from the tools to the guest. Tim.
Re: [Qemu-devel] [PATCHv2 11/11] iscsi: assert that sectors are aligned to LUN blocksize
Am 02.07.2013 um 12:44 schrieb Paolo Bonzini pbonz...@redhat.com: Il 02/07/2013 10:28, Peter Lieven ha scritto: I agree that it's better to emit an error than to corrupt the disk. I'm just trying to understand what needs to be done on top of this to make 4 KB LUNs work. Reading is quite easy, you just have to have a wrapper that is reading an aligned portion of sectors around the original request and then extracting what was requested. Writing is very complicated. Requests have to be serialized, Read-Modify-Write for unaligned write requests. Paolo had all this prepared already. I wonder if it would be enough to have the block size of the host/iSCSI device propagated to the guest drivers and read 4K (or the protocol block size bytes) when probing the MBR. No, propagating the size of the device is only correct if we're doing SCSI passthrough (-device scsi-block) and in that case we're doing it already. For emulation, the choices are: 1) manually set the logical_block_size of the disk to 4096 (which means the disk cannot be used as a boot disk, at least with BIOS; I don't know about UEFI); 2) do 512e/4KN as in my patches. You would still need to expose a 4K physical_sector_size to be precise, but it is just an optimization. This seems the more reasonable for me. Can you check what needs to be done to get your last version of your old series ready for upstream? Is the exposing of a sector size 512 byte complicated or is this something that is already possible with IDE, Virtio-BLK, Virtio-SCSI etc.? Peter
Re: [Qemu-devel] [PATCHv2 08/11] block-migration: efficiently encode zero blocks
Il 02/07/2013 09:40, Stefan Hajnoczi ha scritto: On Mon, Jul 01, 2013 at 05:55:15PM +0200, Peter Lieven wrote: Am 01.07.2013 um 16:13 schrieb Stefan Hajnoczi stefa...@redhat.com: On Thu, Jun 27, 2013 at 03:11:32PM +0200, Peter Lieven wrote: This patch breaks cross-version blog migration. We need to control whether or not to use the new BLK_MIG_FLAG_ZERO_BLOCK flag. you are right the upgrade way works, but downgrade not. what is the proposed way to fix this? The safest way is to add a block migration flag that toggles the zero block. By default it would be off. Maybe you would also add a MigrationCapability in qapi-schema.json to indicated that zero blocks are supported in block migration. That would allow management tools to ask QEMU on both sides if they support zero blocks. Use migrate-set-capabilities to enable the zero block feature. Yes, a capability makes sense. Paolo
Re: [Qemu-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Andreas Färber [mailto:afaer...@suse.de] Sent: 02 July 2013 11:46 To: Paul Durrant Cc: qemu-devel@nongnu.org; xen-de...@lists.xen.org; Paolo Bonzini; Michael S. Tsirkin Subject: Re: [Qemu-devel] [PATCH] Citrix PV Bus device Am 02.07.2013 10:39, schrieb Paul Durrant: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. This device therefore acts as an opt-in for those wishing to deploy Citrix PV drivers. Signed-off-by: Paul Durrant paul.durr...@citrix.com --- hw/i386/Makefile.objs|1 + hw/i386/citrix_pv_bus.c | 122 ++ include/hw/pci/pci_ids.h |3 ++ 3 files changed, 126 insertions(+) create mode 100644 hw/i386/citrix_pv_bus.c diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 205d22e..8e28831 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -4,3 +4,4 @@ obj-y += pc.o pc_piix.o pc_q35.o obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o obj-y += kvmvapic.o +obj-y += citrix_pv_bus.o So the reason to place the device here is TARGET_PAGE_SIZE... We really need a way to access that value from common code, somewhere down my TODO list. :/ diff --git a/hw/i386/citrix_pv_bus.c b/hw/i386/citrix_pv_bus.c new file mode 100644 index 000..e1e0508 --- /dev/null +++ b/hw/i386/citrix_pv_bus.c @@ -0,0 +1,122 @@ +/* Copyright (c) Citrix Systems Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, + * with or without modification, are permitted provided + * that the following conditions are met: + * + * * Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the + * following disclaimer in the documentation and/or other + * materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include hw/hw.h +#include hw/pci/pci.h + +typedef struct _CitrixPVBusDevice { Identifiers starting with underscore are supposedly reserved by POSIX/C99, you can just use the same name as for the typedef. Andreas, Thanks for the review. I'll modify that name as you suggest. +PCIDevice dev; /* private */ PCIDevice parent_obj; /* public */ please. More although still incomplete guidelines at: http://wiki.qemu.org/QOMConventions Thanks for the pointer. +uint8_t revision; +uint32_tpages; +MemoryRegionmmio; +} CitrixPVBusDevice; + +static uint64_t citrix_pv_bus_mmio_read(void *opaque, hwaddr addr, +unsigned size) +{ +fprintf(stderr, WARNING: read from address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); Possibly use the qemu_log() macros or a tracepoint? I'm just looking at tracepoints now. That does seem like a cleaner approach. + +return ~(uint64_t)0; +} + +static void citrix_pv_bus_mmio_write(void *opaque, hwaddr addr, + uint64_t val, unsigned size) +{ +fprintf(stderr, WARNING: write to address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); +} + +static const MemoryRegionOps citrix_pv_bus_mmio_ops = { +.read = citrix_pv_bus_mmio_read, +.write = citrix_pv_bus_mmio_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static int citrix_pv_bus_init(PCIDevice *pci_dev) +{ +
Re: [Qemu-devel] [PATCHv2 11/11] iscsi: assert that sectors are aligned to LUN blocksize
Il 02/07/2013 12:49, Peter Lieven ha scritto: Am 02.07.2013 um 12:44 schrieb Paolo Bonzini pbonz...@redhat.com: Il 02/07/2013 10:28, Peter Lieven ha scritto: I agree that it's better to emit an error than to corrupt the disk. I'm just trying to understand what needs to be done on top of this to make 4 KB LUNs work. Reading is quite easy, you just have to have a wrapper that is reading an aligned portion of sectors around the original request and then extracting what was requested. Writing is very complicated. Requests have to be serialized, Read-Modify-Write for unaligned write requests. Paolo had all this prepared already. I wonder if it would be enough to have the block size of the host/iSCSI device propagated to the guest drivers and read 4K (or the protocol block size bytes) when probing the MBR. No, propagating the size of the device is only correct if we're doing SCSI passthrough (-device scsi-block) and in that case we're doing it already. For emulation, the choices are: 1) manually set the logical_block_size of the disk to 4096 (which means the disk cannot be used as a boot disk, at least with BIOS; I don't know about UEFI); 2) do 512e/4KN as in my patches. You would still need to expose a 4K physical_sector_size to be precise, but it is just an optimization. This seems the more reasonable for me. Can you check what needs to be done to get your last version of your old series ready for upstream? Is the exposing of a sector size 512 byte complicated or is this something that is already possible with IDE, Virtio-BLK, Virtio-SCSI etc.? It is possible with virtio-blk and SCSI with the logical_block_size property of virtio-blk-*, scsi-hd and scsi-disk devices. IIRC libvirt supports it, too. IDE only supports 512e/4KN (in general, not just in QEMU). Paolo Peter
Re: [Qemu-devel] [PATCH] Citrix PV Bus device
Il 02/07/2013 12:45, Andreas Färber ha scritto: Am 02.07.2013 10:39, schrieb Paul Durrant: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. This device therefore acts as an opt-in for those wishing to deploy Citrix PV drivers. Signed-off-by: Paul Durrant paul.durr...@citrix.com --- hw/i386/Makefile.objs|1 + hw/i386/citrix_pv_bus.c | 122 ++ include/hw/pci/pci_ids.h |3 ++ 3 files changed, 126 insertions(+) create mode 100644 hw/i386/citrix_pv_bus.c diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 205d22e..8e28831 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -4,3 +4,4 @@ obj-y += pc.o pc_piix.o pc_q35.o obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o obj-y += kvmvapic.o +obj-y += citrix_pv_bus.o So the reason to place the device here is TARGET_PAGE_SIZE... We really need a way to access that value from common code, somewhere down my TODO list. :/ Why does it need to be in pages rather than bytes? Paolo diff --git a/hw/i386/citrix_pv_bus.c b/hw/i386/citrix_pv_bus.c new file mode 100644 index 000..e1e0508 --- /dev/null +++ b/hw/i386/citrix_pv_bus.c @@ -0,0 +1,122 @@ +/* Copyright (c) Citrix Systems Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, + * with or without modification, are permitted provided + * that the following conditions are met: + * + * * Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the + * following disclaimer in the documentation and/or other + * materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include hw/hw.h +#include hw/pci/pci.h + +typedef struct _CitrixPVBusDevice { Identifiers starting with underscore are supposedly reserved by POSIX/C99, you can just use the same name as for the typedef. +PCIDevice dev; /* private */ PCIDevice parent_obj; /* public */ please. More although still incomplete guidelines at: http://wiki.qemu.org/QOMConventions +uint8_t revision; +uint32_tpages; +MemoryRegionmmio; +} CitrixPVBusDevice; + +static uint64_t citrix_pv_bus_mmio_read(void *opaque, hwaddr addr, +unsigned size) +{ +fprintf(stderr, WARNING: read from address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); Possibly use the qemu_log() macros or a tracepoint? + +return ~(uint64_t)0; +} + +static void citrix_pv_bus_mmio_write(void *opaque, hwaddr addr, + uint64_t val, unsigned size) +{ +fprintf(stderr, WARNING: write to address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); +} + +static const MemoryRegionOps citrix_pv_bus_mmio_ops = { +.read = citrix_pv_bus_mmio_read, +.write = citrix_pv_bus_mmio_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static int citrix_pv_bus_init(PCIDevice *pci_dev) +{ +CitrixPVBusDevice *d = DO_UPCAST(CitrixPVBusDevice, dev, pci_dev); Please don't use DO_UPCAST() with QOM objects. Please introduce a CITRIX_PV_BUS_DEVICE(obj) macro as described above and/or in include/qom/object.h to avoid using the parent field's name in code. +uint8_t *pci_conf; +uint64_t size; + +pci_conf = pci_dev-config; + +pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_MEMORY); +
Re: [Qemu-devel] [PATCH] Citrix PV Bus device
Am 02.07.2013 10:39, schrieb Paul Durrant: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. This device therefore acts as an opt-in for those wishing to deploy Citrix PV drivers. Signed-off-by: Paul Durrant paul.durr...@citrix.com --- hw/i386/Makefile.objs|1 + hw/i386/citrix_pv_bus.c | 122 ++ include/hw/pci/pci_ids.h |3 ++ 3 files changed, 126 insertions(+) create mode 100644 hw/i386/citrix_pv_bus.c diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 205d22e..8e28831 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -4,3 +4,4 @@ obj-y += pc.o pc_piix.o pc_q35.o obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o obj-y += kvmvapic.o +obj-y += citrix_pv_bus.o So the reason to place the device here is TARGET_PAGE_SIZE... We really need a way to access that value from common code, somewhere down my TODO list. :/ diff --git a/hw/i386/citrix_pv_bus.c b/hw/i386/citrix_pv_bus.c new file mode 100644 index 000..e1e0508 --- /dev/null +++ b/hw/i386/citrix_pv_bus.c @@ -0,0 +1,122 @@ +/* Copyright (c) Citrix Systems Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, + * with or without modification, are permitted provided + * that the following conditions are met: + * + * * Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the + * following disclaimer in the documentation and/or other + * materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include hw/hw.h +#include hw/pci/pci.h + +typedef struct _CitrixPVBusDevice { Identifiers starting with underscore are supposedly reserved by POSIX/C99, you can just use the same name as for the typedef. +PCIDevice dev; /* private */ PCIDevice parent_obj; /* public */ please. More although still incomplete guidelines at: http://wiki.qemu.org/QOMConventions +uint8_t revision; +uint32_tpages; +MemoryRegionmmio; +} CitrixPVBusDevice; + +static uint64_t citrix_pv_bus_mmio_read(void *opaque, hwaddr addr, +unsigned size) +{ +fprintf(stderr, WARNING: read from address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); Possibly use the qemu_log() macros or a tracepoint? + +return ~(uint64_t)0; +} + +static void citrix_pv_bus_mmio_write(void *opaque, hwaddr addr, + uint64_t val, unsigned size) +{ +fprintf(stderr, WARNING: write to address 0x TARGET_FMT_plx + in Citrix PV Bus MMIO BAR\n, addr); +} + +static const MemoryRegionOps citrix_pv_bus_mmio_ops = { +.read = citrix_pv_bus_mmio_read, +.write = citrix_pv_bus_mmio_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static int citrix_pv_bus_init(PCIDevice *pci_dev) +{ +CitrixPVBusDevice *d = DO_UPCAST(CitrixPVBusDevice, dev, pci_dev); Please don't use DO_UPCAST() with QOM objects. Please introduce a CITRIX_PV_BUS_DEVICE(obj) macro as described above and/or in include/qom/object.h to avoid using the parent field's name in code. +uint8_t *pci_conf; +uint64_t size; + +pci_conf = pci_dev-config; + +pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_MEMORY); +pci_set_byte(pci_conf + PCI_REVISION_ID, d-revision); + +pci_config_set_prog_interface(pci_conf, 0); + +
Re: [Qemu-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Paolo Bonzini [mailto:pbonz...@redhat.com] Sent: 02 July 2013 11:54 To: Andreas Färber Cc: Paul Durrant; qemu-devel@nongnu.org; xen-de...@lists.xen.org; Michael S. Tsirkin Subject: Re: [Qemu-devel] [PATCH] Citrix PV Bus device Il 02/07/2013 12:45, Andreas Färber ha scritto: Am 02.07.2013 10:39, schrieb Paul Durrant: This patch introduces a new PCI device which will act as the binding point for Citrix branded PV drivers for Xen. The intention is that Citrix Windows PV drivers will be available on Windows Update and thus using the existing Xen platform PCI device as an anchor point is not desirable as that device has been ubiquitous in HVM guests for a long time and thus existing HVM guests running Windows would start automatically downloading drivers from Windows Update when this may not be desired by either the host or guest admin. This device therefore acts as an opt-in for those wishing to deploy Citrix PV drivers. Signed-off-by: Paul Durrant paul.durr...@citrix.com --- hw/i386/Makefile.objs|1 + hw/i386/citrix_pv_bus.c | 122 ++ include/hw/pci/pci_ids.h |3 ++ 3 files changed, 126 insertions(+) create mode 100644 hw/i386/citrix_pv_bus.c diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 205d22e..8e28831 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -4,3 +4,4 @@ obj-y += pc.o pc_piix.o pc_q35.o obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o obj-y += kvmvapic.o +obj-y += citrix_pv_bus.o So the reason to place the device here is TARGET_PAGE_SIZE... We really need a way to access that value from common code, somewhere down my TODO list. :/ Why does it need to be in pages rather than bytes? It doesn't necessarily need to be in pages; it's just a more convenient quantity than bytes. Since it needs to be a power of 2 I could perhaps use an 'order' parameter instead? Paul
Re: [Qemu-devel] [PATCHv2 07/11] iscsi: let bdrv_create conditionally zero out the device
Il 02/07/2013 12:36, Peter Lieven ha scritto: Perhaps we can have a new discard_zeroes field in bdrv_get_info, and the unmap functionality can be moved up to qemu-img convert? Is there any other storage protocol out there that could benefit from it? Definitely LVM. Perhaps in the future gluster too, though right now it only supports discard on files, not block devices. Is discards on LVM sth that is already implemented in qemu? Yes, it supports BLKDISCARD (see handle_aiocb_discard in block/raw-posix.c). Of course there is no way to query the host discard_zeroes setting yet. But even if it weren't implemented in QEMU, you should aim at making it easier (if it's not too much work, which it isn't), not harder. If you do it in block/iscsi.c, the next person who comes will have to basically undo your work and reimplement+retest it with the right API. Would you mind if we postpone the general approach to a later point. It seems that this is much more complex than the iSCSI approach. It shouldn't be more complex at all, actually. You just need to pass the maximum unmap sectors and lbprz parameters through bdrv_get_info. I'm not asking you to add support for BLKDISCARDZEROES and all that. I'm asking you to do the work at the right level. Paolo
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, Jul 02, 2013 at 11:48:46AM +0100, Ian Campbell wrote: On Tue, 2013-07-02 at 11:31 +0100, Paul Durrant wrote: XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Note that I'm not proposing the new device displaces the existing platform device in any way. A dislike for having to coordinate guest internal and guest external configuration changes in this way has been expressed by several people. However if you only intend to support your drivers on XenServer (and it is starting to seem like your constraints may end up forcing this option upon you) then you can equally well carry that new device in your patches too and manage it however your PMs require. It's also good to remember the Citrix XenServer Windows PV drivers are now opensource, and (more) easily usable on non-XenServer environments aswell. I hope they'd work on upstream Xen aswell. -- Pasi
Re: [Qemu-devel] [PATCH] Citrix PV Bus device
Il 02/07/2013 12:57, Paul Durrant ha scritto: So the reason to place the device here is TARGET_PAGE_SIZE... We really need a way to access that value from common code, somewhere down my TODO list. :/ Why does it need to be in pages rather than bytes? It doesn't necessarily need to be in pages; it's just a more convenient quantity than bytes. Since it needs to be a power of 2 I could perhaps use an 'order' parameter instead? I would just use bytes, the power-of-2 requirement can be checked in the init function (actually it would just be caught by pci_register_bar). Paolo
Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information
On Tue, Jul 02, 2013 at 11:05:56AM +0200, Markus Armbruster wrote: Amos Kong ak...@redhat.com writes: diff --git a/net/net.c b/net/net.c index 43a74e4..7b73a10 100644 --- a/net/net.c +++ b/net/net.c @@ -961,6 +961,53 @@ void print_net_client(Monitor *mon, NetClientState *nc) nc-info_str); } +RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name, + Error **errp) +{ +NetClientState *nc; +RxFilterInfoList *filter_list = NULL, *last_entry = NULL; + +QTAILQ_FOREACH(nc, net_clients, next) { +RxFilterInfoList *entry; +RxFilterInfo *info; + +if (nc-info-type != NET_CLIENT_OPTIONS_KIND_NIC) { +continue; +} +if (has_name strcmp(nc-name, name) != 0) { +continue; +} + +if (nc-info-query_rx_filter) { +info = nc-info-query_rx_filter(nc); +entry = g_malloc0(sizeof(*entry)); +entry-value = info; + +if (!filter_list) { +filter_list = entry; +} else { +last_entry-next = entry; +} +last_entry = entry; +} else if (has_name) { +error_setg(errp, net client(%s) doesn't support +rx-filter querying, name); +break; +} + +} + +if (filter_list == NULL !error_is_set(errp)) { +if (has_name) { +error_setg(errp, invalid net client name: %s, name); +} else { +error_setg(errp, no net client supports rx-filter querying); Why is this an error? Return an error note is bettr than a NULL list. Management should not query the rx-filter info if there is no net client supports it. I disagree. If a management application asks a question we can answer, we should give it a straight answer, not an error. ls of an empty directory prints nothing and succeeds. It doesn't concern itself with whether I should or should not ask for the directory's contents, say because nobody but me can access the directory, and I therefore should know perfectly well that it's empty. Reasonable. { return: [ ] } NULL list might be misread to that it supports rx-filter querying, but the rx-filter config has no item. If such a misunderstanding is possible, the command's specification needs fixing. If I read the specification correctly, the returned list has one element per selected NIC that supports rx-filter querying, and no more. Without the optional argument, all NICs are selected. With the optional argument, just the NIC identified by the argument is selected. We don't support filter by net client name in latest version. Whether the rx-filter config has no item shouldn't matter. I don't even understand what that means :) I'm wrong, if one nic doesn't have rx-filter config entry, a NULL dict will also be returned. return: [ { } ] I will return NULL table in this case. +} +} + +return filter_list; +} + +Example: + +- { execute: query-rx-filter, arguments: { name: vnet0 }} +- { return: [ +{ +promiscuous: true, +name: vnet0, +main-mac: 52:54:00:12:34:56, +unicast: normal, +unicast-table: [ +], +multicast: normal, +multicast-overflow: false, +unicast-overflow: false, +multicast-table: [ +01:00:5e:00:00:01, +33:33:00:00:00:01, +33:33:ff:12:34:56 +], +broadcast-allowed: false +} + ] + } + +EQMP This interface is abstract in the sense that it applies to all NICs. At this time, it's implemented only virtio-net implements it. I'm habitually wary of abstractions based on just one concrete instance, which makes me ask: 1. Ignorant question first: could the feature make sense for other NICs, too, or is it specific to virtio-net? We will not. It's ugly to check if nic is virtio-net nic in net/net.c, so I register the query function to NetClientInfo. Traversal the net client list in net/net.c, and execute query of each virtio-net instance in virtio-net.c Implementing the feature as an optional callback is fine. Let me rephrase my question: could this feature be implemented for other NICs? I'm *not* asking you to do that, just whether it would be possible. I'm asking because my review of the QAPI schema depends on the answer. 2. If the former, are you reasonably sure this object will do for other NICs? No. I'm not sure I understand you. Do you mean to say that the
Re: [Qemu-devel] [PATCHv2 07/11] iscsi: let bdrv_create conditionally zero out the device
Am 02.07.2013 um 12:49 schrieb Paolo Bonzini pbonz...@redhat.com: Il 02/07/2013 12:36, Peter Lieven ha scritto: Perhaps we can have a new discard_zeroes field in bdrv_get_info, and the unmap functionality can be moved up to qemu-img convert? Is there any other storage protocol out there that could benefit from it? Definitely LVM. Perhaps in the future gluster too, though right now it only supports discard on files, not block devices. Is discards on LVM sth that is already implemented in qemu? Yes, it supports BLKDISCARD (see handle_aiocb_discard in block/raw-posix.c). Of course there is no way to query the host discard_zeroes setting yet. No way in qemu or no way at all? But even if it weren't implemented in QEMU, you should aim at making it easier (if it's not too much work, which it isn't), not harder. If you do it in block/iscsi.c, the next person who comes will have to basically undo your work and reimplement+retest it with the right API. Would you mind if we postpone the general approach to a later point. It seems that this is much more complex than the iSCSI approach. It shouldn't be more complex at all, actually. You just need to pass the maximum unmap sectors and lbprz parameters through bdrv_get_info. I'm not asking you to add support for BLKDISCARDZEROES and all that. I'm asking you to do the work at the right level. You are right, if its possible in other protocols also it should be done at a higher level. I will at least look into integrating it into bdrv_get_info for iSCSI and implement the unmapping logic in qemu-img for the case that has_zero_init is 0. Peter
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 11:49 +0100, Tim Deegan wrote: At 10:31 + on 02 Jul (1372761105), Paul Durrant wrote: Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. No, that is the eventual aim so I don't think the 'upgrade only' options is really future-proof. Well, you can have them install by default on platforms that want it, or on new enough Xen versions. Or, even better, on new enough _windows_ versions. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Only if it's OK that the _host_ admin has to be involved (which was the original objection). Upgrade-only but hooked to the existing ID lets a guest admin install the drivers manually without plumbing it through $CLOUDPROVIDER's toolstack, and without having it appear suddenly on existing VMs in the dead of night. I think part of the problem here is that it is unclear who the target audience for these drivers are. Paul, are you intending that these drivers be only for XenServer users or are you intending for them to be used by the broader community on a variety of different Xen platforms? Ian.
Re: [Qemu-devel] [PATCHv2 11/11] iscsi: assert that sectors are aligned to LUN blocksize
Il 02/07/2013 10:28, Peter Lieven ha scritto: I agree that it's better to emit an error than to corrupt the disk. I'm just trying to understand what needs to be done on top of this to make 4 KB LUNs work. Reading is quite easy, you just have to have a wrapper that is reading an aligned portion of sectors around the original request and then extracting what was requested. Writing is very complicated. Requests have to be serialized, Read-Modify-Write for unaligned write requests. Paolo had all this prepared already. I wonder if it would be enough to have the block size of the host/iSCSI device propagated to the guest drivers and read 4K (or the protocol block size bytes) when probing the MBR. No, propagating the size of the device is only correct if we're doing SCSI passthrough (-device scsi-block) and in that case we're doing it already. For emulation, the choices are: 1) manually set the logical_block_size of the disk to 4096 (which means the disk cannot be used as a boot disk, at least with BIOS; I don't know about UEFI); 2) do 512e/4KN as in my patches. You would still need to expose a 4K physical_sector_size to be precise, but it is just an optimization. Paolo
Re: [Qemu-devel] [PATCHv2 07/11] iscsi: let bdrv_create conditionally zero out the device
Il 02/07/2013 12:56, Peter Lieven ha scritto: Am 02.07.2013 um 12:49 schrieb Paolo Bonzini pbonz...@redhat.com: Il 02/07/2013 12:36, Peter Lieven ha scritto: Perhaps we can have a new discard_zeroes field in bdrv_get_info, and the unmap functionality can be moved up to qemu-img convert? Is there any other storage protocol out there that could benefit from it? Definitely LVM. Perhaps in the future gluster too, though right now it only supports discard on files, not block devices. Is discards on LVM sth that is already implemented in qemu? Yes, it supports BLKDISCARD (see handle_aiocb_discard in block/raw-posix.c). Of course there is no way to query the host discard_zeroes setting yet. No way in qemu or no way at all? No way in QEMU. Linux has the BLKDISCARDZEROES ioctl, it takes an int* as the third argument and puts 0/1 in it. Paolo
Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting
Il 02/07/2013 13:15, Andreas Färber ha scritto: @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char *implements_type, void object_ref(Object *obj) { -obj-ref++; + __sync_fetch_and_add(obj-ref, 1); How widespread are these in GCC/clang? Is there any fallback? I remember seeing some __sync_* warnings on Mac OS X around 4.2... We are using them already in several places (vhost was the first one to introduce them, I think, but now they are also in migration ad in some tests too). There is no fallback (asm could be a fallback, but we chose to require GCC 4.2 or newer). I'll change this to atomic_inc/dec when applying. Otherwise Reviewed-by: Paolo Bonzini pbonz...@redhat.com Paolo
Re: [Qemu-devel] [PATCHv2 07/11] iscsi: let bdrv_create conditionally zero out the device
Am 02.07.2013 um 13:04 schrieb Paolo Bonzini pbonz...@redhat.com: Il 02/07/2013 12:56, Peter Lieven ha scritto: Am 02.07.2013 um 12:49 schrieb Paolo Bonzini pbonz...@redhat.com: Il 02/07/2013 12:36, Peter Lieven ha scritto: Perhaps we can have a new discard_zeroes field in bdrv_get_info, and the unmap functionality can be moved up to qemu-img convert? Is there any other storage protocol out there that could benefit from it? Definitely LVM. Perhaps in the future gluster too, though right now it only supports discard on files, not block devices. Is discards on LVM sth that is already implemented in qemu? Yes, it supports BLKDISCARD (see handle_aiocb_discard in block/raw-posix.c). Of course there is no way to query the host discard_zeroes setting yet. No way in qemu or no way at all? No way in QEMU. Linux has the BLKDISCARDZEROES ioctl, it takes an int* as the third argument and puts 0/1 in it. okay, thanks for the pointer. if you all don't mind I would split up my series and first send a v3 which adds the unproblematic stuff and leave iscsi: let bdrv_create conditionally zero out the device block-migration: efficiently encode zero blocks out. both seem to need more time and cycles to follow the better approaches. Peter
Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting
Am 02.07.2013 11:36, schrieb Jan Kiszka: Objects can soon be referenced/dereference outside the BQL. So we need to use atomics in object_ref/unref. Based on patch by Liu Ping Fan. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- qom/object.c |5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/qom/object.c b/qom/object.c index 803b94b..a76a30b 100644 --- a/qom/object.c +++ b/qom/object.c @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char *implements_type, void object_ref(Object *obj) { -obj-ref++; + __sync_fetch_and_add(obj-ref, 1); How widespread are these in GCC/clang? Is there any fallback? I remember seeing some __sync_* warnings on Mac OS X around 4.2... Andreas } void object_unref(Object *obj) { g_assert(obj-ref 0); -obj-ref--; /* parent always holds a reference to its children */ -if (obj-ref == 0) { +if (__sync_sub_and_fetch(obj-ref, 1) == 0) { object_finalize(obj); } } -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH] Citrix PV Bus device
On 2 July 2013 11:57, Paul Durrant paul.durr...@citrix.com wrote: -Original Message- From: Paolo Bonzini [mailto:pbonz...@redhat.com] So the reason to place the device here is TARGET_PAGE_SIZE... We really need a way to access that value from common code, somewhere down my TODO list. :/ We probably don't, because it generally doesn't mean what you think it does. It's the smallest possible page size the guest CPU supports, which may not be the same as the actual page size the guest OS is using. Why does it need to be in pages rather than bytes? It doesn't necessarily need to be in pages; it's just a more convenient quantity than bytes. It isn't really more convienient, because the guest would have to tell QEMU what the page size was. (I'm told that virtio is planning to move to a simple just use a byte count approach.) thanks -- PMM
Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting
On 2013-07-02 13:28, Paolo Bonzini wrote: Il 02/07/2013 13:15, Andreas Färber ha scritto: @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char *implements_type, void object_ref(Object *obj) { -obj-ref++; + __sync_fetch_and_add(obj-ref, 1); How widespread are these in GCC/clang? Is there any fallback? I remember seeing some __sync_* warnings on Mac OS X around 4.2... We are using them already in several places (vhost was the first one to introduce them, I think, but now they are also in migration ad in some tests too). There is no fallback (asm could be a fallback, but we chose to require GCC 4.2 or newer). I'll change this to atomic_inc/dec when applying. Otherwise But then atomic_dec_and_test or so. Letting the inc/dec return some value leaves room for interpretations (value of before or after the modification?). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] MAINTAINERS: fix bad F: patterns
Ping! thanks -- PMM On 24 June 2013 11:49, Peter Maydell peter.mayd...@linaro.org wrote: This patch fixes a number of incorrect F: patterns which didn't match any files in the source tree. This was caused by a mix of minor typos (- for _ and the like) and a few entries which hadn't been correctly updated following the rearrangement of hw/. Offending entries were located with the following shell rune: for pattern in $(sed -ne 's/^F: //p' MAINTAINERS); do if ! stat --printf='' $pattern 2/dev/null; then echo bad pattern: $pattern fi done Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- MAINTAINERS | 34 +++--- 1 file changed, 15 insertions(+), 19 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index a015f68..b65d2f7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -303,11 +303,7 @@ Axis Dev88 M: Edgar E. Iglesias edgar.igles...@gmail.com S: Maintained F: hw/cris/axis_dev88.c - -etraxfs -M: Edgar E. Iglesias edgar.igles...@gmail.com -S: Maintained -F: hw/cris/etraxfs.c +F: hw/*/etraxfs_*.c LM32 Machines - @@ -343,7 +339,7 @@ MicroBlaze Machines petalogix_s3adsp1800 M: Edgar E. Iglesias edgar.igles...@gmail.com S: Maintained -F: hw/microblaze/petalogix_s3adsp1800.c +F: hw/microblaze/petalogix_s3adsp1800_mmu.c petalogix_ml605 M: Peter Crosthwaite peter.crosthwa...@petalogix.com @@ -407,8 +403,8 @@ M: Alexander Graf ag...@suse.de L: qemu-...@nongnu.org S: Maintained F: hw/ppc/mac_newworld.c -F: hw/pci/devices/host-uninorth.c -F: hw/pci/devices/host-dec.[hc] +F: hw/pci-host/uninorth.c +F: hw/pci-bridge/dec.[hc] F: hw/misc/macio/ Old World @@ -416,7 +412,7 @@ M: Alexander Graf ag...@suse.de L: qemu-...@nongnu.org S: Maintained F: hw/ppc/mac_oldworld.c -F: hw/pci/devices/host-grackle.c +F: hw/pci-host/grackle.c F: hw/misc/macio/ PReP @@ -424,7 +420,7 @@ M: Andreas Färber andreas.faer...@web.de L: qemu-...@nongnu.org S: Odd Fixes F: hw/ppc/prep.c -F: hw/pci/devices/host-prep.[hc] +F: hw/pci-host/prep.[hc] F: hw/isa/pc87312.[hc] sPAPR @@ -438,19 +434,19 @@ virtex_ml507 M: Edgar E. Iglesias edgar.igles...@gmail.com L: qemu-...@nongnu.org S: Odd Fixes -F: hw/pci/virtex_ml507.c +F: hw/ppc/virtex_ml507.c SH4 Machines R2D M: Magnus Damm magnus.d...@gmail.com S: Maintained -F: hw/sh/r2d.c +F: hw/sh4/r2d.c Shix M: Magnus Damm magnus.d...@gmail.com S: Orphan -F: hw/sh/shix.c +F: hw/sh4/shix.c SPARC Machines -- @@ -475,7 +471,7 @@ S390 Machines S390 Virtio M: Alexander Graf ag...@suse.de S: Maintained -F: hw/s390/s390-*.c +F: hw/s390x/s390-*.c S390 Virtio-ccw M: Cornelia Huck cornelia.h...@de.ibm.com @@ -544,7 +540,7 @@ M: Alexander Graf ag...@suse.de M: Scott Wood scottw...@freescale.com L: qemu-...@nongnu.org S: Supported -F: hw/ppc/e500_* +F: hw/ppc/e500* SCSI M: Paolo Bonzini pbonz...@redhat.com @@ -572,7 +568,7 @@ F: hw/usb/* VFIO M: Alex Williamson alex.william...@redhat.com S: Supported -F: hw/pci/vfio.c +F: hw/misc/vfio.c vhost M: Michael S. Tsirkin m...@redhat.com @@ -646,7 +642,7 @@ CPU M: Andreas Färber afaer...@suse.de S: Supported F: qom/cpu.c -F: include/qemu/cpu.h +F: include/qom/cpu.h F: target-i386/cpu.c ICC Bus @@ -659,7 +655,7 @@ Device Tree M: Peter Crosthwaite peter.crosthwa...@petalogix.com M: Alexander Graf ag...@suse.de S: Maintained -F: device-tree.[ch] +F: device_tree.[ch] GDB stub M: qemu-devel@nongnu.org @@ -670,7 +666,7 @@ F: gdb-xml/ SPICE M: Gerd Hoffmann kra...@redhat.com S: Supported -F: ui/qemu-spice.h +F: include/ui/qemu-spice.h F: ui/spice-*.c F: audio/spiceaudio.c F: hw/display/qxl* -- 1.7.9.5
Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting
Il 02/07/2013 13:44, Jan Kiszka ha scritto: On 2013-07-02 13:28, Paolo Bonzini wrote: Il 02/07/2013 13:15, Andreas Färber ha scritto: @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char *implements_type, void object_ref(Object *obj) { -obj-ref++; + __sync_fetch_and_add(obj-ref, 1); How widespread are these in GCC/clang? Is there any fallback? I remember seeing some __sync_* warnings on Mac OS X around 4.2... We are using them already in several places (vhost was the first one to introduce them, I think, but now they are also in migration ad in some tests too). There is no fallback (asm could be a fallback, but we chose to require GCC 4.2 or newer). I'll change this to atomic_inc/dec when applying. Otherwise But then atomic_dec_and_test or so. Letting the inc/dec return some value leaves room for interpretations (value of before or after the modification?). In qemu, I made all atomic_* functions return the old value. This is consistent with atomic_cmpxchg and atomic_xchg (where returning the new value makes no sense). Paolo
Re: [Qemu-devel] [PATCH] MAINTAINERS: fix bad F: patterns
On 07/02/2013 01:29 PM, Peter Maydell wrote: Ping! Who are you pinging here? Alex thanks -- PMM On 24 June 2013 11:49, Peter Maydellpeter.mayd...@linaro.org wrote: This patch fixes a number of incorrect F: patterns which didn't match any files in the source tree. This was caused by a mix of minor typos (- for _ and the like) and a few entries which hadn't been correctly updated following the rearrangement of hw/. Offending entries were located with the following shell rune: for pattern in $(sed -ne 's/^F: //p' MAINTAINERS); do if ! stat --printf='' $pattern 2/dev/null; then echo bad pattern: $pattern fi done Signed-off-by: Peter Maydellpeter.mayd...@linaro.org --- MAINTAINERS | 34 +++--- 1 file changed, 15 insertions(+), 19 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index a015f68..b65d2f7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -303,11 +303,7 @@ Axis Dev88 M: Edgar E. Iglesiasedgar.igles...@gmail.com S: Maintained F: hw/cris/axis_dev88.c - -etraxfs -M: Edgar E. Iglesiasedgar.igles...@gmail.com -S: Maintained -F: hw/cris/etraxfs.c +F: hw/*/etraxfs_*.c LM32 Machines - @@ -343,7 +339,7 @@ MicroBlaze Machines petalogix_s3adsp1800 M: Edgar E. Iglesiasedgar.igles...@gmail.com S: Maintained -F: hw/microblaze/petalogix_s3adsp1800.c +F: hw/microblaze/petalogix_s3adsp1800_mmu.c petalogix_ml605 M: Peter Crosthwaitepeter.crosthwa...@petalogix.com @@ -407,8 +403,8 @@ M: Alexander Grafag...@suse.de L: qemu-...@nongnu.org S: Maintained F: hw/ppc/mac_newworld.c -F: hw/pci/devices/host-uninorth.c -F: hw/pci/devices/host-dec.[hc] +F: hw/pci-host/uninorth.c +F: hw/pci-bridge/dec.[hc] F: hw/misc/macio/ Old World @@ -416,7 +412,7 @@ M: Alexander Grafag...@suse.de L: qemu-...@nongnu.org S: Maintained F: hw/ppc/mac_oldworld.c -F: hw/pci/devices/host-grackle.c +F: hw/pci-host/grackle.c F: hw/misc/macio/ PReP @@ -424,7 +420,7 @@ M: Andreas Färberandreas.faer...@web.de L: qemu-...@nongnu.org S: Odd Fixes F: hw/ppc/prep.c -F: hw/pci/devices/host-prep.[hc] +F: hw/pci-host/prep.[hc] F: hw/isa/pc87312.[hc] sPAPR @@ -438,19 +434,19 @@ virtex_ml507 M: Edgar E. Iglesiasedgar.igles...@gmail.com L: qemu-...@nongnu.org S: Odd Fixes -F: hw/pci/virtex_ml507.c +F: hw/ppc/virtex_ml507.c SH4 Machines R2D M: Magnus Dammmagnus.d...@gmail.com S: Maintained -F: hw/sh/r2d.c +F: hw/sh4/r2d.c Shix M: Magnus Dammmagnus.d...@gmail.com S: Orphan -F: hw/sh/shix.c +F: hw/sh4/shix.c SPARC Machines -- @@ -475,7 +471,7 @@ S390 Machines S390 Virtio M: Alexander Grafag...@suse.de S: Maintained -F: hw/s390/s390-*.c +F: hw/s390x/s390-*.c S390 Virtio-ccw M: Cornelia Huckcornelia.h...@de.ibm.com @@ -544,7 +540,7 @@ M: Alexander Grafag...@suse.de M: Scott Woodscottw...@freescale.com L: qemu-...@nongnu.org S: Supported -F: hw/ppc/e500_* +F: hw/ppc/e500* SCSI M: Paolo Bonzinipbonz...@redhat.com @@ -572,7 +568,7 @@ F: hw/usb/* VFIO M: Alex Williamsonalex.william...@redhat.com S: Supported -F: hw/pci/vfio.c +F: hw/misc/vfio.c vhost M: Michael S. Tsirkinm...@redhat.com @@ -646,7 +642,7 @@ CPU M: Andreas Färberafaer...@suse.de S: Supported F: qom/cpu.c -F: include/qemu/cpu.h +F: include/qom/cpu.h F: target-i386/cpu.c ICC Bus @@ -659,7 +655,7 @@ Device Tree M: Peter Crosthwaitepeter.crosthwa...@petalogix.com M: Alexander Grafag...@suse.de S: Maintained -F: device-tree.[ch] +F: device_tree.[ch] GDB stub M: qemu-devel@nongnu.org @@ -670,7 +666,7 @@ F: gdb-xml/ SPICE M: Gerd Hoffmannkra...@redhat.com S: Supported -F: ui/qemu-spice.h +F: include/ui/qemu-spice.h F: ui/spice-*.c F: audio/spiceaudio.c F: hw/display/qxl* -- 1.7.9.5
Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting
On 2013-07-02 13:49, Paolo Bonzini wrote: Il 02/07/2013 13:44, Jan Kiszka ha scritto: On 2013-07-02 13:28, Paolo Bonzini wrote: Il 02/07/2013 13:15, Andreas Färber ha scritto: @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char *implements_type, void object_ref(Object *obj) { -obj-ref++; + __sync_fetch_and_add(obj-ref, 1); How widespread are these in GCC/clang? Is there any fallback? I remember seeing some __sync_* warnings on Mac OS X around 4.2... We are using them already in several places (vhost was the first one to introduce them, I think, but now they are also in migration ad in some tests too). There is no fallback (asm could be a fallback, but we chose to require GCC 4.2 or newer). I'll change this to atomic_inc/dec when applying. Otherwise But then atomic_dec_and_test or so. Letting the inc/dec return some value leaves room for interpretations (value of before or after the modification?). In qemu, I made all atomic_* functions return the old value. This is consistent with atomic_cmpxchg and atomic_xchg (where returning the new value makes no sense). Please avoid this ambiguity by naming the functions properly. That xchg returns old values is known, that dec and inc do, is surely not. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] MAINTAINERS: fix bad F: patterns
On 2 July 2013 12:51, Alexander Graf ag...@suse.de wrote: On 07/02/2013 01:29 PM, Peter Maydell wrote: Ping! Who are you pinging here? Our nonexistent mechanism for dealing with patches to files which don't have an obvious maintainer :-) (I didn't realise I hadn't cc'd Anthony on the patch to start with.) -- PMM
Re: [Qemu-devel] [PATCH] target-openrisc: Add typename for CPU models.
Hi Andreas, On Tue, Jul 2, 2013 at 6:18 PM, Andreas Färber afaer...@suse.de wrote: Hi Jia, Am 02.07.2013 11:29, schrieb Jia Liu: On Tue, Jul 2, 2013 at 5:11 PM, Dongxue Zhang elta@gmail.com mailto:elta@gmail.com wrote: Make target-openrisc running OK by add typename in openrisc_cpu_class_by_name(). Signed-off-by: Dongxue Zhang elta@gmail.com mailto:elta@gmail.com --- target-openrisc/cpu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/target-openrisc/cpu.c b/target-openrisc/cpu.c index fd90d37..d38c28b 100644 --- a/target-openrisc/cpu.c +++ b/target-openrisc/cpu.c @@ -96,12 +96,14 @@ static void openrisc_cpu_initfn(Object *obj) static ObjectClass *openrisc_cpu_class_by_name(const char *cpu_model) { ObjectClass *oc; +char *typename; if (cpu_model == NULL) { return NULL; } -oc = object_class_by_name(cpu_model); +typename = g_strdup_printf(%s- TYPE_OPENRISC_CPU, cpu_model); +oc = object_class_by_name(typename); if (oc != NULL (!object_class_dynamic_cast(oc, TYPE_OPENRISC_CPU) || object_class_is_abstract(oc))) { return NULL; Thanks for your fix, it looks and test good to me. Sorry for the breakage. Do you want to add a Reviewed-by/Tested-by/Acked-by? I'd queue it for you then. Thank you very much! May you please tell me how can I make a Reviewed-by/Tested-by/Acked-by? I don't know too much about it. If you could upload a Linux test image somewhere that may help avoid breakages in the future. I find some Linux test images at http://qemu-project.org/Testing . How can I upload one upon to there? Also we reported that there was no maintainer for target-openrisc/ in MAINTAINERS file, do you want to put yourself there so that you are CC'ed on patches? Thank you, I'll submit a patch to add myself into MAINTAINERS file and review target-openrisc. Here's a pointer to the latest refactoring that partially affects or32: http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg05354.html Thank you for patching target-openrisc, I'll test it. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg Regards, Jia
Re: [Qemu-devel] [PATCH v4 06/10] qemu-ga: Add Windows VSS provider to quiesce applications on fsfreeze
On 01/07/2013 21:57, Tomoki Sekiyama wrote: So why don't you query for the interface you supports rather than checking the OS version? Hmm, I don't know what I should query for. Microsoft's document above only provides a table of compatibility between VSS SDK version and OS version. Of course I can install the component and test whether it works or not, but I want to know that before installing something. I'll check it myself and try to suggest a patch. Sorry for missing the face that it is done during installation and not on a freeze request. Gal.
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Ian Campbell Sent: 02 July 2013 11:57 To: Tim (Xen.org) Cc: Paul Durrant; qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 11:49 +0100, Tim Deegan wrote: At 10:31 + on 02 Jul (1372761105), Paul Durrant wrote: Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. No, that is the eventual aim so I don't think the 'upgrade only' options is really future-proof. Well, you can have them install by default on platforms that want it, or on new enough Xen versions. Or, even better, on new enough _windows_ versions. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Only if it's OK that the _host_ admin has to be involved (which was the original objection). Upgrade-only but hooked to the existing ID lets a guest admin install the drivers manually without plumbing it through $CLOUDPROVIDER's toolstack, and without having it appear suddenly on existing VMs in the dead of night. I think part of the problem here is that it is unclear who the target audience for these drivers are. Paul, are you intending that these drivers be only for XenServer users or are you intending for them to be used by the broader community on a variety of different Xen platforms? My intention is that the drivers are widely available to all who want them, but the key word there is 'want'. No-one should get a surprise when we publish to Windows Update so having the drivers bind to a new device which can then be added to the VM config seems like the cleanest solution. I realise that this involves the VM provider having to do something to enable a VM to get drivers from Windows Update rather that the guest admin, but I think that's actually the right way to do it. Adding this device into a VM is essentially enabling new functionality in the same way that adding a new network device or storage device would be. Paul
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
-Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 02 July 2013 11:50 To: Paul Durrant Cc: Ian Campbell; qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device At 10:31 + on 02 Jul (1372761105), Paul Durrant wrote: Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. No, that is the eventual aim so I don't think the 'upgrade only' options is really future-proof. Well, you can have them install by default on platforms that want it, or on new enough Xen versions. Or, even better, on new enough _windows_ versions. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Only if it's OK that the _host_ admin has to be involved (which was the original objection). Upgrade-only but hooked to the existing ID lets a guest admin install the drivers manually without plumbing it through $CLOUDPROVIDER's toolstack, and without having it appear suddenly on existing VMs in the dead of night. But for new VMs we *want* the drivers to download and install automatically; we just don't want them to do that for existing VMs. Paul
Re: [Qemu-devel] [PATCH v2] spapr: Fix compiler warnings for some versions of gcc
On 06/29/2013 03:47 PM, Stefan Weil wrote: i686-w64-mingw32-gcc (GCC) 4.6.3 from Debian wheezy reports these warnings: hw/ppc/spapr_hcall.c:188:1: warning: control reaches end of non-void function [-Wreturn-type] hw/ppc/spapr_pci.c:454:1: warning: control reaches end of non-void function [-Wreturn-type] Both warnings are fixed by using g_assert_not_reached instead of assert. A second line with assert(0) in spapr_pci.c which did not raise a compiler warning was modified, too, because g_assert_not_reached documents the purpose of that statement and is not removed in release builds. Signed-off-by: Stefan Weils...@weilnetz.de Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [Xen-devel] [PATCH] Citrix PV Bus device
On Tue, 2013-07-02 at 13:35 +0100, Paul Durrant wrote: -Original Message- From: Ian Campbell Sent: 02 July 2013 11:57 To: Tim (Xen.org) Cc: Paul Durrant; qemu-devel@nongnu.org; xen-de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Citrix PV Bus device On Tue, 2013-07-02 at 11:49 +0100, Tim Deegan wrote: At 10:31 + on 02 Jul (1372761105), Paul Durrant wrote: Well, the WU drivers could refuse to install except as upgrade to themselves (i.e. fail if there's any unknown driver bound to the xen platform device, and also fail if there's _no_ driver bound). Then the guest admin can choose to install the drivers by hand and get automatic updates after that. That sounds reasonable. However I thought part of the point of getting things into WU was then that they could be inbox (either figuratively or literally) such that they would be installed by the Windows installer. Perhaps that's a separate thing though. No, that is the eventual aim so I don't think the 'upgrade only' options is really future-proof. Well, you can have them install by default on platforms that want it, or on new enough Xen versions. Or, even better, on new enough _windows_ versions. XS, XC and anyone else who chooses could carry a separate patch that changes the default to 'install if there are no drivers', signalling over xenstore, or ACPI, or a Windows domain policy, or whatever. Right. Surely having a new device for the purposes of hosting Citrix PV drivers is a cleaner option for opting in? Only if it's OK that the _host_ admin has to be involved (which was the original objection). Upgrade-only but hooked to the existing ID lets a guest admin install the drivers manually without plumbing it through $CLOUDPROVIDER's toolstack, and without having it appear suddenly on existing VMs in the dead of night. I think part of the problem here is that it is unclear who the target audience for these drivers are. Paul, are you intending that these drivers be only for XenServer users or are you intending for them to be used by the broader community on a variety of different Xen platforms? My intention is that the drivers are widely available to all who want them, but the key word there is 'want'. No-one should get a surprise when we publish to Windows Update so having the drivers bind to a new device which can then be added to the VM config seems like the cleanest solution. Actually, it occurs to me now (and I have a feeling Tim was trying to say this and I didn't get it): It's not really appropriate for any individual vendor to upload a driver to Windows Update which binds to the default platform device ID anyway. There are several other sets of Xen PV drivers out there (including the GPL PV drivers and various companies commercial/proprietary drivers) and having one particular set of drivers in WU puts all of them at a disadvantage. Before doing that we would need consensus behind the particular set of drivers, which I don't think any of them currently have. All of which means that you do need your own device ID, for which you can upload support to WU. However I don't think it therefore follows that you need that device ID to be supported by upstream. Does this work: We assign a device ID to your drivers (and we would do the same for any vendor). You can then upload drivers supporting that ID to WU and arrange within your product to create the appropriate device. At the same time you produce a setup.exe installer which will install those same drivers but also sets up (or includes) the binding to the existing device IDs. So anyone running on XenServer gets the driver direct from WU and you can define your own policies around what that means and what the upgrade path and ties to the platform mean etc. People who want to use your drivers on non-XenServer platforms can choose to do so by manually installing from within the guest, with no need to modify their guest configuration. Does that make sense? Ian.
Re: [Qemu-devel] [PATCH 1/3] spapr: Use named enum for function remove_hpte
On 06/24/2013 07:48 PM, Stefan Weil wrote: The function returned a target_ulong which was made from unnamed enum values. The target_ulong was then assigned to an int variable which was used in a switch statement. Using a named enum in both cases makes reviews easier. Signed-off-by: Stefan Weils...@weilnetz.de Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [PATCH 3/3] spapr: Fix compiler warning for some versions of gcc (spapr_io_read)
On 06/24/2013 07:48 PM, Stefan Weil wrote: i686-w64-mingw32-gcc (GCC) 4.6.3 from Debian wheezy reports this warning: hw/ppc/spapr_pci.c:454:1: warning: control reaches end of non-void function [-Wreturn-type] Adding a default case to the switch statement satisfies the compiler. This modification requires moving the assert statement. Signed-off-by: Stefan Weils...@weilnetz.de --- hw/ppc/spapr_pci.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 04e8362..c04086c 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -448,9 +448,10 @@ static uint64_t spapr_io_read(void *opaque, hwaddr addr, case 2: return cpu_inw(addr); case 4: +default: +assert(size == 4); return cpu_inl(addr); Please fix this one up with g_assert_not_reached() as well. Alex
[Qemu-devel] [PATCH] int128: optimize
static inline Int128 int128_neg(Int128 a) { -a.lo = ~a.lo; -a.hi = ~a.hi; -return int128_add(a, int128_one()); +uint64_t lo = -a.lo; +return (Int128) { lo, ~a.hi + !lo }; } This leaves int128_one unused. (Also the temporary lo seems a bit pointless, since you could just as well write -a.lo and !a.lo.) static inline bool int128_ge(Int128 a, Int128 b) { -return int128_nonneg(int128_sub(a, b)); +return a.hi b.hi || (a.hi == b.hi a.lo = b.lo); } This is a bug fix. The old version gives the wrong answer when a and b are both large and have opposite signs. Jay.
Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting
Il 02/07/2013 13:52, Jan Kiszka ha scritto: But then atomic_dec_and_test or so. Letting the inc/dec return some value leaves room for interpretations (value of before or after the modification?). In qemu, I made all atomic_* functions return the old value. This is consistent with atomic_cmpxchg and atomic_xchg (where returning the new value makes no sense). Please avoid this ambiguity by naming the functions properly. That xchg returns old values is known, that dec and inc do, is surely not. IMO the ambiguity is resolved simply by looking at the docs or existing code, but I can rename them to atomic_fetch_{add,sub,and,or,inc,dec} and add void versions without fetch. Paolo
Re: [Qemu-devel] Significant slowdown after qemu-kvm-1.2.0
On Mon, Jul 1, 2013 at 7:48 PM, Andreas Färber afaer...@suse.de wrote: Hi, Am 02.07.2013 00:24, schrieb jinho hwang: Thank you for your reply. I tried qemu-1.5.0 with --enable-kvm, but there is still a big difference between qemu-kvm-1.2.0 and qemu-1.5.0. They both use the same command line to create VM. Am I missing something in the command-line? sudo qemu-system-x86_64 -name vm -m 1g -hda /home/jinho/virt/vms/vm1.img -device e1000,netdev=net0,mac=00:0c:29:f0:bc:33 -netdev tap,id=net0 -vnc :1 Yes, you are still missing -enable-kvm on the QEMU command line, or alternatively the more modern -machine accel=kvm. Regards, Andreas P.S. Please avoid top-posting and HTML on the mailing list. On Mon, Jul 1, 2013 at 5:54 PM, Paolo Bonzini pbonz...@redhat.com mailto:pbonz...@redhat.com wrote: Il 01/07/2013 23:51, jinho hwang ha scritto: Hi Guys, I am new to kvm development (used to work on Xen). I used qemu-kvm-1.2.0 first, and then due to pci memory size (x64 bit support), I switched from 1.2.0 to qemu-kvm-1.5.0 (same for 1.3.0~). There is no qemu-kvm 1.5.0. You are using qemu 1.5.0. After that, my VM slows down significantly including booting and execution. Also, sometimes kernel messages CPU stuck for x seconds show in dmesg. I think CPU scheduling has been changed or should be set up properly, but I have no clue for now. Can anyone help me to figure out this? I really need some fast VMs as I used to run with qemu-kvm-1.2.0. I am currently running ubuntu-12.10_64 in VM, and my configure was ./configure --prefix=/usr --target-list=x86_64-linux-user,x86_64-softmmu sudo qemu-system-x86_64 -name vm -m 1g -hda /home/jinho/virt/vms/vm1.img -device e1000,netdev=net0,mac=00:0c:29:f0:bc:33 -netdev tap,id=net0 -vnc 135.207.127.181:1 http://135.207.127.181:1 http://135.207.127.181:1 You are not using kvm. Add --enable-kvm to the command line. Paolo -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg That works perfect! I only included --enable-kvm to configure, not qemu-system-x86_64. Now it works very fast. I just registered e-mail list, and found the qemu development is very active. Very good to be part of it now. Thanks! - Jinho
Re: [Qemu-devel] [PATCH v3] e600 core for MPC86xx processors
On 06/24/2013 11:15 PM, Julio Guerra wrote: MPC86xx processors are based on the e600 core, which is not the case in qemu where it is based on the 7400 processor. This patch creates the e600 core and instantiates the MPC86xx processors based on it. Therefore, adding the high BATs, the SPRG 4..7 registers, which are e600-specific [1], and a HW MMU model (as 7400). This allows to define the MPC8610 processor too. Tested with a kernel using the HW TLB misses. [1] http://cache.freescale.com/files/32bit/doc/ref_manual/E600CORERM.pdf Signed-off-by: Julio Guerragu...@julio.in Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [PATCH 3/3] target-arm: explicitly decode SEVL instruction
On 13 June 2013 14:15, Andreas Färber afaer...@suse.de wrote: Am 07.06.2013 14:06, schrieb Mans Rullgard: case 2: /* wfe */ case 4: /* sev */ +case 5: /* sevl */ /* TODO: Implement SEV and WFE. May help SMP performance. */ default: /* nop */ break; So does SEVL need to be implemented? Then the TODO should be updated. Otherwise moving it a line down may be clearer. I updated the comment to mention SEVL as part of taking this patch into target-arm.next. thanks -- PMM