Re: [Qemu-devel] [PATCH 14/17] qmp: add x-debug-block-dirty-bitmap-sha256

2017-02-15 Thread Vladimir Sementsov-Ogievskiy

16.02.2017 03:35, John Snow wrote:


On 02/13/2017 04:54 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 

This is simply the same as the version in the other two series, right?


Yes. Context a bit differs... Aha, I've discovered that in migration I'm 
adding bdrv_next_dirty_bitmap and in persistent - 
bdrv_dirty_bitmap_next. Anyway, one series should be rebased after 
applying the second..




Reviewed-by: John Snow 


---
  block/dirty-bitmap.c |  5 +
  blockdev.c   | 29 +
  include/block/dirty-bitmap.h |  2 ++
  include/qemu/hbitmap.h   |  8 
  qapi/block-core.json | 27 +++
  tests/Makefile.include   |  2 +-
  util/hbitmap.c   | 11 +++
  7 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 32aa6eb..5bec99b 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -558,3 +558,8 @@ BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState 
*bs,
  
  return QLIST_NEXT(bitmap, list);

  }
+
+char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
+{
+return hbitmap_sha256(bitmap->bitmap, errp);
+}
diff --git a/blockdev.c b/blockdev.c
index db82ac9..4d06885 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2790,6 +2790,35 @@ void qmp_block_dirty_bitmap_clear(const char *node, 
const char *name,
  aio_context_release(aio_context);
  }
  
+BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char *node,

+  const char *name,
+  Error **errp)
+{
+AioContext *aio_context;
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+BlockDirtyBitmapSha256 *ret = NULL;
+char *sha256;
+
+bitmap = block_dirty_bitmap_lookup(node, name, , _context, errp);
+if (!bitmap || !bs) {
+return NULL;
+}
+
+sha256 = bdrv_dirty_bitmap_sha256(bitmap, errp);
+if (sha256 == NULL) {
+goto out;
+}
+
+ret = g_new(BlockDirtyBitmapSha256, 1);
+ret->sha256 = sha256;
+
+out:
+aio_context_release(aio_context);
+
+return ret;
+}
+
  void hmp_drive_del(Monitor *mon, const QDict *qdict)
  {
  const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 20b3ec7..ded872a 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -78,4 +78,6 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap);
  BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState *bs,
  BdrvDirtyBitmap *bitmap);
  
+char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);

+
  #endif
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 9239fe5..f353e56 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -238,6 +238,14 @@ void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t 
start, uint64_t count,
  void hbitmap_deserialize_finish(HBitmap *hb);
  
  /**

+ * hbitmap_sha256:
+ * @bitmap: HBitmap to operate on.
+ *
+ * Returns SHA256 hash of the last level.
+ */
+char *hbitmap_sha256(const HBitmap *bitmap, Error **errp);
+
+/**
   * hbitmap_free:
   * @hb: HBitmap to operate on.
   *
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 932f5bb..8646054 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1632,6 +1632,33 @@
'data': 'BlockDirtyBitmap' }
  
  ##

+# @BlockDirtyBitmapSha256:
+#
+# SHA256 hash of dirty bitmap data
+#
+# @sha256: ASCII representation of SHA256 bitmap hash
+#
+# Since: 2.9
+##
+  { 'struct': 'BlockDirtyBitmapSha256',
+'data': {'sha256': 'str'} }
+
+##
+# @x-debug-block-dirty-bitmap-sha256:
+#
+# Get bitmap SHA256
+#
+# Returns: BlockDirtyBitmapSha256 on success
+#  If @node is not a valid block device, DeviceNotFound
+#  If @name is not found or if hashing has failed, GenericError with an
+#  explanation
+#
+# Since: 2.9
+##
+  { 'command': 'x-debug-block-dirty-bitmap-sha256',
+'data': 'BlockDirtyBitmap', 'returns': 'BlockDirtyBitmapSha256' }
+
+##
  # @blockdev-mirror:
  #
  # Start mirroring a block device's writes to a new destination.
diff --git a/tests/Makefile.include b/tests/Makefile.include
index 634394a..7a71b4d 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -526,7 +526,7 @@ tests/test-blockjob$(EXESUF): tests/test-blockjob.o 
$(test-block-obj-y) $(test-u
  tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o 
$(test-block-obj-y) $(test-util-obj-y)
  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
  tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
-tests/test-hbitmap$(EXESUF): 

[Qemu-devel] [PATCH v7 7/8] tests: Add unit tests for the VM Generation ID feature

2017-02-15 Thread ben
From: Ben Warren 

The following tests are implemented:
* test that a GUID passed in by command line is propagated to the guest.
  Read the GUID both from guest memory and from the monitor
* test that the "auto" argument to the GUID generates a valid GUID, as
  seen by the guest.

  This patch is loosely based on a previous patch from:
  Gal Hammer   and Igor Mammedov 

Signed-off-by: Ben Warren 
---
 tests/Makefile.include |   2 +
 tests/vmgenid-test.c   | 174 +
 2 files changed, 176 insertions(+)
 create mode 100644 tests/vmgenid-test.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 143507e..8d36341 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -241,6 +241,7 @@ check-qtest-i386-y += tests/usb-hcd-xhci-test$(EXESUF)
 gcov-files-i386-y += hw/usb/hcd-xhci.c
 check-qtest-i386-y += tests/pc-cpu-test$(EXESUF)
 check-qtest-i386-y += tests/q35-test$(EXESUF)
+check-qtest-i386-y += tests/vmgenid-test$(EXESUF)
 gcov-files-i386-y += hw/pci-host/q35.c
 check-qtest-i386-$(CONFIG_VHOST_NET_TEST_i386) += 
tests/vhost-user-test$(EXESUF)
 ifeq ($(CONFIG_VHOST_NET_TEST_i386),)
@@ -726,6 +727,7 @@ tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o 
contrib/ivshmem-server/ivshmem
 tests/vhost-user-bridge$(EXESUF): tests/vhost-user-bridge.o 
contrib/libvhost-user/libvhost-user.o $(test-util-obj-y)
 tests/test-uuid$(EXESUF): tests/test-uuid.o $(test-util-obj-y)
 tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o
+tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/acpi-utils.o
 
 tests/migration/stress$(EXESUF): tests/migration/stress.o
$(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
,"LINK","$(TARGET_DIR)$@")
diff --git a/tests/vmgenid-test.c b/tests/vmgenid-test.c
new file mode 100644
index 000..1741455
--- /dev/null
+++ b/tests/vmgenid-test.c
@@ -0,0 +1,174 @@
+/*
+ * QTest testcase for VM Generation ID
+ *
+ * Copyright (c) 2016 Red Hat, Inc.
+ * Copyright (c) 2017 Skyport Systems
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include 
+#include 
+#include 
+#include "qemu/osdep.h"
+#include "qemu/bitmap.h"
+#include "qemu/uuid.h"
+#include "hw/acpi/acpi-defs.h"
+#include "acpi-utils.h"
+#include "libqtest.h"
+
+#define VGID_GUID "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
+#define VMGENID_GUID_OFFSET  40   /* allow space for
+   * OVMF SDT Header Probe Supressor
+   */
+
+typedef struct {
+AcpiTableHeader header;
+gchar name_op;
+gchar vgia[4];
+gchar val_op;
+uint32_t vgia_val;
+} QEMU_PACKED VgidTable;
+
+static uint32_t acpi_find_vgia(void)
+{
+uint32_t off;
+AcpiRsdpDescriptor rsdp_table;
+uint32_t rsdt;
+AcpiRsdtDescriptorRev1 rsdt_table;
+int tables_nr;
+uint32_t *tables;
+AcpiTableHeader ssdt_table;
+VgidTable vgid_table;
+int i;
+
+off = acpi_find_rsdp_address();
+g_assert_cmphex(off, <, 0x10);
+
+acpi_parse_rsdp_table(off, _table);
+
+rsdt = rsdp_table.rsdt_physical_address;
+/* read the header */
+ACPI_READ_TABLE_HEADER(_table, rsdt);
+ACPI_ASSERT_CMP(rsdt_table.signature, "RSDT");
+
+/* compute the table entries in rsdt */
+tables_nr = (rsdt_table.length - sizeof(AcpiRsdtDescriptorRev1)) /
+sizeof(uint32_t);
+g_assert_cmpint(tables_nr, >, 0);
+
+/* get the addresses of the tables pointed by rsdt */
+tables = g_new0(uint32_t, tables_nr);
+ACPI_READ_ARRAY_PTR(tables, tables_nr, rsdt);
+
+for (i = 0; i < tables_nr; i++) {
+ACPI_READ_TABLE_HEADER(_table, tables[i]);
+if (!strncmp((char *)ssdt_table.oem_table_id, "VMGENID", 7)) {
+/* the first entry in the table should be VGIA
+ * That's all we need
+ */
+ACPI_READ_FIELD(vgid_table.name_op, tables[i]);
+g_assert(vgid_table.name_op == 0x08);  /* name */
+ACPI_READ_ARRAY(vgid_table.vgia, tables[i]);
+g_assert(memcmp(vgid_table.vgia, "VGIA", 4) == 0);
+ACPI_READ_FIELD(vgid_table.val_op, tables[i]);
+g_assert(vgid_table.val_op == 0x0C);  /* dword */
+ACPI_READ_FIELD(vgid_table.vgia_val, tables[i]);
+/* The GUID is written at a fixed offset into the fw_cfg file
+ * in order to implement the "OVMF SDT Header probe suppressor"
+ * see docs/specs/vmgenid.txt for more details
+ */
+return vgid_table.vgia_val + VMGENID_GUID_OFFSET;
+}
+}
+return 0;
+}
+
+static void read_guid_from_memory(QemuUUID *guid)
+{
+uint32_t vmgenid_addr;
+int i;
+
+vmgenid_addr = acpi_find_vgia();
+g_assert(vmgenid_addr);
+
+/* Read the GUID directly from 

[Qemu-devel] [PATCH v7 4/8] ACPI: Add Virtual Machine Generation ID support

2017-02-15 Thread ben
From: Ben Warren 

This implements the VM Generation ID feature by passing a 128-bit
GUID to the guest via a fw_cfg blob.
Any time the GUID changes, an ACPI notify event is sent to the guest

The user interface is a simple device with one parameter:
 - guid (string, must be "auto" or in UUID format
   ----)

Signed-off-by: Ben Warren 
---
 default-configs/i386-softmmu.mak |   1 +
 default-configs/x86_64-softmmu.mak   |   1 +
 hw/acpi/Makefile.objs|   1 +
 hw/acpi/vmgenid.c| 239 +++
 hw/i386/acpi-build.c |  16 +++
 include/hw/acpi/acpi_dev_interface.h |   1 +
 include/hw/acpi/vmgenid.h|  35 +
 7 files changed, 294 insertions(+)
 create mode 100644 hw/acpi/vmgenid.c
 create mode 100644 include/hw/acpi/vmgenid.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 48b07a4..029e952 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -59,3 +59,4 @@ CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
 CONFIG_PXB=y
+CONFIG_ACPI_VMGENID=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index fd96345..d1d7432 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -59,3 +59,4 @@ CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
 CONFIG_PXB=y
+CONFIG_ACPI_VMGENID=y
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 6acf798..11c35bc 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -5,6 +5,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
 common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
 common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
 common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
+common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
 common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
 
 common-obj-y += acpi_interface.o
diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
new file mode 100644
index 000..8fba7e0
--- /dev/null
+++ b/hw/acpi/vmgenid.c
@@ -0,0 +1,239 @@
+/*
+ *  Virtual Machine Generation ID Device
+ *
+ *  Copyright (C) 2017 Skyport Systems.
+ *
+ *  Author: Ben Warren 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qmp-commands.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/vmgenid.h"
+#include "hw/nvram/fw_cfg.h"
+#include "sysemu/sysemu.h"
+
+void vmgenid_build_acpi(VmGenIdState *vms, GArray *table_data, GArray *guid,
+BIOSLinker *linker)
+{
+Aml *ssdt, *dev, *scope, *method, *addr, *if_ctx;
+uint32_t vgia_offset;
+QemuUUID guid_le;
+
+/* Fill in the GUID values.  These need to be converted to little-endian
+ * first, since that's what the guest expects
+ */
+g_array_set_size(guid, VMGENID_FW_CFG_SIZE - ARRAY_SIZE(guid_le.data));
+guid_le = vms->guid;
+qemu_uuid_bswap(_le);
+/* The GUID is written at a fixed offset into the fw_cfg file
+ * in order to implement the "OVMF SDT Header probe suppressor"
+ * see docs/specs/vmgenid.txt for more details
+ */
+g_array_insert_vals(guid, VMGENID_GUID_OFFSET, guid_le.data,
+ARRAY_SIZE(guid_le.data));
+
+/* Put this in a separate SSDT table */
+ssdt = init_aml_allocator();
+
+/* Reserve space for header */
+acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
+
+/* Storage for the GUID address */
+vgia_offset = table_data->len +
+build_append_named_dword(ssdt->buf, "VGIA");
+scope = aml_scope("\\_SB");
+dev = aml_device("VGEN");
+aml_append(dev, aml_name_decl("_HID", aml_string("QEMUVGID")));
+aml_append(dev, aml_name_decl("_CID", aml_string("VM_Gen_Counter")));
+aml_append(dev, aml_name_decl("_DDN", aml_string("VM_Gen_Counter")));
+
+/* Simple status method to check that address is linked and non-zero */
+method = aml_method("_STA", 0, AML_NOTSERIALIZED);
+addr = aml_local(0);
+aml_append(method, aml_store(aml_int(0xf), addr));
+if_ctx = aml_if(aml_equal(aml_name("VGIA"), aml_int(0)));
+aml_append(if_ctx, aml_store(aml_int(0), addr));
+aml_append(method, if_ctx);
+aml_append(method, aml_return(addr));
+aml_append(dev, method);
+
+/* the ADDR method returns two 32-bit words representing the lower and
+ * upper halves * of the physical address of the fw_cfg blob
+ * (holding the GUID)
+ */
+method = aml_method("ADDR", 0, AML_NOTSERIALIZED);
+
+addr = aml_local(0);
+aml_append(method, aml_store(aml_package(2), addr));
+
+aml_append(method, aml_store(aml_add(aml_name("VGIA"),
+ 

[Qemu-devel] [PATCH v7 3/8] ACPI: Add vmgenid blob storage to the build tables

2017-02-15 Thread ben
From: Ben Warren 

This allows them to be centrally initialized and destroyed

The "AcpiBuildTables.vmgenid" array will be used to construct the
"etc/vmgenid_guid" fw_cfg blob.

Its contents will be linked into fw_cfg after being built on the
pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped
without use on the subsequent, guest triggered, acpi_build_update() ->
acpi_build() call path.

Signed-off-by: Ben Warren 
Reviewed-by: Laszlo Ersek 
Reviewed-by: Igor Mammedov 
---
 hw/acpi/aml-build.c | 2 ++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index b2a1e40..c6f2032 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1559,6 +1559,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
 tables->rsdp = g_array_new(false, true /* clear */, 1);
 tables->table_data = g_array_new(false, true /* clear */, 1);
 tables->tcpalog = g_array_new(false, true /* clear */, 1);
+tables->vmgenid = g_array_new(false, true /* clear */, 1);
 tables->linker = bios_linker_loader_init();
 }
 
@@ -1568,6 +1569,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, 
bool mfre)
 g_array_free(tables->rsdp, true);
 g_array_free(tables->table_data, true);
 g_array_free(tables->tcpalog, mfre);
+g_array_free(tables->vmgenid, mfre);
 }
 
 /* Build rsdt table */
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 559326c..00c21f1 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -210,6 +210,7 @@ struct AcpiBuildTables {
 GArray *table_data;
 GArray *rsdp;
 GArray *tcpalog;
+GArray *vmgenid;
 BIOSLinker *linker;
 } AcpiBuildTables;
 
-- 
2.7.4




Re: [Qemu-devel] [PATCH v6 5/7] qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 7:36 AM, Laszlo Ersek  wrote:
> 
> Two questions:
> 
> On 02/15/17 07:15, b...@skyportsystems.com  
> wrote:
>> From: Igor Mammedov 
>> 
>> Add commands to query Virtual Machine Generation ID counter.
>> 
>> QMP command example:
>>{ "execute": "query-vm-generation-id" }
>> 
>> HMP command example:
>>info vm-generation-id
>> 
>> Signed-off-by: Igor Mammedov 
>> Reviewed-by: Eric Blake 
>> Signed-off-by: Ben Warren 
>> ---
>> hmp-commands-info.hx | 13 +
>> hmp.c|  9 +
>> hmp.h|  1 +
>> hw/acpi/vmgenid.c| 16 
>> qapi-schema.json | 20 
>> stubs/Makefile.objs  |  1 +
>> stubs/vmgenid.c  |  8 
>> 7 files changed, 68 insertions(+)
>> create mode 100644 stubs/vmgenid.c
>> 
>> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
>> index b0f35e6..f3df793 100644
>> --- a/hmp-commands-info.hx
>> +++ b/hmp-commands-info.hx
>> @@ -802,6 +802,19 @@ Show information about hotpluggable CPUs
>> ETEXI
>> 
>> STEXI
>> +@item info vm-generation-id
> 
> (1) Don't we need some kind of @findex here, for consistency with the
> rest of the file?
> 
>> +Show Virtual Machine Generation ID
>> +ETEXI
>> +
>> +{
>> +.name   = "vm-generation-id",
>> +.args_type  = "",
>> +.params = "",
>> +.help   = "Show Virtual Machine Generation ID",
>> +.cmd = hmp_info_vm_generation_id,
>> +},
>> +
>> +STEXI
>> @end table
>> ETEXI
>> 
>> diff --git a/hmp.c b/hmp.c
>> index 2bc4f06..535613d 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -2565,3 +2565,12 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
>> *qdict)
>> 
>> qapi_free_HotpluggableCPUList(saved);
>> }
>> +
>> +void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict)
>> +{
>> +GuidInfo *info = qmp_query_vm_generation_id(NULL);
>> +if (info) {
>> +monitor_printf(mon, "%s\n", info->guid);
>> +}
>> +qapi_free_GuidInfo(info);
>> +}
>> diff --git a/hmp.h b/hmp.h
>> index 05daf7c..799fd37 100644
>> --- a/hmp.h
>> +++ b/hmp.h
>> @@ -137,5 +137,6 @@ void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict 
>> *qdict);
>> void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
>> void hmp_info_dump(Monitor *mon, const QDict *qdict);
>> void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
>> +void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
>> 
>> #endif
>> diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
>> index b1b7b32..c159c76 100644
>> --- a/hw/acpi/vmgenid.c
>> +++ b/hw/acpi/vmgenid.c
>> @@ -235,3 +235,19 @@ static void vmgenid_register_types(void)
>> }
>> 
>> type_init(vmgenid_register_types)
>> +
>> +GuidInfo *qmp_query_vm_generation_id(Error **errp)
>> +{
>> +GuidInfo *info;
>> +VmGenIdState *vms;
>> +Object *obj = find_vmgenid_dev();
>> +
>> +if (!obj) {
>> +return NULL;
>> +}
>> +vms = VMGENID(obj);
>> +
>> +info = g_malloc0(sizeof(*info));
>> +info->guid = qemu_uuid_unparse_strdup(>guid);
>> +return info;
>> +}
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 61151f3..5e2a47f 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -6051,3 +6051,23 @@
>> #
>> ##
>> { 'command': 'query-hotpluggable-cpus', 'returns': ['HotpluggableCPU'] }
>> +
>> +##
>> +# @GuidInfo:
>> +#
>> +# GUID information.
>> +#
>> +# @guid: the globally unique identifier
>> +#
>> +# Since: 2.9
>> +##
>> +{ 'struct': 'GuidInfo', 'data': {'guid': 'str'} }
>> +
>> +##
>> +# @query-vm-generation-id:
>> +#
>> +# Show Virtual Machine Generation ID
>> +#
>> +# Since 2.9
>> +##
>> +{ 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
>> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
>> index a187295..0bffca6 100644
>> --- a/stubs/Makefile.objs
>> +++ b/stubs/Makefile.objs
>> @@ -35,3 +35,4 @@ stub-obj-y += qmp_pc_dimm_device_list.o
>> stub-obj-y += target-monitor-defs.o
>> stub-obj-y += target-get-monitor-def.o
>> stub-obj-y += pc_madt_cpu_entry.o
>> +stub-obj-y += vmgenid.o
>> diff --git a/stubs/vmgenid.c b/stubs/vmgenid.c
>> new file mode 100644
>> index 000..8c448ac
>> --- /dev/null
>> +++ b/stubs/vmgenid.c
>> @@ -0,0 +1,8 @@
>> +#include "qemu/osdep.h"
>> +#include "qmp-commands.h"
>> +
>> +GuidInfo *qmp_query_vm_generation_id(Error **errp)
>> +{
>> +error_setg(errp, "this command is not currently supported");
>> +return NULL;
>> +}
>> 
> 
> (2) Don't we usually employ QERR_UNSUPPORTED for the format string in
> such cases?
> 
> With or without updates:
> 
> Reviewed-by: Laszlo Ersek >
> 
Both items changed.  Thanks!
> Thanks
> Laszlo



smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH v6 4/7] ACPI: Add Virtual Machine Generation ID support

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 7:24 AM, Laszlo Ersek  wrote:
> 
> On 02/15/17 13:19, Igor Mammedov wrote:
>> On Tue, 14 Feb 2017 22:15:46 -0800
>> b...@skyportsystems.com wrote:
>> 
>>> From: Ben Warren 
>>> 
>>> This implements the VM Generation ID feature by passing a 128-bit
>>> GUID to the guest via a fw_cfg blob.
>>> Any time the GUID changes, an ACPI notify event is sent to the guest
>>> 
>>> The user interface is a simple device with one parameter:
>>> - guid (string, must be "auto" or in UUID format
>>>   ----)
>>> 
>>> Signed-off-by: Ben Warren 
>>> ---
>>> default-configs/i386-softmmu.mak |   1 +
>>> default-configs/x86_64-softmmu.mak   |   1 +
>>> hw/acpi/Makefile.objs|   1 +
>>> hw/acpi/vmgenid.c| 237 
>>> +++
>>> hw/i386/acpi-build.c |  16 +++
>>> include/hw/acpi/acpi_dev_interface.h |   1 +
>>> include/hw/acpi/vmgenid.h|  35 ++
>>> 7 files changed, 292 insertions(+)
>>> create mode 100644 hw/acpi/vmgenid.c
>>> create mode 100644 include/hw/acpi/vmgenid.h
>>> 
>>> diff --git a/default-configs/i386-softmmu.mak 
>>> b/default-configs/i386-softmmu.mak
>>> index 48b07a4..029e952 100644
>>> --- a/default-configs/i386-softmmu.mak
>>> +++ b/default-configs/i386-softmmu.mak
>>> @@ -59,3 +59,4 @@ CONFIG_I82801B11=y
>>> CONFIG_SMBIOS=y
>>> CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>> CONFIG_PXB=y
>>> +CONFIG_ACPI_VMGENID=y
>>> diff --git a/default-configs/x86_64-softmmu.mak 
>>> b/default-configs/x86_64-softmmu.mak
>>> index fd96345..d1d7432 100644
>>> --- a/default-configs/x86_64-softmmu.mak
>>> +++ b/default-configs/x86_64-softmmu.mak
>>> @@ -59,3 +59,4 @@ CONFIG_I82801B11=y
>>> CONFIG_SMBIOS=y
>>> CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>> CONFIG_PXB=y
>>> +CONFIG_ACPI_VMGENID=y
>>> diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
>>> index 6acf798..11c35bc 100644
>>> --- a/hw/acpi/Makefile.objs
>>> +++ b/hw/acpi/Makefile.objs
>>> @@ -5,6 +5,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
>>> common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
>>> common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
>>> common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
>>> +common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
>>> common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
>>> 
>>> common-obj-y += acpi_interface.o
>>> diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
>>> new file mode 100644
>>> index 000..b1b7b32
>>> --- /dev/null
>>> +++ b/hw/acpi/vmgenid.c
>>> @@ -0,0 +1,237 @@
>>> +/*
>>> + *  Virtual Machine Generation ID Device
>>> + *
>>> + *  Copyright (C) 2017 Skyport Systems.
>>> + *
>>> + *  Author: Ben Warren 
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or 
>>> later.
>>> + * See the COPYING file in the top-level directory.
>>> + *
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "qmp-commands.h"
>>> +#include "hw/acpi/acpi.h"
>>> +#include "hw/acpi/aml-build.h"
>>> +#include "hw/acpi/vmgenid.h"
>>> +#include "hw/nvram/fw_cfg.h"
>>> +#include "sysemu/sysemu.h"
>>> +
>>> +void vmgenid_build_acpi(VmGenIdState *vms, GArray *table_data, GArray 
>>> *guid,
>>> +BIOSLinker *linker)
>>> +{
>>> +Aml *ssdt, *dev, *scope, *method, *addr, *if_ctx;
>>> +uint32_t vgia_offset;
>>> +QemuUUID guid_le;
>>> +
>>> +/* Fill in the GUID values.  These need to be converted to 
>>> little-endian
>>> + * first, since that's what the guest expects
>>> + */
>>> +g_array_set_size(guid, VMGENID_FW_CFG_SIZE);
>>> +memcpy(_le.data, >guid.data, sizeof(vms->guid.data));
>>> +qemu_uuid_bswap(_le);
>>> +/* The GUID is written at a fixed offset into the fw_cfg file
>>> + * in order to implement the "OVMF SDT Header probe suppressor"
>>> + * see docs/specs/vmgenid.txt for more details
>>> + */
>>> +g_array_insert_vals(guid, VMGENID_GUID_OFFSET, guid_le.data,
>>> +ARRAY_SIZE(guid_le.data));
> 
> Ben:
> 
> (1) The logic is sane here, but the initial sizing of the array is not
> correct. The initial size should be
> 
>  (VMGENID_FW_CFG_SIZE - ARRAY_SIZE(guid_le.data))
> 
> The reason for this is that g_array_insert_vals() really inserts (it
> doesn't overwrite) data, therefore it grows the array. From the GLib
> source code [glib/garray.c]:
> 
> --
> GArray*
> g_array_insert_vals (GArray*farray,
> guint  index_,
> gconstpointer  data,
> guint  len)
> {
>  GRealArray *array = (GRealArray*) farray;
> 
>  g_return_val_if_fail (array, NULL);
> 
>  g_array_maybe_expand (array, len);
> 
>  memmove (g_array_elt_pos (array, len + index_),
>   g_array_elt_pos (array, index_),
>   g_array_elt_len (array, array->len - index_));
> 
>  memcpy 

Re: [Qemu-devel] [PATCH v6 3/7] ACPI: Add vmgenid blob storage to the build tables

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 6:30 AM, Laszlo Ersek  wrote:
> 
> On 02/15/17 07:15, b...@skyportsystems.com wrote:
>> From: Ben Warren 
>> 
>> This allows them to be centrally initialized and destroyed
>> 
>> The "AcpiBuildTables.vmgenid" array will be used to construct the
>> "etc/vmgenid" fw_cfg blob.
> 
> Trivial wart: the blob is now called "etc/vmgenid_guid".
> 
> If you send a v7, feel free to fix it up. Not critical.
> 
Fixed in v7
> My R-b stands.
> 
> Thanks!
> Laszlo
> 
>> Its contents will be linked into fw_cfg after being built on the
>> pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped
>> without use on the subsequent, guest triggered, acpi_build_update() ->
>> acpi_build() call path.
>> 
>> Signed-off-by: Ben Warren 
>> Reviewed-by: Laszlo Ersek 
>> ---
>> hw/acpi/aml-build.c | 2 ++
>> include/hw/acpi/aml-build.h | 1 +
>> 2 files changed, 3 insertions(+)
>> 
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index b2a1e40..c6f2032 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1559,6 +1559,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
>> tables->rsdp = g_array_new(false, true /* clear */, 1);
>> tables->table_data = g_array_new(false, true /* clear */, 1);
>> tables->tcpalog = g_array_new(false, true /* clear */, 1);
>> +tables->vmgenid = g_array_new(false, true /* clear */, 1);
>> tables->linker = bios_linker_loader_init();
>> }
>> 
>> @@ -1568,6 +1569,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables 
>> *tables, bool mfre)
>> g_array_free(tables->rsdp, true);
>> g_array_free(tables->table_data, true);
>> g_array_free(tables->tcpalog, mfre);
>> +g_array_free(tables->vmgenid, mfre);
>> }
>> 
>> /* Build rsdt table */
>> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
>> index 559326c..00c21f1 100644
>> --- a/include/hw/acpi/aml-build.h
>> +++ b/include/hw/acpi/aml-build.h
>> @@ -210,6 +210,7 @@ struct AcpiBuildTables {
>> GArray *table_data;
>> GArray *rsdp;
>> GArray *tcpalog;
>> +GArray *vmgenid;
>> BIOSLinker *linker;
>> } AcpiBuildTables;
>> 
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature


[Qemu-devel] [PATCH v2 3/4] char: remove the right fd been watched in qemu_chr_fe_set_handlers()

2017-02-15 Thread zhanghailiang
We can call qemu_chr_fe_set_handlers() to add/remove fd been watched
in 'context' which can be either default main context or other explicit
context. But the original logic is not correct, we didn't remove
the right fd because we call g_main_context_find_source_by_id(NULL, tag)
which always try to find the Gsource from default context.

Fix it by passing the right context to g_main_context_find_source_by_id().

Cc: Paolo Bonzini 
Cc: Marc-André Lureau 
Signed-off-by: zhanghailiang 
---
 chardev/char-io.c | 13 +
 chardev/char-io.h |  2 ++
 chardev/char.c|  2 +-
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/chardev/char-io.c b/chardev/char-io.c
index 7dfc3f2..a69cc61 100644
--- a/chardev/char-io.c
+++ b/chardev/char-io.c
@@ -127,14 +127,14 @@ guint io_add_watch_poll(Chardev *chr,
 return tag;
 }
 
-static void io_remove_watch_poll(guint tag)
+static void io_remove_watch_poll(guint tag, GMainContext *context)
 {
 GSource *source;
 IOWatchPoll *iwp;
 
 g_return_if_fail(tag > 0);
 
-source = g_main_context_find_source_by_id(NULL, tag);
+source = g_main_context_find_source_by_id(context, tag);
 g_return_if_fail(source != NULL);
 
 iwp = io_watch_poll_from_source(source);
@@ -146,14 +146,19 @@ static void io_remove_watch_poll(guint tag)
 g_source_destroy(>parent);
 }
 
-void remove_fd_in_watch(Chardev *chr)
+void qemu_remove_fd_in_watch(Chardev *chr, GMainContext *context)
 {
 if (chr->fd_in_tag) {
-io_remove_watch_poll(chr->fd_in_tag);
+io_remove_watch_poll(chr->fd_in_tag, context);
 chr->fd_in_tag = 0;
 }
 }
 
+void remove_fd_in_watch(Chardev *chr)
+{
+qemu_remove_fd_in_watch(chr, NULL);
+}
+
 int io_channel_send_full(QIOChannel *ioc,
  const void *buf, size_t len,
  int *fds, size_t nfds)
diff --git a/chardev/char-io.h b/chardev/char-io.h
index d7ae5f1..117c888 100644
--- a/chardev/char-io.h
+++ b/chardev/char-io.h
@@ -38,6 +38,8 @@ guint io_add_watch_poll(Chardev *chr,
 
 void remove_fd_in_watch(Chardev *chr);
 
+void qemu_remove_fd_in_watch(Chardev *chr, GMainContext *context);
+
 int io_channel_send(QIOChannel *ioc, const void *buf, size_t len);
 
 int io_channel_send_full(QIOChannel *ioc, const void *buf, size_t len,
diff --git a/chardev/char.c b/chardev/char.c
index abd525f..5563375 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -560,7 +560,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 cc = CHARDEV_GET_CLASS(s);
 if (!opaque && !fd_can_read && !fd_read && !fd_event) {
 fe_open = 0;
-remove_fd_in_watch(s);
+qemu_remove_fd_in_watch(s, context);
 } else {
 fe_open = 1;
 }
-- 
1.8.3.1





[Qemu-devel] [PATCH v2 2/4] colo-compare: kick compare thread to exit after some cleanup in finalization

2017-02-15 Thread zhanghailiang
We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(>thread))'
branch.

Before compare thead exits, some cleanup works need to be
done,  All unhandled packets need to be released and connection_track_table
needs to be freed, or there will be memory leak.

Signed-off-by: zhanghailiang 
---
 net/colo-compare.c | 39 +--
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index fdde788..37ce75c 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,8 @@ typedef struct CompareState {
 GHashTable *connection_track_table;
 /* compare thread, a thread for each NIC */
 QemuThread thread;
+
+GMainLoop *compare_loop;
 } CompareState;
 
 typedef struct CompareClass {
@@ -496,7 +498,6 @@ static gboolean check_old_packet_regular(void *opaque)
 static void *colo_compare_thread(void *opaque)
 {
 GMainContext *worker_context;
-GMainLoop *compare_loop;
 CompareState *s = opaque;
 GSource *timeout_source;
 
@@ -507,7 +508,7 @@ static void *colo_compare_thread(void *opaque)
 qemu_chr_fe_set_handlers(>chr_sec_in, compare_chr_can_read,
  compare_sec_chr_in, NULL, s, worker_context, 
true);
 
-compare_loop = g_main_loop_new(worker_context, FALSE);
+s->compare_loop = g_main_loop_new(worker_context, FALSE);
 
 /* To kick any packets that the secondary doesn't match */
 timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
@@ -515,10 +516,10 @@ static void *colo_compare_thread(void *opaque)
   (GSourceFunc)check_old_packet_regular, s, NULL);
 g_source_attach(timeout_source, worker_context);
 
-g_main_loop_run(compare_loop);
+g_main_loop_run(s->compare_loop);
 
 g_source_unref(timeout_source);
-g_main_loop_unref(compare_loop);
+g_main_loop_unref(s->compare_loop);
 g_main_context_unref(worker_context);
 return NULL;
 }
@@ -675,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
 return;
 }
 
+static void colo_flush_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(>primary_list)) {
+pkt = g_queue_pop_head(>primary_list);
+compare_chr_send(>chr_out, pkt->data, pkt->size);
+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_head(>secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
 static void colo_compare_class_init(ObjectClass *oc, void *data)
 {
 UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -703,14 +721,15 @@ static void colo_compare_finalize(Object *obj)
 qemu_chr_fe_deinit(>chr_sec_in);
 qemu_chr_fe_deinit(>chr_out);
 
-g_queue_free(>conn_list);
+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(>thread);
 
-if (qemu_thread_is_self(>thread)) {
-/* compare connection */
-g_queue_foreach(>conn_list, colo_compare_connection, s);
-qemu_thread_join(>thread);
-}
+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(>conn_list, colo_flush_packets, s);
+
+g_queue_free(>conn_list);
 
+g_hash_table_destroy(s->connection_track_table);
 g_free(s->pri_indev);
 g_free(s->sec_indev);
 g_free(s->outdev);
-- 
1.8.3.1





[Qemu-devel] [PATCH v2 4/4] colo-compare: Fix removing fds been watched incorrectly in finalization

2017-02-15 Thread zhanghailiang
We will catch the bellow error report while try to delete compare object
by qmp command:
chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == ((void 
*)0)' failed.

This is caused by failing to remove the right fd been watched while
call qemu_chr_fe_set_handlers();

Fix it by pass the worker_context parameter to qemu_chr_fe_set_handlers().

Signed-off-by: zhanghailiang 
---
 net/colo-compare.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 37ce75c..a6fc2ff 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -84,6 +84,7 @@ typedef struct CompareState {
 /* compare thread, a thread for each NIC */
 QemuThread thread;
 
+GMainContext *worker_context;
 GMainLoop *compare_loop;
 } CompareState;
 
@@ -497,30 +498,29 @@ static gboolean check_old_packet_regular(void *opaque)
 
 static void *colo_compare_thread(void *opaque)
 {
-GMainContext *worker_context;
 CompareState *s = opaque;
 GSource *timeout_source;
 
-worker_context = g_main_context_new();
+s->worker_context = g_main_context_new();
 
 qemu_chr_fe_set_handlers(>chr_pri_in, compare_chr_can_read,
- compare_pri_chr_in, NULL, s, worker_context, 
true);
+  compare_pri_chr_in, NULL, s, s->worker_context, 
true);
 qemu_chr_fe_set_handlers(>chr_sec_in, compare_chr_can_read,
- compare_sec_chr_in, NULL, s, worker_context, 
true);
+  compare_sec_chr_in, NULL, s, s->worker_context, 
true);
 
-s->compare_loop = g_main_loop_new(worker_context, FALSE);
+s->compare_loop = g_main_loop_new(s->worker_context, FALSE);
 
 /* To kick any packets that the secondary doesn't match */
 timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
 g_source_set_callback(timeout_source,
   (GSourceFunc)check_old_packet_regular, s, NULL);
-g_source_attach(timeout_source, worker_context);
+g_source_attach(timeout_source, s->worker_context);
 
 g_main_loop_run(s->compare_loop);
 
 g_source_unref(timeout_source);
 g_main_loop_unref(s->compare_loop);
-g_main_context_unref(worker_context);
+g_main_context_unref(s->worker_context);
 return NULL;
 }
 
@@ -717,8 +717,10 @@ static void colo_compare_finalize(Object *obj)
 {
 CompareState *s = COLO_COMPARE(obj);
 
-qemu_chr_fe_deinit(>chr_pri_in);
-qemu_chr_fe_deinit(>chr_sec_in);
+qemu_chr_fe_set_handlers(>chr_pri_in, NULL, NULL, NULL, NULL,
+ s->worker_context, true);
+qemu_chr_fe_set_handlers(>chr_sec_in, NULL, NULL, NULL, NULL,
+ s->worker_context, true);
 qemu_chr_fe_deinit(>chr_out);
 
 g_main_loop_quit(s->compare_loop);
-- 
1.8.3.1





[Qemu-devel] [PATCH v2 0/4] colo-compare: fix some bugs

2017-02-15 Thread zhanghailiang
This series includes two parts: codes optimization and bug fix.
patch 1 tries to move timer process into colo compare thread as 
a new coroutine.
patch 2 ~ 4 fixe some bugs of colo compare.

v2:
 - Squash patch 3 of last version into patch 2. (ZhangChen's suggestion)

zhanghailiang (4):
  colo-compare: use g_timeout_source_new() to process the stale packets
  colo-compare: kick compare thread to exit after some cleanup in
finalization
  char: remove the right fd been watched in qemu_chr_fe_set_handlers()
  colo-compare: Fix removing fds been watched incorrectly in
finalization

 chardev/char-io.c  |  13 --
 chardev/char-io.h  |   2 +
 chardev/char.c |   2 +-
 net/colo-compare.c | 115 +++--
 4 files changed, 71 insertions(+), 61 deletions(-)

-- 
1.8.3.1





Re: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb descriptor

2017-02-15 Thread Jason Wang



On 2017年02月16日 13:43, Jason Wang wrote:



On 2017年02月16日 13:36, Liu, Yi L wrote:

-Original Message-
From: Qemu-devel 
[mailto:qemu-devel-bounces+yi.l.liu=intel@nongnu.org]

On Behalf Of Michael S. Tsirkin
Sent: Tuesday, January 10, 2017 1:40 PM
To: qemu-devel@nongnu.org
Cc: Peter Maydell ; Eduardo Habkost
; Jason Wang ; Peter Xu
; Paolo Bonzini ; Richard
Henderson 
Subject: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb
descriptor

From: Jason Wang 

This patch enables device IOTLB support for intel iommu. The major 
work is to
implement QI device IOTLB descriptor processing and notify the 
device through

iommu notifier.


Hi Jason/Michael,

Recently Peter Xu's patch also touched intel-iommu emulation. His 
patch shadows
second-level page table by capturing iotlb flush from guest. It would 
result in page
table updating in host. Does this patch also use the same map/umap 
API provided

by VFIO?


Yes, it depends on the iommu notifier too.

If it is, then I think it would also update page table in host. It 
looks to be
a duplicate update. Pls refer to the following snapshot captured from 
section 6.5.2.5

of vtd spec.

"Since translation requests from a device may be serviced by hardware 
from the IOTLB, software must
always request IOTLB invalidation (iotlb_inv_dsc) before requesting 
corresponding Device-TLB

(dev_tlb_inv_dsc) invalidation."

Maybe for device-iotlb, we need a separate API which just pass down 
the invalidate

info without updating page table. Any thoughts?


cc Alex.

If we want ATS to be visible for guest (but I'm not sure if VFIO 
support this), we probably need another notifier or a new flag.


Thanks 


Or need a dedicated address_space if ATS were enabled for the device.



Re: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb descriptor

2017-02-15 Thread Jason Wang



On 2017年02月16日 13:36, Liu, Yi L wrote:

-Original Message-
From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=intel@nongnu.org]
On Behalf Of Michael S. Tsirkin
Sent: Tuesday, January 10, 2017 1:40 PM
To: qemu-devel@nongnu.org
Cc: Peter Maydell ; Eduardo Habkost
; Jason Wang ; Peter Xu
; Paolo Bonzini ; Richard
Henderson 
Subject: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb
descriptor

From: Jason Wang 

This patch enables device IOTLB support for intel iommu. The major work is to
implement QI device IOTLB descriptor processing and notify the device through
iommu notifier.


Hi Jason/Michael,

Recently Peter Xu's patch also touched intel-iommu emulation. His patch shadows
second-level page table by capturing iotlb flush from guest. It would result in 
page
table updating in host. Does this patch also use the same map/umap API provided
by VFIO?


Yes, it depends on the iommu notifier too.


If it is, then I think it would also update page table in host. It looks to be
a duplicate update. Pls refer to the following snapshot captured from section 
6.5.2.5
of vtd spec.

"Since translation requests from a device may be serviced by hardware from the 
IOTLB, software must
always request IOTLB invalidation (iotlb_inv_dsc) before requesting 
corresponding Device-TLB
(dev_tlb_inv_dsc) invalidation."

Maybe for device-iotlb, we need a separate API which just pass down the 
invalidate
info without updating page table. Any thoughts?


cc Alex.

If we want ATS to be visible for guest (but I'm not sure if VFIO support 
this), we probably need another notifier or a new flag.


Thanks



Thanks,
Yi L

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
---
  hw/i386/intel_iommu_internal.h | 13 ++-
  include/hw/i386/x86-iommu.h|  1 +
  hw/i386/intel_iommu.c  | 83
+++---
  hw/i386/x86-iommu.c| 17 +
  4 files changed, 107 insertions(+), 7 deletions(-)

diff --git a/hw/i386/intel_iommu_internal.h
b/hw/i386/intel_iommu_internal.h index 11abfa2..356f188 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -183,6 +183,7 @@
  /* (offset >> 4) << 8 */
  #define VTD_ECAP_IRO(DMAR_IOTLB_REG_OFFSET << 4)
  #define VTD_ECAP_QI (1ULL << 1)
+#define VTD_ECAP_DT (1ULL << 2)
  /* Interrupt Remapping support */
  #define VTD_ECAP_IR (1ULL << 3)
  #define VTD_ECAP_EIM(1ULL << 4)
@@ -326,6 +327,7 @@ typedef union VTDInvDesc VTDInvDesc;
  #define VTD_INV_DESC_TYPE   0xf
  #define VTD_INV_DESC_CC 0x1 /* Context-cache Invalidate Desc 
*/
  #define VTD_INV_DESC_IOTLB  0x2
+#define VTD_INV_DESC_DEVICE 0x3
  #define VTD_INV_DESC_IEC0x4 /* Interrupt Entry Cache
 Invalidate Descriptor */
  #define VTD_INV_DESC_WAIT   0x5 /* Invalidation Wait Descriptor */
@@ -361,6 +363,13 @@ typedef union VTDInvDesc VTDInvDesc;
  #define VTD_INV_DESC_IOTLB_RSVD_LO  0xff00ULL
  #define VTD_INV_DESC_IOTLB_RSVD_HI  0xf80ULL

+/* Mask for Device IOTLB Invalidate Descriptor */ #define
+VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) & 0xf000ULL)
+#define VTD_INV_DESC_DEVICE_IOTLB_SIZE(val) ((val) & 0x1) #define
+VTD_INV_DESC_DEVICE_IOTLB_SID(val) (((val) >> 32) & 0xULL) #define
+VTD_INV_DESC_DEVICE_IOTLB_RSVD_HI 0xffeULL #define
+VTD_INV_DESC_DEVICE_IOTLB_RSVD_LO 0xffe0fff8
+
  /* Information about page-selective IOTLB invalidate */  struct
VTDIOTLBPageInvInfo {
  uint16_t domain_id;
@@ -399,8 +408,8 @@ typedef struct VTDRootEntry VTDRootEntry;
  #define VTD_CONTEXT_ENTRY_FPD   (1ULL << 1) /* Fault Processing Disable
*/
  #define VTD_CONTEXT_ENTRY_TT(3ULL << 2) /* Translation Type */
  #define VTD_CONTEXT_TT_MULTI_LEVEL  0
-#define VTD_CONTEXT_TT_DEV_IOTLB1
-#define VTD_CONTEXT_TT_PASS_THROUGH 2
+#define VTD_CONTEXT_TT_DEV_IOTLB(1ULL << 2)
+#define VTD_CONTEXT_TT_PASS_THROUGH (2ULL << 2)
  /* Second Level Page Translation Pointer*/
  #define VTD_CONTEXT_ENTRY_SLPTPTR   (~0xfffULL)
  #define VTD_CONTEXT_ENTRY_RSVD_LO   (0xff0ULL | ~VTD_HAW_MASK)
diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
index 0c89d98..361c07c 100644
--- a/include/hw/i386/x86-iommu.h
+++ b/include/hw/i386/x86-iommu.h
@@ -73,6 +73,7 @@ typedef struct IEC_Notifier IEC_Notifier;  struct
X86IOMMUState {
  SysBusDevice busdev;
   

Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Hailiang Zhang


On 2017/2/16 10:27, Zhang Chen wrote:



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang
---
   net/colo-compare.c | 20 
   1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
   return;
   }



This function in my patch "colo-compare and filter-rewriter work with
colo-frame "
Named 'colo_flush_connection', I think use 'flush' instead of 'release'
is better,



OK, i will fix it in next version, thanks.


Thanks
Zhang Chen



+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(>primary_list)) {
+pkt = g_queue_pop_head(>primary_list);
+compare_chr_send(>chr_out, pkt->data, pkt->size);
+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_head(>secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
   static void colo_compare_class_init(ObjectClass *oc, void *data)
   {
   UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
   g_main_loop_quit(s->compare_loop);
   qemu_thread_join(>thread);

+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(>conn_list, colo_release_packets, s);

   g_queue_free(>conn_list);

+g_hash_table_destroy(s->connection_track_table);
   g_free(s->pri_indev);
   g_free(s->sec_indev);
   g_free(s->outdev);







Re: [Qemu-devel] [PATCH 5/6] target-ppc: support for 32-bit carry and overflow

2017-02-15 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 02/14/2017 02:05 PM, Nikunj A Dadhania wrote:
>> Yes, you are right. I had a discussion with Paul Mackerras yesterday, he
>> explained to me in detail about the bits. I am working on the revised
>> implementation. Will detail it in the commit message.
>
> As you're working on this, consider changing the definition of cpu_ov such 
> that 
> the MSB is OV and bit 31 is OV32.
>
> E.g.
>
>
>   static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
>  TCGv arg1, TCGv arg2, int sub)
>   {
>   TCGv t0 = tcg_temp_new();
>
>   tcg_gen_xor_tl(cpu_ov, arg0, arg2);
>   tcg_gen_xor_tl(t0, arg1, arg2);
>   if (sub) {
>   tcg_gen_and_tl(cpu_ov, cpu_ov, t0);
>   } else {
>   tcg_gen_andc_tl(cpu_ov, cpu_ov, t0);
>   }
>   tcg_temp_free(t0);
>   if (NARROW_MODE(ctx)) {
>   tcg_gen_ext32s_tl(cpu_ov, cpu_ov);
>   }
> -tcg_gen_shri_tl(cpu_ov, cpu_ov, TARGET_LONG_BITS - 1);
>   tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
>   }
>
>
> is all that is required for arithmetic to compute OV and OV32 into those two 
> bits.

How about the below?

@@ -809,10 +809,11 @@ static inline void gen_op_arith_compute_ov(DisasContext 
*ctx, TCGv arg0,
 tcg_gen_andc_tl(cpu_ov, cpu_ov, t0);
 }
 tcg_temp_free(t0);
+tcg_gen_extract_tl(cpu_ov32, cpu_ov, 31, 1);
+tcg_gen_extract_tl(cpu_ov, cpu_ov, 63, 1);
 if (NARROW_MODE(ctx)) {
-tcg_gen_ext32s_tl(cpu_ov, cpu_ov);
+tcg_gen_mov_tl(cpu_ov, cpu_ov32);
 }
-tcg_gen_shri_tl(cpu_ov, cpu_ov, TARGET_LONG_BITS - 1);
 tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
 }

Regards
Nikunj




Re: [Qemu-devel] [PATCH V7 2/2] Add a new qmp command to do checkpoint, query xen replication status

2017-02-15 Thread Jason Wang



On 2017年02月16日 11:25, Zhang Chen wrote:

Ping...

No new for a long time.

Who can pick up this patch?



I believe you'd better cc migration maintainers (cced), have you tried 
scripts/get_maintainer ?


Thanks



Thanks

Zhang Chen


On 02/14/2017 04:28 AM, Stefano Stabellini wrote:

On Wed, 8 Feb 2017, Eric Blake wrote:

On 02/07/2017 11:24 PM, Zhang Chen wrote:

We can call this qmp command to do checkpoint outside of qemu.
Xen colo will need this function.

Signed-off-by: Zhang Chen 
Signed-off-by: Wen Congyang 
---
  migration/colo.c | 17 
  qapi-schema.json | 60 


  2 files changed, 77 insertions(+)


Reviewed-by: Eric Blake 

Given that the series is all acked, are you going to take care of the
pull request?


.








Re: [Qemu-devel] [RFC] virtio-pci: Allow PCIe virtio devices on root bus

2017-02-15 Thread David Gibson
On Thu, Feb 16, 2017 at 01:48:42PM +1100, David Gibson wrote:
> On Wed, Feb 15, 2017 at 04:59:33PM +0200, Marcel Apfelbaum wrote:
> > On 02/15/2017 03:45 AM, David Gibson wrote:
> > > On Tue, Feb 14, 2017 at 02:53:08PM +0200, Marcel Apfelbaum wrote:
> > > > On 02/14/2017 06:15 AM, David Gibson wrote:
> > > > > On Mon, Feb 13, 2017 at 12:14:23PM +0200, Marcel Apfelbaum wrote:
> > > > > > On 02/13/2017 06:33 AM, David Gibson wrote:
> > > > > > > On Sun, Feb 12, 2017 at 09:05:46PM +0200, Marcel Apfelbaum wrote:
> > > > > > > > On 02/10/2017 02:37 AM, David Gibson wrote:
> > > > > > > > > On Thu, Feb 09, 2017 at 10:04:47AM +0100, Laszlo Ersek wrote:
> > > > > > > > > > On 02/09/17 05:16, David Gibson wrote:
> > > > > > > > > > > On Wed, Feb 08, 2017 at 11:40:50AM +0100, Laszlo Ersek 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On 02/08/17 07:16, David Gibson wrote:
[snip]
> > > >   Which means that you can use it to
> > > > > drive PCIe devices just fine.  "Bus level" PCIe extensions like AER
> > > > > and PCIe standard hotplug won't work, but PAPR has its own mechanisms
> > > > > for those (common between PCI and PCIe).
> > > > > 
> > > > > I did float the idea of having the pseries PCI bus remain plain PCI
> > > > > but with a special flag to allow PCIe devices to be attached to it
> > > > > anyway.  It wasn't greeted with much enthusiasm..
> > > > > 
> > > > 
> > > > Can you point me to the discussion please? It seems similar to what I 
> > > > proposed above.
> > > 
> > > Sorry, I was misleading.  I think I just raised that idea with Andrea
> > > and a few other people internally, not on one of the lists at large.
> > > 
> > > > As you properly described it, is much closer to PCI then PCIe, even the 
> > > > only characteristic
> > > > that makes it "a little" PCIe, the Extended Configuration Space support,
> > > > is done with an alternative interface.
> > > > 
> > > > I agree the PAPR bus is not PCIe.
> > > 
> > > Ok, so if we take that direction, the question becomes how do we let
> > > PCIe devices plug into this mostly-not-PCIe bus.  Maybe introduce a
> > > "pci_bus_accepts_express()" function that will replace many, but not
> > > all current uses of "pci_bus_is_express()"?
> > > 
> > 
> > Sounds good and I think Eduardo is already working on exactly this
> > idea, however he is on PTO now. It is better to synchronize with him.
> 
> Ah, right.  Do you know when he'll be back?  This is semi-urgent for
> Power.
> 
> 
> > > Such a helper could maybe simplify the logic in virtio-pci (and XHCI?)
> > > by returning false on an x86 root bus.
> > > 
> > 
> > The rule would me more complicated. We don't want to completely remove the
> > possibility to have PCIe devices as part of Root Complex. it seems
> > like I am contradicting myself, but no).
> > This is why we have guidelines and  not hard-coded policies.
> > Also ,the QEMU way is to be more permissive. We provide guidelines and sane
> > defaults, but we let the user to chose.
> > 
> > Getting back to our problem, the rule would be:
> > hybrid devices should be PCI or PCIe for a bus?
> > PAPR bus should return 'PCIe' for hybrid devices.
> > X86 bus should return 'PCIe' if not root.
> 
> Ok.

Wait, actually.. we have two possible directions to go, both of which
have been mentioned in the thread, but I don't think we've settled on
one:

1) Have pseries create a PCIe bus (as my first cut draft does).

That should allow pure PCIe devices to appear either under a port or
(more usually for PAPR) as "integrated endpoints".  In addition we'd
need as suggested above a "pcie_hybrid_type()" function that would
tell hybrid devices to also appear as PCIe rather than PCI.

2) Have pseries create a vanilla PCI bus (or a special PAPR PCI
   variant)

Appearing as vanilla PCI would in a number of ways more closely match
the way PCI buses are handled on PAPR.  However, we still need to
connect PCIe devices to it.  So we'd need some 'bus_accepts_pcie()'
hook and use that (in place of pci_bus_is_express()) to determine both
whether we can attach pure PCIe devices and that hybrid devices should
appear as PCIe rather than plain PCI.


Based on the immediately preceding discussion, I was leaning towards
(2).  Is that your feeling as well?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH V7 2/2] Add a new qmp command to do checkpoint, query xen replication status

2017-02-15 Thread Zhang Chen

Ping...

No new for a long time.

Who can pick up this patch?


Thanks

Zhang Chen


On 02/14/2017 04:28 AM, Stefano Stabellini wrote:

On Wed, 8 Feb 2017, Eric Blake wrote:

On 02/07/2017 11:24 PM, Zhang Chen wrote:

We can call this qmp command to do checkpoint outside of qemu.
Xen colo will need this function.

Signed-off-by: Zhang Chen 
Signed-off-by: Wen Congyang 
---
  migration/colo.c | 17 
  qapi-schema.json | 60 
  2 files changed, 77 insertions(+)


Reviewed-by: Eric Blake 

Given that the series is all acked, are you going to take care of the
pull request?


.



--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH v7 2/2] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-02-15 Thread ashish mittal
Sorry, pressed the "send" button instead of "expand text" on the
previous email ...

On Mon, Feb 13, 2017 at 6:43 AM, Stefan Hajnoczi  wrote:
> On Tue, Feb 07, 2017 at 08:18:14PM -0800, Ashish Mittal wrote:
>> diff --git a/tests/qemu-iotests/common.config 
>> b/tests/qemu-iotests/common.config
>> index f6384fb..c7a80c0 100644
>> --- a/tests/qemu-iotests/common.config
>> +++ b/tests/qemu-iotests/common.config
>> @@ -105,6 +105,10 @@ if [ -z "$QEMU_NBD_PROG" ]; then
>>  export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
>>  fi
>>
>> +if [ -z "$QEMU_VXHS_PROG" ]; then
>> +export QEMU_VXHS_PROG="`set_prog_path qnio_server /usr/local/bin`"
>
> Did you test this with /usr/local/bin/qnio_server?
>
> I think it will evaluate to QEMU_VXHS_PROG=/usr/local/bin when qnio_server
> isn't found in PATH.  You probably wanted /usr/local/bin/qnio_server instead.
>
> I suggest dropping the second argument completely and letting the user set 
> PATH
> themselves.  No existing set_prog_path caller uses the second argument.
>

You're right. Will drop the second argument.

> # $1 = prog to look for, $2* = default pathnames if not found in $PATH
> set_prog_path()
> {
> p=`command -v $1 2> /dev/null`
> if [ -n "$p" -a -x "$p" ]; then
> echo $p
> return 0
> fi
> p=$1
>
> shift
> for f; do
> if [ -x $f ]; then
> echo $f
> return 0
> fi
> done
>
> echo ""
> return 1
> }
>
>> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
>> index 3213765..06a3164 100644
>> --- a/tests/qemu-iotests/common.rc
>> +++ b/tests/qemu-iotests/common.rc
>> @@ -89,6 +89,9 @@ else
>>  TEST_IMG=$TEST_DIR/t.$IMGFMT
>>  elif [ "$IMGPROTO" = "archipelago" ]; then
>>  TEST_IMG="archipelago:at.$IMGFMT"
>> +elif [ "$IMGPROTO" = "vxhs" ]; then
>> +TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
>> +TEST_IMG="vxhs://127.0.0.1:/t.$IMGFMT"
>>  else
>>  TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
>>  fi
>> @@ -175,6 +178,12 @@ _make_test_img()
>>  eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT  
>> $TEST_IMG_FILE &"
>>  sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
>>  fi
>> +
>> +# Start QNIO server on image directory for vxhs protocol
>> +if [ $IMGPROTO = "vxhs" ]; then
>> +eval "$QEMU_VXHS -d  $TEST_DIR &"
>> +sleep 1 # Wait for server to come up.
>
> This is a pre-existing problem and you don't need to fix it now:
>
> We should replace sleep 1 with a function that probes the TCP port until the
> connection can be established or a timeout is reached.  The netcat (nc) 
> utility
> is often used for this.
>
> sleep 1 is not reliable and may fail on a heavily loaded machine like the
> Travis-CI build machines that are used.

Will skip this one for now.

Thanks,
Ashish



Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 07:52:35PM -0700, Alex Williamson wrote:
> On Thu, 16 Feb 2017 10:35:28 +0800
> Peter Xu  wrote:
> 
> > On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> > > VFIO actually wants to create a capability with ID == 0.
> > > This is done to make guest drivers skip the given capability.
> > > pcie_add_capability then trips up on this capability
> > > when looking for end of capability list.
> > > 
> > > To support this use-case, it's easy enough to switch to
> > > e.g. 0x for these comparisons - we can be sure
> > > it will never match a 16-bit capability ID.
> > > 
> > > Signed-off-by: Michael S. Tsirkin   
> > 
> > Reviewed-by: Peter Xu 
> > 
> > Two nits:
> > 
> > (1) maybe we can s/0x/0x/ in the whole patch since ecap_id
> > is 16 bits
> 
> The former is used because it's beyond the address space of a valid
> capability.  Using 0x just makes the situation different, not
> better.

But isn't pcie_find_capability_list() defining cap_id parameter as
uint16_t? In that case, 0x will be the same as 0x since
we'll just take the lower 16 bits?

> 
> > 
> > (2) maybe we can add one more sentence in the comment below showing
> > where the 0x thing comes from (it comes from PCIe spec 7.9.2)
> 
> The capability in hardware is 16bits, thus a value that exceeds 16 bits
> can never match a valid ID.  It has nothing to do with 7.9.2.  Thanks,
> 
> Alex
> 
> > > ---
> > >  hw/pci/pcie.c | 11 +++
> > >  1 file changed, 7 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > index cbd4bb4..f4dd177 100644
> > > --- a/hw/pci/pcie.c
> > > +++ b/hw/pci/pcie.c
> > > @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
> > >   * uint16_t ext_cap_size
> > >   */
> > >  
> > > -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t 
> > > cap_id,
> > > +/* Passing a cap_id value > 0x will return 0 and put end of list in 
> > > prev */
> > > +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t 
> > > cap_id,
> > >uint16_t *prev_p)
> > >  {
> > >  uint16_t prev = 0;
> > > @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
> > >  } else {
> > >  uint16_t prev;
> > >  
> > > -/* 0 is reserved cap id. use internally to find the last 
> > > capability
> > > -   in the linked list */
> > > -next = pcie_find_capability_list(dev, 0, );
> > > +/*
> > > + * 0x is not a valid cap id (it's a 16 bit field). use
> > > + * internally to find the last capability in the linked list.
> > > + */
> > > +next = pcie_find_capability_list(dev, 0x, );
> > >  
> > >  assert(prev >= PCI_CONFIG_SPACE_SIZE);
> > >  assert(next == 0);
> > > -- 
> > > MST  
> > 
> > -- peterx
> 

-- peterx



Re: [Qemu-devel] [PATCH] hw/ppc/spapr: Check for valid page size when hot plugging memory

2017-02-15 Thread David Gibson
On Wed, Feb 15, 2017 at 10:21:44AM +0100, Thomas Huth wrote:
> On POWER, the valid page sizes that the guest can use are bound
> to the CPU and not to the memory region. QEMU already has some
> fancy logic to find out the right maximum memory size to tell
> it to the guest during boot (see getrampagesize() in the file
> target/ppc/kvm.c for more information).
> However, once we're booted and the guest is using huge pages
> already, it is currently still possible to hot-plug memory regions
> that does not support huge pages - which of course does not work
> on POWER, since the guest thinks that it is possible to use huge
> pages everywhere. The KVM_RUN ioctl will then abort with -EFAULT,
> QEMU spills out a not very helpful error message together with
> a register dump and the user is annoyed that the VM unexpectedly
> died.
> To avoid this situation, we should check the page size of hot-plugged
> DIMMs to see whether it is possible to use it in the current VM.
> If it does not fit, we can print out a better error message and
> refuse to add it, so that the VM does not die unexpectely and the
> user has a second chance to plug a DIMM with a matching memory
> backend instead.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1419466
> Signed-off-by: Thomas Huth 

Using the global is a bit yucky, but I can't see an easy way to remove
it, and it's not like there aren't already some ugly globals in the
KVM code.  In the meantime this fixes a real bug, so I've merged this
to ppc-for-2.9.

Thanks.

> ---
>  hw/ppc/spapr.c   |  8 
>  target/ppc/kvm.c | 32 
>  target/ppc/kvm_ppc.h |  7 +++
>  3 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e465d7a..1a90aae 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2357,6 +2357,7 @@ static void spapr_memory_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  uint64_t align = memory_region_get_alignment(mr);
>  uint64_t size = memory_region_size(mr);
>  uint64_t addr;
> +char *mem_dev;
>  
>  if (size % SPAPR_MEMORY_BLOCK_SIZE) {
>  error_setg(_err, "Hotplugged memory size must be a multiple of 
> "
> @@ -2364,6 +2365,13 @@ static void spapr_memory_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  goto out;
>  }
>  
> +mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, 
> NULL);
> +if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) {
> +error_setg(_err, "Memory backend has bad page size. "
> +   "Use 'memory-backend-file' with correct mem-path.");
> +goto out;
> +}
> +
>  pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, _err);
>  if (local_err) {
>  goto out;
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 663d2e7..584546b 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -438,12 +438,13 @@ static bool kvm_valid_page_size(uint32_t flags, long 
> rampgsize, uint32_t shift)
>  return (1ul << shift) <= rampgsize;
>  }
>  
> +static long max_cpu_page_size;
> +
>  static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  {
>  static struct kvm_ppc_smmu_info smmu_info;
>  static bool has_smmu_info;
>  CPUPPCState *env = >env;
> -long rampagesize;
>  int iq, ik, jq, jk;
>  bool has_64k_pages = false;
>  
> @@ -458,7 +459,9 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  has_smmu_info = true;
>  }
>  
> -rampagesize = getrampagesize();
> +if (!max_cpu_page_size) {
> +max_cpu_page_size = getrampagesize();
> +}
>  
>  /* Convert to QEMU form */
>  memset(>sps, 0, sizeof(env->sps));
> @@ -478,14 +481,14 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  struct ppc_one_seg_page_size *qsps = >sps.sps[iq];
>  struct kvm_ppc_one_seg_page_size *ksps = _info.sps[ik];
>  
> -if (!kvm_valid_page_size(smmu_info.flags, rampagesize,
> +if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
>   ksps->page_shift)) {
>  continue;
>  }
>  qsps->page_shift = ksps->page_shift;
>  qsps->slb_enc = ksps->slb_enc;
>  for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
> -if (!kvm_valid_page_size(smmu_info.flags, rampagesize,
> +if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
>   ksps->enc[jk].page_shift)) {
>  continue;
>  }
> @@ -510,12 +513,33 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  env->mmu_model &= ~POWERPC_MMU_64K;
>  }
>  }
> +
> +bool kvmppc_is_mem_backend_page_size_ok(char *obj_path)
> +{
> +Object *mem_obj = object_resolve_path(obj_path, NULL);
> +char *mempath = object_property_get_str(mem_obj, "mem-path", NULL);
> +long pagesize;
> +
> +if (mempath) {

Re: [Qemu-devel] [RFC] virtio-pci: Allow PCIe virtio devices on root bus

2017-02-15 Thread David Gibson
On Wed, Feb 15, 2017 at 04:59:33PM +0200, Marcel Apfelbaum wrote:
> On 02/15/2017 03:45 AM, David Gibson wrote:
> > On Tue, Feb 14, 2017 at 02:53:08PM +0200, Marcel Apfelbaum wrote:
> > > On 02/14/2017 06:15 AM, David Gibson wrote:
> > > > On Mon, Feb 13, 2017 at 12:14:23PM +0200, Marcel Apfelbaum wrote:
> > > > > On 02/13/2017 06:33 AM, David Gibson wrote:
> > > > > > On Sun, Feb 12, 2017 at 09:05:46PM +0200, Marcel Apfelbaum wrote:
> > > > > > > On 02/10/2017 02:37 AM, David Gibson wrote:
> > > > > > > > On Thu, Feb 09, 2017 at 10:04:47AM +0100, Laszlo Ersek wrote:
> > > > > > > > > On 02/09/17 05:16, David Gibson wrote:
> > > > > > > > > > On Wed, Feb 08, 2017 at 11:40:50AM +0100, Laszlo Ersek 
> > > > > > > > > > wrote:
> > > > > > > > > > > On 02/08/17 07:16, David Gibson wrote:
> > > > > > > > > > > > Marcel,
> > > > > > > > > > > > 
> > > > > > > > > > > > Your original patch adding PCIe support to virtio-pci.c 
> > > > > > > > > > > > has the
> > > > > > > > > > > > limitation noted below that PCIe won't be enabled if 
> > > > > > > > > > > > the device is on
> > > > > > > > > > > > the root bus (rather than under a root or downstream 
> > > > > > > > > > > > port).  As
> > > > > > > > > > > > reasoned below, I think removing the check is correct, 
> > > > > > > > > > > > even for x86
> > > > > > > > > > > > (though it would rarely be useful there).  But I could 
> > > > > > > > > > > > well have
> > > > > > > > > > > > missed something.  Let me know if so...
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Virtio devices can appear as either vanilla PCI or 
> > > > > > > > > > > > PCI-Express devices
> > > > > > > > > > > > depending on the bus they're connected to.  At the 
> > > > > > > > > > > > moment it will only
> > > > > > > > > > > > appear as vanilla PCI if connected to the root bus of a 
> > > > > > > > > > > > PCIe host bridge.
> > > > > > > > > > > > 
> > > > > > > > > > > > Presumably this is to reflect the fact that PCIe 
> > > > > > > > > > > > devices usually need to
> > > > > > > > > > > > be connected to a root (or further downstream) port 
> > > > > > > > > > > > rather than directly
> > > > > > > > > > > > on the root bus.  However, due to the odd requirements 
> > > > > > > > > > > > of the PAPR spec on the 'pseries'
> > > > > > > > > > > > machine type, it's normal for PCIe devices to appear on 
> > > > > > > > > > > > the root bus
> > > > > > > > > > > > without root ports.
> > > > > > > > > > > > 
> > > > > > > > > > > > Further, even on x86, there's no inherent reason we 
> > > > > > > > > > > > couldn't present a
> > > > > > > > > > > > virtio device as an "integrated device" (typically used 
> > > > > > > > > > > > for things built
> > > > > > > > > > > > into the PCI chipset), and those devices *do* typically 
> > > > > > > > > > > > appear on the root
> > > > > > > > > > > > bus.
> > > > > > > > > > > 
> > > > > > > > > > > I'm not personally making a counter-argument, just 
> > > > > > > > > > > qouting some of
> > > > > > > > > > > the relevant parts of "docs/pcie.txt" ("PCI EXPRESS 
> > > > > > > > > > > GUIDELINES"):
> > > > > > > > > > 
> > > > > > > > > > So, an earlier discussion more or less concluded that the 
> > > > > > > > > > PCIe
> > > > > > > > > > guidelines don't really work with PAPR guests.  That comes 
> > > > > > > > > > because
> > > > > > > > > > PAPR was designed with PowerVM in mind which allows PCI 
> > > > > > > > > > passthrough
> > > > > > > > > > but doesn't do any emulated PCI devices.  So they wanted to 
> > > > > > > > > > present
> > > > > > > > > > passed through devices (virtual or phyical) to the guest 
> > > > > > > > > > without
> > > > > > > > > > inserting virtual root ports.
> > > > > > > > > > 
> > > > > > > > > > Now, you can argue that this was a silly decision in PAPR, 
> > > > > > > > > > and you
> > > > > > > > > > could well be right, but there it is.
> > > > > > > > > 
> > > > > > > > > I can totally accept this, but then we should state it as a 
> > > > > > > > > fact near
> > > > > > > > > the top of "docs/pcie.txt".
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > > Place only the following kinds of devices directly on 
> > > > > > > > > > > > the Root Complex:
> > > > > > > > > > > > (1) PCI Devices (e.g. network card, graphics card, 
> > > > > > > > > > > > IDE controller),
> > > > > > > > > > > > not controllers. Place only legacy PCI devices 
> > > > > > > > > > > > on
> > > > > > > > > > > > the Root Complex. These will be considered 
> > > > > > > > > > > > Integrated Endpoints.
> > > > > > > > > > > > Note: Integrated Endpoints are not 
> > > > > > > > > > > > hot-pluggable.
> > > > > > > > > > > > 
> > > > > > > > > > > > Although the PCI Express spec does not forbid 
> > > > > > > > > > > > PCI Express devices as
> > > > > > > > > > > > Integrated Endpoints, existing hardware mostly 
> > > > > > > > > 

Re: [Qemu-devel] [PATCH v7 2/2] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-02-15 Thread ashish mittal
On Mon, Feb 13, 2017 at 6:43 AM, Stefan Hajnoczi  wrote:
> On Tue, Feb 07, 2017 at 08:18:14PM -0800, Ashish Mittal wrote:
>> diff --git a/tests/qemu-iotests/common.config 
>> b/tests/qemu-iotests/common.config
>> index f6384fb..c7a80c0 100644
>> --- a/tests/qemu-iotests/common.config
>> +++ b/tests/qemu-iotests/common.config
>> @@ -105,6 +105,10 @@ if [ -z "$QEMU_NBD_PROG" ]; then
>>  export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
>>  fi
>>
>> +if [ -z "$QEMU_VXHS_PROG" ]; then
>> +export QEMU_VXHS_PROG="`set_prog_path qnio_server /usr/local/bin`"
>
> Did you test this with /usr/local/bin/qnio_server?
>
> I think it will evaluate to QEMU_VXHS_PROG=/usr/local/bin when qnio_server
> isn't found in PATH.  You probably wanted /usr/local/bin/qnio_server instead.
>
> I suggest dropping the second argument completely and letting the user set 
> PATH
> themselves.  No existing set_prog_path caller uses the second argument.
>
> # $1 = prog to look for, $2* = default pathnames if not found in $PATH
> set_prog_path()
> {
> p=`command -v $1 2> /dev/null`
> if [ -n "$p" -a -x "$p" ]; then
> echo $p
> return 0
> fi
> p=$1
>
> shift
> for f; do
> if [ -x $f ]; then
> echo $f
> return 0
> fi
> done
>
> echo ""
> return 1
> }
>
>> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
>> index 3213765..06a3164 100644
>> --- a/tests/qemu-iotests/common.rc
>> +++ b/tests/qemu-iotests/common.rc
>> @@ -89,6 +89,9 @@ else
>>  TEST_IMG=$TEST_DIR/t.$IMGFMT
>>  elif [ "$IMGPROTO" = "archipelago" ]; then
>>  TEST_IMG="archipelago:at.$IMGFMT"
>> +elif [ "$IMGPROTO" = "vxhs" ]; then
>> +TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
>> +TEST_IMG="vxhs://127.0.0.1:/t.$IMGFMT"
>>  else
>>  TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
>>  fi
>> @@ -175,6 +178,12 @@ _make_test_img()
>>  eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT  
>> $TEST_IMG_FILE &"
>>  sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
>>  fi
>> +
>> +# Start QNIO server on image directory for vxhs protocol
>> +if [ $IMGPROTO = "vxhs" ]; then
>> +eval "$QEMU_VXHS -d  $TEST_DIR &"
>> +sleep 1 # Wait for server to come up.
>
> This is a pre-existing problem and you don't need to fix it now:
>
> We should replace sleep 1 with a function that probes the TCP port until the
> connection can be established or a timeout is reached.  The netcat (nc) 
> utility
> is often used for this.
>
> sleep 1 is not reliable and may fail on a heavily loaded machine like the
> Travis-CI build machines that are used.



Re: [Qemu-devel] [PATCH v7 1/2] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-02-15 Thread ashish mittal
On Mon, Feb 13, 2017 at 6:57 AM, Stefan Hajnoczi  wrote:
> On Tue, Feb 07, 2017 at 08:18:13PM -0800, Ashish Mittal wrote:
>> +static int vxhs_parse_uri(const char *filename, QDict *options)
>> +{
>> +URI *uri = NULL;
>> +char *hoststr, *portstr;
>> +char *port;
>> +int ret = 0;
>> +
>> +trace_vxhs_parse_uri_filename(filename);
>> +uri = uri_parse(filename);
>> +if (!uri || !uri->server || !uri->path) {
>> +uri_free(uri);
>> +return -EINVAL;
>> +}
>> +
>> +hoststr = g_strdup(VXHS_OPT_SERVER".host");
>> +qdict_put(options, hoststr, qstring_from_str(uri->server));
>> +g_free(hoststr);
>> +
>> +portstr = g_strdup(VXHS_OPT_SERVER".port");
>> +if (uri->port) {
>> +port = g_strdup_printf("%d", uri->port);
>> +qdict_put(options, portstr, qstring_from_str(port));
>> +g_free(port);
>> +}
>> +g_free(portstr);
>> +
>> +if (strstr(uri->path, "vxhs") == NULL) {
>> +qdict_put(options, "vdisk-id", qstring_from_str(uri->path));
>> +}
>> +
>> +trace_vxhs_parse_uri_hostinfo(1, uri->server, uri->port);
>
> What is the purpose of the first argument?
>

It used to be a placeholder for the host index, which is now only 1. I
will remove it.

>> +str = g_strdup_printf(VXHS_OPT_SERVER".");
>> +qdict_extract_subqdict(options, _options, str);
>> +
>> +/* Create opts info from runtime_tcp_opts list */
>> +tcp_opts = qemu_opts_create(_tcp_opts, NULL, 0, _abort);
>> +qemu_opts_absorb_qdict(tcp_opts, backing_options, _err);
>> +if (local_err) {
>> +qdict_del(backing_options, str);
>
> What is qdict_del(backing_options, VXHS_OPT_SERVER".") supposed to do?
> The same call is made further down too.
>

Per my understanding, qdict_del() is to free the 'server.' entries
within the subqdict.

qdict_extract_subqdict() allocates a subqdict and populates it with
the entries based on the pattern we pass. In this case 'server.'.

>> +qemu_opts_del(tcp_opts);
>> +ret = -EINVAL;
>> +goto out;
>> +}
>> +
>> +server_host_opt = qemu_opt_get(tcp_opts, VXHS_OPT_HOST);
>> +if (!server_host_opt) {
>> +error_setg(_err, QERR_MISSING_PARAMETER,
>> +   VXHS_OPT_SERVER"."VXHS_OPT_HOST);
>> +ret = -EINVAL;
>> +goto out;
>
> Missing qemu_opts_del(tcp_opts).
>

Will fix this!

>> +}
>> +
>> +if (strlen(server_host_opt) > MAXHOSTNAMELEN) {
>> +error_setg(errp, "server.host cannot be more than %d characters",
>> +   MAXHOSTNAMELEN);
>> +ret = -EINVAL;
>> +goto out;
>
> Missing qemu_opts_del(tcp_opts).
>

Will fix this!

>> @@ -5114,6 +5147,7 @@ echo "tcmalloc support  $tcmalloc"
>>  echo "jemalloc support  $jemalloc"
>>  echo "avx2 optimization $avx2_opt"
>>  echo "replication support $replication"
>> +echo "VxHS block device $vxhs"
>>
>>  if test "$sdl_too_old" = "yes"; then
>>  echo "-> Your SDL version is too old - please upgrade to have SDL support"
>> @@ -5729,6 +5763,12 @@ if test "$pthread_setname_np" = "yes" ; then
>>echo "CONFIG_PTHREAD_SETNAME_NP=y" >> $config_host_mak
>>  fi
>>
>> +if test "$vxhs" = "yes" ; then
>> +  echo "CONFIG_VXHS=y" >> $config_host_mak
>> +  echo "VXHS_CFLAGS=$vxhs_cflags" >> $config_host_mak
>
> Please drop this unused variable.

Will fix this!


Thanks,
Ashish



Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Alex Williamson
On Thu, 16 Feb 2017 10:35:28 +0800
Peter Xu  wrote:

> On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> > VFIO actually wants to create a capability with ID == 0.
> > This is done to make guest drivers skip the given capability.
> > pcie_add_capability then trips up on this capability
> > when looking for end of capability list.
> > 
> > To support this use-case, it's easy enough to switch to
> > e.g. 0x for these comparisons - we can be sure
> > it will never match a 16-bit capability ID.
> > 
> > Signed-off-by: Michael S. Tsirkin   
> 
> Reviewed-by: Peter Xu 
> 
> Two nits:
> 
> (1) maybe we can s/0x/0x/ in the whole patch since ecap_id
> is 16 bits

The former is used because it's beyond the address space of a valid
capability.  Using 0x just makes the situation different, not
better.

> 
> (2) maybe we can add one more sentence in the comment below showing
> where the 0x thing comes from (it comes from PCIe spec 7.9.2)

The capability in hardware is 16bits, thus a value that exceeds 16 bits
can never match a valid ID.  It has nothing to do with 7.9.2.  Thanks,

Alex

> > ---
> >  hw/pci/pcie.c | 11 +++
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > index cbd4bb4..f4dd177 100644
> > --- a/hw/pci/pcie.c
> > +++ b/hw/pci/pcie.c
> > @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
> >   * uint16_t ext_cap_size
> >   */
> >  
> > -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
> > +/* Passing a cap_id value > 0x will return 0 and put end of list in 
> > prev */
> > +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t cap_id,
> >uint16_t *prev_p)
> >  {
> >  uint16_t prev = 0;
> > @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
> >  } else {
> >  uint16_t prev;
> >  
> > -/* 0 is reserved cap id. use internally to find the last capability
> > -   in the linked list */
> > -next = pcie_find_capability_list(dev, 0, );
> > +/*
> > + * 0x is not a valid cap id (it's a 16 bit field). use
> > + * internally to find the last capability in the linked list.
> > + */
> > +next = pcie_find_capability_list(dev, 0x, );
> >  
> >  assert(prev >= PCI_CONFIG_SPACE_SIZE);
> >  assert(next == 0);
> > -- 
> > MST  
> 
> -- peterx




Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Alex Williamson
On Thu, 16 Feb 2017 10:28:39 +0800
Peter Xu  wrote:

> On Wed, Feb 15, 2017 at 11:15:52AM -0700, Alex Williamson wrote:
> 
> [...]
> 
> > > Alex, do you like something like below to fix above issue that Jintack
> > > has encountered?
> > > 
> > > (note: this code is not for compile, only trying show what I mean...)
> > > 
> > > --8<---
> > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > > index 332f41d..4dca631 100644
> > > --- a/hw/vfio/pci.c
> > > +++ b/hw/vfio/pci.c
> > > @@ -1877,25 +1877,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > >   */
> > >  config = g_memdup(pdev->config, vdev->config_size);
> > > 
> > > -/*
> > > - * Extended capabilities are chained with each pointing to the next, 
> > > so we
> > > - * can drop anything other than the head of the chain simply by 
> > > modifying
> > > - * the previous next pointer.  For the head of the chain, we can 
> > > modify the
> > > - * capability ID to something that cannot match a valid capability.  
> > > ID
> > > - * 0 is reserved for this since absence of capabilities is indicated 
> > > by
> > > - * 0 for the ID, version, AND next pointer.  However, 
> > > pcie_add_capability()
> > > - * uses ID 0 as reserved for list management and will incorrectly 
> > > match and
> > > - * assert if we attempt to pre-load the head of the chain with this 
> > > ID.
> > > - * Use ID 0x temporarily since it is also seems to be reserved in
> > > - * part for identifying absence of capabilities in a root complex 
> > > register
> > > - * block.  If the ID still exists after adding capabilities, switch 
> > > back to
> > > - * zero.  We'll mark this entire first dword as emulated for this 
> > > purpose.
> > > - */
> > > -pci_set_long(pdev->config + PCI_CONFIG_SPACE_SIZE,
> > > - PCI_EXT_CAP(0x, 0, 0));
> > > -pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
> > > -pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, ~0);
> > > -
> > >  for (next = PCI_CONFIG_SPACE_SIZE; next;
> > >   next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
> > >  header = pci_get_long(config + next);
> > > @@ -1917,6 +1898,8 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > >  switch (cap_id) {
> > >  case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
> > >  case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function 
> > > virtualization */
> > > +/* keep this ecap header (4 bytes), but mask cap_id to 
> > > 0x */
> > > +...
> > >  trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, 
> > > next);
> > >  break;
> > >  default:
> > > @@ -1925,11 +1908,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > > 
> > >  }
> > > 
> > > -/* Cleanup chain head ID if necessary */
> > > -if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0x) {
> > > -pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
> > > -}
> > > -
> > >  g_free(config);
> > >  return;
> > >  }  
> > > ->8-
> > > 
> > > Since after all we need the assumption that 0x is reserved for
> > > cap_id. Then, we can just remove the "first 0x then 0x0" hack,
> > > which is imho error-prone and hacky.  
> > 
> > This doesn't fix the bug, which is that pcie_add_capability() uses a
> > valid capability ID for it's own internal tracking.  It's only doing
> > this to find the end of the capability chain, which we could do in a
> > spec complaint way by looking for a zero next pointer.  Fix that and
> > then vfio doesn't need to do this set to 0x then back to zero
> > nonsense at all.  Capability ID zero is valid.  Thanks,  
> 
> Yeah I see Michael's fix on the capability list stuff. However, imho
> these are two different issues? Or say, even if with that patch, we
> should still need this hack (first 0x0, then 0x) right? Since
> looks like that patch didn't solve the problem if the first pcie ecap
> is masked at 0x100.

I thought the problem was that QEMU in the host exposes a device with a
capability ID of 0 to the L1 guest.  QEMU in the L1 guest balks at a
capability ID of 0 because that's how it finds the end of the chain.
Therefore if we make QEMU not use capability ID 0 for internal
purposes, things work.  vfio using 0x and swapping back to 0x0
becomes unnecessary, but doesn't hurt anything.  Thanks,

Alex



Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Jason Wang



On 2017年02月16日 10:43, Hailiang Zhang wrote:

On 2017/2/16 10:34, Jason Wang wrote:



On 2017年02月15日 16:34, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang 
---
   net/colo-compare.c | 20 
   1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable 
*uc, Error **errp)

   return;
   }

+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(>primary_list)) {
+pkt = g_queue_pop_head(>primary_list);
+compare_chr_send(>chr_out, pkt->data, pkt->size);


Any reason to send packets here?



Yes, considering the usage case which we shut COLO for
the VM to make it as a normal VM without FT.
We need to remove all the filter objects. In this case,
IMHO, it is necessary to release the unhandled packets.


Thanks.


Right, I see. All other patches looks good let's squash this into 2.

Thanks




Thanks


+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_head(>secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
   static void colo_compare_class_init(ObjectClass *oc, void *data)
   {
   UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
   g_main_loop_quit(s->compare_loop);
   qemu_thread_join(>thread);

+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(>conn_list, colo_release_packets, s);

   g_queue_free(>conn_list);

+g_hash_table_destroy(s->connection_track_table);
   g_free(s->pri_indev);
   g_free(s->sec_indev);
   g_free(s->outdev);



.








Re: [Qemu-devel] [PATCH 2/5] colo-compare: kick compare thread to exit while finalize

2017-02-15 Thread Hailiang Zhang

On 2017/2/16 10:25, Zhang Chen wrote:



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(>thread))'
branch.

Signed-off-by: zhanghailiang 
---
   net/colo-compare.c | 19 +--
   1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index fdde788..a16e2d5 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,8 @@ typedef struct CompareState {
   GHashTable *connection_track_table;
   /* compare thread, a thread for each NIC */
   QemuThread thread;
+
+GMainLoop *compare_loop;
   } CompareState;

   typedef struct CompareClass {
@@ -496,7 +498,6 @@ static gboolean check_old_packet_regular(void *opaque)
   static void *colo_compare_thread(void *opaque)
   {
   GMainContext *worker_context;
-GMainLoop *compare_loop;
   CompareState *s = opaque;
   GSource *timeout_source;

@@ -507,7 +508,7 @@ static void *colo_compare_thread(void *opaque)
   qemu_chr_fe_set_handlers(>chr_sec_in, compare_chr_can_read,
compare_sec_chr_in, NULL, s, worker_context, 
true);

-compare_loop = g_main_loop_new(worker_context, FALSE);
+s->compare_loop = g_main_loop_new(worker_context, FALSE);

   /* To kick any packets that the secondary doesn't match */
   timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
@@ -515,10 +516,10 @@ static void *colo_compare_thread(void *opaque)
 (GSourceFunc)check_old_packet_regular, s, NULL);
   g_source_attach(timeout_source, worker_context);

-g_main_loop_run(compare_loop);
+g_main_loop_run(s->compare_loop);

   g_source_unref(timeout_source);
-g_main_loop_unref(compare_loop);
+g_main_loop_unref(s->compare_loop);
   g_main_context_unref(worker_context);
   return NULL;
   }
@@ -703,13 +704,11 @@ static void colo_compare_finalize(Object *obj)
   qemu_chr_fe_deinit(>chr_sec_in);
   qemu_chr_fe_deinit(>chr_out);

-g_queue_free(>conn_list);
+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(>thread);

-if (qemu_thread_is_self(>thread)) {
-/* compare connection */
-g_queue_foreach(>conn_list, colo_compare_connection, s);
-qemu_thread_join(>thread);
-}


Before free the 's->conn_list', you should flush all queued primary packets
and release all queued secondary packets here, so combine this patch
with 3/5 patch as
one patch is a better choose.



Make sense, will fix it in next version, thanks.


Thanks
Zhang Chen


+
+g_queue_free(>conn_list);

   g_free(s->pri_indev);
   g_free(s->sec_indev);







Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Hailiang Zhang

On 2017/2/16 10:34, Jason Wang wrote:



On 2017年02月15日 16:34, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang 
---
   net/colo-compare.c | 20 
   1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
   return;
   }

+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(>primary_list)) {
+pkt = g_queue_pop_head(>primary_list);
+compare_chr_send(>chr_out, pkt->data, pkt->size);


Any reason to send packets here?



Yes, considering the usage case which we shut COLO for
the VM to make it as a normal VM without FT.
We need to remove all the filter objects. In this case,
IMHO, it is necessary to release the unhandled packets.


Thanks.


Thanks


+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_head(>secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
   static void colo_compare_class_init(ObjectClass *oc, void *data)
   {
   UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
   g_main_loop_quit(s->compare_loop);
   qemu_thread_join(>thread);

+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(>conn_list, colo_release_packets, s);

   g_queue_free(>conn_list);

+g_hash_table_destroy(s->connection_track_table);
   g_free(s->pri_indev);
   g_free(s->sec_indev);
   g_free(s->outdev);



.






Re: [Qemu-devel] [PATCH] target-ppc: Add quad precision muladd instructions

2017-02-15 Thread Bharata B Rao
On Thu, Feb 16, 2017 at 09:13:31AM +1100, Richard Henderson wrote:
> On 02/15/2017 05:37 PM, Bharata B Rao wrote:
> > + *
> > + * TODO: When float128_muladd() becomes available, switch this
> > + * implementation to use that instead of separate float128_mul()
> > + * followed by float128_add().
> 
> Let's just do that, rather than add something that can't pass tests.
> 
> You should be able to copy float64_muladd and, for the most part, s/128/256/
> and s/64/128/.  Other of the magic numbers, like the implicit bit and the
> exponent bias, you get from float128_mul.

I started like that but got lost somewhere down that path...

It needs at least the following new functions to be implemented:

propagateFloat128MulAddNaN
shortShift256Left
shift256RightJamming
add256
sub256

It all looked doable, but the magic numbers used around the code that
does eventual multiplication looked difficult to understand and I couldn't
deduce that from float128_mul. For some reason float128_mul implements
multipliction via multiplication and addition (mul128To256 & add128). There
is no equivalent to this in float64_muladd.

Let me make another attempt at this.

Regards,
Bharata.




Re: [Qemu-devel] [PATCH 2/5] colo-compare: kick compare thread to exit while finalize

2017-02-15 Thread Jason Wang



On 2017年02月16日 10:25, Zhang Chen wrote:

@@ -703,13 +704,11 @@ static void colo_compare_finalize(Object *obj)
  qemu_chr_fe_deinit(>chr_sec_in);
  qemu_chr_fe_deinit(>chr_out);
  -g_queue_free(>conn_list);
+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(>thread);
  -if (qemu_thread_is_self(>thread)) {
-/* compare connection */
-g_queue_foreach(>conn_list, colo_compare_connection, s);
-qemu_thread_join(>thread);
-}


Before free the 's->conn_list', you should flush all queued primary 
packets
and release all queued secondary packets here, so combine this patch 
with 3/5 patch as

one patch is a better choose.

Thanks
Zhang Chen 


Yes, agree.

Thanks



Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> VFIO actually wants to create a capability with ID == 0.
> This is done to make guest drivers skip the given capability.
> pcie_add_capability then trips up on this capability
> when looking for end of capability list.
> 
> To support this use-case, it's easy enough to switch to
> e.g. 0x for these comparisons - we can be sure
> it will never match a 16-bit capability ID.
> 
> Signed-off-by: Michael S. Tsirkin 

Reviewed-by: Peter Xu 

Two nits:

(1) maybe we can s/0x/0x/ in the whole patch since ecap_id
is 16 bits

(2) maybe we can add one more sentence in the comment below showing
where the 0x thing comes from (it comes from PCIe spec 7.9.2)

Thanks,

> ---
>  hw/pci/pcie.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index cbd4bb4..f4dd177 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
>   * uint16_t ext_cap_size
>   */
>  
> -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
> +/* Passing a cap_id value > 0x will return 0 and put end of list in prev 
> */
> +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t cap_id,
>uint16_t *prev_p)
>  {
>  uint16_t prev = 0;
> @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
>  } else {
>  uint16_t prev;
>  
> -/* 0 is reserved cap id. use internally to find the last capability
> -   in the linked list */
> -next = pcie_find_capability_list(dev, 0, );
> +/*
> + * 0x is not a valid cap id (it's a 16 bit field). use
> + * internally to find the last capability in the linked list.
> + */
> +next = pcie_find_capability_list(dev, 0x, );
>  
>  assert(prev >= PCI_CONFIG_SPACE_SIZE);
>  assert(next == 0);
> -- 
> MST

-- peterx



Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Jason Wang



On 2017年02月15日 16:34, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang 
---
  net/colo-compare.c | 20 
  1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
  return;
  }
  
+static void colo_release_packets(void *opaque, void *user_data)

+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(>primary_list)) {
+pkt = g_queue_pop_head(>primary_list);
+compare_chr_send(>chr_out, pkt->data, pkt->size);


Any reason to send packets here?

Thanks


+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_head(>secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
  static void colo_compare_class_init(ObjectClass *oc, void *data)
  {
  UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
  g_main_loop_quit(s->compare_loop);
  qemu_thread_join(>thread);
  
+/* Release all unhandled packets after compare thead exited */

+g_queue_foreach(>conn_list, colo_release_packets, s);
  
  g_queue_free(>conn_list);
  
+g_hash_table_destroy(s->connection_track_table);

  g_free(s->pri_indev);
  g_free(s->sec_indev);
  g_free(s->outdev);





Re: [Qemu-devel] [PATCH] pcie: simplify pcie_add_capability()

2017-02-15 Thread Peter Xu
On Thu, Feb 16, 2017 at 10:18:00AM +0800, Cao jin wrote:
> Hi peter
> 
> On 02/14/2017 03:51 PM, Peter Xu wrote:
> > When we add PCIe extended capabilities, we should be following the rule
> > that we add the head extended cap (at offset 0x100) first, then the rest
> > of them. Meanwhile, we are always adding new capability bits at the end
> > of the list. Here the "next" looks meaningless in all cases since it
> > should always be zero (along with the "header").
> > 
> > Simplify the function a bit, and it looks more readable now.
> > 
> 
> See if this suggestion could be incorporated into your patch:)
> http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg01418.html

Sure. But imho that's really trivial and as long as the assertions are
working correctly (no matter in which order) I can live with both. :)

Anyway, thanks for the pointer!

-- peterx



Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 11:15:52AM -0700, Alex Williamson wrote:

[...]

> > Alex, do you like something like below to fix above issue that Jintack
> > has encountered?
> > 
> > (note: this code is not for compile, only trying show what I mean...)
> > 
> > --8<---
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 332f41d..4dca631 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -1877,25 +1877,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> >   */
> >  config = g_memdup(pdev->config, vdev->config_size);
> > 
> > -/*
> > - * Extended capabilities are chained with each pointing to the next, 
> > so we
> > - * can drop anything other than the head of the chain simply by 
> > modifying
> > - * the previous next pointer.  For the head of the chain, we can 
> > modify the
> > - * capability ID to something that cannot match a valid capability.  ID
> > - * 0 is reserved for this since absence of capabilities is indicated by
> > - * 0 for the ID, version, AND next pointer.  However, 
> > pcie_add_capability()
> > - * uses ID 0 as reserved for list management and will incorrectly 
> > match and
> > - * assert if we attempt to pre-load the head of the chain with this ID.
> > - * Use ID 0x temporarily since it is also seems to be reserved in
> > - * part for identifying absence of capabilities in a root complex 
> > register
> > - * block.  If the ID still exists after adding capabilities, switch 
> > back to
> > - * zero.  We'll mark this entire first dword as emulated for this 
> > purpose.
> > - */
> > -pci_set_long(pdev->config + PCI_CONFIG_SPACE_SIZE,
> > - PCI_EXT_CAP(0x, 0, 0));
> > -pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
> > -pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, ~0);
> > -
> >  for (next = PCI_CONFIG_SPACE_SIZE; next;
> >   next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
> >  header = pci_get_long(config + next);
> > @@ -1917,6 +1898,8 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> >  switch (cap_id) {
> >  case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
> >  case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization 
> > */
> > +/* keep this ecap header (4 bytes), but mask cap_id to 0x 
> > */
> > +...
> >  trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, 
> > next);
> >  break;
> >  default:
> > @@ -1925,11 +1908,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > 
> >  }
> > 
> > -/* Cleanup chain head ID if necessary */
> > -if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0x) {
> > -pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
> > -}
> > -
> >  g_free(config);
> >  return;
> >  }
> > ->8-  
> > 
> > Since after all we need the assumption that 0x is reserved for
> > cap_id. Then, we can just remove the "first 0x then 0x0" hack,
> > which is imho error-prone and hacky.
> 
> This doesn't fix the bug, which is that pcie_add_capability() uses a
> valid capability ID for it's own internal tracking.  It's only doing
> this to find the end of the capability chain, which we could do in a
> spec complaint way by looking for a zero next pointer.  Fix that and
> then vfio doesn't need to do this set to 0x then back to zero
> nonsense at all.  Capability ID zero is valid.  Thanks,

Yeah I see Michael's fix on the capability list stuff. However, imho
these are two different issues? Or say, even if with that patch, we
should still need this hack (first 0x0, then 0x) right? Since
looks like that patch didn't solve the problem if the first pcie ecap
is masked at 0x100.

Please correct me if I missed anything. Thanks,

-- peterx



Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Zhang Chen



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang
---
  net/colo-compare.c | 20 
  1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
  return;
  }
  


This function in my patch "colo-compare and filter-rewriter work with 
colo-frame "
Named 'colo_flush_connection', I think use 'flush' instead of 'release' 
is better,


Thanks
Zhang Chen



+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(>primary_list)) {
+pkt = g_queue_pop_head(>primary_list);
+compare_chr_send(>chr_out, pkt->data, pkt->size);
+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_head(>secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
  static void colo_compare_class_init(ObjectClass *oc, void *data)
  {
  UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
  g_main_loop_quit(s->compare_loop);
  qemu_thread_join(>thread);
  
+/* Release all unhandled packets after compare thead exited */

+g_queue_foreach(>conn_list, colo_release_packets, s);
  
  g_queue_free(>conn_list);
  
+g_hash_table_destroy(s->connection_track_table);

  g_free(s->pri_indev);
  g_free(s->sec_indev);
  g_free(s->outdev);


--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH 2/5] colo-compare: kick compare thread to exit while finalize

2017-02-15 Thread Zhang Chen



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(>thread))'
branch.

Signed-off-by: zhanghailiang 
---
  net/colo-compare.c | 19 +--
  1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index fdde788..a16e2d5 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,8 @@ typedef struct CompareState {
  GHashTable *connection_track_table;
  /* compare thread, a thread for each NIC */
  QemuThread thread;
+
+GMainLoop *compare_loop;
  } CompareState;
  
  typedef struct CompareClass {

@@ -496,7 +498,6 @@ static gboolean check_old_packet_regular(void *opaque)
  static void *colo_compare_thread(void *opaque)
  {
  GMainContext *worker_context;
-GMainLoop *compare_loop;
  CompareState *s = opaque;
  GSource *timeout_source;
  
@@ -507,7 +508,7 @@ static void *colo_compare_thread(void *opaque)

  qemu_chr_fe_set_handlers(>chr_sec_in, compare_chr_can_read,
   compare_sec_chr_in, NULL, s, worker_context, 
true);
  
-compare_loop = g_main_loop_new(worker_context, FALSE);

+s->compare_loop = g_main_loop_new(worker_context, FALSE);
  
  /* To kick any packets that the secondary doesn't match */

  timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
@@ -515,10 +516,10 @@ static void *colo_compare_thread(void *opaque)
(GSourceFunc)check_old_packet_regular, s, NULL);
  g_source_attach(timeout_source, worker_context);
  
-g_main_loop_run(compare_loop);

+g_main_loop_run(s->compare_loop);
  
  g_source_unref(timeout_source);

-g_main_loop_unref(compare_loop);
+g_main_loop_unref(s->compare_loop);
  g_main_context_unref(worker_context);
  return NULL;
  }
@@ -703,13 +704,11 @@ static void colo_compare_finalize(Object *obj)
  qemu_chr_fe_deinit(>chr_sec_in);
  qemu_chr_fe_deinit(>chr_out);
  
-g_queue_free(>conn_list);

+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(>thread);
  
-if (qemu_thread_is_self(>thread)) {

-/* compare connection */
-g_queue_foreach(>conn_list, colo_compare_connection, s);
-qemu_thread_join(>thread);
-}


Before free the 's->conn_list', you should flush all queued primary packets
and release all queued secondary packets here, so combine this patch 
with 3/5 patch as

one patch is a better choose.

Thanks
Zhang Chen


+
+g_queue_free(>conn_list);
  
  g_free(s->pri_indev);

  g_free(s->sec_indev);


--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH] pcie: simplify pcie_add_capability()

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 04:25:05PM +0200, Marcel Apfelbaum wrote:
> On 02/14/2017 09:51 AM, Peter Xu wrote:
> >When we add PCIe extended capabilities, we should be following the rule
> >that we add the head extended cap (at offset 0x100) first, then the rest
> >of them. Meanwhile, we are always adding new capability bits at the end
> >of the list. Here the "next" looks meaningless in all cases since it
> >should always be zero (along with the "header").
> >
> >Simplify the function a bit, and it looks more readable now.
> >
> >Signed-off-by: Peter Xu 
> >---
> > hw/pci/pcie.c | 15 ---
> > 1 file changed, 4 insertions(+), 11 deletions(-)
> >
> >diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> >index cbd4bb4..e0e6f6a 100644
> >--- a/hw/pci/pcie.c
> >+++ b/hw/pci/pcie.c
> >@@ -664,30 +664,23 @@ void pcie_add_capability(PCIDevice *dev,
> >  uint16_t cap_id, uint8_t cap_ver,
> >  uint16_t offset, uint16_t size)
> > {
> >-uint32_t header;
> >-uint16_t next;
> >-
> > assert(offset >= PCI_CONFIG_SPACE_SIZE);
> > assert(offset < offset + size);
> > assert(offset + size <= PCIE_CONFIG_SPACE_SIZE);
> > assert(size >= 8);
> > assert(pci_is_express(dev));
> >
> >-if (offset == PCI_CONFIG_SPACE_SIZE) {
> >-header = pci_get_long(dev->config + offset);
> >-next = PCI_EXT_CAP_NEXT(header);
> >-} else {
> >+if (offset != PCI_CONFIG_SPACE_SIZE) {
> > uint16_t prev;
> >
> > /* 0 is reserved cap id. use internally to find the last capability
> >in the linked list */
> >-next = pcie_find_capability_list(dev, 0, );
> >-
> >+assert(pcie_find_capability_list(dev, 0, ) == 0);
> 
> Hi Peter,
> 
> It is not recommended to use assert with an expression with side-effects.

Exactly. Thanks Marcel, I'll repost.

-- peterx



Re: [Qemu-devel] [PATCH] pcie: simplify pcie_add_capability()

2017-02-15 Thread Cao jin
Hi peter

On 02/14/2017 03:51 PM, Peter Xu wrote:
> When we add PCIe extended capabilities, we should be following the rule
> that we add the head extended cap (at offset 0x100) first, then the rest
> of them. Meanwhile, we are always adding new capability bits at the end
> of the list. Here the "next" looks meaningless in all cases since it
> should always be zero (along with the "header").
> 
> Simplify the function a bit, and it looks more readable now.
> 

See if this suggestion could be incorporated into your patch:)
http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg01418.html

-- 
Sincerely,
Cao jin

> Signed-off-by: Peter Xu 
> ---
>  hw/pci/pcie.c | 15 ---
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index cbd4bb4..e0e6f6a 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -664,30 +664,23 @@ void pcie_add_capability(PCIDevice *dev,
>   uint16_t cap_id, uint8_t cap_ver,
>   uint16_t offset, uint16_t size)
>  {
> -uint32_t header;
> -uint16_t next;
> -
>  assert(offset >= PCI_CONFIG_SPACE_SIZE);
>  assert(offset < offset + size);
>  assert(offset + size <= PCIE_CONFIG_SPACE_SIZE);
>  assert(size >= 8);
>  assert(pci_is_express(dev));
>  
> -if (offset == PCI_CONFIG_SPACE_SIZE) {
> -header = pci_get_long(dev->config + offset);
> -next = PCI_EXT_CAP_NEXT(header);
> -} else {
> +if (offset != PCI_CONFIG_SPACE_SIZE) {
>  uint16_t prev;
>  
>  /* 0 is reserved cap id. use internally to find the last capability
> in the linked list */
> -next = pcie_find_capability_list(dev, 0, );
> -
> +assert(pcie_find_capability_list(dev, 0, ) == 0);
>  assert(prev >= PCI_CONFIG_SPACE_SIZE);
> -assert(next == 0);
>  pcie_ext_cap_set_next(dev, prev, offset);
>  }
> -pci_set_long(dev->config + offset, PCI_EXT_CAP(cap_id, cap_ver, next));
> +
> +pci_set_long(dev->config + offset, PCI_EXT_CAP(cap_id, cap_ver, 0));
>  
>  /* Make capability read-only by default */
>  memset(dev->wmask + offset, 0, size);
> 







Re: [Qemu-devel] [PATCH v3 15/16] target-m68k: add more FPU instructions

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

+static long double floatx80_to_ldouble(floatx80 val)
+{
+if (floatx80_is_infinity(val)) {
+if (floatx80_is_neg(val)) {
+return -__builtin_infl();
+}
+return __builtin_infl();
+}
+if (floatx80_is_any_nan(val)) {
+char low[20];
+sprintf(low, "0x%016"PRIx64, val.low);
+
+return nanl(low);
+}
+
+return *(long double *)
+}


This doesn't work except for x86 host.

You ought to extract the mantissa, convert the 64-bit value to long-double, and 
use ldexpl to scale the result for the exponent.


Similarly converting the other way use frexpl and ldexpl.


r~



Re: [Qemu-devel] [PATCH v3 14/16] target-m68k: add explicit single and double precision operations

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

+case 0: /* fmove */
+break;
+case 0x40: /* fsmove */
+gen_helper_redf32_FP0(cpu_env);
+gen_helper_extf32_FP0(cpu_env);
+break;
+case 0x44: /* fdmove */
+gen_helper_redf64_FP0(cpu_env);
+gen_helper_extf64_FP0(cpu_env);
 break;


This is going to produce double-rounding errors.  Better to properly set the 
rounding precision first and convert once.



r~



Re: [Qemu-devel] [PATCH v3 13/16] target-m68k: add fsglmul and fsgldiv

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

fsglmul and fsgldiv truncate data to single precision before computing
results.

Signed-off-by: Laurent Vivier 
---
 target/m68k/fpu_helper.c | 22 ++
 target/m68k/helper.h |  2 ++
 target/m68k/translate.c  |  8 
 3 files changed, 32 insertions(+)

diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index 42f5b5c..8a3eed3 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -351,6 +351,17 @@ void HELPER(mul_FP0_FP1)(CPUM68KState *env)
 floatx80_to_FP0(env, res);
 }

+void HELPER(sglmul_FP0_FP1)(CPUM68KState *env)
+{
+float64 a, b, res;
+
+a = floatx80_to_float64(FP0_to_floatx80(env), >fp_status);
+b = floatx80_to_float64(FP1_to_floatx80(env), >fp_status);


s/float64/float32/g

Kinda sorta, probably close enough.  The manual says the resulting exponent may 
be out of range.  Which means this will produce +Inf in cases HW won't.



r~



Re: [Qemu-devel] [PATCH v3 12/16] target-m68k: add fscale, fgetman, fgetexp and fmod

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier 
---
 target/m68k/cpu.h|  1 +
 target/m68k/fpu_helper.c | 56 
 target/m68k/helper.h |  4 
 target/m68k/translate.c  | 14 
 4 files changed, 75 insertions(+)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 7985dc3..3042ab7 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -253,6 +253,7 @@ typedef enum {
 /* Quotient */

 #define FPSR_QT_MASK  0x00ff
+#define FPSR_QT_SHIFT 16

 /* Floating-Point Control Register */
 /* Rounding mode */
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index d8145e0..42f5b5c 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -458,3 +458,59 @@ void HELPER(const_FP0)(CPUM68KState *env, uint32_t offset)
 env->fp0l = fpu_rom[offset].low;
 env->fp0h = fpu_rom[offset].high;
 }
+
+void HELPER(getexp_FP0)(CPUM68KState *env)
+{
+int32_t exp;
+floatx80 res;
+
+res = FP0_to_floatx80(env);
+if (floatx80_is_zero_or_denormal(res) || floatx80_is_any_nan(res) ||
+floatx80_is_infinity(res)) {
+return;
+}
+exp = (env->fp0h & 0x7fff) - 0x3fff;
+
+res = int32_to_floatx80(exp, >fp_status);
+
+floatx80_to_FP0(env, res);


Failure to raise OPERR for infinities?


+void HELPER(getman_FP0)(CPUM68KState *env)
+{
+floatx80 res;
+res = int64_to_floatx80(env->fp0l, >fp_status);
+floatx80_to_FP0(env, res);
+}


This seems completely wrong.  (1) NaN gets returned, (2) Inf raises OPERR, (3) 
Normal values return something in the range [1.0 ... 2.0).  Which means you 
should just force the exponent rather than convert the low part.



+
+void HELPER(scale_FP0_FP1)(CPUM68KState *env)
+{
+int32_t scale;
+int32_t exp;
+
+scale = floatx80_to_int32(FP0_to_floatx80(env), >fp_status);
+
+exp = (env->fp1h & 0x7fff) + scale;
+
+env->fp0h = (env->fp1h & 0x8000) | (exp & 0x7fff);
+env->fp0l = env->fp1l;
+}


Missing handling for NaN, Inf, 0, denormal.


r~



Re: [Qemu-devel] [Help] Windows2012 as Guest 64+cores on KVM Halts

2017-02-15 Thread Gonglei (Arei)
Hi,

> 
> On Sat, 2017-02-11 at 10:39 -0500, Paolo Bonzini wrote:
> > >
> > >
> > > >
> > > > On 10/02/2017 10:31, Gonglei (Arei) wrote:
> > > > >
> > > > > But We tested the same cases on Xen platform and VMware, and
> > > > > the guest booted successfully.
> > > >
> > > > Were these two also tested with enlightenments enabled?  TCG
> > > > surely isn't.
> > >
> > > About TCG, I just remove ' accel=kvm,' and 'hy_releaxed' from the
> > > below QEMU
> > > Command line, I thought the hyper-V enabled then. Sorry about that.
> > >
> > > But for Xen, we set 'viridian=1' which be thought the Hyper-V is
> > > enabled.
> > >
> > > For VMWare we also enabled the Hyper-V enlightenments.
> If I'm not mistaken, even Hyper-V server doesn't allow specify more
> than 64 vCPUs for Generation 1 VMs.

Normally yes, but I found the explanation from Microsoft document about it:

Maximum Supported Virtual Processors

On Windows operating systems versions through Windows Server 2008 R2, 
reporting the HV#1 hypervisor interface limits the Windows virtual machine 
to a maximum of 64 VPs, regardless of what is reported via CPUID.4005.EAX.
Starting with Windows Server 2012 and Windows 8, if CPUID.4005.EAX 
contains a value of -1, Windows assumes that the hypervisor imposes no specific
limit to the number of VPs. In this case, Windows Server 2012 guest VMs may
use more than 64 VPs, up to the maximum supported number of processors 
applicable to the specific Windows version being used.

Link: 
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs

"Requirements for Implementing the Microsoft Hypervisor Interface"

And the below patch works for me, I can support max 255 vcpus for WS2012
with hyper-v enlightenments.

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 27fd050..efe3cbc 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -772,7 +772,7 @@ int kvm_arch_init_vcpu(CPUState *cs)

 c = _data.entries[cpuid_i++];
 c->function = HYPERV_CPUID_IMPLEMENT_LIMITS;
-c->eax = 0x40;
+c->eax = -1;
 c->ebx = 0x40;

 kvm_base = KVM_CPUID_SIGNATURE_NEXT;

> In any case, if you are only interested in hv_relaxed, you can drop it
> off for WS2012 as long as you have cpu hypervisor flag
> (CPUID.1:ECX [bit 31]=1) turned on.
> 
hy_relaxed is just a example of enabling hyperv-v enlightenments.

Thanks,
-Gonglei


Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] [PATCH v16 1/2] virtio-crypto: Add virtio crypto device specification

2017-02-15 Thread Gonglei (Arei)
Hi Halil,

> 
> On 02/09/2017 03:29 AM, Gonglei (Arei) wrote:
> [..]
> > Oh, so much work need to be done.
> >
> > Halil, Would you mind work together with me to perfect the spec?
> > And feel free to add your signed-off-by tag. :)
> >
> > TBH as a non-native English speaker, it's more difficult writing a
> > spec than coding. :(
> >
> > Look forward to your reply.
> >
> 
> 
> First, sorry for the long delay -- was busy and then ill. Thank you

I hope you feel better now.

> very much for your offer. I would prefer continuing as a reviewer,
> but I would very much appreciate if you could add me to the
> 'Acknowledgments' appendix as a part of your patch ;). 

No problem, I can do that.

> Unfortunately
> I do not have the time now to allocate significantly more time for
> this. I'm also having difficulties to think of another way of working
> efficiently together on this, than what we already do. I can try to
> provide more suggestions in terms of formulation, but it's still
> just review. Thank you very much!
> 

OK, thank you, I can understand. Just like me, I am also busy solving bugs
of inner projects and some works with high priority. So, recently
I have little time on the spec. :( But I'll do it once I have time again.

Thanks,
-Gonglei




Re: [Qemu-devel] [PATCH v3 07/16] target-m68k: manage FPU exceptions

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier 
---
 target/m68k/cpu.h|  28 +
 target/m68k/fpu_helper.c | 107 ++-
 target/m68k/helper.h |   1 +
 target/m68k/translate.c  |  27 
 4 files changed, 162 insertions(+), 1 deletion(-)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 6b3cb26..7985dc3 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -57,6 +57,15 @@
 #define EXCP_TRAP15 47   /* User trap #15.  */
 #define EXCP_UNSUPPORTED61
 #define EXCP_ICE13
+#define EXCP_FP_BSUN48 /* Branch Set on Unordered */
+#define EXCP_FP_INEX49 /* Inexact result */
+#define EXCP_FP_DZ  50 /* Divide by Zero */
+#define EXCP_FP_UNFL51 /* Underflow */
+#define EXCP_FP_OPERR   52 /* Operand Error */
+#define EXCP_FP_OVFL53 /* Overflow */
+#define EXCP_FP_SNAN54 /* Signaling Not-A-Number */
+#define EXCP_FP_UNIMP   55 /* Unimplemented Data type */
+

 #define EXCP_RTE0x100
 #define EXCP_HALT_INSN  0x101
@@ -222,6 +231,25 @@ typedef enum {
 #define FPSR_CC_Z 0x0400 /* Zero */
 #define FPSR_CC_N 0x0800 /* Negative */

+/* Exception Status */
+#define FPSR_ES_MASK  0xff00
+#define FPSR_ES_BSUN  0x8000 /* Branch Set on Unordered */
+#define FPSR_ES_SNAN  0x4000 /* Signaling Not-A-Number */
+#define FPSR_ES_OPERR 0x2000 /* Operand Error */
+#define FPSR_ES_OVFL  0x1000 /* Overflow */
+#define FPSR_ES_UNFL  0x0800 /* Underflow */
+#define FPSR_ES_DZ0x0400 /* Divide by Zero */
+#define FPSR_ES_INEX2 0x0200 /* Inexact operation */
+#define FPSR_ES_INEX  0x0100 /* Inexact decimal input */
+
+/* Accrued Exception */
+#define FPSR_AE_MASK  0x00ff
+#define FPSR_AE_IOP   0x0080 /* Invalid Operation */
+#define FPSR_AE_OVFL  0x0040 /* Overflow */
+#define FPSR_AE_UNFL  0x0020 /* Underflow */
+#define FPSR_AE_DZ0x0010 /* Divide by Zero */
+#define FPSR_AE_INEX  0x0008 /* Inexact */
+
 /* Quotient */

 #define FPSR_QT_MASK  0x00ff
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index 9d39118..1e68c41 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -177,6 +177,70 @@ static void restore_rounding_mode(CPUM68KState *env)
 }
 }

+static void set_fpsr_exception(CPUM68KState *env)
+{
+uint32_t fpsr = 0;
+int flags;
+
+flags = get_float_exception_flags(>fp_status);
+if (flags == 0) {
+return;
+}
+set_float_exception_flags(0, >fp_status);
+
+if (flags & float_flag_invalid) {
+fpsr |= FPSR_AE_IOP;
+}
+if (flags & float_flag_divbyzero) {
+fpsr |= FPSR_AE_DZ;
+}
+if (flags & float_flag_overflow) {
+fpsr |= FPSR_AE_OVFL;
+}
+if (flags & float_flag_underflow) {
+fpsr |= FPSR_AE_UNFL;
+}
+if (flags & float_flag_inexact) {
+fpsr |= FPSR_AE_INEX;
+}
+
+env->fpsr = (env->fpsr & ~FPSR_AE_MASK) | fpsr;
+}
+
+static void fpu_exception(CPUM68KState *env, uint32_t exception)
+{
+CPUState *cs = CPU(m68k_env_get_cpu(env));
+
+env->fpsr = (env->fpsr & ~FPSR_ES_MASK) | exception;
+if (env->fpcr & exception) {


What are you trying to do here?  This test is obviously true if exception != 0.


+switch (exception) {
+case FPSR_ES_BSUN:
+cs->exception_index = EXCP_FP_BSUN;
+break;
+case FPSR_ES_SNAN:
+cs->exception_index = EXCP_FP_SNAN;
+break;
+case FPSR_ES_OPERR:
+cs->exception_index = EXCP_FP_OPERR;
+break;
+case FPSR_ES_OVFL:
+cs->exception_index = EXCP_FP_OVFL;
+break;
+case FPSR_ES_UNFL:
+cs->exception_index = EXCP_FP_UNFL;
+break;
+case FPSR_ES_DZ:
+cs->exception_index = EXCP_FP_DZ;
+break;
+case FPSR_ES_INEX:
+case FPSR_ES_INEX2:
+cs->exception_index = EXCP_FP_INEX;
+break;
+}
+cpu_loop_exit_restore(cs, GETPC());


GETPC must be invoked from the outer-most handler.  You need to pass this in 
from the callers.



+}
+}
+
 void cpu_m68k_set_fpcr(CPUM68KState *env, uint32_t val)
 {
 env->fpcr = val & 0x;
@@ -292,10 +356,16 @@ void HELPER(cmp_FP0_FP1)(CPUM68KState *env)
 {
 floatx80 fp0 = FP0_to_floatx80(env);
 floatx80 fp1 = FP1_to_floatx80(env);
-int float_compare;
+int flags, float_compare;

 float_compare = floatx80_compare(fp1, fp0, >fp_status);
 env->fpsr = (env->fpsr & ~FPSR_CC_MASK) | float_comp_to_cc(float_compare);
+
+flags = get_float_exception_flags(>fp_status);
+if (flags & float_flag_invalid) {
+fpu_exception(env, FPSR_ES_OPERR);
+   }
+   set_fpsr_exception(env);
 }

 void HELPER(tst_FP0)(CPUM68KState *env)
@@ -315,4 +385,39 @@ void 

[Qemu-devel] [PATCH v5 8/8] hw/mips: MIPS Boston board support

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Introduce support for emulating the MIPS Boston development board. The
Boston board is built around an FPGA & 3 PCIe controllers, one of which
is connected to an Intel EG20T Platform Controller Hub. It is used
during the development & debug of new CPUs and the software intended to
run on them, and is essentially the successor to the older MIPS Malta
board.

This patch does not implement the EG20T, instead connecting an already
supported ICH-9 AHCI controller. Whilst this isn't accurate it's enough
for typical stock Boston software (eg. Linux kernels) to work with hard
disks given that both the ICH-9 & EG20T implement the AHCI
specification.

Boston boards typically boot kernels in the FIT image format, and this
patch will treat kernels provided to QEMU as such. When loading a kernel
directly, the board code will generate minimal firmware much as the
Malta board code does. This firmware will set up the CM, CPC & GIC
register base addresses then set argument registers & jump to the kernel
entry point. Alternatively, bootloader code may be loaded using the bios
argument in which case no firmware will be generated & execution will
proceed from the start of the boot code at the default MIPS boot
exception vector (offset 0x1fc0 into (c)kseg1).

Currently real Boston boards are always used with FPGA bitfiles that
include a Global Interrupt Controller (GIC), so the interrupt
configuration is only defined for such cases. Therefore the board will
only allow use of CPUs which implement the CPS components, including the
GIC, and will otherwise exit with a message.

Signed-off-by: Paul Burton 
Reviewed-by: Yongbok Kim 
[yongbok@imgtec.com:
  isolated boston machine support for mips64el.
  updated for recent Chardev changes.
  ignore missing bios/kernel for qtest.]
Signed-off-by: Yongbok Kim 
---
 configure|   2 +-
 default-configs/mips64el-softmmu.mak |   2 +
 hw/mips/Makefile.objs|   1 +
 hw/mips/boston.c | 576 +++
 4 files changed, 580 insertions(+), 1 deletion(-)
 create mode 100644 hw/mips/boston.c

diff --git a/configure b/configure
index 4b68861..8e8f18d 100755
--- a/configure
+++ b/configure
@@ -3378,7 +3378,7 @@ fi
 fdt_required=no
 for target in $target_list; do
   case $target in
-aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
+
aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu)
   fdt_required=yes
 ;;
   esac
diff --git a/default-configs/mips64el-softmmu.mak 
b/default-configs/mips64el-softmmu.mak
index 485e218..cc5f3b3 100644
--- a/default-configs/mips64el-softmmu.mak
+++ b/default-configs/mips64el-softmmu.mak
@@ -10,3 +10,5 @@ CONFIG_JAZZ=y
 CONFIG_G364FB=y
 CONFIG_JAZZ_LED=y
 CONFIG_VT82C686=y
+CONFIG_MIPS_BOSTON=y
+CONFIG_PCI_XILINX=y
diff --git a/hw/mips/Makefile.objs b/hw/mips/Makefile.objs
index 9352a1c..48cd2ef 100644
--- a/hw/mips/Makefile.objs
+++ b/hw/mips/Makefile.objs
@@ -4,3 +4,4 @@ obj-$(CONFIG_JAZZ) += mips_jazz.o
 obj-$(CONFIG_FULONG) += mips_fulong2e.o
 obj-y += gt64xxx_pci.o
 obj-$(CONFIG_MIPS_CPS) += cps.o
+obj-$(CONFIG_MIPS_BOSTON) += boston.o
diff --git a/hw/mips/boston.c b/hw/mips/boston.c
new file mode 100644
index 000..560c8b4
--- /dev/null
+++ b/hw/mips/boston.c
@@ -0,0 +1,576 @@
+/*
+ * MIPS Boston development board emulation.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "exec/address-spaces.h"
+#include "hw/boards.h"
+#include "hw/char/serial.h"
+#include "hw/hw.h"
+#include "hw/ide/pci.h"
+#include "hw/ide/ahci.h"
+#include "hw/loader.h"
+#include "hw/loader-fit.h"
+#include "hw/mips/cps.h"
+#include "hw/mips/cpudevs.h"
+#include "hw/pci-host/xilinx-pcie.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "sysemu/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/qtest.h"
+
+#include 
+
+#define TYPE_MIPS_BOSTON "mips-boston"
+#define BOSTON(obj) OBJECT_CHECK(BostonState, (obj), TYPE_MIPS_BOSTON)
+
+typedef struct {
+SysBusDevice parent_obj;
+
+

Re: [Qemu-devel] [PATCH 14/17] qmp: add x-debug-block-dirty-bitmap-sha256

2017-02-15 Thread John Snow


On 02/13/2017 04:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 

This is simply the same as the version in the other two series, right?

Reviewed-by: John Snow 

> ---
>  block/dirty-bitmap.c |  5 +
>  blockdev.c   | 29 +
>  include/block/dirty-bitmap.h |  2 ++
>  include/qemu/hbitmap.h   |  8 
>  qapi/block-core.json | 27 +++
>  tests/Makefile.include   |  2 +-
>  util/hbitmap.c   | 11 +++
>  7 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 32aa6eb..5bec99b 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -558,3 +558,8 @@ BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState 
> *bs,
>  
>  return QLIST_NEXT(bitmap, list);
>  }
> +
> +char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
> +{
> +return hbitmap_sha256(bitmap->bitmap, errp);
> +}
> diff --git a/blockdev.c b/blockdev.c
> index db82ac9..4d06885 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2790,6 +2790,35 @@ void qmp_block_dirty_bitmap_clear(const char *node, 
> const char *name,
>  aio_context_release(aio_context);
>  }
>  
> +BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char 
> *node,
> +  const char 
> *name,
> +  Error **errp)
> +{
> +AioContext *aio_context;
> +BdrvDirtyBitmap *bitmap;
> +BlockDriverState *bs;
> +BlockDirtyBitmapSha256 *ret = NULL;
> +char *sha256;
> +
> +bitmap = block_dirty_bitmap_lookup(node, name, , _context, errp);
> +if (!bitmap || !bs) {
> +return NULL;
> +}
> +
> +sha256 = bdrv_dirty_bitmap_sha256(bitmap, errp);
> +if (sha256 == NULL) {
> +goto out;
> +}
> +
> +ret = g_new(BlockDirtyBitmapSha256, 1);
> +ret->sha256 = sha256;
> +
> +out:
> +aio_context_release(aio_context);
> +
> +return ret;
> +}
> +
>  void hmp_drive_del(Monitor *mon, const QDict *qdict)
>  {
>  const char *id = qdict_get_str(qdict, "id");
> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
> index 20b3ec7..ded872a 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -78,4 +78,6 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
> *bitmap);
>  BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState *bs,
>  BdrvDirtyBitmap *bitmap);
>  
> +char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
> +
>  #endif
> diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
> index 9239fe5..f353e56 100644
> --- a/include/qemu/hbitmap.h
> +++ b/include/qemu/hbitmap.h
> @@ -238,6 +238,14 @@ void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t 
> start, uint64_t count,
>  void hbitmap_deserialize_finish(HBitmap *hb);
>  
>  /**
> + * hbitmap_sha256:
> + * @bitmap: HBitmap to operate on.
> + *
> + * Returns SHA256 hash of the last level.
> + */
> +char *hbitmap_sha256(const HBitmap *bitmap, Error **errp);
> +
> +/**
>   * hbitmap_free:
>   * @hb: HBitmap to operate on.
>   *
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 932f5bb..8646054 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1632,6 +1632,33 @@
>'data': 'BlockDirtyBitmap' }
>  
>  ##
> +# @BlockDirtyBitmapSha256:
> +#
> +# SHA256 hash of dirty bitmap data
> +#
> +# @sha256: ASCII representation of SHA256 bitmap hash
> +#
> +# Since: 2.9
> +##
> +  { 'struct': 'BlockDirtyBitmapSha256',
> +'data': {'sha256': 'str'} }
> +
> +##
> +# @x-debug-block-dirty-bitmap-sha256:
> +#
> +# Get bitmap SHA256
> +#
> +# Returns: BlockDirtyBitmapSha256 on success
> +#  If @node is not a valid block device, DeviceNotFound
> +#  If @name is not found or if hashing has failed, GenericError with 
> an
> +#  explanation
> +#
> +# Since: 2.9
> +##
> +  { 'command': 'x-debug-block-dirty-bitmap-sha256',
> +'data': 'BlockDirtyBitmap', 'returns': 'BlockDirtyBitmapSha256' }
> +
> +##
>  # @blockdev-mirror:
>  #
>  # Start mirroring a block device's writes to a new destination.
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 634394a..7a71b4d 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -526,7 +526,7 @@ tests/test-blockjob$(EXESUF): tests/test-blockjob.o 
> $(test-block-obj-y) $(test-u
>  tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o 
> $(test-block-obj-y) $(test-util-obj-y)
>  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
>  tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
> -tests/test-hbitmap$(EXESUF): 

[Qemu-devel] [PATCH v5 5/8] dtc: Update requirement to v1.4.2

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

In order to obtain fdt_first_subnode & fdt_next_subnode symbols from
libfdt for use by a later patch, bump the requirement for dtc to v1.4.2
& the submodule to that same version.

Signed-off-by: Paul Burton 
Reviewed-by: Yongbok Kim 
Signed-off-by: Yongbok Kim 
---
 configure | 6 +++---
 dtc   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 1c9655e..4b68861 100755
--- a/configure
+++ b/configure
@@ -3396,11 +3396,11 @@ fi
 if test "$fdt" != "no" ; then
   fdt_libs="-lfdt"
   # explicitly check for libfdt_env.h as it is missing in some stable installs
-  # and test for required functions to make sure we are on a version >= 1.4.0
+  # and test for required functions to make sure we are on a version >= 1.4.2
   cat > $TMPC << EOF
 #include 
 #include 
-int main(void) { fdt_get_property_by_offset(0, 0, 0); return 0; }
+int main(void) { fdt_first_subnode(0, 0); return 0; }
 EOF
   if compile_prog "" "$fdt_libs" ; then
 # system DTC is good - use it
@@ -3418,7 +3418,7 @@ EOF
 fdt_libs="-L\$(BUILD_DIR)/dtc/libfdt $fdt_libs"
   elif test "$fdt" = "yes" ; then
 # have neither and want - prompt for system/submodule install
-error_exit "DTC (libfdt) version >= 1.4.0 not present. Your options:" \
+error_exit "DTC (libfdt) version >= 1.4.2 not present. Your options:" \
 "  (1) Preferred: Install the DTC (libfdt) devel package" \
 "  (2) Fetch the DTC submodule, using:" \
 "  git submodule update --init dtc"
diff --git a/dtc b/dtc
index 65cc4d2..ec02b34 16
--- a/dtc
+++ b/dtc
@@ -1 +1 @@
-Subproject commit 65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf
+Subproject commit ec02b34c05be04f249ffaaca4b666f5246877dea
-- 
2.7.4




[Qemu-devel] [PATCH v5 7/8] hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Add support for emulating the Xilinx AXI Root Port Bridge for PCI
Express as described by Xilinx' PG055 document. This is a PCIe
controller that can be used with certain series of Xilinx FPGAs, and is
used on the MIPS Boston board which will make use of this code.

Signed-off-by: Paul Burton 
[yongbok@imgtec.com:
  removed returning on !level,
  updated IRQ connection with GPIO logic,
  moved xilinx_pcie_init() to boston.c
  replaced stw_le_p() with pci_set_word()
  and other cosmetic changes]
Signed-off-by: Yongbok Kim 
---
 hw/pci-host/Makefile.objs |   1 +
 hw/pci-host/xilinx-pcie.c | 328 ++
 include/hw/pci-host/xilinx-pcie.h |  68 
 3 files changed, 397 insertions(+)
 create mode 100644 hw/pci-host/xilinx-pcie.c
 create mode 100644 include/hw/pci-host/xilinx-pcie.h

diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index 45f1f0e..9c7909c 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -16,3 +16,4 @@ common-obj-$(CONFIG_FULONG) += bonito.o
 common-obj-$(CONFIG_PCI_PIIX) += piix.o
 common-obj-$(CONFIG_PCI_Q35) += q35.o
 common-obj-$(CONFIG_PCI_GENERIC) += gpex.o
+common-obj-$(CONFIG_PCI_XILINX) += xilinx-pcie.o
diff --git a/hw/pci-host/xilinx-pcie.c b/hw/pci-host/xilinx-pcie.c
new file mode 100644
index 000..8b71e2d
--- /dev/null
+++ b/hw/pci-host/xilinx-pcie.c
@@ -0,0 +1,328 @@
+/*
+ * Xilinx PCIe host controller emulation.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci-host/xilinx-pcie.h"
+
+enum root_cfg_reg {
+/* Interrupt Decode Register */
+ROOTCFG_INTDEC  = 0x138,
+
+/* Interrupt Mask Register */
+ROOTCFG_INTMASK = 0x13c,
+/* INTx Interrupt Received */
+#define ROOTCFG_INTMASK_INTX(1 << 16)
+/* MSI Interrupt Received */
+#define ROOTCFG_INTMASK_MSI (1 << 17)
+
+/* PHY Status/Control Register */
+ROOTCFG_PSCR= 0x144,
+/* Link Up */
+#define ROOTCFG_PSCR_LINK_UP(1 << 11)
+
+/* Root Port Status/Control Register */
+ROOTCFG_RPSCR   = 0x148,
+/* Bridge Enable */
+#define ROOTCFG_RPSCR_BRIDGEEN  (1 << 0)
+/* Interrupt FIFO Not Empty */
+#define ROOTCFG_RPSCR_INTNEMPTY (1 << 18)
+/* Interrupt FIFO Overflow */
+#define ROOTCFG_RPSCR_INTOVF(1 << 19)
+
+/* Root Port Interrupt FIFO Read Register 1 */
+ROOTCFG_RPIFR1  = 0x158,
+#define ROOTCFG_RPIFR1_INT_LANE_SHIFT   27
+#define ROOTCFG_RPIFR1_INT_ASSERT_SHIFT 29
+#define ROOTCFG_RPIFR1_INT_VALID_SHIFT  31
+/* Root Port Interrupt FIFO Read Register 2 */
+ROOTCFG_RPIFR2  = 0x15c,
+};
+
+static void xilinx_pcie_update_intr(XilinxPCIEHost *s,
+uint32_t set, uint32_t clear)
+{
+int level;
+
+s->intr |= set;
+s->intr &= ~clear;
+
+if (s->intr_fifo_r != s->intr_fifo_w) {
+s->intr |= ROOTCFG_INTMASK_INTX;
+}
+
+level = !!(s->intr & s->intr_mask);
+qemu_set_irq(s->irq, level);
+}
+
+static void xilinx_pcie_queue_intr(XilinxPCIEHost *s,
+   uint32_t fifo_reg1, uint32_t fifo_reg2)
+{
+XilinxPCIEInt *intr;
+unsigned int new_w;
+
+new_w = (s->intr_fifo_w + 1) % ARRAY_SIZE(s->intr_fifo);
+if (new_w == s->intr_fifo_r) {
+s->rpscr |= ROOTCFG_RPSCR_INTOVF;
+return;
+}
+
+intr = >intr_fifo[s->intr_fifo_w];
+s->intr_fifo_w = new_w;
+
+intr->fifo_reg1 = fifo_reg1;
+intr->fifo_reg2 = fifo_reg2;
+
+xilinx_pcie_update_intr(s, ROOTCFG_INTMASK_INTX, 0);
+}
+
+static void xilinx_pcie_set_irq(void *opaque, int irq_num, int level)
+{
+XilinxPCIEHost *s = XILINX_PCIE_HOST(opaque);
+
+xilinx_pcie_queue_intr(s,
+   (irq_num << ROOTCFG_RPIFR1_INT_LANE_SHIFT) |
+   (level << ROOTCFG_RPIFR1_INT_ASSERT_SHIFT) |
+   (1 << ROOTCFG_RPIFR1_INT_VALID_SHIFT),
+   0);
+}
+
+static void xilinx_pcie_host_realize(DeviceState *dev, Error **errp)
+{
+PCIHostState *pci = PCI_HOST_BRIDGE(dev);
+XilinxPCIEHost *s = XILINX_PCIE_HOST(dev);
+SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+

[Qemu-devel] [PATCH v5 4/8] target-mips: Provide function to test if a CPU supports an ISA

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Provide a new cpu_supports_isa function which allows callers to
determine whether a CPU supports one of the ISA_ flags, by testing
whether the associated struct mips_def_t sets the ISA flags in its
insn_flags field.

An example use of this is to allow boards which generate bootloader code
to determine the properties of the CPU that will be used, for example
whether the CPU is 64 bit or which architecture revision it implements.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Yongbok Kim 
---
 target/mips/cpu.h   |  1 +
 target/mips/translate.c | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index e1c78f5..4a4747a 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -815,6 +815,7 @@ int cpu_mips_signal_handler(int host_signum, void *pinfo, 
void *puc);
 
 #define cpu_init(cpu_model) CPU(cpu_mips_init(cpu_model))
 bool cpu_supports_cps_smp(const char *cpu_model);
+bool cpu_supports_isa(const char *cpu_model, unsigned int isa);
 void cpu_set_exception_base(int vp_index, target_ulong address);
 
 /* TODO QOM'ify CPU reset and remove */
diff --git a/target/mips/translate.c b/target/mips/translate.c
index 7f8ecf4..8b4a072 100644
--- a/target/mips/translate.c
+++ b/target/mips/translate.c
@@ -20233,6 +20233,16 @@ bool cpu_supports_cps_smp(const char *cpu_model)
 return (def->CP0_Config3 & (1 << CP0C3_CMGCR)) != 0;
 }
 
+bool cpu_supports_isa(const char *cpu_model, unsigned int isa)
+{
+const mips_def_t *def = cpu_mips_find_by_name(cpu_model);
+if (!def) {
+return false;
+}
+
+return (def->insn_flags & isa) != 0;
+}
+
 void cpu_set_exception_base(int vp_index, target_ulong address)
 {
 MIPSCPU *vp = MIPS_CPU(qemu_get_cpu(vp_index));
-- 
2.7.4




[Qemu-devel] [PATCH v5 2/8] hw/mips_gictimer: provide API for retrieving frequency

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Provide a new function mips_gictimer_get_freq() which returns the
frequency at which a GIC timer will count. This will be useful for
boards which perform setup based upon this frequency.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Yongbok Kim 
---
 hw/timer/mips_gictimer.c | 5 +
 include/hw/timer/mips_gictimer.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/timer/mips_gictimer.c b/hw/timer/mips_gictimer.c
index 3698889..f5c5806 100644
--- a/hw/timer/mips_gictimer.c
+++ b/hw/timer/mips_gictimer.c
@@ -14,6 +14,11 @@
 
 #define TIMER_PERIOD 10 /* 10 ns period for 100 Mhz frequency */
 
+uint32_t mips_gictimer_get_freq(MIPSGICTimerState *gic)
+{
+return NANOSECONDS_PER_SECOND / TIMER_PERIOD;
+}
+
 static void gic_vptimer_update(MIPSGICTimerState *gictimer,
uint32_t vp_index, uint64_t now)
 {
diff --git a/include/hw/timer/mips_gictimer.h b/include/hw/timer/mips_gictimer.h
index c8bc5d2..c7ca6c8 100644
--- a/include/hw/timer/mips_gictimer.h
+++ b/include/hw/timer/mips_gictimer.h
@@ -31,6 +31,7 @@ struct MIPSGICTimerState {
 MIPSGICTimerCB *cb;
 };
 
+uint32_t mips_gictimer_get_freq(MIPSGICTimerState *gic);
 uint32_t mips_gictimer_get_sh_count(MIPSGICTimerState *gic);
 void mips_gictimer_store_sh_count(MIPSGICTimerState *gic, uint64_t count);
 uint32_t mips_gictimer_get_vp_compare(MIPSGICTimerState *gictimer,
-- 
2.7.4




[Qemu-devel] [PATCH v5 6/8] loader: Support Flattened Image Trees (FIT images)

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Introduce support for loading Flattened Image Trees, as used by modern
U-Boot. FIT images are essentially flattened device tree files which
contain binary images such as kernels, FDTs or ramdisks along with one
or more configuration nodes describing boot configurations.

The MIPS Boston board typically boots kernels in the form of FIT images,
and will make use of this code.

Signed-off-by: Paul Burton 
[yongbok@imgtec.com: fixed potential memory leaks]
Signed-off-by: Yongbok Kim 
---
 hw/core/Makefile.objs   |   1 +
 hw/core/loader-fit.c| 325 
 hw/core/loader.c|   7 +-
 include/hw/loader-fit.h |  41 ++
 include/hw/loader.h |   6 +
 5 files changed, 374 insertions(+), 6 deletions(-)
 create mode 100644 hw/core/loader-fit.c
 create mode 100644 include/hw/loader-fit.h

diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
index 7f8c9dc..ff59512 100644
--- a/hw/core/Makefile.objs
+++ b/hw/core/Makefile.objs
@@ -13,6 +13,7 @@ common-obj-$(CONFIG_PTIMER) += ptimer.o
 common-obj-$(CONFIG_SOFTMMU) += sysbus.o
 common-obj-$(CONFIG_SOFTMMU) += machine.o
 common-obj-$(CONFIG_SOFTMMU) += loader.o
+common-obj-$(CONFIG_SOFTMMU) += loader-fit.o
 common-obj-$(CONFIG_SOFTMMU) += qdev-properties-system.o
 common-obj-$(CONFIG_SOFTMMU) += register.o
 common-obj-$(CONFIG_SOFTMMU) += or-irq.o
diff --git a/hw/core/loader-fit.c b/hw/core/loader-fit.c
new file mode 100644
index 000..4ddd35e
--- /dev/null
+++ b/hw/core/loader-fit.c
@@ -0,0 +1,325 @@
+/*
+ * Flattened Image Tree loader.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "exec/address-spaces.h"
+#include "exec/memory.h"
+#include "hw/loader.h"
+#include "hw/loader-fit.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+
+#include 
+#include 
+
+#define FIT_LOADER_MAX_PATH (128)
+
+static const void *fit_load_image_alloc(const void *itb, const char *name,
+int *poff, size_t *psz)
+{
+const void *data;
+const char *comp;
+void *uncomp_data;
+char path[FIT_LOADER_MAX_PATH];
+int off, sz;
+ssize_t uncomp_len;
+
+snprintf(path, sizeof(path), "/images/%s", name);
+
+off = fdt_path_offset(itb, path);
+if (off < 0) {
+return NULL;
+}
+if (poff) {
+*poff = off;
+}
+
+data = fdt_getprop(itb, off, "data", );
+if (!data) {
+return NULL;
+}
+
+comp = fdt_getprop(itb, off, "compression", NULL);
+if (!comp || !strcmp(comp, "none")) {
+if (psz) {
+*psz = sz;
+}
+uncomp_data = g_malloc(sz);
+memmove(uncomp_data, data, sz);
+return uncomp_data;
+}
+
+if (!strcmp(comp, "gzip")) {
+uncomp_len = UBOOT_MAX_GUNZIP_BYTES;
+uncomp_data = g_malloc(uncomp_len);
+
+uncomp_len = gunzip(uncomp_data, uncomp_len, (void *) data, sz);
+if (uncomp_len < 0) {
+error_printf("unable to decompress %s image\n", name);
+g_free(uncomp_data);
+return NULL;
+}
+
+data = g_realloc(uncomp_data, uncomp_len);
+if (psz) {
+*psz = uncomp_len;
+}
+return data;
+}
+
+error_printf("unknown compression '%s'\n", comp);
+return NULL;
+}
+
+static int fit_image_addr(const void *itb, int img, const char *name,
+  hwaddr *addr)
+{
+const void *prop;
+int len;
+
+prop = fdt_getprop(itb, img, name, );
+if (!prop) {
+return -ENOENT;
+}
+
+switch (len) {
+case 4:
+*addr = fdt32_to_cpu(*(fdt32_t *)prop);
+return 0;
+case 8:
+*addr = fdt64_to_cpu(*(fdt64_t *)prop);
+return 0;
+default:
+error_printf("invalid %s address length %d\n", name, len);
+return -EINVAL;
+}
+}
+
+static int fit_load_kernel(const struct fit_loader *ldr, const void *itb,
+   int cfg, void *opaque, hwaddr *pend)
+{
+const char *name;
+const void *data;
+const void *load_data;
+hwaddr load_addr, entry_addr;
+int img_off, err;
+

[Qemu-devel] [PATCH v5 0/8] MIPS Boston board support

2017-02-15 Thread Yongbok Kim
This series introduces support for the MIPS Boston development board. It begins
by introducing support for moving MIPS Coherence Manager GCRs which Boston
software typically does to avoid conflicting with its flash memory region. An
API is then added to retrieve the emulated MIPS GIC timer frequency, which is
used to report system clock frequency to software via "platform registers"
which the Boston board provides. An issue with the MIPS GIC that current Boston
Linux kernels encounter is fixed, and an API introduced to allow the board to
determine whether the MIPS CPS hardware is supported.

The last 3 patches are more extensive, providing support for the FIT image
format used with Boston, the Xilinx PCIe controller which Boston boards include
3 of, and finally the Boston board support itself.

This can be tested with either U-Boot or Linux if desired. U-Boot support is
available in the following patchset:

  https://www.mail-archive.com/u-boot@lists.denx.de/msg221003.html

Linux kernel support can be found as part of the generic kernel patchset:

  https://www.linux-mips.org/archives/linux-mips/2016-08/msg00456.html

Hopefully this will be merged for v4.9, but it can also be found in a
downstream kernel from Imagination Technologies in the "eng" branch of:

  git://git.linux-mips.org/pub/scm/linux-mti.git

Linux may be built with:

  $ make 64r6el_defconfig
  $ make

The arch/mips/boot/vmlinux.gz.itb image may then be provided to QEMU's -kernel
argument, for example:

  $ qemu-system-mips64el -M boston -kernel vmlinux.gz.itb -serial stdio

v5:
  loader-fit
quick fix for the redefinition issue reported from Patchew.

v4:
Yongbok Kim:
  boston
ignore missing bios/kernel for qtest.

v3:
Yongbok Kim:
  loader-fit
fixed potential memory leaks.
  xlinix-pcie
added descriptions for macros. (Alistair)
removed returning on !level. (Alistair)
updated IRQ connection with GPIO logic (Alistair)
moved xilinx_pcie_init() to boston.c (Alistair)
replaced stw_le_p() with pci_set_word()
  boston
isolated boston machine support for mips64el.
updated for recent Chardev changes.
  and other cosmetic changes.

v1, v2:
Paul Burton (8):
  hw/mips_cmgcr: allow GCR base to be moved
  hw/mips_gictimer: provide API for retrieving frequency
  hw/mips_gic: Update pin state on mask changes
  target-mips: Provide function to test if a CPU supports an ISA
  dtc: Update requirement to v1.4.2
  loader: Support Flattened Image Trees (FIT images)
  hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller
  hw/mips: MIPS Boston board support

 configure|   8 +-
 default-configs/mips64el-softmmu.mak |   2 +
 dtc  |   2 +-
 hw/core/Makefile.objs|   1 +
 hw/core/loader-fit.c | 325 
 hw/core/loader.c |   7 +-
 hw/intc/mips_gic.c   |  56 ++--
 hw/mips/Makefile.objs|   1 +
 hw/mips/boston.c | 576 +++
 hw/misc/mips_cmgcr.c |  17 ++
 hw/pci-host/Makefile.objs|   1 +
 hw/pci-host/xilinx-pcie.c| 328 
 hw/timer/mips_gictimer.c |   5 +
 include/hw/loader-fit.h  |  41 +++
 include/hw/loader.h  |   6 +
 include/hw/misc/mips_cmgcr.h |   3 +
 include/hw/pci-host/xilinx-pcie.h|  68 +
 include/hw/timer/mips_gictimer.h |   1 +
 target/mips/cpu.h|   1 +
 target/mips/translate.c  |  10 +
 20 files changed, 1423 insertions(+), 36 deletions(-)
 create mode 100644 hw/core/loader-fit.c
 create mode 100644 hw/mips/boston.c
 create mode 100644 hw/pci-host/xilinx-pcie.c
 create mode 100644 include/hw/loader-fit.h
 create mode 100644 include/hw/pci-host/xilinx-pcie.h

-- 
2.7.4




[Qemu-devel] [PATCH v5 1/8] hw/mips_cmgcr: allow GCR base to be moved

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Support moving the GCR base address & updating the CPU's CP0 CMGCRBase
register appropriately. This is required if a platform needs to move its
GCRs away from other memory, as the MIPS Boston development board does
to avoid its flash memory.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Signed-off-by: Yongbok Kim 
---
 hw/misc/mips_cmgcr.c | 17 +
 include/hw/misc/mips_cmgcr.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/hw/misc/mips_cmgcr.c b/hw/misc/mips_cmgcr.c
index b3ba166..a1edb53 100644
--- a/hw/misc/mips_cmgcr.c
+++ b/hw/misc/mips_cmgcr.c
@@ -29,6 +29,20 @@ static inline bool is_gic_connected(MIPSGCRState *s)
 return s->gic_mr != NULL;
 }
 
+static inline void update_gcr_base(MIPSGCRState *gcr, uint64_t val)
+{
+CPUState *cpu;
+MIPSCPU *mips_cpu;
+
+gcr->gcr_base = val & GCR_BASE_GCRBASE_MSK;
+memory_region_set_address(>iomem, gcr->gcr_base);
+
+CPU_FOREACH(cpu) {
+mips_cpu = MIPS_CPU(cpu);
+mips_cpu->env.CP0_CMGCRBase = gcr->gcr_base >> 4;
+}
+}
+
 static inline void update_cpc_base(MIPSGCRState *gcr, uint64_t val)
 {
 if (is_cpc_connected(gcr)) {
@@ -117,6 +131,9 @@ static void gcr_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 MIPSGCRVPState *other_vps = >vps[current_vps->other];
 
 switch (addr) {
+case GCR_BASE_OFS:
+update_gcr_base(gcr, data);
+break;
 case GCR_GIC_BASE_OFS:
 update_gic_base(gcr, data);
 break;
diff --git a/include/hw/misc/mips_cmgcr.h b/include/hw/misc/mips_cmgcr.h
index a209d91..c9dfcb4 100644
--- a/include/hw/misc/mips_cmgcr.h
+++ b/include/hw/misc/mips_cmgcr.h
@@ -41,6 +41,9 @@
 #define GCR_L2_CONFIG_BYPASS_SHF20
 #define GCR_L2_CONFIG_BYPASS_MSK((0x1ULL) << GCR_L2_CONFIG_BYPASS_SHF)
 
+/* GCR_BASE register fields */
+#define GCR_BASE_GCRBASE_MSK 0x8000ULL
+
 /* GCR_GIC_BASE register fields */
 #define GCR_GIC_BASE_GICEN_MSK   1
 #define GCR_GIC_BASE_GICBASE_MSK 0xFFFEULL
-- 
2.7.4




Re: [Qemu-devel] [PATCH v15 25/25] qcow2-bitmap: improve check_constraints_on_bitmap

2017-02-15 Thread John Snow


On 02/15/2017 05:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add detailed error messages.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: John Snow 

> ---
>  block/qcow2-bitmap.c | 48 ++--
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index 9177c56..e25c872 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -160,28 +160,49 @@ static int check_table_entry(uint64_t entry, int 
> cluster_size)
>  
>  static int check_constraints_on_bitmap(BlockDriverState *bs,
> const char *name,
> -   uint32_t granularity)
> +   uint32_t granularity,
> +   Error **errp)
>  {
>  BDRVQcow2State *s = bs->opaque;
>  int granularity_bits = ctz32(granularity);
>  int64_t len = bdrv_getlength(bs);
> -bool fail;
>  
>  assert(granularity > 0);
>  assert((granularity & (granularity - 1)) == 0);
>  
>  if (len < 0) {
> +error_setg_errno(errp, -len, "Failed to get size of '%s'",
> + bdrv_get_device_or_node_name(bs));
>  return len;
>  }
>  
> -fail = (granularity_bits > BME_MAX_GRANULARITY_BITS) ||
> -   (granularity_bits < BME_MIN_GRANULARITY_BITS) ||
> -   (len > (uint64_t)BME_MAX_PHYS_SIZE << granularity_bits) ||
> -   (len > (uint64_t)BME_MAX_TABLE_SIZE * s->cluster_size <<
> -  granularity_bits) ||
> -   (strlen(name) > BME_MAX_NAME_SIZE);
> +if (granularity_bits > BME_MAX_GRANULARITY_BITS) {
> +error_setg(errp, "Granularity exceeds maximum (%u bytes)",
> +   1 << BME_MAX_GRANULARITY_BITS);
> +return -EINVAL;
> +}
> +if (granularity_bits < BME_MIN_GRANULARITY_BITS) {
> +error_setg(errp, "Granularity is under minimum (%u bytes)",
> +   1 << BME_MIN_GRANULARITY_BITS);
> +return -EINVAL;
> +}
>  
> -return fail ? -EINVAL : 0;
> +if ((len > (uint64_t)BME_MAX_PHYS_SIZE << granularity_bits) ||
> +(len > (uint64_t)BME_MAX_TABLE_SIZE * s->cluster_size <<
> +   granularity_bits))
> +{
> +error_setg(errp, "Too much space will be occupied by the bitmap. "
> +   "Use larger granularity");
> +return -EINVAL;
> +}
> +
> +if (strlen(name) > BME_MAX_NAME_SIZE) {
> +error_setg(errp, "Name length exceeds maximum (%u characters)",
> +   BME_MAX_NAME_SIZE);
> +return -EINVAL;
> +}
> +
> +return 0;
>  }
>  
>  static void clear_bitmap_table(BlockDriverState *bs, uint64_t *bitmap_table,
> @@ -1142,9 +1163,9 @@ void 
> qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp)
>  continue;
>  }
>  
> -if (check_constraints_on_bitmap(bs, name, granularity) < 0) {
> -error_setg(errp, "Bitmap '%s' doesn't satisfy the constraints",
> -   name);
> +if (check_constraints_on_bitmap(bs, name, granularity, errp) < 0) {
> +error_prepend(errp, "Bitmap '%s' doesn't satisfy the 
> constraints: ",
> +  name);
>  goto fail;
>  }
>  
> @@ -1233,8 +1254,7 @@ bool qcow2_can_store_new_dirty_bitmap(BlockDriverState 
> *bs,
>  bool found;
>  Qcow2BitmapList *bm_list;
>  
> -if (check_constraints_on_bitmap(bs, name, granularity) != 0) {
> -error_setg(errp, "The constraints are not satisfied");
> +if (check_constraints_on_bitmap(bs, name, granularity, errp) != 0) {
>  goto fail;
>  }
>  
> 

-- 
—js



Re: [Qemu-devel] RFC: How to make seccomp reliable and useful ?

2017-02-15 Thread Eduardo Otubo
On Wed, Feb 15, 2017 at 06=27=32PM +, Daniel P. Berrange wrote:
> The current impl of seccomp in QEMU is intentionally allowing a huge range
> of system calls to be executed. The goal was that running '-sandbox on'
> should never break any feature of QEMU, so naturally any syscall that can
> executed on any codepath QEMU takes must be allowed.
> 
> This is good for usability because users don't need to understand the 
> technical
> details of the sandbox technology, they merely say "on" and it "just works".
> Conversely though, this is bad for security because QEMU has to allow a huge
> range of system calls to be used due to its broad functionality.
> 
> During initial discussions for seccomp back in 2012 it was suggested, there
> might be alternate policies developed for QEMU which deny some features, but
> improve security overall. To best of my knowledge, this has never been 
> discussed
> again since then.
> 
> 
> In addition, since initially merging, there has been a steady stream of 
> patches
> to whitelist further syscalls that were missing. Some of these were missing 
> due
> to newly added functionality in QEMU since the original seccomp impl, while
> others have been missing since day 1. It is reasonable to expect that there 
> are
> still many syscalls missing in the whitelist. In just a couple of minutes of
> comparing the whitelist vs global syscall list it was possible to identify two
> further missing syscalls. The '-netdev bridge,br=virbr0' network backend fails
> because setuid is blocked, preventing execution of the qemu-bridge-helper
> program. If built against glibc < 2.9, or running on kernel < 2.6.27 it will
> fail to call eventfd() because we only permit eventfd2() syscall, not the
> older eventfd() syscall used on older Linux. Some ifup scripts used with the
> -netdev arg may also break due to lack of chmod, flock, getxattr permissions.
> This risk of missing syscalls is why -sandbox defaults to off, and we've never
> considered defaulting it to on.
> 
> 
> The fundamental problem is that building a whitelist of syscalls used by QEMU
> emulators is an intractable problem. QEMU on my system links to 183 different
> shared libraries and there is no way in the world that anyone can figure out
> which code paths QEMU triggers in these libraries and thus identify which
> syscalls will be genuinely needed.
> 
> Thus a whitelist based approach for QEMU is doomed to always be missing some
> syscalls, resulting in uneccessary abrts of QEMU when it tickles some edge
> case. If you are lucky the abort() happens at startup so you see it quickly
> and can address it. If you are unlucky the abort() happens after your VM has
> been running for days/week/months and you loose data.
> 
> IOW, seccomp integration as it currently exists today in QEMU offers minimal
> security benefits, while at the same time causing spurious crashes which may
> cause user data loss from aborting a running VM, discouraging users from using
> even the minimal protection it offers.
> 
> I think we need to rework our seccomp support so that we can have a high 
> enough
> level of confidence in it, that it could be enabled by default. At the same 
> time
> we need to make it do something more tangibly useful from a security POV.
> 
> 
> First we need to admit that whitelisting is a failed approach, and switch to
> using blacklisting. Unless we do this, we'll never have high enough confidence
> to enable it by default - something that's never turned on might as well not
> exist at all.
> 
> 
> There is a reasonable easily identifiable set of syscalls that QEMU should
> never be permitted to use, no matter what configuration it is in, what helpers
> it spawns, or what libraries it links to. eg reboot, swapon, swapoff,  syslog,
> mount, unmount, kexec_*, etc - any syscall that affects global system state,
> rather than process local state should be forbidden.
> 
> There are some syscalls that are simply hardcoded to return ENOSYS which can
> be trivially blacklisted. afs_syscall, break, fattach, ftime, etc (see the
> man page 'unimplemented(2)').
> 
> There are some syscalls which are considered obsolete - they were previously
> useful, but no modern code would call them, as they have been superceeded.
> For example, readdir replaced by getdents. We could blacklist these by default
> but provide a way to allow use of obsolete syscalls if running on older 
> systems.
> e.g. '-sandbox on,obsolete=allow'. They might be obsolete enough that we 
> decide
> to just block them permanently with no opt in - would need to analyse when
> their replacements appeared in widespread use.
> 
> There might be a few more syscalls which we can determine are never valid to
> use in QEMU or any library or helper program it might run. I expect this list
> to be very small though, given the impossibility of auditing code paths 
> through
> millions of lines of code QEMU links to.
> 
> Everything else should be allowed.
> 
> At this point we 

Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Jintack Lim
On Wed, Feb 15, 2017 at 5:50 PM, Alex Williamson  wrote:

> On Wed, 15 Feb 2017 17:05:35 -0500
> Jintack Lim  wrote:
>
> > On Tue, Feb 14, 2017 at 9:52 PM, Peter Xu  wrote:
> >
> > > On Tue, Feb 14, 2017 at 07:50:39AM -0500, Jintack Lim wrote:
> > >
> > > [...]
> > >
> > > > > > >> > I misunderstood what you said?
> > > > > > >
> > > > > > > I failed to understand why an vIOMMU could help boost
> performance.
> > > :(
> > > > > > > Could you provide your command line here so that I can try to
> > > > > > > reproduce?
> > > > > >
> > > > > > Sure. This is the command line to launch L1 VM
> > > > > >
> > > > > > qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
> > > > > > -m 12G -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > > > > -drive file=/mydata/guest0.img,format=raw --nographic -cpu host
> \
> > > > > > -smp 4,sockets=4,cores=1,threads=1 \
> > > > > > -device vfio-pci,host=08:00.0,id=net0
> > > > > >
> > > > > > And this is for L2 VM.
> > > > > >
> > > > > > ./qemu-system-x86_64 -M q35,accel=kvm \
> > > > > > -m 8G \
> > > > > > -drive file=/vm/l2guest.img,format=raw --nographic -cpu host \
> > > > > > -device vfio-pci,host=00:03.0,id=net0
> > > > >
> > > > > ... here looks like these are command lines for L1/L2 guest, rather
> > > > > than L1 guest with/without vIOMMU?
> > > > >
> > > >
> > > > That's right. I thought you were asking about command lines for L1/L2
> > > guest
> > > > :(.
> > > > I think I made the confusion, and as I said above, I didn't mean to
> talk
> > > > about the performance of L1 guest with/without vIOMMO.
> > > > We can move on!
> > >
> > > I see. Sure! :-)
> > >
> > > [...]
> > >
> > > > >
> > > > > Then, I *think* above assertion you encountered would fail only if
> > > > > prev == 0 here, but I still don't quite sure why was that
> happening.
> > > > > Btw, could you paste me your "lspci -vvv -s 00:03.0" result in
> your L1
> > > > > guest?
> > > > >
> > > >
> > > > Sure. This is from my L1 guest.
> > >
> > > Hmm... I think I found the problem...
> > >
> > > >
> > > > root@guest0:~# lspci -vvv -s 00:03.0
> > > > 00:03.0 Network controller: Mellanox Technologies MT27500 Family
> > > > [ConnectX-3]
> > > > Subsystem: Mellanox Technologies Device 0050
> > > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > > > Stepping- SERR+ FastB2B- DisINTx+
> > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>  > > > SERR-  > > > Latency: 0, Cache Line Size: 64 bytes
> > > > Interrupt: pin A routed to IRQ 23
> > > > Region 0: Memory at fe90 (64-bit, non-prefetchable) [size=1M]
> > > > Region 2: Memory at fe00 (64-bit, prefetchable) [size=8M]
> > > > Expansion ROM at fea0 [disabled] [size=1M]
> > > > Capabilities: [40] Power Management version 3
> > > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-
> > > )
> > > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > > > Capabilities: [48] Vital Product Data
> > > > Product Name: CX354A - ConnectX-3 QSFP
> > > > Read-only fields:
> > > > [PN] Part number: MCX354A-FCBT
> > > > [EC] Engineering changes: A4
> > > > [SN] Serial number: MT1346X00791
> > > > [V0] Vendor specific: PCIe Gen3 x8
> > > > [RV] Reserved: checksum good, 0 byte(s) reserved
> > > > Read/write fields:
> > > > [V1] Vendor specific: N/A
> > > > [YA] Asset tag: N/A
> > > > [RW] Read-write area: 105 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 252 byte(s) free
> > > > End
> > > > Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
> > > > Vector table: BAR=0 offset=0007c000
> > > > PBA: BAR=0 offset=0007d000
> > > > Capabilities: [60] Express (v2) Root Complex Integrated Endpoint,
> MSI 00
> > > > DevCap: MaxPayload 256 bytes, PhantFunc 0
> > > > ExtTag- RBE+
> > > > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> > > > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > > > MaxPayload 256 bytes, MaxReadReq 4096 bytes
> > > > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> > > > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not
> > > > Supported
> > > > DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF
> > > Disabled
> > > > Capabilities: [100 v0] #00
> > >
> > > Here we 

Re: [Qemu-devel] [PATCH v15 16/25] qmp: add persistent flag to block-dirty-bitmap-add

2017-02-15 Thread John Snow


On 02/15/2017 05:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add optional 'persistent' flag to qmp command block-dirty-bitmap-add.
> Default is false.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Max Reitz 

Reviewed-by: John Snow 

> ---
>  blockdev.c   | 18 +-
>  qapi/block-core.json |  8 +++-
>  2 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 245e1e1..40605fa 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1967,6 +1967,7 @@ static void 
> block_dirty_bitmap_add_prepare(BlkActionState *common,
>  /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>  qmp_block_dirty_bitmap_add(action->node, action->name,
> action->has_granularity, action->granularity,
> +   action->has_persistent, action->persistent,
> _err);
>  
>  if (!local_err) {
> @@ -2696,10 +2697,12 @@ out:
>  
>  void qmp_block_dirty_bitmap_add(const char *node, const char *name,
>  bool has_granularity, uint32_t granularity,
> +bool has_persistent, bool persistent,
>  Error **errp)
>  {
>  AioContext *aio_context;
>  BlockDriverState *bs;
> +BdrvDirtyBitmap *bitmap;
>  
>  if (!name || name[0] == '\0') {
>  error_setg(errp, "Bitmap name cannot be empty");
> @@ -2725,7 +2728,20 @@ void qmp_block_dirty_bitmap_add(const char *node, 
> const char *name,
>  granularity = bdrv_get_default_bitmap_granularity(bs);
>  }
>  
> -bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> +if (!has_persistent) {
> +persistent = false;
> +}
> +
> +if (persistent &&
> +!bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp))
> +{
> +goto out;
> +}
> +
> +bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> +if (bitmap != NULL) {
> +bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
> +}
>  
>   out:
>  aio_context_release(aio_context);
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 932f5bb..535df20 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1559,10 +1559,16 @@
>  # @granularity: #optional the bitmap granularity, default is 64k for
>  #   block-dirty-bitmap-add
>  #
> +# @persistent: #optional the bitmap is persistent, i.e. it will be saved to 
> the
> +#  corresponding block device image file on its close. For now 
> only
> +#  Qcow2 disks support persistent bitmaps. Default is false.
> +#  (Since 2.9)
> +#
>  # Since: 2.4
>  ##
>  { 'struct': 'BlockDirtyBitmapAdd',
> -  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32' } }
> +  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
> +'*persistent': 'bool' } }
>  
>  ##
>  # @block-dirty-bitmap-add:
> 




Re: [Qemu-devel] [PATCH v15 15/25] qcow2: add .bdrv_can_store_new_dirty_bitmap

2017-02-15 Thread John Snow


On 02/15/2017 05:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> Realize .bdrv_can_store_new_dirty_bitmap interface.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Thanks,

Reviewed-by: John Snow 

> ---
>  block/qcow2-bitmap.c | 52 
> 
>  block/qcow2.c|  1 +
>  block/qcow2.h|  4 
>  3 files changed, 57 insertions(+)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index b177a95..1ee89e4 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -1182,3 +1182,55 @@ fail:
>  
>  bitmap_list_free(bm_list);
>  }
> +
> +bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  uint32_t granularity,
> +  Error **errp)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +bool found;
> +Qcow2BitmapList *bm_list;
> +
> +if (check_constraints_on_bitmap(bs, name, granularity) != 0) {
> +error_setg(errp, "The constraints are not satisfied");
> +goto fail;
> +}
> +
> +if (s->nb_bitmaps == 0) {
> +return true;
> +}
> +
> +if (s->nb_bitmaps >= QCOW2_MAX_BITMAPS) {
> +error_setg(errp,
> +   "Maximum number of persistent bitmaps is already 
> reached");
> +goto fail;
> +}
> +
> +if (s->bitmap_directory_size + calc_dir_entry_size(strlen(name), 0) >
> +QCOW2_MAX_BITMAP_DIRECTORY_SIZE)
> +{
> +error_setg(errp, "No enough space in the bitmap directory");
> +goto fail;
> +}
> +
> +bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
> +   s->bitmap_directory_size, errp);
> +if (bm_list == NULL) {
> +goto fail;
> +}
> +
> +found = find_bitmap_by_name(bm_list, name);
> +bitmap_list_free(bm_list);
> +if (found) {
> +error_setg(errp, "Bitmap with the same name is already stored");
> +goto fail;
> +}
> +
> +return true;
> +
> +fail:
> +error_prepend(errp, "Can't make bitmap '%s' persistent in '%s': ",
> +  name, bdrv_get_device_or_node_name(bs));
> +return false;
> +}
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d0e41bf..6e1fe53 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3541,6 +3541,7 @@ BlockDriver bdrv_qcow2 = {
>  
>  .bdrv_load_autoloading_dirty_bitmaps = 
> qcow2_load_autoloading_dirty_bitmaps,
>  .bdrv_store_persistent_dirty_bitmaps = 
> qcow2_store_persistent_dirty_bitmaps,
> +.bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
>  };
>  
>  static void bdrv_qcow2_init(void)
> diff --git a/block/qcow2.h b/block/qcow2.h
> index d9a7643..749710d 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -616,5 +616,9 @@ void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
> void **table);
>  /* qcow2-bitmap.c functions */
>  void qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
>  void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
> +bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  uint32_t granularity,
> +  Error **errp);
>  
>  #endif
> 



Re: [Qemu-devel] [PATCH v6 6/7] tests: Move reusable ACPI macros into a new header file

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 2:56 PM, Eric Blake  wrote:
> 
> On 02/15/2017 03:58 PM, Ben Warren wrote:
> 
>>> 
>>> ---
>>> tests/acpi-utils.h   | 75 
>>> 
>>> tests/bios-tables-test.c | 72 +-
>>> 2 files changed, 76 insertions(+), 71 deletions(-)
> 
> 
>>> No copyright blurb? Also, does MAINTAINERS need an update to cover the
>>> new file?
>>> 
>> Sure, I didn’t realize the header files all have copyright headers.  As for 
>> MAINTAINERS, do you mean I should add a device entry for vmgenid?
> 
> In this patch, you're just refactoring to a new tests/acpi-utils.h, so
> I'd normally suggest adding it to the blurb that owns
> tests/bios-tables-test.c - but as a pre-existing problem, that also has
> no listed maintainer.  we're trying to ensure that all new added files
> have something listed in MAINTAINERS, even if it is in a misc section
> that only emails the list, although it's harder to say what maintainer
> to use for existing files that you are merely touching, and failure to
> list a maintainer is not (yet) a hard failure (although there have been
> patches proposed to scripts/checkpatch.pl to tighten the rules).
> 
> A new section for vmgenid might not be a bad idea, especially it if
> covers more files than just the one addition I noticed in this patch.
> 
Thank you for clarifying.  I’ll take care of it.
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 
—Ben



smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [RFC QEMU PATCH 1/8] nvdimm: do not initialize label_data if label_size is zero

2017-02-15 Thread Konrad Rzeszutek Wilk
On Mon, Oct 10, 2016 at 08:34:16AM +0800, Haozhong Zhang wrote:
> When memory-backend-xen is used, the label_data pointer can not be got
> via memory_region_get_ram_ptr(). We will use other functions to get

Could you explain why it cannot be retrieved via that way?

> label_data once we introduce NVDIMM label support to Xen.

Is this an particular patch in this series that does that?
You may want to enumerate which one it is.

> 
> Signed-off-by: Haozhong Zhang 
> ---
> Cc: Xiao Guangrong 
> Cc: "Michael S. Tsirkin" 
> Cc: Igor Mammedov 
> ---
>  hw/mem/nvdimm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
> index 7895805..d25993b 100644
> --- a/hw/mem/nvdimm.c
> +++ b/hw/mem/nvdimm.c
> @@ -87,7 +87,9 @@ static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp)
>  align = memory_region_get_alignment(mr);
>  
>  pmem_size = size - nvdimm->label_size;
> -nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size;
> +if (nvdimm->label_size) {
> +nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size;
> +}
>  pmem_size = QEMU_ALIGN_DOWN(pmem_size, align);
>  
>  if (size <= nvdimm->label_size || !pmem_size) {
> -- 
> 2.10.1
> 



Re: [Qemu-devel] [PATCH] target-ppc: Add quad precision muladd instructions

2017-02-15 Thread Richard Henderson

On 02/15/2017 05:37 PM, Bharata B Rao wrote:

+ *
+ * TODO: When float128_muladd() becomes available, switch this
+ * implementation to use that instead of separate float128_mul()
+ * followed by float128_add().


Let's just do that, rather than add something that can't pass tests.

You should be able to copy float64_muladd and, for the most part, s/128/256/ 
and s/64/128/.  Other of the magic numbers, like the implicit bit and the 
exponent bias, you get from float128_mul.



r~



Re: [Qemu-devel] [PATCH v6 6/7] tests: Move reusable ACPI macros into a new header file

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 1:35 PM, Eric Blake  wrote:
> 
> On 02/15/2017 12:15 AM, b...@skyportsystems.com 
>  wrote:
>> From: Ben Warren 
>> 
>> Also usable by upcoming VM Generation ID tests
>> 
>> Signed-off-by: Ben Warren 
>> ---
>> tests/acpi-utils.h   | 75 
>> 
>> tests/bios-tables-test.c | 72 +-
>> 2 files changed, 76 insertions(+), 71 deletions(-)
>> create mode 100644 tests/acpi-utils.h
>> 
>> diff --git a/tests/acpi-utils.h b/tests/acpi-utils.h
>> new file mode 100644
>> index 000..d5e5eff
>> --- /dev/null
>> +++ b/tests/acpi-utils.h
>> @@ -0,0 +1,75 @@
>> +#ifndef TEST_ACPI_UTILS_H
>> +#define TEST_ACPI_UTILS_H
> 
> No copyright blurb? Also, does MAINTAINERS need an update to cover the
> new file?
> 
Sure, I didn’t realize the header files all have copyright headers.  As for 
MAINTAINERS, do you mean I should add a device entry for vmgenid?

thanks,
Ben
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org 


smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH 1/2] ide: core: add cleanup function

2017-02-15 Thread John Snow


On 02/15/2017 04:26 AM, Li Qiang wrote:
> Hello,
> 
> 2017-02-15 7:30 GMT+08:00 John Snow  >:
> 
> 
> 
> On 02/09/2017 08:22 PM, Li Qiang wrote:
> > Hello John,
> >
> > 2017-02-10 3:42 GMT+08:00 John Snow  
> > >>:
> >
> >
> >
> > On 02/09/2017 02:04 AM, Li Qiang wrote:
> > > As the pci ahci can be hotplug and unplug, in the ahci unrealize
> > > function it should free all the resource once allocated in the
> > > realized function. This patch adds two cleanup function.
> > >
> > > Signed-off-by: Li Qiang    >>
> > > ---
> > >  hw/ide/core.c | 21 +
> > >  include/hw/ide/internal.h |  2 ++
> > >  2 files changed, 23 insertions(+)
> > >
> > > diff --git a/hw/ide/core.c b/hw/ide/core.c
> > > index 43709e5..8fe5896 100644
> > > --- a/hw/ide/core.c
> > > +++ b/hw/ide/core.c
> > > @@ -2586,6 +2586,13 @@ void ide_register_restart_cb(IDEBus *bus)
> > >  }
> > >  }
> > >
> > > +void ide_unregister_restart_cb(IDEBus *bus)
> > > +{
> > > +if (bus->dma->ops->restart_dma) {
> > > +qemu_del_vm_change_state_handler(bus->vmstate);
> > > +}
> > > +}
> > > +
> >
> > Doesn't this conflict with qdev.c's idebus_unrealize call?
> >
> >
> > As far as I can see, No conflict. Let's get a confirmation.
> >
> > Hello Paolo,
> >
> > Does this patch have any conflict/affect with the qdev?
> >
> > Thanks.
> >
> 
> They're both deleting the same VMstate handler, so as far as I can see
> this is a redundant call.
> 
> 
> IIUC, the idebus_unrealize in qdev.c is for qdev model.
> But the added is for qom model.
> For example, if you use 'device_add ahci,id=ahci'/'device_del ahci' in
> the qmp.
> 
> The qemu will call 'pci_ich9_ahci_realize'/'pci_ich9_uninit'.
> 
> 
>  

I'm sorry, I still don't understand. Do you have some reproducer or case
where I can verify that this leaks?

It doesn't look as if you can hot-add or hot-remove an AHCI device right
now anyway, have you tested this?

Further, if the two calls AREN'T in conflict, I'd rather find some
cleanup mechanism that handles all the unrealize/uninit cases together
instead of having separate cleanup pathways.

--js



Re: [Qemu-devel] [PATCH v3 0/5] SLIRP VMStatification

2017-02-15 Thread Dr. David Alan Gilbert
Ewww, it looks like I have a mingw disagreement I need to fix;



> QEMU_CFLAGS   -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/pixman-1  
> -I$(SRC_PATH)/dtc/libfdt -Werror -mms-bitfields 
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0 
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/lib/glib-2.0/include 
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include  -m64 -mcx16 -mthreads 
> -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE 
> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
> -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
> -fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels 
> -Wno-shift-negative-value -Wno-missing-include-dirs -Wempty-body 
> -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
> -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
> -Wtype-limits -fstack-protector-strong 
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include 
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/p11-kit-1 
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include  
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include   
> -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/libpng16 



error: invalid operands to binary - (have 'uint16_t * {aka short unsigned 
int *}' and 'short int *')
> /tmp/qemu-test/src/slirp/slirp.c:1280:9: note: in expansion of macro 
> 'VMSTATE_UINT16'
>  VMSTATE_UINT16(ss.ss_family, union slirp_sockaddr),
>  ^



  error: invalid operands to binary - (have 'uint32_t * {aka unsigned int *}' 
and 'u_long * {aka long unsigned int *}')
> /tmp/qemu-test/src/slirp/slirp.c:1281:9: note: in expansion of macro 
> 'VMSTATE_UINT32_TEST'
>  VMSTATE_UINT32_TEST(sin.sin_addr.s_addr, union slirp_sockaddr,
>  ^

> /tmp/qemu-test/src/slirp/slirp.c:1309:9: note: in expansion of macro 
> 'VMSTATE_UINT32_TEST'
>  VMSTATE_UINT32_TEST(so_faddr.s_addr, struct socket,
>  ^


> /tmp/qemu-test/src/slirp/slirp.c:1311:9: note: in expansion of macro 
> 'VMSTATE_UINT32_TEST'
>  VMSTATE_UINT32_TEST(so_laddr.s_addr, struct socket,
>  ^


My mingw headers has:

  struct sockaddr_storage {
short ss_family;
char __ss_pad1[_SS_PAD1SIZE];

__MINGW_EXTENSION __int64 __ss_align;
char __ss_pad2[_SS_PAD2SIZE];
  };

so the ss_family problem is a signedness problem.

and:

typedef struct in_addr {
  union {
struct { u_char  s_b1, s_b2, s_b3, s_b4; } S_un_b;
struct { u_short s_w1, s_w2; } S_un_w;
u_long S_addr;
  } S_un;
} IN_ADDR, *PIN_ADDR, *LPIN_ADDR;

Hmm, as far as I can tell it's long's are still 32bit; so I'll need
to dig to figure that out.

I'll go and figure it out.


Dave

> /tmp/qemu-test/src/rules.mak:69: recipe for target 'slirp/slirp.o' failed
> make: *** [slirp/slirp.o] Error 1
> make: *** Waiting for unfinished jobs
> make[1]: *** [docker-run] Error 2
> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-t49wmu6l/src'
> make: *** [docker-run-test-mingw@fedora] Error 2
> === OUTPUT END ===
> 
> Test command exited with code: 2
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@freelists.org
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 2/3] report guest crash information in GUEST_PANICKED event

2017-02-15 Thread Eric Blake
On 02/14/2017 12:25 AM, Denis V. Lunev wrote:
> From: Anton Nefedov 
> 
> it's not very convenient to use the crash-information property interface,
> so provide a CPU class callback to get the guest crash information, and pass
> that information in the event
> 
> Signed-off-by: Anton Nefedov 
> Signed-off-by: Denis V. Lunev 
> ---

> +++ b/qapi/event.json
> @@ -488,7 +488,9 @@
>  #
>  # @action: action that has been taken, currently always "pause"
>  #
> -# Since: 1.5
> +# @info: optional information about a panic
> +#
> +# Since: 1.5 (@info since 2.9)

This is more usually written:

# @info: #optional information about a panic (since 2.9)
#
# Since: 1.5


> +
> +if (info) {
> +qapi_free_GuestPanicInformation(info);

qapi_free_*() is safe to call on a NULL pointer, so this is a useless 'if'.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 11:14 AM, Ben Warren  wrote:
> 
>> 
>> On Feb 15, 2017, at 10:24 AM, Igor Mammedov > > wrote:
>> 
>> On Wed, 15 Feb 2017 20:04:40 +0200
>> "Michael S. Tsirkin" > wrote:
>> 
>>> On Wed, Feb 15, 2017 at 06:43:09PM +0100, Igor Mammedov wrote:
 On Wed, 15 Feb 2017 18:39:06 +0200
 "Michael S. Tsirkin" > wrote:
 
> On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov wrote:  
>> On Wed, 15 Feb 2017 17:30:00 +0200
>> "Michael S. Tsirkin" > wrote:
>> 
>>> On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor Mammedov wrote:
 On Wed, 15 Feb 2017 15:13:20 +0100
 Laszlo Ersek > wrote:
 
> Commenting under Igor's reply for simplicity
> 
> On 02/15/17 11:57, Igor Mammedov wrote:  
>> On Tue, 14 Feb 2017 22:15:43 -0800
>> b...@skyportsystems.com  wrote:
>> 
>>> From: Ben Warren >> >
>>> 
>>> This is similar to the existing 'add pointer' functionality, but 
>>> instead
>>> of instructing the guest (BIOS or UEFI) to patch memory, it 
>>> instructs
>>> the guest to write the pointer back to QEMU via a writeable fw_cfg 
>>> file.
>>> 
>>> Signed-off-by: Ben Warren >> >
>>> ---
>>> hw/acpi/bios-linker-loader.c | 58 
>>> ++--
>>> include/hw/acpi/bios-linker-loader.h |  6 
>>> 2 files changed, 61 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/hw/acpi/bios-linker-loader.c 
>>> b/hw/acpi/bios-linker-loader.c
>>> index d963ebe..5030cf1 100644
>>> --- a/hw/acpi/bios-linker-loader.c
>>> +++ b/hw/acpi/bios-linker-loader.c
>>> @@ -78,6 +78,19 @@ struct BiosLinkerLoaderEntry {
>>> uint32_t length;
>>> } cksum;
>>> 
>>> +/*
>>> + * COMMAND_WRITE_POINTER - write the fw_cfg file 
>>> (originating from
>>> + * @dest_file) at @wr_pointer.offset, by adding a pointer 
>>> to the table
>>> + * originating from @src_file. 1,2,4 or 8 byte unsigned
>>> + * addition is used depending on @wr_pointer.size.
>>> + */
> 
> The words "adding" and "addition" are causing confusion here.
> 
> In all of the previous discussion, *addition* was out of scope from
> WRITE_POINTER. Again, the firmware is specifically not required to
> *read* any part of the fw_cfg blob identified by "dest_file".
> 
> WRITE_POINTER instructs the firmware to return the allocation address 
> of
> the downloaded "src_file" to QEMU. Any necessary runtime subscripting
> within "src_file" is to be handled by QEMU code dynamically.
> 
> For example, consider that "src_file" has *several* fields that QEMU
> wants to massage; in that case, indexing within QEMU code with field
> offsets is simply unavoidable.  
 what I don't like here is that this indexing would be rather fragile
 and has to be done in different parts of QEMU /device, AML/.
 
 I'd prefer this helper function to have the same @src_offset
 behavior as ADD_POINTER where patched address could point to
 any part of src_file i.e. not just beginning.  
>>> 
>>> 
>>> 
>>>/*
>>> * COMMAND_ADD_POINTER - patch the table (originating from
>>> * @dest_file) at @pointer.offset, by adding a pointer to the 
>>> table
>>> * originating from @src_file. 1,2,4 or 8 byte unsigned
>>> * addition is used depending on @pointer.size.
>>> */
>>> 
>>> so the way ADD works is
>>> read at offset
>>> add table address
>>> write result at offset
>>> 
>>> in other words it is always beginning of table that is added.
>> more exactly it's, read at 
>>  src_offset = *(dst_blob_ptr+dst_offset)
>>  *(dst_blob+dst_offset) = src_blob_ptr + src_offset
>> 
>>> Would the following be acceptable?
>>> 
>>> 
>>> * COMMAND_WRITE_POINTER - update the fw_cfg file (originating 
>>> from
>>> * @dest_file) at @wr_pointer.offset, by writing a pointer to 
>>> the table
>>> * originating from @src_file. 1,2,4 or 8 byte unsigned value
>>> * is written depending on 

Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Igor Mammedov
On Wed, 15 Feb 2017 10:14:55 -0800
Ben Warren  wrote:

> > On Feb 15, 2017, at 10:06 AM, Michael S. Tsirkin  wrote:
> > 
> > On Wed, Feb 15, 2017 at 09:54:08AM -0800, Ben Warren wrote:  
> >> 
> >>On Feb 15, 2017, at 9:43 AM, Igor Mammedov  wrote:
> >> 
> >>On Wed, 15 Feb 2017 18:39:06 +0200
> >>"Michael S. Tsirkin"  wrote:
> >> 
> >> 
> >>On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov wrote:
> >> 
> >>On Wed, 15 Feb 2017 17:30:00 +0200
> >>"Michael S. Tsirkin"  wrote:
> >> 
> >> 
> >>On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor Mammedov 
> >> wrote:
> >> 
> >> 
> >>On Wed, 15 Feb 2017 15:13:20 +0100
> >>Laszlo Ersek  wrote:
> >> 
> >> 
> >>Commenting under Igor's reply for simplicity
> >> 
> >>On 02/15/17 11:57, Igor Mammedov wrote:
> >> 
> >>On Tue, 14 Feb 2017 22:15:43 -0800
> >>b...@skyportsystems.com wrote:
> >> 
> >> 
> >>From: Ben Warren 
> >> 
> >>This is similar to the existing 'add 
> >> pointer'
> >>functionality, but instead
> >>of instructing the guest (BIOS or UEFI) to
> >>patch memory, it instructs
> >>the guest to write the pointer back to QEMU 
> >> via
> >>a writeable fw_cfg file.
> >> 
> >>Signed-off-by: Ben Warren <  
> >>b...@skyportsystems.com>  
> >>---
> >>hw/acpi/bios-linker-loader.c | 58
> >>++--
> >>include/hw/acpi/bios-linker-loader.h |  6 
> >> 
> >>2 files changed, 61 insertions(+), 3 
> >> deletions
> >>(-)
> >> 
> >>diff --git a/hw/acpi/bios-linker-loader.c 
> >> b/hw/
> >>acpi/bios-linker-loader.c
> >>index d963ebe..5030cf1 100644
> >>--- a/hw/acpi/bios-linker-loader.c
> >>+++ b/hw/acpi/bios-linker-loader.c
> >>@@ -78,6 +78,19 @@ struct 
> >> BiosLinkerLoaderEntry
> >>{
> >>uint32_t length;
> >>} cksum;
> >> 
> >>+/*
> >>+ * COMMAND_WRITE_POINTER - write 
> >> the
> >>fw_cfg file (originating from
> >>+ * @dest_file) at 
> >> @wr_pointer.offset,
> >>by adding a pointer to the table
> >>+ * originating from @src_file. 
> >> 1,2,4
> >>or 8 byte unsigned
> >>+ * addition is used depending on
> >>@wr_pointer.size.
> >>+ */  
> >> 
> >> 
> >>The words "adding" and "addition" are causing 
> >> confusion
> >>here.
> >> 
> >>In all of the previous discussion, *addition* was 
> >> out
> >>of scope from
> >>WRITE_POINTER. Again, the firmware is specifically 
> >> not
> >>required to
> >>*read* any part of the fw_cfg blob identified by
> >>"dest_file".
> >> 
> >>WRITE_POINTER instructs the firmware to return the
> >>allocation address of
> >>the downloaded "src_file" to QEMU. Any necessary
> >>runtime subscripting
> >>within "src_file" is to be handled by QEMU code
> >>dynamically.
> >> 
> >>For example, consider that "src_file" has *several*
> >>fields that QEMU
> >>wants to massage; in that case, indexing within QEMU
> >>code with field
> >>offsets is simply unavoidable.
> >> 
> >>what I don't like here is that this indexing would be
> >>rather fragile
> >>and has to be done in different parts of QEMU /device, 
> >> AML
> >>   

Re: [Qemu-devel] [PATCH v3 0/5] SLIRP VMStatification

2017-02-15 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH v3 0/5] SLIRP VMStatification
Message-id: 20170215181922.11624-1-dgilb...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20170215181922.11624-1-dgilb...@redhat.com -> 
patchew/20170215181922.11624-1-dgilb...@redhat.com
Switched to a new branch 'test'
158a5ce slirp: VMStatify remaining except for loop
c9a77f6 slirp: VMStatify socket level
a3add04 slirp: Common lhost/fhost union
0c4b4ae slirp: VMStatify sbuf
b9e6cbb slirp: VMState conversion; tcpcb

=== OUTPUT BEGIN ===
Checking PATCH 1/5: slirp: VMState conversion; tcpcb...
ERROR: code indent should never use tabs
#211: FILE: slirp/tcp_var.h:51:
+^Iuint8_t t_force;^I^I/* 1 if forcing out a byte */$

ERROR: code indent should never use tabs
#221: FILE: slirp/tcp_var.h:112:
+^Iuint8_t^It_oobflags;^I^I/* have some */$

ERROR: code indent should never use tabs
#222: FILE: slirp/tcp_var.h:113:
+^Iuint8_t^It_iobc;^I^I^I/* input character */$

total: 3 errors, 0 warnings, 195 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 2/5: slirp: VMStatify sbuf...
ERROR: code indent should never use tabs
#26: FILE: slirp/sbuf.h:15:
+^Iuint32_t sb_cc;^I^I/* actual chars in buffer */$

ERROR: code indent should never use tabs
#27: FILE: slirp/sbuf.h:16:
+^Iuint32_t sb_datalen;^I/* Length of data  */$

total: 2 errors, 0 warnings, 155 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 3/5: slirp: Common lhost/fhost union...
Checking PATCH 4/5: slirp: VMStatify socket level...
ERROR: if this code is redundant consider removing it
#86: FILE: slirp/slirp.c:1286:
+#if 0

total: 1 errors, 0 warnings, 206 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 5/5: slirp: VMStatify remaining except for loop...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH v3 3/5] slirp: Common lhost/fhost union

2017-02-15 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

The socket structure has a pair of unions for lhost and fhost
addresses; the unions are identical so split them out into
a separate union declaration.

Signed-off-by: Dr. David Alan Gilbert 
---
 slirp/socket.h | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/slirp/socket.h b/slirp/socket.h
index 8feed2a..c1be77e 100644
--- a/slirp/socket.h
+++ b/slirp/socket.h
@@ -15,6 +15,12 @@
  * Our socket structure
  */
 
+union slirp_sockaddr {
+struct sockaddr_storage ss;
+struct sockaddr_in sin;
+struct sockaddr_in6 sin6;
+};
+
 struct socket {
   struct socket *so_next,*so_prev;  /* For a linked list of sockets */
 
@@ -31,22 +37,14 @@ struct socket {
   struct tcpiphdr *so_ti; /* Pointer to the original ti within
* so_mconn, for non-blocking connections */
   int so_urgc;
-  union {   /* foreign host */
-  struct sockaddr_storage ss;
-  struct sockaddr_in sin;
-  struct sockaddr_in6 sin6;
-  } fhost;
+  union slirp_sockaddr fhost;  /* Foreign host */
 #define so_faddr fhost.sin.sin_addr
 #define so_fport fhost.sin.sin_port
 #define so_faddr6 fhost.sin6.sin6_addr
 #define so_fport6 fhost.sin6.sin6_port
 #define so_ffamily fhost.ss.ss_family
 
-  union {   /* local host */
-  struct sockaddr_storage ss;
-  struct sockaddr_in sin;
-  struct sockaddr_in6 sin6;
-  } lhost;
+  union slirp_sockaddr lhost;  /* Local host */
 #define so_laddr lhost.sin.sin_addr
 #define so_lport lhost.sin.sin_port
 #define so_laddr6 lhost.sin6.sin6_addr
-- 
2.9.3




[Qemu-devel] [PATCH v3 1/5] slirp: VMState conversion; tcpcb

2017-02-15 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Convert the migration of the struct tcpcb to use a VMStateDescription,
the rest of it will come later.

Mostly mechanical, except for conversion of some 'char' to uint8_t
to ensure portability.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Samuel Thibault 
Reviewed-by: Juan Quintela 
---
 slirp/slirp.c   | 149 
 slirp/tcp_var.h |   6 +--
 2 files changed, 57 insertions(+), 98 deletions(-)

diff --git a/slirp/slirp.c b/slirp/slirp.c
index 60539de..276d8cb 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -1129,53 +1129,62 @@ void slirp_socket_recv(Slirp *slirp, struct in_addr 
guest_addr, int guest_port,
 tcp_output(sototcpcb(so));
 }
 
-static void slirp_tcp_save(QEMUFile *f, struct tcpcb *tp)
+static int slirp_tcp_post_load(void *opaque, int version)
 {
-int i;
+tcp_template((struct tcpcb *)opaque);
 
-qemu_put_sbe16(f, tp->t_state);
-for (i = 0; i < TCPT_NTIMERS; i++)
-qemu_put_sbe16(f, tp->t_timer[i]);
-qemu_put_sbe16(f, tp->t_rxtshift);
-qemu_put_sbe16(f, tp->t_rxtcur);
-qemu_put_sbe16(f, tp->t_dupacks);
-qemu_put_be16(f, tp->t_maxseg);
-qemu_put_sbyte(f, tp->t_force);
-qemu_put_be16(f, tp->t_flags);
-qemu_put_be32(f, tp->snd_una);
-qemu_put_be32(f, tp->snd_nxt);
-qemu_put_be32(f, tp->snd_up);
-qemu_put_be32(f, tp->snd_wl1);
-qemu_put_be32(f, tp->snd_wl2);
-qemu_put_be32(f, tp->iss);
-qemu_put_be32(f, tp->snd_wnd);
-qemu_put_be32(f, tp->rcv_wnd);
-qemu_put_be32(f, tp->rcv_nxt);
-qemu_put_be32(f, tp->rcv_up);
-qemu_put_be32(f, tp->irs);
-qemu_put_be32(f, tp->rcv_adv);
-qemu_put_be32(f, tp->snd_max);
-qemu_put_be32(f, tp->snd_cwnd);
-qemu_put_be32(f, tp->snd_ssthresh);
-qemu_put_sbe16(f, tp->t_idle);
-qemu_put_sbe16(f, tp->t_rtt);
-qemu_put_be32(f, tp->t_rtseq);
-qemu_put_sbe16(f, tp->t_srtt);
-qemu_put_sbe16(f, tp->t_rttvar);
-qemu_put_be16(f, tp->t_rttmin);
-qemu_put_be32(f, tp->max_sndwnd);
-qemu_put_byte(f, tp->t_oobflags);
-qemu_put_byte(f, tp->t_iobc);
-qemu_put_sbe16(f, tp->t_softerror);
-qemu_put_byte(f, tp->snd_scale);
-qemu_put_byte(f, tp->rcv_scale);
-qemu_put_byte(f, tp->request_r_scale);
-qemu_put_byte(f, tp->requested_s_scale);
-qemu_put_be32(f, tp->ts_recent);
-qemu_put_be32(f, tp->ts_recent_age);
-qemu_put_be32(f, tp->last_ack_sent);
+return 0;
 }
 
+static const VMStateDescription vmstate_slirp_tcp = {
+.name = "slirp-tcp",
+.version_id = 0,
+.post_load = slirp_tcp_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_INT16(t_state, struct tcpcb),
+VMSTATE_INT16_ARRAY(t_timer, struct tcpcb, TCPT_NTIMERS),
+VMSTATE_INT16(t_rxtshift, struct tcpcb),
+VMSTATE_INT16(t_rxtcur, struct tcpcb),
+VMSTATE_INT16(t_dupacks, struct tcpcb),
+VMSTATE_UINT16(t_maxseg, struct tcpcb),
+VMSTATE_UINT8(t_force, struct tcpcb),
+VMSTATE_UINT16(t_flags, struct tcpcb),
+VMSTATE_UINT32(snd_una, struct tcpcb),
+VMSTATE_UINT32(snd_nxt, struct tcpcb),
+VMSTATE_UINT32(snd_up, struct tcpcb),
+VMSTATE_UINT32(snd_wl1, struct tcpcb),
+VMSTATE_UINT32(snd_wl2, struct tcpcb),
+VMSTATE_UINT32(iss, struct tcpcb),
+VMSTATE_UINT32(snd_wnd, struct tcpcb),
+VMSTATE_UINT32(rcv_wnd, struct tcpcb),
+VMSTATE_UINT32(rcv_nxt, struct tcpcb),
+VMSTATE_UINT32(rcv_up, struct tcpcb),
+VMSTATE_UINT32(irs, struct tcpcb),
+VMSTATE_UINT32(rcv_adv, struct tcpcb),
+VMSTATE_UINT32(snd_max, struct tcpcb),
+VMSTATE_UINT32(snd_cwnd, struct tcpcb),
+VMSTATE_UINT32(snd_ssthresh, struct tcpcb),
+VMSTATE_INT16(t_idle, struct tcpcb),
+VMSTATE_INT16(t_rtt, struct tcpcb),
+VMSTATE_UINT32(t_rtseq, struct tcpcb),
+VMSTATE_INT16(t_srtt, struct tcpcb),
+VMSTATE_INT16(t_rttvar, struct tcpcb),
+VMSTATE_UINT16(t_rttmin, struct tcpcb),
+VMSTATE_UINT32(max_sndwnd, struct tcpcb),
+VMSTATE_UINT8(t_oobflags, struct tcpcb),
+VMSTATE_UINT8(t_iobc, struct tcpcb),
+VMSTATE_INT16(t_softerror, struct tcpcb),
+VMSTATE_UINT8(snd_scale, struct tcpcb),
+VMSTATE_UINT8(rcv_scale, struct tcpcb),
+VMSTATE_UINT8(request_r_scale, struct tcpcb),
+VMSTATE_UINT8(requested_s_scale, struct tcpcb),
+VMSTATE_UINT32(ts_recent, struct tcpcb),
+VMSTATE_UINT32(ts_recent_age, struct tcpcb),
+VMSTATE_UINT32(last_ack_sent, struct tcpcb),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void slirp_sbuf_save(QEMUFile *f, struct sbuf *sbuf)
 {
 uint32_t off;
@@ -1218,7 +1227,7 @@ static void slirp_socket_save(QEMUFile *f, struct socket 
*so)
 qemu_put_be32(f, so->so_state);
 

[Qemu-devel] [PATCH v3 5/5] slirp: VMStatify remaining except for loop

2017-02-15 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

This converts the remaining components, except for the top level
loop, to VMState.

Signed-off-by: Dr. David Alan Gilbert 
---
 slirp/slirp.c | 48 +++-
 1 file changed, 19 insertions(+), 29 deletions(-)

diff --git a/slirp/slirp.c b/slirp/slirp.c
index 6fc7bac..0fcab0b 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -1332,15 +1332,25 @@ static const VMStateDescription vmstate_slirp_socket = {
 }
 };
 
-static void slirp_bootp_save(QEMUFile *f, Slirp *slirp)
-{
-int i;
+static const VMStateDescription vmstate_slirp_bootp_client = {
+.name = "slirp_bootpclient",
+.fields = (VMStateField[]) {
+VMSTATE_UINT16(allocated, BOOTPClient),
+VMSTATE_BUFFER(macaddr, BOOTPClient),
+VMSTATE_END_OF_LIST()
+}
+};
 
-for (i = 0; i < NB_BOOTP_CLIENTS; i++) {
-qemu_put_be16(f, slirp->bootp_clients[i].allocated);
-qemu_put_buffer(f, slirp->bootp_clients[i].macaddr, 6);
+static const VMStateDescription vmstate_slirp = {
+.name = "slirp",
+.version_id = 4,
+.fields = (VMStateField[]) {
+VMSTATE_UINT16_V(ip_id, Slirp, 2),
+VMSTATE_STRUCT_ARRAY(bootp_clients, Slirp, NB_BOOTP_CLIENTS, 3,
+ vmstate_slirp_bootp_client, BOOTPClient),
+VMSTATE_END_OF_LIST()
 }
-}
+};
 
 static void slirp_state_save(QEMUFile *f, void *opaque)
 {
@@ -1360,22 +1370,10 @@ static void slirp_state_save(QEMUFile *f, void *opaque)
 }
 qemu_put_byte(f, 0);
 
-qemu_put_be16(f, slirp->ip_id);
-
-slirp_bootp_save(f, slirp);
+vmstate_save_state(f, _slirp, slirp, NULL);
 }
 
 
-static void slirp_bootp_load(QEMUFile *f, Slirp *slirp)
-{
-int i;
-
-for (i = 0; i < NB_BOOTP_CLIENTS; i++) {
-slirp->bootp_clients[i].allocated = qemu_get_be16(f);
-qemu_get_buffer(f, slirp->bootp_clients[i].macaddr, 6);
-}
-}
-
 static int slirp_state_load(QEMUFile *f, void *opaque, int version_id)
 {
 Slirp *slirp = opaque;
@@ -1410,13 +1408,5 @@ static int slirp_state_load(QEMUFile *f, void *opaque, 
int version_id)
 so->extra = (void *)ex_ptr->ex_exec;
 }
 
-if (version_id >= 2) {
-slirp->ip_id = qemu_get_be16(f);
-}
-
-if (version_id >= 3) {
-slirp_bootp_load(f, slirp);
-}
-
-return 0;
+return vmstate_load_state(f, _slirp, slirp, version_id);
 }
-- 
2.9.3




Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Michael S. Tsirkin
On Wed, Feb 15, 2017 at 06:43:09PM +0100, Igor Mammedov wrote:
> On Wed, 15 Feb 2017 18:39:06 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov wrote:
> > > On Wed, 15 Feb 2017 17:30:00 +0200
> > > "Michael S. Tsirkin"  wrote:
> > >   
> > > > On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor Mammedov wrote:  
> > > > > On Wed, 15 Feb 2017 15:13:20 +0100
> > > > > Laszlo Ersek  wrote:
> > > > > 
> > > > > > Commenting under Igor's reply for simplicity
> > > > > > 
> > > > > > On 02/15/17 11:57, Igor Mammedov wrote:
> > > > > > > On Tue, 14 Feb 2017 22:15:43 -0800
> > > > > > > b...@skyportsystems.com wrote:
> > > > > > >   
> > > > > > >> From: Ben Warren 
> > > > > > >>
> > > > > > >> This is similar to the existing 'add pointer' functionality, but 
> > > > > > >> instead
> > > > > > >> of instructing the guest (BIOS or UEFI) to patch memory, it 
> > > > > > >> instructs
> > > > > > >> the guest to write the pointer back to QEMU via a writeable 
> > > > > > >> fw_cfg file.
> > > > > > >>
> > > > > > >> Signed-off-by: Ben Warren 
> > > > > > >> ---
> > > > > > >>  hw/acpi/bios-linker-loader.c | 58 
> > > > > > >> ++--
> > > > > > >>  include/hw/acpi/bios-linker-loader.h |  6 
> > > > > > >>  2 files changed, 61 insertions(+), 3 deletions(-)
> > > > > > >>
> > > > > > >> diff --git a/hw/acpi/bios-linker-loader.c 
> > > > > > >> b/hw/acpi/bios-linker-loader.c
> > > > > > >> index d963ebe..5030cf1 100644
> > > > > > >> --- a/hw/acpi/bios-linker-loader.c
> > > > > > >> +++ b/hw/acpi/bios-linker-loader.c
> > > > > > >> @@ -78,6 +78,19 @@ struct BiosLinkerLoaderEntry {
> > > > > > >>  uint32_t length;
> > > > > > >>  } cksum;
> > > > > > >>  
> > > > > > >> +/*
> > > > > > >> + * COMMAND_WRITE_POINTER - write the fw_cfg file 
> > > > > > >> (originating from
> > > > > > >> + * @dest_file) at @wr_pointer.offset, by adding a 
> > > > > > >> pointer to the table
> > > > > > >> + * originating from @src_file. 1,2,4 or 8 byte unsigned
> > > > > > >> + * addition is used depending on @wr_pointer.size.
> > > > > > >> + */  
> > > > > > 
> > > > > > The words "adding" and "addition" are causing confusion here.
> > > > > > 
> > > > > > In all of the previous discussion, *addition* was out of scope from
> > > > > > WRITE_POINTER. Again, the firmware is specifically not required to
> > > > > > *read* any part of the fw_cfg blob identified by "dest_file".
> > > > > > 
> > > > > > WRITE_POINTER instructs the firmware to return the allocation 
> > > > > > address of
> > > > > > the downloaded "src_file" to QEMU. Any necessary runtime 
> > > > > > subscripting
> > > > > > within "src_file" is to be handled by QEMU code dynamically.
> > > > > > 
> > > > > > For example, consider that "src_file" has *several* fields that QEMU
> > > > > > wants to massage; in that case, indexing within QEMU code with field
> > > > > > offsets is simply unavoidable.
> > > > > what I don't like here is that this indexing would be rather fragile
> > > > > and has to be done in different parts of QEMU /device, AML/.
> > > > > 
> > > > > I'd prefer this helper function to have the same @src_offset
> > > > > behavior as ADD_POINTER where patched address could point to
> > > > > any part of src_file i.e. not just beginning.
> > > > 
> > > > 
> > > > 
> > > > /*
> > > >  * COMMAND_ADD_POINTER - patch the table (originating from
> > > >  * @dest_file) at @pointer.offset, by adding a pointer to the 
> > > > table
> > > >  * originating from @src_file. 1,2,4 or 8 byte unsigned
> > > >  * addition is used depending on @pointer.size.
> > > >  */
> > > >  
> > > > so the way ADD works is
> > > > read at offset
> > > > add table address
> > > > write result at offset
> > > > 
> > > > in other words it is always beginning of table that is added.  
> > > more exactly it's, read at 
> > >   src_offset = *(dst_blob_ptr+dst_offset)
> > >   *(dst_blob+dst_offset) = src_blob_ptr + src_offset
> > >   
> > > > Would the following be acceptable?
> > > > 
> > > > 
> > > >  * COMMAND_WRITE_POINTER - update the fw_cfg file (originating 
> > > > from
> > > >  * @dest_file) at @wr_pointer.offset, by writing a pointer to 
> > > > the table
> > > >  * originating from @src_file. 1,2,4 or 8 byte unsigned value
> > > >  * is written depending on @wr_pointer.size.  
> > > it looses 'adding' part of ADD_POINTER command which handles src_offset,
> > > however implementing adding part looks a bit complicated
> > > as patched blob (dst) is not in guest memory but in QEMU and
> > > on reset *(dst_blob+dst_offset) should be reset to src_offset.
> > > Considering dst file could be device specific memory 

Re: [Qemu-devel] [PATCH v2] linux-user: Add sockopts for IPv6 ping and IPv6 traceroute

2017-02-15 Thread Laurent Vivier
Le 13/02/2017 à 23:01, Helge Deller a écrit :
> Add the neccessary sockopts for ping and traceroute on IPv6.
> 
> This fixes the following qemu warnings with IPv6:
> Unsupported ancillary data: 0/2
> Unsupported ancillary data: 0/11
> Unsupported ancillary data: 41/25
> Unsupported setsockopt level=0 optname=12 
> Unsupported setsockopt level=41 optname=16
> Unsupported setsockopt level=41 optname=25
> Unsupported setsockopt level=41 optname=50
> Unsupported setsockopt level=41 optname=51
> Unsupported setsockopt level=41 optname=8
> Unsupported setsockopt level=58 optname=1
> 
> Tested on hppa-linux-user and x86_64-linux-user.
> 
> Changes to v1:
> - Added IPV6_PKTINFO sockopt as reported by Philippe Mathieu-Daudé

Please, add the version comment after the "---" marker
(BTW, where is it?)

> 
> Signed-off-by: Helge Deller 
> Tested-by: Philippe Mathieu-Daudé 
> Reviewed-by: Philippe Mathieu-Daudé 

("---" should be here, you should use "git send-email"...)

> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 9be8e95..97c2519 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -57,6 +57,8 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include "qemu-common.h"
>  #ifdef CONFIG_TIMERFD
>  #include 
> @@ -1839,6 +1841,78 @@ static inline abi_long host_to_target_cmsg(struct 
> target_msghdr *target_msgh,
>  }
>  break;
>  
> +case SOL_IP:
> +switch (cmsg->cmsg_type) {
> +case IP_TTL:
> +{
> +goto copy_word;

Not sure a "goto" between "SOL_*" cases is a good idea.

> +}
> +case IP_RECVERR:
> +{
> +struct errhdr_t {
> +   struct sock_extended_err ee;
> +   struct sockaddr_in offender;
> +};
> +struct errhdr_t *errh = (struct errhdr_t *)data;
> +struct errhdr_t *target_errh =
> +(struct errhdr_t *)target_data;
> +
> +__put_user(errh->ee.ee_errno, _errh->ee.ee_errno);
> +__put_user(errh->ee.ee_origin, _errh->ee.ee_origin);
> +__put_user(errh->ee.ee_type,  _errh->ee.ee_type);
> +__put_user(errh->ee.ee_code, _errh->ee.ee_code);
> +__put_user(errh->ee.ee_pad, _errh->ee.ee_pad);
> +__put_user(errh->ee.ee_info, _errh->ee.ee_info);
> +__put_user(errh->ee.ee_data, _errh->ee.ee_data);
> +host_to_target_sockaddr((unsigned long) 
> _errh->offender,
> +(void *) >offender, sizeof(errh->offender));
> + break;
> +}
> +default:
> +goto unimplemented;
> +}
> +break;
> +
> +case SOL_IPV6:
> +switch (cmsg->cmsg_type) {
> +case IPV6_HOPLIMIT:
> +copy_word:
> +{
> +uint32_t *v = (uint32_t *)data;
> +uint32_t *t_int = (uint32_t *)target_data;
> +if (tgt_len != CMSG_LEN(0))
> +goto unimplemented;
> +
> +__put_user(*v, t_int);
> + break;
> +}
> +case IPV6_RECVERR:
> +{
> +struct errhdr6_t {
> +   struct sock_extended_err ee;
> +   struct sockaddr_in6 offender;
> +};
> +struct errhdr6_t *errh = (struct errhdr6_t *)data;
> +struct errhdr6_t *target_errh =
> +(struct errhdr6_t *)target_data;
> +
> +__put_user(errh->ee.ee_errno, _errh->ee.ee_errno);
> +__put_user(errh->ee.ee_origin, _errh->ee.ee_origin);
> +__put_user(errh->ee.ee_type,  _errh->ee.ee_type);
> +__put_user(errh->ee.ee_code, _errh->ee.ee_code);
> +__put_user(errh->ee.ee_pad, _errh->ee.ee_pad);
> +__put_user(errh->ee.ee_info, _errh->ee.ee_info);
> +__put_user(errh->ee.ee_data, _errh->ee.ee_data);
> +target_errh->offender = errh->offender;
> +__put_user(errh->offender.sin6_family, 
> _errh->offender.sin6_family);
> +__put_user(errh->offender.sin6_scope_id, 
> _errh->offender.sin6_scope_id);

Perhaps you can put this in host_to_target_sockaddr()?
[I don't know IPv6]

> + break;
> +}
> +default:
> +goto unimplemented;
> +}
> +break;
> +
>  default:
>  unimplemented:
>  gemu_log("Unsupported ancillary data: %d/%d\n",
> @@ -2766,6 +2840,7 @@ static abi_long do_setsockopt(int sockfd, int level, 
> int optname,
>  case IP_PKTINFO:
>  case IP_MTU_DISCOVER:
>  case IP_RECVERR:
> +case 

Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 9:43 AM, Igor Mammedov  wrote:
> 
> On Wed, 15 Feb 2017 18:39:06 +0200
> "Michael S. Tsirkin" > wrote:
> 
>> On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov wrote:
>>> On Wed, 15 Feb 2017 17:30:00 +0200
>>> "Michael S. Tsirkin"  wrote:
>>> 
 On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor Mammedov wrote:  
> On Wed, 15 Feb 2017 15:13:20 +0100
> Laszlo Ersek  wrote:
> 
>> Commenting under Igor's reply for simplicity
>> 
>> On 02/15/17 11:57, Igor Mammedov wrote:
>>> On Tue, 14 Feb 2017 22:15:43 -0800
>>> b...@skyportsystems.com wrote:
>>> 
 From: Ben Warren 
 
 This is similar to the existing 'add pointer' functionality, but 
 instead
 of instructing the guest (BIOS or UEFI) to patch memory, it instructs
 the guest to write the pointer back to QEMU via a writeable fw_cfg 
 file.
 
 Signed-off-by: Ben Warren 
 ---
 hw/acpi/bios-linker-loader.c | 58 
 ++--
 include/hw/acpi/bios-linker-loader.h |  6 
 2 files changed, 61 insertions(+), 3 deletions(-)
 
 diff --git a/hw/acpi/bios-linker-loader.c 
 b/hw/acpi/bios-linker-loader.c
 index d963ebe..5030cf1 100644
 --- a/hw/acpi/bios-linker-loader.c
 +++ b/hw/acpi/bios-linker-loader.c
 @@ -78,6 +78,19 @@ struct BiosLinkerLoaderEntry {
 uint32_t length;
 } cksum;
 
 +/*
 + * COMMAND_WRITE_POINTER - write the fw_cfg file (originating 
 from
 + * @dest_file) at @wr_pointer.offset, by adding a pointer to 
 the table
 + * originating from @src_file. 1,2,4 or 8 byte unsigned
 + * addition is used depending on @wr_pointer.size.
 + */  
>> 
>> The words "adding" and "addition" are causing confusion here.
>> 
>> In all of the previous discussion, *addition* was out of scope from
>> WRITE_POINTER. Again, the firmware is specifically not required to
>> *read* any part of the fw_cfg blob identified by "dest_file".
>> 
>> WRITE_POINTER instructs the firmware to return the allocation address of
>> the downloaded "src_file" to QEMU. Any necessary runtime subscripting
>> within "src_file" is to be handled by QEMU code dynamically.
>> 
>> For example, consider that "src_file" has *several* fields that QEMU
>> wants to massage; in that case, indexing within QEMU code with field
>> offsets is simply unavoidable.
> what I don't like here is that this indexing would be rather fragile
> and has to be done in different parts of QEMU /device, AML/.
> 
> I'd prefer this helper function to have the same @src_offset
> behavior as ADD_POINTER where patched address could point to
> any part of src_file i.e. not just beginning.
 
 
 
/*
 * COMMAND_ADD_POINTER - patch the table (originating from
 * @dest_file) at @pointer.offset, by adding a pointer to the table
 * originating from @src_file. 1,2,4 or 8 byte unsigned
 * addition is used depending on @pointer.size.
 */
 
 so the way ADD works is
read at offset
add table address
write result at offset
 
 in other words it is always beginning of table that is added.  
>>> more exactly it's, read at 
>>>  src_offset = *(dst_blob_ptr+dst_offset)
>>>  *(dst_blob+dst_offset) = src_blob_ptr + src_offset
>>> 
 Would the following be acceptable?
 
 
 * COMMAND_WRITE_POINTER - update the fw_cfg file (originating from
 * @dest_file) at @wr_pointer.offset, by writing a pointer to the 
 table
 * originating from @src_file. 1,2,4 or 8 byte unsigned value
 * is written depending on @wr_pointer.size.  
>>> it looses 'adding' part of ADD_POINTER command which handles src_offset,
>>> however implementing adding part looks a bit complicated
>>> as patched blob (dst) is not in guest memory but in QEMU and
>>> on reset *(dst_blob+dst_offset) should be reset to src_offset.
>>> Considering dst file could be device specific memory (field/blob/whatever)
>>> it could be hard to track/notice proper reset behavior.
>>> 
>>> So now I'm not sure if src_offset is worth adding.  
>> 
>> Right. Let's just do this math in QEMU if we have to.
> Math complicates QEMU code though and not only QMEMU but AML code as well.
> Considering that we are adding a new command and don't have to keep
> any sort of compatibility we can pass src_offset as part
> of command instead of hiding it inside of dst_file.
> Something like this:
> 
> 

Re: [Qemu-devel] [RFC PATCH 11/41] vvfat: Implement .bdrv_child_perm()

2017-02-15 Thread Kevin Wolf
Am 15.02.2017 um 18:30 hat Max Reitz geschrieben:
> On 13.02.2017 18:22, Kevin Wolf wrote:
> > vvfat is the last remaining driver that can have children, but doesn't
> > implement .bdrv_child_perm() yet. The default handlers aren't suitable
> > here, so let's implement a very simple driver-specific one that protects
> > the internal child from being used by other users as good as our
> > permissions permit.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  block/vvfat.c | 13 +
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/block/vvfat.c b/block/vvfat.c
> > index c6bf67e..7246432 100644
> > --- a/block/vvfat.c
> > +++ b/block/vvfat.c
> > @@ -3052,6 +3052,18 @@ err:
> >  return ret;
> >  }
> >  
> > +static void vvfat_child_perm(BlockDriverState *bs, BdrvChild *c,
> > + const BdrvChildRole *role,
> > + uint64_t perm, uint64_t shared,
> > + uint64_t *nperm, uint64_t *nshared)
> > +{
> > +assert(role == _vvfat_qcow);
> > +
> > +/* This is a private node, nobody should try to attach to it */
> > +*nperm = BLK_PERM_WRITE;
> > +*nshared = 0;
> 
> 0 for shared is probably enough to ward every other access off, but
> maybe we should still pro forma request consistent read access...?

Makes sense, yes.

But you missed the real bug I hid there for you:

qemu-system-x86_64: block.c:1530: bdrv_check_update_perm: Assertion
`new_shared_perm & BLK_PERM_WRITE_UNCHANGED' failed.

Kevin

> Max
> 
> > +}
> > +
> >  static void vvfat_close(BlockDriverState *bs)
> >  {
> >  BDRVVVFATState *s = bs->opaque;
> > @@ -3077,6 +3089,7 @@ static BlockDriver bdrv_vvfat = {
> >  .bdrv_file_open = vvfat_open,
> >  .bdrv_refresh_limits= vvfat_refresh_limits,
> >  .bdrv_close = vvfat_close,
> > +.bdrv_child_perm= vvfat_child_perm,
> >  
> >  .bdrv_co_preadv = vvfat_co_preadv,
> >  .bdrv_co_pwritev= vvfat_co_pwritev,
> > 
> 
> 





pgp_P_HFe314u.pgp
Description: PGP signature


Re: [Qemu-devel] [RFC PATCH 10/41] block: Request child permissions in format drivers

2017-02-15 Thread Max Reitz
On 13.02.2017 18:22, Kevin Wolf wrote:
> This makes use of the .bdrv_child_perm() implementation for formats that
> we just added. All format drivers expose the permissions they actually
> need nows, so that they can be set accordingly and updated when parents
> are attached or detached.
> 
> The only format not included here is raw, which was already converted
> with the other filter drivers.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/bochs.c | 1 +

In theory it might make sense to have another category "read-only format
drivers". In practice those are the drivers we don't care much about,
s...

>  block/cloop.c | 1 +
>  block/crypto.c| 1 +

I'm not sure whether the crypto driver qualifies more as a format driver
than as a filter driver. I guess saying it's a format driver is a bit
more strict and it's better to be strict when it doubt.

Max

>  block/dmg.c   | 1 +
>  block/parallels.c | 1 +
>  block/qcow.c  | 1 +
>  block/qcow2.c | 1 +
>  block/qed.c   | 1 +
>  block/vdi.c   | 1 +
>  block/vhdx.c  | 1 +
>  block/vmdk.c  | 1 +
>  block/vpc.c   | 1 +
>  12 files changed, 12 insertions(+)
> 
> diff --git a/block/bochs.c b/block/bochs.c
> index 7dd2ac4..516da56 100644
> --- a/block/bochs.c
> +++ b/block/bochs.c
> @@ -293,6 +293,7 @@ static BlockDriver bdrv_bochs = {
>  .instance_size   = sizeof(BDRVBochsState),
>  .bdrv_probe  = bochs_probe,
>  .bdrv_open   = bochs_open,
> +.bdrv_child_perm = bdrv_format_default_perms,
>  .bdrv_refresh_limits = bochs_refresh_limits,
>  .bdrv_co_preadv = bochs_co_preadv,
>  .bdrv_close  = bochs_close,
> diff --git a/block/cloop.c b/block/cloop.c
> index 877c9b0..a6c7b9d 100644
> --- a/block/cloop.c
> +++ b/block/cloop.c
> @@ -290,6 +290,7 @@ static BlockDriver bdrv_cloop = {
>  .instance_size  = sizeof(BDRVCloopState),
>  .bdrv_probe = cloop_probe,
>  .bdrv_open  = cloop_open,
> +.bdrv_child_perm = bdrv_format_default_perms,
>  .bdrv_refresh_limits = cloop_refresh_limits,
>  .bdrv_co_preadv = cloop_co_preadv,
>  .bdrv_close = cloop_close,
> diff --git a/block/crypto.c b/block/crypto.c
> index 200fd0b..ca46883 100644
> --- a/block/crypto.c
> +++ b/block/crypto.c
> @@ -628,6 +628,7 @@ BlockDriver bdrv_crypto_luks = {
>  .bdrv_probe = block_crypto_probe_luks,
>  .bdrv_open  = block_crypto_open_luks,
>  .bdrv_close = block_crypto_close,
> +.bdrv_child_perm= bdrv_format_default_perms,
>  .bdrv_create= block_crypto_create_luks,
>  .bdrv_truncate  = block_crypto_truncate,
>  .create_opts= _crypto_create_opts_luks,
> diff --git a/block/dmg.c b/block/dmg.c
> index 8e387cd..a7d25fc 100644
> --- a/block/dmg.c
> +++ b/block/dmg.c
> @@ -697,6 +697,7 @@ static BlockDriver bdrv_dmg = {
>  .bdrv_probe = dmg_probe,
>  .bdrv_open  = dmg_open,
>  .bdrv_refresh_limits = dmg_refresh_limits,
> +.bdrv_child_perm = bdrv_format_default_perms,
>  .bdrv_co_preadv = dmg_co_preadv,
>  .bdrv_close = dmg_close,
>  };
> diff --git a/block/parallels.c b/block/parallels.c
> index d3970e1..b79e7df 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -762,6 +762,7 @@ static BlockDriver bdrv_parallels = {
>  .bdrv_probe  = parallels_probe,
>  .bdrv_open   = parallels_open,
>  .bdrv_close  = parallels_close,
> +.bdrv_child_perm  = bdrv_format_default_perms,
>  .bdrv_co_get_block_status = parallels_co_get_block_status,
>  .bdrv_has_zero_init   = bdrv_has_zero_init_1,
>  .bdrv_co_flush_to_os  = parallels_co_flush_to_os,
> diff --git a/block/qcow.c b/block/qcow.c
> index a6dfe1a..3a95d4f 100644
> --- a/block/qcow.c
> +++ b/block/qcow.c
> @@ -1052,6 +1052,7 @@ static BlockDriver bdrv_qcow = {
>  .bdrv_probe  = qcow_probe,
>  .bdrv_open   = qcow_open,
>  .bdrv_close  = qcow_close,
> +.bdrv_child_perm= bdrv_format_default_perms,
>  .bdrv_reopen_prepare= qcow_reopen_prepare,
>  .bdrv_create= qcow_create,
>  .bdrv_has_zero_init = bdrv_has_zero_init_1,
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 4684554..dac3fb8 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3399,6 +3399,7 @@ BlockDriver bdrv_qcow2 = {
>  .bdrv_reopen_commit   = qcow2_reopen_commit,
>  .bdrv_reopen_abort= qcow2_reopen_abort,
>  .bdrv_join_options= qcow2_join_options,
> +.bdrv_child_perm  = bdrv_format_default_perms,
>  .bdrv_create= qcow2_create,
>  .bdrv_has_zero_init = bdrv_has_zero_init_1,
>  .bdrv_co_get_block_status = qcow2_co_get_block_status,
> diff --git a/block/qed.c b/block/qed.c
> index 1ea5114..eda9402 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -1678,6 +1678,7 @@ static BlockDriver bdrv_qed = {
> 

Re: [Qemu-devel] [RFC PATCH 11/41] vvfat: Implement .bdrv_child_perm()

2017-02-15 Thread Max Reitz
On 13.02.2017 18:22, Kevin Wolf wrote:
> vvfat is the last remaining driver that can have children, but doesn't
> implement .bdrv_child_perm() yet. The default handlers aren't suitable
> here, so let's implement a very simple driver-specific one that protects
> the internal child from being used by other users as good as our
> permissions permit.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/vvfat.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/block/vvfat.c b/block/vvfat.c
> index c6bf67e..7246432 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -3052,6 +3052,18 @@ err:
>  return ret;
>  }
>  
> +static void vvfat_child_perm(BlockDriverState *bs, BdrvChild *c,
> + const BdrvChildRole *role,
> + uint64_t perm, uint64_t shared,
> + uint64_t *nperm, uint64_t *nshared)
> +{
> +assert(role == _vvfat_qcow);
> +
> +/* This is a private node, nobody should try to attach to it */
> +*nperm = BLK_PERM_WRITE;
> +*nshared = 0;

0 for shared is probably enough to ward every other access off, but
maybe we should still pro forma request consistent read access...?

Max

> +}
> +
>  static void vvfat_close(BlockDriverState *bs)
>  {
>  BDRVVVFATState *s = bs->opaque;
> @@ -3077,6 +3089,7 @@ static BlockDriver bdrv_vvfat = {
>  .bdrv_file_open = vvfat_open,
>  .bdrv_refresh_limits= vvfat_refresh_limits,
>  .bdrv_close = vvfat_close,
> +.bdrv_child_perm= vvfat_child_perm,
>  
>  .bdrv_co_preadv = vvfat_co_preadv,
>  .bdrv_co_pwritev= vvfat_co_pwritev,
> 




signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v4 3/8] hw/mips_gic: Update pin state on mask changes

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

If the GIC interrupt mask is changed by a write to the smask (set mask)
or rmask (reset mask) registers, we need to re-evaluate the state of the
pins/IRQs fed to the CPU. Without doing so we risk leaving a pin high
despite the interrupt that led to that state being masked, or losing
interrupts if an already pending interrupt is unmasked.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Signed-off-by: Yongbok Kim 
---
 hw/intc/mips_gic.c | 56 ++
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/hw/intc/mips_gic.c b/hw/intc/mips_gic.c
index 6e25773..15e6e40 100644
--- a/hw/intc/mips_gic.c
+++ b/hw/intc/mips_gic.c
@@ -20,31 +20,29 @@
 #include "kvm_mips.h"
 #include "hw/intc/mips_gic.h"
 
-static void mips_gic_set_vp_irq(MIPSGICState *gic, int vp, int pin, int level)
+static void mips_gic_set_vp_irq(MIPSGICState *gic, int vp, int pin)
 {
-int ored_level = level;
+int ored_level = 0;
 int i;
 
 /* ORing pending registers sharing same pin */
-if (!ored_level) {
-for (i = 0; i < gic->num_irq; i++) {
-if ((gic->irq_state[i].map_pin & GIC_MAP_MSK) == pin &&
-gic->irq_state[i].map_vp == vp &&
-gic->irq_state[i].enabled) {
-ored_level |= gic->irq_state[i].pending;
-}
-if (ored_level) {
-/* no need to iterate all interrupts */
-break;
-}
+for (i = 0; i < gic->num_irq; i++) {
+if ((gic->irq_state[i].map_pin & GIC_MAP_MSK) == pin &&
+gic->irq_state[i].map_vp == vp &&
+gic->irq_state[i].enabled) {
+ored_level |= gic->irq_state[i].pending;
 }
-if (((gic->vps[vp].compare_map & GIC_MAP_MSK) == pin) &&
-(gic->vps[vp].mask & GIC_VP_MASK_CMP_MSK)) {
-/* ORing with local pending register (count/compare) */
-ored_level |= (gic->vps[vp].pend & GIC_VP_MASK_CMP_MSK) >>
-  GIC_VP_MASK_CMP_SHF;
+if (ored_level) {
+/* no need to iterate all interrupts */
+break;
 }
 }
+if (((gic->vps[vp].compare_map & GIC_MAP_MSK) == pin) &&
+(gic->vps[vp].mask & GIC_VP_MASK_CMP_MSK)) {
+/* ORing with local pending register (count/compare) */
+ored_level |= (gic->vps[vp].pend & GIC_VP_MASK_CMP_MSK) >>
+  GIC_VP_MASK_CMP_SHF;
+}
 if (kvm_enabled())  {
 kvm_mips_set_ipi_interrupt(mips_env_get_cpu(gic->vps[vp].env),
pin + GIC_CPU_PIN_OFFSET,
@@ -55,21 +53,27 @@ static void mips_gic_set_vp_irq(MIPSGICState *gic, int vp, 
int pin, int level)
 }
 }
 
-static void gic_set_irq(void *opaque, int n_IRQ, int level)
+static void gic_update_pin_for_irq(MIPSGICState *gic, int n_IRQ)
 {
-MIPSGICState *gic = (MIPSGICState *) opaque;
 int vp = gic->irq_state[n_IRQ].map_vp;
 int pin = gic->irq_state[n_IRQ].map_pin & GIC_MAP_MSK;
 
+if (vp < 0 || vp >= gic->num_vps) {
+return;
+}
+mips_gic_set_vp_irq(gic, vp, pin);
+}
+
+static void gic_set_irq(void *opaque, int n_IRQ, int level)
+{
+MIPSGICState *gic = (MIPSGICState *) opaque;
+
 gic->irq_state[n_IRQ].pending = (uint8_t) level;
 if (!gic->irq_state[n_IRQ].enabled) {
 /* GIC interrupt source disabled */
 return;
 }
-if (vp < 0 || vp >= gic->num_vps) {
-return;
-}
-mips_gic_set_vp_irq(gic, vp, pin, level);
+gic_update_pin_for_irq(gic, n_IRQ);
 }
 
 #define OFFSET_CHECK(c) \
@@ -209,7 +213,7 @@ static void gic_timer_store_vp_compare(MIPSGICState *gic, 
uint32_t vp_index,
 gic->vps[vp_index].pend &= ~(1 << GIC_LOCAL_INT_COMPARE);
 if (gic->vps[vp_index].compare_map & GIC_MAP_TO_PIN_MSK) {
 uint32_t pin = (gic->vps[vp_index].compare_map & GIC_MAP_MSK);
-mips_gic_set_vp_irq(gic, vp_index, pin, 0);
+mips_gic_set_vp_irq(gic, vp_index, pin);
 }
 mips_gictimer_store_vp_compare(gic->gic_timer, vp_index, compare);
 }
@@ -286,6 +290,7 @@ static void gic_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 OFFSET_CHECK((base + size * 8) <= gic->num_irq);
 for (i = 0; i < size * 8; i++) {
 gic->irq_state[base + i].enabled &= !((data >> i) & 1);
+gic_update_pin_for_irq(gic, base + i);
 }
 break;
 case GIC_SH_WEDGE_OFS:
@@ -305,6 +310,7 @@ static void gic_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 OFFSET_CHECK((base + size * 8) <= gic->num_irq);
 for (i = 0; i < size * 8; i++) {
 gic->irq_state[base + i].enabled |= (data >> i) & 1;
+gic_update_pin_for_irq(gic, base + i);
 }
 break;
 case GIC_SH_MAP0_PIN_OFS 

[Qemu-devel] [PATCH v4 5/8] dtc: Update requirement to v1.4.2

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

In order to obtain fdt_first_subnode & fdt_next_subnode symbols from
libfdt for use by a later patch, bump the requirement for dtc to v1.4.2
& the submodule to that same version.

Signed-off-by: Paul Burton 
Reviewed-by: Yongbok Kim 
Signed-off-by: Yongbok Kim 
---
 configure | 6 +++---
 dtc   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 1c9655e..4b68861 100755
--- a/configure
+++ b/configure
@@ -3396,11 +3396,11 @@ fi
 if test "$fdt" != "no" ; then
   fdt_libs="-lfdt"
   # explicitly check for libfdt_env.h as it is missing in some stable installs
-  # and test for required functions to make sure we are on a version >= 1.4.0
+  # and test for required functions to make sure we are on a version >= 1.4.2
   cat > $TMPC << EOF
 #include 
 #include 
-int main(void) { fdt_get_property_by_offset(0, 0, 0); return 0; }
+int main(void) { fdt_first_subnode(0, 0); return 0; }
 EOF
   if compile_prog "" "$fdt_libs" ; then
 # system DTC is good - use it
@@ -3418,7 +3418,7 @@ EOF
 fdt_libs="-L\$(BUILD_DIR)/dtc/libfdt $fdt_libs"
   elif test "$fdt" = "yes" ; then
 # have neither and want - prompt for system/submodule install
-error_exit "DTC (libfdt) version >= 1.4.0 not present. Your options:" \
+error_exit "DTC (libfdt) version >= 1.4.2 not present. Your options:" \
 "  (1) Preferred: Install the DTC (libfdt) devel package" \
 "  (2) Fetch the DTC submodule, using:" \
 "  git submodule update --init dtc"
diff --git a/dtc b/dtc
index 65cc4d2..ec02b34 16
--- a/dtc
+++ b/dtc
@@ -1 +1 @@
-Subproject commit 65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf
+Subproject commit ec02b34c05be04f249ffaaca4b666f5246877dea
-- 
2.7.4




[Qemu-devel] [PATCH v4 8/8] hw/mips: MIPS Boston board support

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Introduce support for emulating the MIPS Boston development board. The
Boston board is built around an FPGA & 3 PCIe controllers, one of which
is connected to an Intel EG20T Platform Controller Hub. It is used
during the development & debug of new CPUs and the software intended to
run on them, and is essentially the successor to the older MIPS Malta
board.

This patch does not implement the EG20T, instead connecting an already
supported ICH-9 AHCI controller. Whilst this isn't accurate it's enough
for typical stock Boston software (eg. Linux kernels) to work with hard
disks given that both the ICH-9 & EG20T implement the AHCI
specification.

Boston boards typically boot kernels in the FIT image format, and this
patch will treat kernels provided to QEMU as such. When loading a kernel
directly, the board code will generate minimal firmware much as the
Malta board code does. This firmware will set up the CM, CPC & GIC
register base addresses then set argument registers & jump to the kernel
entry point. Alternatively, bootloader code may be loaded using the bios
argument in which case no firmware will be generated & execution will
proceed from the start of the boot code at the default MIPS boot
exception vector (offset 0x1fc0 into (c)kseg1).

Currently real Boston boards are always used with FPGA bitfiles that
include a Global Interrupt Controller (GIC), so the interrupt
configuration is only defined for such cases. Therefore the board will
only allow use of CPUs which implement the CPS components, including the
GIC, and will otherwise exit with a message.

Signed-off-by: Paul Burton 
Reviewed-by: Yongbok Kim 
[yongbok@imgtec.com:
  isolated boston machine support for mips64el.
  updated for recent Chardev changes.
  ignore missing bios/kernel for qtest.]
Signed-off-by: Yongbok Kim 
---
 configure|   2 +-
 default-configs/mips64el-softmmu.mak |   2 +
 hw/mips/Makefile.objs|   1 +
 hw/mips/boston.c | 576 +++
 4 files changed, 580 insertions(+), 1 deletion(-)
 create mode 100644 hw/mips/boston.c

diff --git a/configure b/configure
index 4b68861..8e8f18d 100755
--- a/configure
+++ b/configure
@@ -3378,7 +3378,7 @@ fi
 fdt_required=no
 for target in $target_list; do
   case $target in
-aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
+
aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu)
   fdt_required=yes
 ;;
   esac
diff --git a/default-configs/mips64el-softmmu.mak 
b/default-configs/mips64el-softmmu.mak
index 485e218..cc5f3b3 100644
--- a/default-configs/mips64el-softmmu.mak
+++ b/default-configs/mips64el-softmmu.mak
@@ -10,3 +10,5 @@ CONFIG_JAZZ=y
 CONFIG_G364FB=y
 CONFIG_JAZZ_LED=y
 CONFIG_VT82C686=y
+CONFIG_MIPS_BOSTON=y
+CONFIG_PCI_XILINX=y
diff --git a/hw/mips/Makefile.objs b/hw/mips/Makefile.objs
index 9352a1c..48cd2ef 100644
--- a/hw/mips/Makefile.objs
+++ b/hw/mips/Makefile.objs
@@ -4,3 +4,4 @@ obj-$(CONFIG_JAZZ) += mips_jazz.o
 obj-$(CONFIG_FULONG) += mips_fulong2e.o
 obj-y += gt64xxx_pci.o
 obj-$(CONFIG_MIPS_CPS) += cps.o
+obj-$(CONFIG_MIPS_BOSTON) += boston.o
diff --git a/hw/mips/boston.c b/hw/mips/boston.c
new file mode 100644
index 000..560c8b4
--- /dev/null
+++ b/hw/mips/boston.c
@@ -0,0 +1,576 @@
+/*
+ * MIPS Boston development board emulation.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "exec/address-spaces.h"
+#include "hw/boards.h"
+#include "hw/char/serial.h"
+#include "hw/hw.h"
+#include "hw/ide/pci.h"
+#include "hw/ide/ahci.h"
+#include "hw/loader.h"
+#include "hw/loader-fit.h"
+#include "hw/mips/cps.h"
+#include "hw/mips/cpudevs.h"
+#include "hw/pci-host/xilinx-pcie.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "sysemu/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/qtest.h"
+
+#include 
+
+#define TYPE_MIPS_BOSTON "mips-boston"
+#define BOSTON(obj) OBJECT_CHECK(BostonState, (obj), TYPE_MIPS_BOSTON)
+
+typedef struct {
+SysBusDevice parent_obj;
+
+

[Qemu-devel] [PATCH v4 1/8] hw/mips_cmgcr: allow GCR base to be moved

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Support moving the GCR base address & updating the CPU's CP0 CMGCRBase
register appropriately. This is required if a platform needs to move its
GCRs away from other memory, as the MIPS Boston development board does
to avoid its flash memory.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Signed-off-by: Yongbok Kim 
---
 hw/misc/mips_cmgcr.c | 17 +
 include/hw/misc/mips_cmgcr.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/hw/misc/mips_cmgcr.c b/hw/misc/mips_cmgcr.c
index b3ba166..a1edb53 100644
--- a/hw/misc/mips_cmgcr.c
+++ b/hw/misc/mips_cmgcr.c
@@ -29,6 +29,20 @@ static inline bool is_gic_connected(MIPSGCRState *s)
 return s->gic_mr != NULL;
 }
 
+static inline void update_gcr_base(MIPSGCRState *gcr, uint64_t val)
+{
+CPUState *cpu;
+MIPSCPU *mips_cpu;
+
+gcr->gcr_base = val & GCR_BASE_GCRBASE_MSK;
+memory_region_set_address(>iomem, gcr->gcr_base);
+
+CPU_FOREACH(cpu) {
+mips_cpu = MIPS_CPU(cpu);
+mips_cpu->env.CP0_CMGCRBase = gcr->gcr_base >> 4;
+}
+}
+
 static inline void update_cpc_base(MIPSGCRState *gcr, uint64_t val)
 {
 if (is_cpc_connected(gcr)) {
@@ -117,6 +131,9 @@ static void gcr_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 MIPSGCRVPState *other_vps = >vps[current_vps->other];
 
 switch (addr) {
+case GCR_BASE_OFS:
+update_gcr_base(gcr, data);
+break;
 case GCR_GIC_BASE_OFS:
 update_gic_base(gcr, data);
 break;
diff --git a/include/hw/misc/mips_cmgcr.h b/include/hw/misc/mips_cmgcr.h
index a209d91..c9dfcb4 100644
--- a/include/hw/misc/mips_cmgcr.h
+++ b/include/hw/misc/mips_cmgcr.h
@@ -41,6 +41,9 @@
 #define GCR_L2_CONFIG_BYPASS_SHF20
 #define GCR_L2_CONFIG_BYPASS_MSK((0x1ULL) << GCR_L2_CONFIG_BYPASS_SHF)
 
+/* GCR_BASE register fields */
+#define GCR_BASE_GCRBASE_MSK 0x8000ULL
+
 /* GCR_GIC_BASE register fields */
 #define GCR_GIC_BASE_GICEN_MSK   1
 #define GCR_GIC_BASE_GICBASE_MSK 0xFFFEULL
-- 
2.7.4




[Qemu-devel] [PATCH v4 4/8] target-mips: Provide function to test if a CPU supports an ISA

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Provide a new cpu_supports_isa function which allows callers to
determine whether a CPU supports one of the ISA_ flags, by testing
whether the associated struct mips_def_t sets the ISA flags in its
insn_flags field.

An example use of this is to allow boards which generate bootloader code
to determine the properties of the CPU that will be used, for example
whether the CPU is 64 bit or which architecture revision it implements.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Yongbok Kim 
---
 target/mips/cpu.h   |  1 +
 target/mips/translate.c | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index e1c78f5..4a4747a 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -815,6 +815,7 @@ int cpu_mips_signal_handler(int host_signum, void *pinfo, 
void *puc);
 
 #define cpu_init(cpu_model) CPU(cpu_mips_init(cpu_model))
 bool cpu_supports_cps_smp(const char *cpu_model);
+bool cpu_supports_isa(const char *cpu_model, unsigned int isa);
 void cpu_set_exception_base(int vp_index, target_ulong address);
 
 /* TODO QOM'ify CPU reset and remove */
diff --git a/target/mips/translate.c b/target/mips/translate.c
index 7f8ecf4..8b4a072 100644
--- a/target/mips/translate.c
+++ b/target/mips/translate.c
@@ -20233,6 +20233,16 @@ bool cpu_supports_cps_smp(const char *cpu_model)
 return (def->CP0_Config3 & (1 << CP0C3_CMGCR)) != 0;
 }
 
+bool cpu_supports_isa(const char *cpu_model, unsigned int isa)
+{
+const mips_def_t *def = cpu_mips_find_by_name(cpu_model);
+if (!def) {
+return false;
+}
+
+return (def->insn_flags & isa) != 0;
+}
+
 void cpu_set_exception_base(int vp_index, target_ulong address)
 {
 MIPSCPU *vp = MIPS_CPU(qemu_get_cpu(vp_index));
-- 
2.7.4




Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Laszlo Ersek
On 02/15/17 17:39, Michael S. Tsirkin wrote:
> On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov wrote:
>> On Wed, 15 Feb 2017 17:30:00 +0200
>> "Michael S. Tsirkin"  wrote:
>>
>>> On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor Mammedov wrote:
 On Wed, 15 Feb 2017 15:13:20 +0100
 Laszlo Ersek  wrote:
   
> Commenting under Igor's reply for simplicity
>
> On 02/15/17 11:57, Igor Mammedov wrote:  
>> On Tue, 14 Feb 2017 22:15:43 -0800
>> b...@skyportsystems.com wrote:
>> 
>>> From: Ben Warren 
>>>
>>> This is similar to the existing 'add pointer' functionality, but instead
>>> of instructing the guest (BIOS or UEFI) to patch memory, it instructs
>>> the guest to write the pointer back to QEMU via a writeable fw_cfg file.
>>>
>>> Signed-off-by: Ben Warren 
>>> ---
>>>  hw/acpi/bios-linker-loader.c | 58 
>>> ++--
>>>  include/hw/acpi/bios-linker-loader.h |  6 
>>>  2 files changed, 61 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/acpi/bios-linker-loader.c b/hw/acpi/bios-linker-loader.c
>>> index d963ebe..5030cf1 100644
>>> --- a/hw/acpi/bios-linker-loader.c
>>> +++ b/hw/acpi/bios-linker-loader.c
>>> @@ -78,6 +78,19 @@ struct BiosLinkerLoaderEntry {
>>>  uint32_t length;
>>>  } cksum;
>>>  
>>> +/*
>>> + * COMMAND_WRITE_POINTER - write the fw_cfg file (originating 
>>> from
>>> + * @dest_file) at @wr_pointer.offset, by adding a pointer to 
>>> the table
>>> + * originating from @src_file. 1,2,4 or 8 byte unsigned
>>> + * addition is used depending on @wr_pointer.size.
>>> + */
>
> The words "adding" and "addition" are causing confusion here.
>
> In all of the previous discussion, *addition* was out of scope from
> WRITE_POINTER. Again, the firmware is specifically not required to
> *read* any part of the fw_cfg blob identified by "dest_file".
>
> WRITE_POINTER instructs the firmware to return the allocation address of
> the downloaded "src_file" to QEMU. Any necessary runtime subscripting
> within "src_file" is to be handled by QEMU code dynamically.
>
> For example, consider that "src_file" has *several* fields that QEMU
> wants to massage; in that case, indexing within QEMU code with field
> offsets is simply unavoidable.  
 what I don't like here is that this indexing would be rather fragile
 and has to be done in different parts of QEMU /device, AML/.

 I'd prefer this helper function to have the same @src_offset
 behavior as ADD_POINTER where patched address could point to
 any part of src_file i.e. not just beginning.  
>>>
>>>
>>>
>>> /*
>>>  * COMMAND_ADD_POINTER - patch the table (originating from
>>>  * @dest_file) at @pointer.offset, by adding a pointer to the table
>>>  * originating from @src_file. 1,2,4 or 8 byte unsigned
>>>  * addition is used depending on @pointer.size.
>>>  */
>>>  
>>> so the way ADD works is
>>> read at offset
>>> add table address
>>> write result at offset
>>>
>>> in other words it is always beginning of table that is added.
>> more exactly it's, read at 
>>   src_offset = *(dst_blob_ptr+dst_offset)
>>   *(dst_blob+dst_offset) = src_blob_ptr + src_offset
>>
>>> Would the following be acceptable?
>>>
>>>
>>>  * COMMAND_WRITE_POINTER - update the fw_cfg file (originating from
>>>  * @dest_file) at @wr_pointer.offset, by writing a pointer to the 
>>> table
>>>  * originating from @src_file. 1,2,4 or 8 byte unsigned value
>>>  * is written depending on @wr_pointer.size.
>> it looses 'adding' part of ADD_POINTER command which handles src_offset,
>> however implementing adding part looks a bit complicated
>> as patched blob (dst) is not in guest memory but in QEMU and
>> on reset *(dst_blob+dst_offset) should be reset to src_offset.
>> Considering dst file could be device specific memory (field/blob/whatever)
>> it could be hard to track/notice proper reset behavior.
>>
>> So now I'm not sure if src_offset is worth adding.
> 
> Right. Let's just do this math in QEMU if we have to.

Deal. :)

Thanks
Laszlo




Re: [Qemu-devel] [PATCH v3] backup: allow target without .bdrv_get_info

2017-02-15 Thread Jeff Cody
On Wed, Feb 15, 2017 at 05:58:13PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Currently backup to nbd target is broken, as nbd doesn't have
> .bdrv_get_info realization.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v3: fix compilation (I feel like an idiot)
> adjust wording (Fam)
> 
> v2: add WARNING
> 
> ===
> 
> Since commit
> 
> commit 4c9bca7e39a6e07ad02c1dcde3478363344ec60b
> Author: John Snow 
> Date:   Thu Feb 25 15:58:30 2016 -0500
> 
> block/backup: avoid copying less than full target clusters
> 
> backup to nbd target is broken, we have "Couldn't determine the cluster size 
> of
> the target image".
> 
> Proposed NBD protocol extension - NBD_OPT_INFO should finally solve this 
> problem.
> But until it is not realized, we need allow backup to nbd target due to 
> backward
> compatibility.
> 
> Furthermore, is it entirely ok to disallow backup if bds lacks .bdrv_get_info?
> Which behavior should be default: to fail backup or to use default cluster 
> size?
> 
> 
>  block/backup.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index ea38733..d800a24 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -638,7 +638,16 @@ BlockJob *backup_job_create(const char *job_id, 
> BlockDriverState *bs,
>   * backup cluster size is smaller than the target cluster size. Even for
>   * targets with a backing file, try to avoid COW if possible. */
>  ret = bdrv_get_info(target, );
> -if (ret < 0 && !target->backing) {
> +if (ret == -ENOTSUP) {
> +/* Cluster size is not defined */
> +fprintf(stderr,
> +"WARNING: Target block device doesn't provide information "
> +"about block size and it doesn't have backing file. Default "
> +"block size of %u bytes is used. If actual block size of "
> +"target exceeds this default, backup may be unusable",
> +BACKUP_CLUSTER_SIZE_DEFAULT);

You should use error_report, like in your v2 patch.  Just make sure to
include qemu/error-report.h.

> +job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
> +} else if (ret < 0 && !target->backing) {
>  error_setg_errno(errp, -ret,
>  "Couldn't determine the cluster size of the target image, "
>  "which has no backing file");
> -- 
> 1.8.3.1
> 



Re: [Qemu-devel] [RFC PATCH 09/41] block: Default .bdrv_child_perm() for format drivers

2017-02-15 Thread Max Reitz
On 13.02.2017 18:22, Kevin Wolf wrote:
> Almost all format drivers have the same characteristics as far as
> permissions are concerned: They have one or more children for storing
> their own data and, more importantly, metadata (can be written to and
> grow even without external write requests, must be protected against
> other writers and present consistent data) and optionally a backing file
> (this is just data, so like for a filter, it only depends on what the
> parent nodes need).
> 
> This provides a default implementation that can be shared by most of
> our format drivers.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block.c   | 37 +
>  include/block/block_int.h |  8 
>  2 files changed, 45 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 290768d..8e99bb5 100644
> --- a/block.c
> +++ b/block.c
> @@ -1459,6 +1459,43 @@ void bdrv_filter_default_perms(BlockDriverState *bs, 
> BdrvChild *c,
> (c->shared_perm & DEFAULT_PERM_UNCHANGED);
>  }
>  
> +void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
> +   const BdrvChildRole *role,
> +   uint64_t perm, uint64_t shared,
> +   uint64_t *nperm, uint64_t *nshared)
> +{
> +bool backing = (role == _backing);
> +assert(role == _backing || role == _file);
> +
> +if (!backing) {
> +/* Apart from the modifications below, the same permissions are
> + * forwarded and left alone as for filters */
> +bdrv_filter_default_perms(bs, c, role, perm, shared, , );
> +
> +/* Format drivers may touch metadata even if the guest doesn't write 
> */
> +if (!bdrv_is_read_only(bs)) {
> +perm |= BLK_PERM_WRITE | BLK_PERM_RESIZE;
> +}
> +
> +/* bs->file always needs to be consistent because of the metadata. We
> + * can never allow other users to resize or write to it. */
> +perm |= BLK_PERM_CONSISTENT_READ;
> +shared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
> +} else {
> +/* We want consistent read from backing files if the parent needs it.
> + * No other operations are performed on backing files. */
> +perm &= BLK_PERM_CONSISTENT_READ;
> +
> +/* If the parent can deal with changing data, we're okay with a
> + * writable backing file. */

Are we OK with a resizable backing file, too? I'm not sure, actually.
Maybe we should just forbid it and hope nobody asks for it.

Max

> +shared &= BLK_PERM_WRITE;
> +shared |= BLK_PERM_CONSISTENT_READ | BLK_PERM_GRAPH_MOD |
> +  BLK_PERM_WRITE_UNCHANGED;
> +}
> +
> +*nperm = perm;
> +*nshared = shared;
> +}
>  
>  static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
>  {
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 2d74f92..46f51a6 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -885,6 +885,14 @@ void bdrv_filter_default_perms(BlockDriverState *bs, 
> BdrvChild *c,
> uint64_t perm, uint64_t shared,
> uint64_t *nperm, uint64_t *nshared);
>  
> +/* Default implementation for BlockDriver.bdrv_child_perm() that can be used 
> by
> + * (non-raw) image formats: Like above for bs->backing, but for bs->file it
> + * requires WRITE | RESIZE for read-write images, always requires
> + * CONSISTENT_READ and doesn't share WRITE. */
> +void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
> +   const BdrvChildRole *role,
> +   uint64_t perm, uint64_t shared,
> +   uint64_t *nperm, uint64_t *nshared);
>  
>  const char *bdrv_get_parent_name(const BlockDriverState *bs);
>  void blk_dev_change_media_cb(BlockBackend *blk, bool load);
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 4/7] ACPI: Add Virtual Machine Generation ID support

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 8:40 AM, Michael S. Tsirkin  wrote:
> 
> On Wed, Feb 15, 2017 at 05:07:57PM +0100, Igor Mammedov wrote:
>>> Those improvements can be added later, IMO -- but please do work out
>>> with Igor whether he really wants a v7 for those.
>> since it's minor fixes not influencing other patches within series
>> there is not need to repost whole series,
>> just this fixed up patch as replay to this thread tagged as v7
>> or a patch on top, I'm fine with either way.
> 
> OK, I'll merge v7, send more cleanups/fixes as patches on top pls.
> 
> -- 
> MST
Thanks everybody.  The requested changes are pretty minor so I should be able 
to turn v7 around today.

—Ben

smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH 17/18] nbd: BLOCK_STATUS for standard get_block_status function: server part

2017-02-15 Thread Paolo Bonzini


On 09/02/2017 16:38, Eric Blake wrote:
>> +static int blockstatus_to_extent(BlockDriverState *bs, uint64_t offset,
>> +  uint64_t length, NBDExtent *extent)
>> +{
>> +BlockDriverState *file;
>> +uint64_t start_sector = offset >> BDRV_SECTOR_BITS;
>> +uint64_t last_sector = (offset + length - 1) >> BDRV_SECTOR_BITS;
> Converting from bytes to sectors by rounding...
> 
>> +uint64_t begin = start_sector;
>> +uint64_t end = last_sector + 1;
>> +
>> +int nb = MIN(INT_MAX, end - begin);
>> +int64_t ret = bdrv_get_block_status_above(bs, NULL, begin, nb, , 
>> );
>> +if (ret < 0) {
>> +return ret;
>> +}
>> +
>> +extent->flags =
>> +cpu_to_be32((ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
>> +(ret & BDRV_BLOCK_ZERO  ? NBD_STATE_ZERO : 0));
>> +extent->length = cpu_to_be32((nb << BDRV_SECTOR_BITS) -
>> + (offset - (start_sector << 
>> BDRV_SECTOR_BITS)));
> ...then computing the length by undoing the rounding. I really think we
> should consider fixing bdrv_get_block_status_above() to be byte-based,
> but that's a separate series.  Your calculations look correct in the
> meantime, although '(offset & (BDRV_SECTOR_SIZE - 1))' may be a bit
> easier to read than '(offset - (start_sector << BDRV_SECTOR_BITS))'.

Agreed.  And please make it a separate variable, i.e.

uint64_t length;

length = (nb << BDRV_SECTOR_BITS) - (offset & BDRV_SECTOR_SIZE - 1);
...
extent->length = cpu_to_be32(length);

Paolo

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3] backup: allow target without .bdrv_get_info

2017-02-15 Thread Kevin Wolf
Am 15.02.2017 um 15:58 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Currently backup to nbd target is broken, as nbd doesn't have
> .bdrv_get_info realization.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v3: fix compilation (I feel like an idiot)
> adjust wording (Fam)
> 
> v2: add WARNING
> 
> ===
> 
> Since commit
> 
> commit 4c9bca7e39a6e07ad02c1dcde3478363344ec60b
> Author: John Snow 
> Date:   Thu Feb 25 15:58:30 2016 -0500
> 
> block/backup: avoid copying less than full target clusters
> 
> backup to nbd target is broken, we have "Couldn't determine the cluster size 
> of
> the target image".
> 
> Proposed NBD protocol extension - NBD_OPT_INFO should finally solve this 
> problem.
> But until it is not realized, we need allow backup to nbd target due to 
> backward
> compatibility.
> 
> Furthermore, is it entirely ok to disallow backup if bds lacks .bdrv_get_info?
> Which behavior should be default: to fail backup or to use default cluster 
> size?
> 
> 
>  block/backup.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index ea38733..d800a24 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -638,7 +638,16 @@ BlockJob *backup_job_create(const char *job_id, 
> BlockDriverState *bs,
>   * backup cluster size is smaller than the target cluster size. Even for
>   * targets with a backing file, try to avoid COW if possible. */
>  ret = bdrv_get_info(target, );
> -if (ret < 0 && !target->backing) {
> +if (ret == -ENOTSUP) {
> +/* Cluster size is not defined */
> +fprintf(stderr,

error_report() was better because with HMP, the message would appear in
the monitor (where the user is working) instead of stderr (where they
might easily miss the message).

If this means that you need to include another header file, just do
that.

> +"WARNING: Target block device doesn't provide information "
> +"about block size and it doesn't have backing file. Default "
> +"block size of %u bytes is used. If actual block size of "
> +"target exceeds this default, backup may be unusable",

This error message could use a few more articles. :-)

"WARNING: The target block device doesn't provide information "
"about the block size and it doesn't have a backing file. The default "
"block size of %u bytes is used. If the actual block size of "
"the target exceeds this default, the backup may be unusable",


> +BACKUP_CLUSTER_SIZE_DEFAULT);
> +job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
> +} else if (ret < 0 && !target->backing) {
>  error_setg_errno(errp, -ret,
>  "Couldn't determine the cluster size of the target image, "
>  "which has no backing file");

I'm not completely sure whether this is the right fix or whether it
would be better to address this is NBD, e.g. by adding an option where
you specify the block size when opening the NBD driver. (Later, when NBD
can communicate this, it would check if the option matches what the
server says and error out if it doesn't.)

Kevin



Re: [Qemu-devel] [PATCH 18/18] nbd: BLOCK_STATUS for standard get_block_status function: client part

2017-02-15 Thread Paolo Bonzini


On 09/02/2017 17:00, Eric Blake wrote:
>> +if (!client->block_status_ok) {
>> +*pnum = nb_sectors;
>> +ret = BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED;
>> +if (bs->drv->protocol_name) {

This condition is always true, I think?

>> +ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num * 
>> BDRV_SECTOR_SIZE);
>> +}
>> +return ret;
>> +}
> Looks like a sane fallback when we don't have anything more accurate.

>> +
>> +ret = nbd_client_co_cmd_block_status(bs, sector_num << BDRV_SECTOR_BITS,
>> + nb_sectors << BDRV_SECTOR_BITS,
>> + , _extents);
>> +if (ret < 0) {
>> +return ret;
>> +}
>> +
>> +*pnum = extents[0].length >> BDRV_SECTOR_BITS;
>> +ret = (extents[0].flags & NBD_STATE_HOLE ? 0 : BDRV_BLOCK_ALLOCATED) |
>> +  (extents[0].flags & NBD_STATE_ZERO ? BDRV_BLOCK_ZERO : 0);
>> +
>> +if ((ret & BDRV_BLOCK_ALLOCATED) && !(ret & BDRV_BLOCK_ZERO)) {
>> +ret |= BDRV_BLOCK_DATA;
>> +}

You can always return BDRV_BLOCK_OFFSET_VALID here, too.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 17/18] nbd: BLOCK_STATUS for standard get_block_status function: server part

2017-02-15 Thread Paolo Bonzini


On 09/02/2017 16:38, Eric Blake wrote:
> Umm, why are we sending only one status? If the client requests two ids
> during NBD_OPT_SET_META_CONTEXT, we should be able to provide both
> pieces of information at once.  For a minimal implementation, it works
> for proof of concept, but it is pretty restrictive to tell clients that
> they can only request a single status context.  I'm fine if we add that
> functionality in a later patch, but we'd better have the implementation
> ready for the same release as this patch (I still think 2.9 is a
> reasonable goal).

Agreed on this too.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 08/18] hbitmap: add next_zero function

2017-02-15 Thread Paolo Bonzini


On 07/02/2017 23:55, Eric Blake wrote:
> On 02/03/2017 09:47 AM, Vladimir Sementsov-Ogievskiy wrote:
>> The function searches for next zero bit.
>> Also add interface for BdrvDirtyBitmap.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>  block/dirty-bitmap.c |  5 +
>>  include/block/dirty-bitmap.h |  2 ++
>>  include/qemu/hbitmap.h   |  8 
>>  util/hbitmap.c   | 26 ++
>>  4 files changed, 41 insertions(+)
>>
> 
> It would be nice to enhance the testsuite to cover hbitmap_next_zero().
> 

Agreed.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 07/18] nbd: Minimal structured read for client

2017-02-15 Thread Paolo Bonzini


On 07/02/2017 21:14, Eric Blake wrote:
> On 02/03/2017 09:47 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Minimal implementation: always send DF flag, to not deal with fragmented
>> replies.
> 
> This works well with your minimal server implementation, but I worry
> that it will cause us to fall over when talking to a fully-compliant
> server that chooses to send EOVERFLOW errors for any request larger than
> 64k when DF is set; it also makes it impossible to benefit from sparse
> reads.  I guess that means we need to start thinking about followup
> patches to flush out our implementation.  But maybe I can live with this
> patch as is, since the goal of your series was not so much the full
> power of structured reads, but getting to a point where we could use
> structured reply for block status, even if it means your client can only
> communicate with qemu-nbd as server for now, as long as we do get to the
> rest of the patches for a full-blown structured read.

Can you post a diff that expresses this as a comment?  I'll squash the
comment into this commit.

>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>  block/nbd-client.c  |  47 +++
>>  block/nbd-client.h  |   2 +
>>  include/block/nbd.h |  15 +++--
>>  nbd/client.c| 170 
>> ++--
>>  qemu-nbd.c  |   2 +-
>>  5 files changed, 203 insertions(+), 33 deletions(-)
> 
> Hmm - no change to the testsuite. Structured reads seems like the sort
> of thing that it would be nice to test with some canned server replies,
> particularly with server behavior that is permitted by the NBD protocol
> but does not happen by default in qemu-nbd.

This would require implementing some kind of mock server support.  That
would be a very good thing but not something we have much infrastructure
for (you could use either a real socket or a mock QIOChannel).

Thanks,

Paolo

>>
>> diff --git a/block/nbd-client.c b/block/nbd-client.c
>> index 3779c6c999..ff96bd1635 100644
>> --- a/block/nbd-client.c
>> +++ b/block/nbd-client.c
>> @@ -180,13 +180,20 @@ static void nbd_co_receive_reply(NBDClientSession *s,
>>  *reply = s->reply;
>>  if (reply->handle != request->handle ||
>>  !s->ioc) {
>> +reply->simple = true;
>>  reply->error = EIO;
> 
> I don't think this is quite right - by setting reply->simple to true,
> you are forcing the caller to treat this as the final packet related to
> this request->handle, even though that might not be the case.
> 
> As it is, I wonder if this code is correct, even before your patch - the
> server is allowed to give responses out-of-order (if we request multiple
> reads without waiting for the first response) - I don't see how setting
> reply->error to EIO if the request->handle indicates that we are
> receiving an out-of-order response to some other packet, but that our
> request is still awaiting traffic.
> 
>>  } else {
>> -if (qiov && reply->error == 0) {
>> -ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
>> -   true);
>> -if (ret != request->len) {
>> -reply->error = EIO;
>> +if (qiov) {
>> +if ((reply->simple ? reply->error == 0 :
>> + reply->type == NBD_REPLY_TYPE_OFFSET_DATA)) {
>> +ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, 
>> request->len,
>> +   true);
> 
> This works only because you used the DF flag.  If we allow fragmenting,
> then you have to be careful to write the reply into the correct offset
> of the iov.
> 
>> +if (ret != request->len) {
>> +reply->error = EIO;
>> +}
>> +} else if (!reply->simple &&
>> +   reply->type == NBD_REPLY_TYPE_OFFSET_HOLE) {
>> +qemu_iovec_memset(qiov, 0, 0, request->len);
>>  }
> 
> Up to here, you didn't do any inspection for NBD_REPLY_FLAG_DONE (so you
> don't know if this is the last packet the server is sending for this
> reqeust->handle), and didn't do any special casing for
> NBD_REPLY_TYPE_NONE or for the various error replies.  I'm not sure if
> this will always do what you want.  In fact, I'm not even sure if
> reply->error is set correctly for all structured packets.
> 
>>  }
>>  
>> @@ -227,6 +234,7 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
>> offset,
>>  .type = NBD_CMD_READ,
>>  .from = offset,
>>  .len = bytes,
>> +.flags = client->structured_reply ? NBD_CMD_FLAG_DF : 0,
>>  };
>>  NBDReply reply;
>>  ssize_t ret;
>> @@ -237,12 +245,30 @@ int nbd_client_co_preadv(BlockDriverState *bs, 
>> uint64_t offset,
>>  nbd_coroutine_start(client, );
>>  ret = nbd_co_send_request(bs, , NULL);
>>  if (ret < 0) {
>> -reply.error = -ret;
>> -} else {
>> -

Re: [Qemu-devel] [PATCH 06/18] nbd/client: refactor drop_sync

2017-02-15 Thread Paolo Bonzini


On 08/02/2017 08:55, Vladimir Sementsov-Ogievskiy wrote:
> 07.02.2017 02:19, Eric Blake wrote:
>> On 02/03/2017 09:47 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Return 0 on success to simplify success checking.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>  nbd/client.c | 35 +++
>>>  1 file changed, 19 insertions(+), 16 deletions(-)
>> I'm not sure that this simplifies anything.  You have a net addition in
>> lines of code, so unless some later patch is improved because of this,
>> I'm inclined to say this is needless churn.
>>
> 
> I just dislike duplicating information like "drop_sync(ioc, 124) !=
> 124". In the code there is no place where positive and not equal to size
> return value actually handled. But it is not so important, if you are
> against i'll drop this, no problem.

I think I agree with Vladimir.

> One note: I don't have good understanding of the following: actually
> read can return positive value < queried size, which means that we
> should read again. But it is not handled in the code (handled, but just
> as an error), except drop_sync.. (With drop_sync it is side effect of
> using limited buffer size, yes?). Is it all ok?

It is handled in nbd_wr_syncv.

Paolo



Re: [Qemu-devel] [Qemu-block] [PATCH] block: Swap request limit definitions

2017-02-15 Thread Max Reitz
On 15.02.2017 17:44, Kevin Wolf wrote:
> Am 15.02.2017 um 14:42 hat Max Reitz geschrieben:
>> On 14.02.2017 10:52, Alberto Garcia wrote:
>>> On Mon 13 Feb 2017 06:13:38 PM CET, Max Reitz wrote:
>>>
>> -#define BDRV_REQUEST_MAX_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, \
>> - INT_MAX >> BDRV_SECTOR_BITS)
>> -#define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << 
>> BDRV_SECTOR_BITS)
>> +#define BDRV_REQUEST_MAX_BYTES  MIN(SIZE_MAX, INT_MAX)
>> +#define BDRV_REQUEST_MAX_SECTORS(BDRV_REQUEST_MAX_BYTES >> 
>> BDRV_SECTOR_BITS)
>
> I'm just pointing it out because I don't know if this can cause
> problems, but this patch would make BDRV_REQUEST_MAX_BYTES not a
> multiple of the sector size (INT_MAX is actually a prime number).

 Very good point. I don't think this could be an issue, though. For one
 thing, the use of BDRV_REQUEST_MAX_BYTES is very limited.
>>>
>>> Ok, but then I wonder what's the benefit of increasing
>>> BDRV_REQUEST_MAX_BYTES.
>>
>> The benefit is that the definition looks cleaner.
> 
> Whatever way we want to write it, I think MAX_BYTES = MAX_SECTORS * 512
> should be a given. Everything else is bound to confuse people and
> introduce bugs sooner or later.

Probably only sooner and not later, considering we are switching to byte
granularity overall anyway. And if something confuses people, I'd argue
it's the fact that we still have sector granularity all over the place
and not that your requests can be a bit bigger if you submit them in
bytes than if you submit them in sectors.

Anyway, if MAX_BYTES should be a multiple of the sector size, then I
can't think of a much better way to write this than what we currently
have and this patch is unneeded.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 4/7] ACPI: Add Virtual Machine Generation ID support

2017-02-15 Thread Michael S. Tsirkin
On Wed, Feb 15, 2017 at 05:07:57PM +0100, Igor Mammedov wrote:
> > Those improvements can be added later, IMO -- but please do work out
> > with Igor whether he really wants a v7 for those.
> since it's minor fixes not influencing other patches within series
> there is not need to repost whole series,
> just this fixed up patch as replay to this thread tagged as v7
> or a patch on top, I'm fine with either way.

OK, I'll merge v7, send more cleanups/fixes as patches on top pls.

-- 
MST



Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Michael S. Tsirkin
On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov wrote:
> On Wed, 15 Feb 2017 17:30:00 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor Mammedov wrote:
> > > On Wed, 15 Feb 2017 15:13:20 +0100
> > > Laszlo Ersek  wrote:
> > >   
> > > > Commenting under Igor's reply for simplicity
> > > > 
> > > > On 02/15/17 11:57, Igor Mammedov wrote:  
> > > > > On Tue, 14 Feb 2017 22:15:43 -0800
> > > > > b...@skyportsystems.com wrote:
> > > > > 
> > > > >> From: Ben Warren 
> > > > >>
> > > > >> This is similar to the existing 'add pointer' functionality, but 
> > > > >> instead
> > > > >> of instructing the guest (BIOS or UEFI) to patch memory, it instructs
> > > > >> the guest to write the pointer back to QEMU via a writeable fw_cfg 
> > > > >> file.
> > > > >>
> > > > >> Signed-off-by: Ben Warren 
> > > > >> ---
> > > > >>  hw/acpi/bios-linker-loader.c | 58 
> > > > >> ++--
> > > > >>  include/hw/acpi/bios-linker-loader.h |  6 
> > > > >>  2 files changed, 61 insertions(+), 3 deletions(-)
> > > > >>
> > > > >> diff --git a/hw/acpi/bios-linker-loader.c 
> > > > >> b/hw/acpi/bios-linker-loader.c
> > > > >> index d963ebe..5030cf1 100644
> > > > >> --- a/hw/acpi/bios-linker-loader.c
> > > > >> +++ b/hw/acpi/bios-linker-loader.c
> > > > >> @@ -78,6 +78,19 @@ struct BiosLinkerLoaderEntry {
> > > > >>  uint32_t length;
> > > > >>  } cksum;
> > > > >>  
> > > > >> +/*
> > > > >> + * COMMAND_WRITE_POINTER - write the fw_cfg file 
> > > > >> (originating from
> > > > >> + * @dest_file) at @wr_pointer.offset, by adding a pointer 
> > > > >> to the table
> > > > >> + * originating from @src_file. 1,2,4 or 8 byte unsigned
> > > > >> + * addition is used depending on @wr_pointer.size.
> > > > >> + */
> > > > 
> > > > The words "adding" and "addition" are causing confusion here.
> > > > 
> > > > In all of the previous discussion, *addition* was out of scope from
> > > > WRITE_POINTER. Again, the firmware is specifically not required to
> > > > *read* any part of the fw_cfg blob identified by "dest_file".
> > > > 
> > > > WRITE_POINTER instructs the firmware to return the allocation address of
> > > > the downloaded "src_file" to QEMU. Any necessary runtime subscripting
> > > > within "src_file" is to be handled by QEMU code dynamically.
> > > > 
> > > > For example, consider that "src_file" has *several* fields that QEMU
> > > > wants to massage; in that case, indexing within QEMU code with field
> > > > offsets is simply unavoidable.  
> > > what I don't like here is that this indexing would be rather fragile
> > > and has to be done in different parts of QEMU /device, AML/.
> > > 
> > > I'd prefer this helper function to have the same @src_offset
> > > behavior as ADD_POINTER where patched address could point to
> > > any part of src_file i.e. not just beginning.  
> > 
> > 
> > 
> > /*
> >  * COMMAND_ADD_POINTER - patch the table (originating from
> >  * @dest_file) at @pointer.offset, by adding a pointer to the table
> >  * originating from @src_file. 1,2,4 or 8 byte unsigned
> >  * addition is used depending on @pointer.size.
> >  */
> >  
> > so the way ADD works is
> > read at offset
> > add table address
> > write result at offset
> > 
> > in other words it is always beginning of table that is added.
> more exactly it's, read at 
>   src_offset = *(dst_blob_ptr+dst_offset)
>   *(dst_blob+dst_offset) = src_blob_ptr + src_offset
> 
> > Would the following be acceptable?
> > 
> > 
> >  * COMMAND_WRITE_POINTER - update the fw_cfg file (originating from
> >  * @dest_file) at @wr_pointer.offset, by writing a pointer to the 
> > table
> >  * originating from @src_file. 1,2,4 or 8 byte unsigned value
> >  * is written depending on @wr_pointer.size.
> it looses 'adding' part of ADD_POINTER command which handles src_offset,
> however implementing adding part looks a bit complicated
> as patched blob (dst) is not in guest memory but in QEMU and
> on reset *(dst_blob+dst_offset) should be reset to src_offset.
> Considering dst file could be device specific memory (field/blob/whatever)
> it could be hard to track/notice proper reset behavior.
> 
> So now I'm not sure if src_offset is worth adding.

Right. Let's just do this math in QEMU if we have to.

> > 
> > 
> > > 
> > >   
> > > > (1) So, the above looks correct, but please replace "adding" with
> > > > "storing", and "unsigned addition" with "store".
> > > > 
> > > > Side point: the case for ADD_POINTER is different; there we patch
> > > > several individual ACPI objects. The fact that I requested explicit
> > > > addition within the ADDR method, as opposed to pre-setting VGIA to a
> > > > nonzero offset, is an *incidental* limitation 

Re: [Qemu-devel] [PATCH 1/1] qemu-iotests: redirect nbd server stdout to /dev/null

2017-02-15 Thread Kevin Wolf
Am 14.02.2017 um 19:42 hat Eric Blake geschrieben:
> On 02/14/2017 12:15 PM, Jeff Cody wrote:
> > Some iotests (e.g. 174) try to filter the output of _make_test_image by
> > piping the stdout.  Pipe the server stdout to /dev/null, so that filter
> > pipe does not need to wait until process completion.
> > 
> > Signed-off-by: Jeff Cody 
> > ---
> >  tests/qemu-iotests/common.rc | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Eric Blake 

Thanks, applied to the block branch.

Kevin


pgpQbjcbggnV2.pgp
Description: PGP signature


Re: [Qemu-devel] [Qemu-discuss] Estimation of qcow2 image size converted from raw image

2017-02-15 Thread Daniel P. Berrange
On Wed, Feb 15, 2017 at 05:05:04PM +0100, Alberto Garcia wrote:
> On Wed 15 Feb 2017 04:57:12 PM CET, Nir Soffer wrote:
> >>> Let's try this syntax:
> >>>
> >>>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
> >>>   1234678000
> >>>
> >>> As John explained, it is only an estimate.  But it will be a
> >>> conservative maximum.
> >>
> >> This forces you to have an input file. It would be nice to be able to
> >> get the same information by merely giving the desired capacity e.g
> >>
> >>   $ qemu-img query-max-size -O qcow2 20G
> >
> > Without a file, this will have to assume that all clusters will be
> > allocated.
> 
> ...and that there are no internal snapshots. I'm not sure if this is
> very useful in general.

As long as the caveat is documented it is fine. Internal snapshots are
often completely ignored by apps since they have many downsides compared
to using external snapshots.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|



  1   2   >