Re: [Qemu-devel] [PATCH 1/2] dataplane: drop copy_in_vring_desc()
On Fri, Jun 26, 2015 at 07:05:07PM +0200, Greg Kurz wrote: On Fri, 26 Jun 2015 18:28:42 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Fri, Jun 26, 2015 at 02:18:38PM +0200, Greg Kurz wrote: On Fri, 26 Jun 2015 12:28:45 +0200 Cornelia Huck cornelia.h...@de.ibm.com wrote: On Fri, 26 Jun 2015 09:32:21 +0200 Greg Kurz gk...@linux.vnet.ibm.com wrote: During early virtio 1.0 devel, there were several proposals about how to deal with the endianness of the vring descriptor fields: - convert the decriptor to host endianness in a single place, and use its fields directly in the code - keep the descriptor untouched and use virtio memory helpers to access its fields with the appropriate endianness It seems like both approaches got merged: commit f5a5628cf0b6 introduces an extra swap that negates the one brought by commit b0e5d90ebc3e. This breaks boot in SLOF (BE client) when host is ppc64le with the following QEMU error: Failed to map descriptor addr 0x18e2517e len 268435456 A solution could be to revert f5a5628cf0b6, but dropping copy_in_vring_desc() is equivalent and result in a smaller patch. I'd prefer the revert, as the resulting code is nicer IMHO. Agreed. It is good to clear the endianness noise out of the real code. :) Can you please send v2 that works the way you want it? Taken from a previous mail by Stefan: Cornelia already sent [PATCH] Revert dataplane: allow virtio-1 devices to revert f5a5628cf0b. I acked it but the patch is going through Michael Tsirkin. I guess you just have to take Cornelia's patch + Stefan's ack, and patch 2/2 in my series + Cornelia's rb, and we are all set. :) FWIW I tested and it works exactly the same. Sounds good. Can you please test the pci branch from my tree, and confirm? But your second patch should apply regardless. Thanks for your feedback. This patch allows SLOF to boot the OS. Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com --- hw/virtio/dataplane/vring.c | 14 ++ 1 file changed, 2 insertions(+), 12 deletions(-)
Re: [Qemu-devel] Allocate PCI MMIO without BAR requests.
Hi David, On 2015-06-26 17:32, David kiarie wrote: Hi all, Some efforts to emulate AMD IOMMU have being going over the past few months. In real hardware AMD IOMMU is implemented as a PCI function. When emulating it in Qemu we want to allocate it MMIO space but real AMD IOMMU manage to reserve memory without making a BAR request, probably through a static address that's written by the device.(This is something similar to what non-PCI bus devices do).Trying to reserve memory via a BAR request results in address conflicts(in Linux) and all other PCI devices reserve platform resources via BAR requests. The AMD IOMMU spec makes it even clearer: 3 Registers The IOMMU is configured and controlled via two sets of registers — one in the PCI configuration space and another set mapped in system address space. [...] 3.1 PCI Resources [...] A PCI Function containing an IOMMU capability block does not include PCI BAR registers. I would like to hear suggestions on how to reserve a memory region for the device without making a BAR request. I see two approaches: - Let the IOMMU sit on two buses, PCI and system, i.e. become a PCI and SysBus device at the same time - I suspect, though, that this cannot be modeled with QOM right now. - Model the MMIO registers via the BAR interface but overwrite the PCI config space so that no BAR becomes visible and make sure that writes to the PCI command register cannot disable this region (which would be the case with normal BARs). Hackish, but it seems feasible. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v7 3/3] ich9: implement strap SPKR pin logic
On Sat, Jun 27, 2015 at 02:56:33PM -0300, Paulo Alcantara wrote: If the signal is sampled high, this indicates that the system is strapped to the No Reboot mode (ICH9 will disable the TCO Timer system reboot feature). The status of this strap is readable via the NO_REBOOT bit (CC: offset 0x3410:bit 5). The NO_REBOOT bit is set when SPKR pin on ICH9 is sampled high. This bit may be set or cleared by software if the strap is sampled low but may not override the strap when it indicates No Reboot. This patch implements the logic where hardware has ability to set SPKR pin through a property named pin-spkr I would call it noreboot and not pin-spkr since that's what it does in the end. and it's sampled low by default. I think sample high is a safer default. Signed-off-by: Paulo Alcantara pca...@zytor.com --- hw/acpi/tco.c | 3 ++- hw/isa/lpc_ich9.c | 38 ++ include/hw/i386/ich9.h | 11 +++ 3 files changed, 51 insertions(+), 1 deletion(-) diff --git a/hw/acpi/tco.c b/hw/acpi/tco.c index 1794a54..c1f5739 100644 --- a/hw/acpi/tco.c +++ b/hw/acpi/tco.c @@ -64,7 +64,8 @@ static void tco_timer_expired(void *opaque) tr-tco.sts2 |= TCO_BOOT_STS; tr-timeouts_no = 0; -if (!(gcs ICH9_CC_GCS_NO_REBOOT)) { +if ((lpc-pin_strap.spkr ICH9_PS_SPKR_PIN_LOW) +!(gcs ICH9_CC_GCS_NO_REBOOT)) { watchdog_perform_action(); tco_timer_stop(tr); return; diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index b547002..49d1f30 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -575,11 +575,49 @@ static void ich9_lpc_get_sci_int(Object *obj, Visitor *v, visit_type_uint32(v, value, name, errp); } +static void ich9_lpc_get_spkr_pin(Object *obj, Visitor *v, + void *opaque, const char *name, + Error **errp) +{ +ICH9LPCState *lpc = opaque; +uint8_t value = lpc-pin_strap.spkr; + +visit_type_uint8(v, value, name, errp); +} + +static void ich9_lpc_set_spkr_pin(Object *obj, Visitor *v, + void *opaque, const char *name, + Error **errp) +{ +ICH9LPCState *lpc = opaque; +Error *local_err = NULL; +uint8_t value; +uint32_t *gcs; + +visit_type_uint8(v, value, name, local_err); +if (local_err) { +goto out; +} +value = ICH9_PS_SPKR_PIN_MASK; +if (value ICH9_PS_SPKR_PIN_HIGH) { +gcs = (uint32_t *)lpc-chip_config[ICH9_CC_GCS]; +*gcs |= ICH9_CC_GCS_NO_REBOOT; +} +lpc-pin_strap.spkr = value; +out: +error_propagate(errp, local_err); +} + static void ich9_lpc_add_properties(ICH9LPCState *lpc) { static const uint8_t acpi_enable_cmd = ICH9_APM_ACPI_ENABLE; static const uint8_t acpi_disable_cmd = ICH9_APM_ACPI_DISABLE; +lpc-pin_strap.spkr = ICH9_PS_SPKR_PIN_DEFAULT; +object_property_add(OBJECT(lpc), pin-spkr, uint8, +ich9_lpc_get_spkr_pin, +ich9_lpc_set_spkr_pin, +NULL, lpc, NULL); object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, uint32, ich9_lpc_get_sci_int, NULL, NULL, NULL, NULL); BTW it's easier to add simple properties in dc-props then you don't need all the error propagate code etc. diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h index f5681a3..aafc43f 100644 --- a/include/hw/i386/ich9.h +++ b/include/hw/i386/ich9.h @@ -46,6 +46,11 @@ typedef struct ICH9LPCState { ICH9LPCPMRegs pm; uint32_t sci_level; /* track sci level */ +/* 2.24 Pin Straps */ +struct { +uint8_t spkr; +} pin_strap; + /* 10.1 Chipset Configuration registers(Memory Space) which is pointed by RCBA */ uint8_t chip_config[ICH9_CC_SIZE]; @@ -72,6 +77,12 @@ Object *ich9_lpc_find(void); #define Q35_MASK(bit, ms_bit, ls_bit) \ ((uint##bit##_t)(((1ULL ((ms_bit) + 1)) - 1) ~((1ULL ls_bit) - 1))) +/* ICH9: Pin Straps */ +#define ICH9_PS_SPKR_PIN_LOW0x01 +#define ICH9_PS_SPKR_PIN_HIGH 0x02 +#define ICH9_PS_SPKR_PIN_MASK 0x03 +#define ICH9_PS_SPKR_PIN_DEFAULTICH9_PS_SPKR_PIN_LOW + The interface seems a bit inconvenient to me. Why not make it a simple boolean property? /* ICH9: Chipset Configuration Registers */ #define ICH9_CC_ADDR_MASK (ICH9_CC_SIZE - 1) -- 2.1.0
Re: [Qemu-devel] [PATCH] hw/pxb: add chassis_nr property
On 06/18/2015 08:49 PM, Marcel Apfelbaum wrote: Add a chassis_nr property Instead of using PXB bus number as internal bridge's chassis nr. ping I think this should be part of 2.4. Thanks, Marcel Suggested-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Marcel Apfelbaum mar...@redhat.com --- docs/pci_expander_bridge.txt| 7 +++ hw/pci-bridge/pci_expander_bridge.c | 5 - 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/pci_expander_bridge.txt b/docs/pci_expander_bridge.txt index d7913fb..bc5be05 100644 --- a/docs/pci_expander_bridge.txt +++ b/docs/pci_expander_bridge.txt @@ -23,9 +23,9 @@ A detailed command line would be: -m 2G -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-ram,size=1024M,policy=bind,host-nodes=1,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 --device pxb,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd --device pxb,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8,bus=pci.0 -device e1000,bus=bridge2,addr=0x3 --device pxb,id=bridge3,bus=pci.0,bus_nr=40,bus=pci.0 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 +-device pxb,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4,chassis_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd +-device pxb,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8,bus=pci.0,chassis_nr=8 -device e1000,bus=bridge2,addr=0x3 +-device pxb,id=bridge3,bus=pci.0,bus_nr=40,bus=pci.0,chassis_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 Here you have: - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes) @@ -55,4 +55,3 @@ The PXB is composed by: - Using the bridge will enable hotplug support - All the devices behind the bridge will use bridge's IO/MEM windows compacting the PCI address space. - diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c index ec2bb45..62756d1 100644 --- a/hw/pci-bridge/pci_expander_bridge.c +++ b/hw/pci-bridge/pci_expander_bridge.c @@ -38,6 +38,7 @@ typedef struct PXBDev { PCIDevice parent_obj; /* public */ +uint8_t chassis_nr; uint8_t bus_nr; uint16_t numa_node; } PXBDev; @@ -175,7 +176,7 @@ static int pxb_dev_initfn(PCIDevice *dev) bds = qdev_create(BUS(bus), pci-bridge); bds-id = dev_name; -qdev_prop_set_uint8(bds, chassis_nr, pxb-bus_nr); +qdev_prop_set_uint8(bds, chassis_nr, pxb-chassis_nr); PCI_HOST_BRIDGE(ds)-bus = bus; @@ -194,6 +195,8 @@ static int pxb_dev_initfn(PCIDevice *dev) } static Property pxb_dev_properties[] = { +/* Note: 0 is not a legal chassis number. */ +DEFINE_PROP_UINT8(chassis_nr, PXBDev, chassis_nr, 0), /* Note: 0 is not a legal a PXB bus number. */ DEFINE_PROP_UINT8(bus_nr, PXBDev, bus_nr, 0), DEFINE_PROP_UINT16(numa_node, PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
Re: [Qemu-devel] [PATCH] Fix incorrect small packet padding in vmxnet3
On Jun 28, 2015, at 18:34 PM, Brian Kress bkr...@egenera.com wrote: When running ESXi under qemu there is an issue with the ESXi guest discarding packets that are too short. The guest discards any packets under the normal minimum length for an ethernet packet (60). This results in odd behaviour where other hosts or VMs on other hosts can communicate with the ESXi guest just fine (since there's a physical NIC somewhere doing padding), but VMs on the host and the host itself cannot because the ARP request packets are too small for the ESXi host to accept. Someone in the past thought this was worth fixing, and added code to the vmxnet3 qemu emulation such that if it is receiving packets smaller than 60 bytes to pad the packet out to 60. Unfortunately this code is wrong (or at least in the wrong place). It does so BEFORE before taking into account the vnet_hdr at the front of the packet added by the tap device. As a result, it might add padding, but it never adds enough. Specifically it adds 10 less (the length of the vnet_hdr) than it needs to. The following (hopefully obviously correct) patch simply swaps the order of processing the vnet header and the padding. With this patch an ESXi guest is able to communicate with the host or other local VMs. Signed-off-by: Brian Kress kre...@moose.net --- a/hw/net/vmxnet3.c 2015-04-27 10:08:24.0 -0400 +++ b/hw/net/vmxnet3.c 2015-06-23 11:38:48.865728713 -0400 @@ -1879,6 +1879,12 @@ return -1; } +if (s-peer_has_vhdr) { +vmxnet_rx_pkt_set_vhdr(s-rx_pkt, (struct virtio_net_hdr *)buf); +buf += sizeof(struct virtio_net_hdr); +size -= sizeof(struct virtio_net_hdr); +} + /* Pad to minimum Ethernet frame length */ if (size sizeof(min_buf)) { memcpy(min_buf, buf, size); @@ -1887,12 +1893,6 @@ size = sizeof(min_buf); } -if (s-peer_has_vhdr) { -vmxnet_rx_pkt_set_vhdr(s-rx_pkt, (struct virtio_net_hdr *)buf); -buf += sizeof(struct virtio_net_hdr); -size -= sizeof(struct virtio_net_hdr); -} - Looks good. ~Dmitry. vmxnet_rx_pkt_set_packet_type(s-rx_pkt, get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
Re: [Qemu-devel] [PATCH v7 3/3] ich9: implement strap SPKR pin logic
On Sun, 28 Jun 2015 10:37:58 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Sat, Jun 27, 2015 at 02:56:33PM -0300, Paulo Alcantara wrote: If the signal is sampled high, this indicates that the system is strapped to the No Reboot mode (ICH9 will disable the TCO Timer system reboot feature). The status of this strap is readable via the NO_REBOOT bit (CC: offset 0x3410:bit 5). The NO_REBOOT bit is set when SPKR pin on ICH9 is sampled high. This bit may be set or cleared by software if the strap is sampled low but may not override the strap when it indicates No Reboot. This patch implements the logic where hardware has ability to set SPKR pin through a property named pin-spkr I would call it noreboot and not pin-spkr since that's what it does in the end. Right. That's also more user intuitive, indeed. and it's sampled low by default. I think sample high is a safer default. OK. I'll default it to high. Signed-off-by: Paulo Alcantara pca...@zytor.com --- hw/acpi/tco.c | 3 ++- hw/isa/lpc_ich9.c | 38 ++ include/hw/i386/ich9.h | 11 +++ 3 files changed, 51 insertions(+), 1 deletion(-) diff --git a/hw/acpi/tco.c b/hw/acpi/tco.c index 1794a54..c1f5739 100644 --- a/hw/acpi/tco.c +++ b/hw/acpi/tco.c @@ -64,7 +64,8 @@ static void tco_timer_expired(void *opaque) tr-tco.sts2 |= TCO_BOOT_STS; tr-timeouts_no = 0; -if (!(gcs ICH9_CC_GCS_NO_REBOOT)) { +if ((lpc-pin_strap.spkr ICH9_PS_SPKR_PIN_LOW) +!(gcs ICH9_CC_GCS_NO_REBOOT)) { watchdog_perform_action(); tco_timer_stop(tr); return; diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index b547002..49d1f30 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -575,11 +575,49 @@ static void ich9_lpc_get_sci_int(Object *obj, Visitor *v, visit_type_uint32(v, value, name, errp); } +static void ich9_lpc_get_spkr_pin(Object *obj, Visitor *v, + void *opaque, const char *name, + Error **errp) +{ +ICH9LPCState *lpc = opaque; +uint8_t value = lpc-pin_strap.spkr; + +visit_type_uint8(v, value, name, errp); +} + +static void ich9_lpc_set_spkr_pin(Object *obj, Visitor *v, + void *opaque, const char *name, + Error **errp) +{ +ICH9LPCState *lpc = opaque; +Error *local_err = NULL; +uint8_t value; +uint32_t *gcs; + +visit_type_uint8(v, value, name, local_err); +if (local_err) { +goto out; +} +value = ICH9_PS_SPKR_PIN_MASK; +if (value ICH9_PS_SPKR_PIN_HIGH) { +gcs = (uint32_t *)lpc-chip_config[ICH9_CC_GCS]; +*gcs |= ICH9_CC_GCS_NO_REBOOT; +} +lpc-pin_strap.spkr = value; +out: +error_propagate(errp, local_err); +} + static void ich9_lpc_add_properties(ICH9LPCState *lpc) { static const uint8_t acpi_enable_cmd = ICH9_APM_ACPI_ENABLE; static const uint8_t acpi_disable_cmd = ICH9_APM_ACPI_DISABLE; +lpc-pin_strap.spkr = ICH9_PS_SPKR_PIN_DEFAULT; +object_property_add(OBJECT(lpc), pin-spkr, uint8, +ich9_lpc_get_spkr_pin, +ich9_lpc_set_spkr_pin, +NULL, lpc, NULL); object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, uint32, ich9_lpc_get_sci_int, NULL, NULL, NULL, NULL); BTW it's easier to add simple properties in dc-props then you don't need all the error propagate code etc. Hrm - good to know. I'll take a look at it. Thanks. diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h index f5681a3..aafc43f 100644 --- a/include/hw/i386/ich9.h +++ b/include/hw/i386/ich9.h @@ -46,6 +46,11 @@ typedef struct ICH9LPCState { ICH9LPCPMRegs pm; uint32_t sci_level; /* track sci level */ +/* 2.24 Pin Straps */ +struct { +uint8_t spkr; +} pin_strap; + /* 10.1 Chipset Configuration registers(Memory Space) which is pointed by RCBA */ uint8_t chip_config[ICH9_CC_SIZE]; @@ -72,6 +77,12 @@ Object *ich9_lpc_find(void); #define Q35_MASK(bit, ms_bit, ls_bit) \ ((uint##bit##_t)(((1ULL ((ms_bit) + 1)) - 1) ~((1ULL ls_bit) - 1))) +/* ICH9: Pin Straps */ +#define ICH9_PS_SPKR_PIN_LOW0x01 +#define ICH9_PS_SPKR_PIN_HIGH 0x02 +#define ICH9_PS_SPKR_PIN_MASK 0x03 +#define ICH9_PS_SPKR_PIN_DEFAULTICH9_PS_SPKR_PIN_LOW + The interface seems a bit inconvenient to me. Why not make it a simple boolean property? No real reason, actually. Since it has no more than 2 states (high and low), a boolean property would be appropriate. I'll make it boolean. Thanks, Paulo -- Paulo Alcantara, C.E.S.A.R
Re: [Qemu-devel] vmxnet3, vnet_hdr, and minimum length padding
On Jun 23, 2015, at 18:49 PM, Brian Kress kre...@moose.net wrote: When running ESXi under qemu there is an issue with the ESXi guest discarding packets that are too short. The guest discards any packets under the normal minimum length for an ethernet packet (60). This results in odd behaviour where other hosts or VMs on other hosts can communicate with the ESXi guest just fine (since there's a physical NIC somewhere doing padding), but VMs on the host and the host itself cannot because the ARP request packets are too small for the ESXi host to accept. Someone in the past thought this was worth fixing, and added code to the vmxnet3 qemu emulation such that if it is receiving packets smaller than 60 bytes to pad the packet out to 60. Unfortunately this code is wrong (or at least in the wrong place). It does so BEFORE before taking into account the vnet_hdr at the front of the packet added by the tap device. As a result, it might add padding, but it never adds enough. Specifically it adds 10 less (the length of the vnet_hdr) than it needs to. The following (hopefully obviously correct) patch simply swaps the order of processing the vnet header and the padding. With this patch an ESXi guest is able to communicate with the host or other local VMs. --- a/qemu-2.3.0/hw/net/vmxnet3.c 2015-04-27 10:08:24.0 -0400 +++ b/qemu-2.3.0/hw/net/vmxnet3.c 2015-06-23 11:38:48.865728713 -0400 @@ -1879,6 +1879,12 @@ return -1; } +if (s-peer_has_vhdr) { +vmxnet_rx_pkt_set_vhdr(s-rx_pkt, (struct virtio_net_hdr *)buf); +buf += sizeof(struct virtio_net_hdr); +size -= sizeof(struct virtio_net_hdr); +} + /* Pad to minimum Ethernet frame length */ if (size sizeof(min_buf)) { memcpy(min_buf, buf, size); @@ -1887,12 +1893,6 @@ size = sizeof(min_buf); } -if (s-peer_has_vhdr) { -vmxnet_rx_pkt_set_vhdr(s-rx_pkt, (struct virtio_net_hdr *)buf); -buf += sizeof(struct virtio_net_hdr); -size -= sizeof(struct virtio_net_hdr); -} - Reviewed-by: Dmitry Fleytman dmi...@daynix.com mailto:dmi...@daynix.com The code is fine, thanks! Please fix the patch according to Paolo comments. Regards, Dmitry. vmxnet_rx_pkt_set_packet_type(s-rx_pkt, get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
Re: [Qemu-devel] [PATCH 1/2] dataplane: drop copy_in_vring_desc()
On Sun, 28 Jun 2015 15:03:20 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Fri, Jun 26, 2015 at 07:05:07PM +0200, Greg Kurz wrote: On Fri, 26 Jun 2015 18:28:42 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Fri, Jun 26, 2015 at 02:18:38PM +0200, Greg Kurz wrote: On Fri, 26 Jun 2015 12:28:45 +0200 Cornelia Huck cornelia.h...@de.ibm.com wrote: On Fri, 26 Jun 2015 09:32:21 +0200 Greg Kurz gk...@linux.vnet.ibm.com wrote: During early virtio 1.0 devel, there were several proposals about how to deal with the endianness of the vring descriptor fields: - convert the decriptor to host endianness in a single place, and use its fields directly in the code - keep the descriptor untouched and use virtio memory helpers to access its fields with the appropriate endianness It seems like both approaches got merged: commit f5a5628cf0b6 introduces an extra swap that negates the one brought by commit b0e5d90ebc3e. This breaks boot in SLOF (BE client) when host is ppc64le with the following QEMU error: Failed to map descriptor addr 0x18e2517e len 268435456 A solution could be to revert f5a5628cf0b6, but dropping copy_in_vring_desc() is equivalent and result in a smaller patch. I'd prefer the revert, as the resulting code is nicer IMHO. Agreed. It is good to clear the endianness noise out of the real code. :) Can you please send v2 that works the way you want it? Taken from a previous mail by Stefan: Cornelia already sent [PATCH] Revert dataplane: allow virtio-1 devices to revert f5a5628cf0b. I acked it but the patch is going through Michael Tsirkin. I guess you just have to take Cornelia's patch + Stefan's ack, and patch 2/2 in my series + Cornelia's rb, and we are all set. :) FWIW I tested and it works exactly the same. Sounds good. Can you please test the pci branch from my tree, and confirm? Sure ! I shall do it tomorrow. Thanks. -- Greg But your second patch should apply regardless. Thanks for your feedback. This patch allows SLOF to boot the OS. Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com --- hw/virtio/dataplane/vring.c | 14 ++ 1 file changed, 2 insertions(+), 12 deletions(-)
Re: [Qemu-devel] [PATCH 1/2] target-arm: Split DISAS_YIELD from DISAS_WFE
On 27 June 2015 at 03:25, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Mon, Jun 15, 2015 at 11:49 AM, Peter Maydell peter.mayd...@linaro.org wrote: Currently we use DISAS_WFE for both WFE and YIELD instructions. This is functionally correct because at the moment both of them are implemented as yield this CPU back to the top level loop so another CPU has a chance to run. However it's rather confusing that YIELD ends up calling HELPER(wfe), and if we ever want to implement real behaviour for WFE and SEV it's likely to trip us up. Split out the yield codepath to use DISAS_YIELD and a new HELPER(yield) function. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/helper.h| 1 + target-arm/op_helper.c | 12 target-arm/translate-a64.c | 6 ++ target-arm/translate.h | 1 + 4 files changed, 20 insertions(+) diff --git a/target-arm/helper.h b/target-arm/helper.h index fc885de..827b33d 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -50,6 +50,7 @@ DEF_HELPER_2(exception_internal, void, env, i32) DEF_HELPER_4(exception_with_syndrome, void, env, i32, i32, i32) DEF_HELPER_1(wfi, void, env) DEF_HELPER_1(wfe, void, env) +DEF_HELPER_1(yield, void, env) DEF_HELPER_1(pre_hvc, void, env) DEF_HELPER_2(pre_smc, void, env, i32) diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c index 7fa32c4..5f06ca0 100644 --- a/target-arm/op_helper.c +++ b/target-arm/op_helper.c @@ -334,6 +334,18 @@ void HELPER(wfe)(CPUARMState *env) cpu_loop_exit(cs); } +void HELPER(yield)(CPUARMState *env) +{ +CPUState *cs = CPU(arm_env_get_cpu(env)); + +/* This is a non-trappable hint instruction, so semantically + * different from WFE even though we currently implement it + * identically. Yield control back to the top level loop. + */ Comment referencing out of scope functionality is a trap for developers, anyone patching WFE and not thinking about yield needs to be aware of comment staleness over here. +cs-exception_index = EXCP_YIELD; +cpu_loop_exit(cs); +} + I think the real problem here is the inaccuracy of WFE and not yield, so that is where such an explanatory comment should go. You can also make it more self documenting by maying wfe call yield: HELPER(wfe)(CPUARMState *env) { /* This is a hint instruction semantically different from YIELD even though we currently * implement it identically. Yield control back to the top level loop ... */ HELPER(yield)(env); } Yeah, I agree. I'll respin. -- PMM
[Qemu-devel] [PATCH v8 3/3] ich9: implement strap SPKR pin logic
If the signal is sampled high, this indicates that the system is strapped to the No Reboot mode (ICH9 will disable the TCO Timer system reboot feature). The status of this strap is readable via the NO_REBOOT bit (CC: offset 0x3410:bit 5). The NO_REBOOT bit is set when SPKR pin on ICH9 is sampled high. This bit may be set or cleared by software if the strap is sampled low but may not override the strap when it indicates No Reboot. This patch implements the logic where hardware has ability to set SPKR pin through a property named noreboot and it's sampled high by default. Signed-off-by: Paulo Alcantara pca...@zytor.com --- v7 - v8: * change property name to noreboot * default noreboot property to high * define property in dc-props * update tco tests to support and exercise noreboot property --- hw/acpi/tco.c | 2 +- hw/isa/lpc_ich9.c | 6 ++ include/hw/i386/ich9.h | 5 + tests/tco-test.c | 18 -- 4 files changed, 28 insertions(+), 3 deletions(-) diff --git a/hw/acpi/tco.c b/hw/acpi/tco.c index 1794a54..7a026c2 100644 --- a/hw/acpi/tco.c +++ b/hw/acpi/tco.c @@ -64,7 +64,7 @@ static void tco_timer_expired(void *opaque) tr-tco.sts2 |= TCO_BOOT_STS; tr-timeouts_no = 0; -if (!(gcs ICH9_CC_GCS_NO_REBOOT)) { +if (!lpc-pin_strap.spkr_hi !(gcs ICH9_CC_GCS_NO_REBOOT)) { watchdog_perform_action(); tco_timer_stop(tr); return; diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index b547002..3b460d4 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -688,6 +688,11 @@ static const VMStateDescription vmstate_ich9_lpc = { } }; +static Property ich9_lpc_properties[] = { +DEFINE_PROP_BOOL(noreboot, ICH9LPCState, pin_strap.spkr_hi, true), +DEFINE_PROP_END_OF_LIST(), +}; + static void ich9_lpc_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -699,6 +704,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, void *data) dc-reset = ich9_lpc_reset; k-init = ich9_lpc_init; dc-vmsd = vmstate_ich9_lpc; +dc-props = ich9_lpc_properties; k-config_write = ich9_lpc_config_write; dc-desc = ICH9 LPC bridge; k-vendor_id = PCI_VENDOR_ID_INTEL; diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h index f5681a3..63c5cd8 100644 --- a/include/hw/i386/ich9.h +++ b/include/hw/i386/ich9.h @@ -46,6 +46,11 @@ typedef struct ICH9LPCState { ICH9LPCPMRegs pm; uint32_t sci_level; /* track sci level */ +/* 2.24 Pin Straps */ +struct { +bool spkr_hi; +} pin_strap; + /* 10.1 Chipset Configuration registers(Memory Space) which is pointed by RCBA */ uint8_t chip_config[ICH9_CC_SIZE]; diff --git a/tests/tco-test.c b/tests/tco-test.c index 1a2fe3d..6a48188 100644 --- a/tests/tco-test.c +++ b/tests/tco-test.c @@ -42,6 +42,7 @@ enum { typedef struct { const char *args; +bool noreboot; QPCIDevice *dev; void *lpc_base; void *tco_io_base; @@ -53,7 +54,9 @@ static void test_init(TestData *d) QTestState *qs; char *s; -s = g_strdup_printf(-machine q35 %s, !d-args ? : d-args); +s = g_strdup_printf(-machine q35 %s %s, +d-noreboot ? : -global ICH9-LPC.noreboot=false, +!d-args ? : d-args); qs = qtest_start(s); qtest_irq_intercept_in(qs, ioapic); g_free(s); @@ -135,6 +138,7 @@ static void test_tco_defaults(void) TestData d; d.args = NULL; +d.noreboot = true; test_init(d); g_assert_cmpint(qpci_io_readw(d.dev, d.tco_io_base + TCO_RLD), ==, TCO_RLD_DEFAULT); @@ -167,6 +171,7 @@ static void test_tco_timeout(void) int ret; d.args = NULL; +d.noreboot = true; test_init(d); stop_tco(d); @@ -210,6 +215,7 @@ static void test_tco_max_timeout(void) int ret; d.args = NULL; +d.noreboot = true; test_init(d); stop_tco(d); @@ -253,6 +259,7 @@ static void test_tco_second_timeout_pause(void) QDict *ad; td.args = -watchdog-action pause; +td.noreboot = false; test_init(td); stop_tco(td); @@ -277,6 +284,7 @@ static void test_tco_second_timeout_reset(void) QDict *ad; td.args = -watchdog-action reset; +td.noreboot = false; test_init(td); stop_tco(td); @@ -301,6 +309,7 @@ static void test_tco_second_timeout_shutdown(void) QDict *ad; td.args = -watchdog-action shutdown; +td.noreboot = false; test_init(td); stop_tco(td); @@ -325,6 +334,7 @@ static void test_tco_second_timeout_none(void) QDict *ad; td.args = -watchdog-action none; +td.noreboot = false; test_init(td); stop_tco(td); @@ -349,6 +359,7 @@ static void test_tco_ticks_counter(void) uint16_t rld; d.args = NULL; +d.noreboot = true; test_init(d); stop_tco(d); @@ -375,6 +386,7 @@ static void
[Qemu-devel] [PATCH v8 2/3] tests: add testcase for TCO watchdog emulation
This patch adds a testcase that covers the following: 1) TCO default values 2) first and second TCO timeout 3) watch and validate ticks counter through TCO_RLD register 4) maximum supported TCO timeout (0x3ff) 5) watchdog actions (pause/reset/shutdown/none) upon second TCO timeout 6) set and get of TCO control and status bits Signed-off-by: Paulo Alcantara pca...@zytor.com --- v1 - v2: * some cleanup * add test for TCO_LOCK bit v2 - v3: * add tests for TCO control status bits * fix check of SECOND_TO_STS bit (it's set in TCO2_STS reg) v3 - v4: * add more description to commit log * use RCBA_BASE_ADDR macro defintion from hw/i386/ich9-cc.h instead v4 - v5: * use modified macros (now prefixed with ICH9_) from ich9-cc.h * move license to GPLv2+ v5 - v6: * remove include of hw/i386/ich9-cc.h since it's no longer exist v6 - v7: (no changes) --- tests/Makefile | 2 + tests/tco-test.c | 460 +++ 2 files changed, 462 insertions(+) create mode 100644 tests/tco-test.c diff --git a/tests/Makefile b/tests/Makefile index eff5e11..ef1e981 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -152,6 +152,7 @@ check-qtest-i386-y += tests/i440fx-test$(EXESUF) check-qtest-i386-y += tests/fw_cfg-test$(EXESUF) check-qtest-i386-y += tests/drive_del-test$(EXESUF) check-qtest-i386-y += tests/wdt_ib700-test$(EXESUF) +check-qtest-i386-y += tests/tco-test$(EXESUF) gcov-files-i386-y += hw/watchdog/watchdog.c hw/watchdog/wdt_ib700.c check-qtest-i386-y += $(check-qtest-pci-y) gcov-files-i386-y += $(gcov-files-pci-y) @@ -370,6 +371,7 @@ tests/eepro100-test$(EXESUF): tests/eepro100-test.o tests/vmxnet3-test$(EXESUF): tests/vmxnet3-test.o tests/ne2000-test$(EXESUF): tests/ne2000-test.o tests/wdt_ib700-test$(EXESUF): tests/wdt_ib700-test.o +tests/tco-test$(EXESUF): tests/tco-test.o $(libqos-pc-obj-y) tests/virtio-balloon-test$(EXESUF): tests/virtio-balloon-test.o tests/virtio-blk-test$(EXESUF): tests/virtio-blk-test.o $(libqos-virtio-obj-y) tests/virtio-net-test$(EXESUF): tests/virtio-net-test.o $(libqos-pc-obj-y) diff --git a/tests/tco-test.c b/tests/tco-test.c new file mode 100644 index 000..1a2fe3d --- /dev/null +++ b/tests/tco-test.c @@ -0,0 +1,460 @@ +/* + * QEMU ICH9 TCO emulation tests + * + * Copyright (c) 2015 Paulo Alcantara pca...@zytor.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include glib.h +#include string.h +#include stdio.h +#include stdlib.h + +#include libqtest.h +#include libqos/pci.h +#include libqos/pci-pc.h +#include hw/pci/pci_regs.h +#include hw/i386/ich9.h +#include hw/acpi/ich9.h +#include hw/acpi/tco.h + +#define RCBA_BASE_ADDR0xfed1c000 +#define PM_IO_BASE_ADDR 0xb000 + +enum { +TCO_RLD_DEFAULT = 0x, +TCO_DAT_IN_DEFAULT = 0x00, +TCO_DAT_OUT_DEFAULT = 0x00, +TCO1_STS_DEFAULT= 0x, +TCO2_STS_DEFAULT= 0x, +TCO1_CNT_DEFAULT= 0x, +TCO2_CNT_DEFAULT= 0x0008, +TCO_MESSAGE1_DEFAULT= 0x00, +TCO_MESSAGE2_DEFAULT= 0x00, +TCO_WDCNT_DEFAULT = 0x00, +TCO_TMR_DEFAULT = 0x0004, +SW_IRQ_GEN_DEFAULT = 0x03, +}; + +#define TCO_SECS_TO_TICKS(secs) (((secs) * 10) / 6) +#define TCO_TICKS_TO_SECS(ticks)(((ticks) * 6) / 10) + +typedef struct { +const char *args; +QPCIDevice *dev; +void *lpc_base; +void *tco_io_base; +} TestData; + +static void test_init(TestData *d) +{ +QPCIBus *bus; +QTestState *qs; +char *s; + +s = g_strdup_printf(-machine q35 %s, !d-args ? : d-args); +qs = qtest_start(s); +qtest_irq_intercept_in(qs, ioapic); +g_free(s); + +bus = qpci_init_pc(); +d-dev = qpci_device_find(bus, QPCI_DEVFN(0x1f, 0x00)); +g_assert(d-dev != NULL); + +/* map PCI-to-LPC bridge interface BAR */ +d-lpc_base = qpci_iomap(d-dev, 0, NULL); + +qpci_device_enable(d-dev); + +g_assert(d-lpc_base != NULL); + +/* set ACPI PM I/O space base address */ +qpci_config_writel(d-dev, (uintptr_t)d-lpc_base + ICH9_LPC_PMBASE, + PM_IO_BASE_ADDR | 0x1); +/* enable ACPI I/O */ +qpci_config_writeb(d-dev, (uintptr_t)d-lpc_base + ICH9_LPC_ACPI_CTRL, + 0x80); +/* set Root Complex BAR */ +qpci_config_writel(d-dev, (uintptr_t)d-lpc_base + ICH9_LPC_RCBA, + RCBA_BASE_ADDR | 0x1); + +d-tco_io_base = (void *)((uintptr_t)PM_IO_BASE_ADDR + 0x60); +} + +static void stop_tco(const TestData *d) +{ +uint32_t val; + +val = qpci_io_readw(d-dev, d-tco_io_base + TCO1_CNT); +val |= TCO_TMR_HLT; +qpci_io_writew(d-dev, d-tco_io_base + TCO1_CNT, val); +} + +static void start_tco(const TestData *d) +{ +uint32_t val; + +val = qpci_io_readw(d-dev, d-tco_io_base + TCO1_CNT); +val = ~TCO_TMR_HLT; +qpci_io_writew(d-dev,
[Qemu-devel] [PATCH v8 1/3] ich9: add TCO interface emulation
This interface provides some registers within a 32-byte range and can be acessed through PCI-to-LPC bridge interface (PMBASE + 0x60). It's commonly used as a watchdog timer to detect system lockups through SMIs that are generated -- if TCO_EN bit is set -- on every timeout. If NO_REBOOT bit is not set in GCS (General Control and Status register), the system will be resetted upon second timeout if TCO_RLD register wasn't previously written to prevent timeout. This patch adds support to TCO watchdog logic and few other features like mapping NMIs to SMIs (NMI2SMI_EN bit), system intruder detection, etc. are not implemented yet. Signed-off-by: Paulo Alcantara pca...@zytor.com --- v1 - v2: * add migration support for TCO I/O device state * wake up only when total time expired instead of every 0.6s * some cleanup suggested by Paolo Bonzini v2 - v3: * set SECOND_TO_STS and BOOT_STS bits in TCO2_STS instead * improve handling of TCO_LOCK bit in TCO1_CNT register v3 - v4: * fix some conflicts in hw/acpi/ich9.c after rebasing against master * remove meaningless use_tco field from TCOIORegs structure * add a object property named enable_tco and only enable TCO support on pc-q35-2.4 and later v4 - v5: * remove unused field (use_tco) in TCOIORegs structure * move license to GPLv2+ v5 - v6: * remove io_tco field from ICH9LPCPMRegs structure since it's no longer used * set ICH9_CC_GCS_NO_REBOOT bit by default in ICH9's LPC initialisation v6 - v7: (no changes) --- hw/acpi/Makefile.objs | 2 +- hw/acpi/ich9.c | 55 ++- hw/acpi/tco.c | 264 + hw/i386/pc_q35.c | 4 +- hw/isa/lpc_ich9.c | 16 ++- include/hw/acpi/ich9.h | 6 +- include/hw/acpi/tco.h | 82 +++ include/hw/boards.h| 3 +- include/hw/i386/ich9.h | 11 ++- include/hw/i386/pc.h | 1 + 10 files changed, 436 insertions(+), 8 deletions(-) create mode 100644 hw/acpi/tco.c create mode 100644 include/hw/acpi/tco.h diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index 29d46d8..3db1f07 100644 --- a/hw/acpi/Makefile.objs +++ b/hw/acpi/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-$(CONFIG_ACPI_X86) += core.o piix4.o ich9.o pcihp.o +common-obj-$(CONFIG_ACPI_X86) += core.o piix4.o ich9.o pcihp.o tco.o common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o common-obj-$(CONFIG_ACPI) += acpi_interface.o diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 8a64ffb..d3d9953 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -30,6 +30,7 @@ #include qemu/timer.h #include sysemu/sysemu.h #include hw/acpi/acpi.h +#include hw/acpi/tco.h #include sysemu/kvm.h #include exec/address-spaces.h @@ -92,8 +93,16 @@ static void ich9_smi_writel(void *opaque, hwaddr addr, uint64_t val, unsigned width) { ICH9LPCPMRegs *pm = opaque; +TCOIORegs *tr = pm-tco_regs; +uint64_t tco_en; + switch (addr) { case 0: +tco_en = pm-smi_en ICH9_PMIO_SMI_EN_TCO_EN; +/* once TCO_LOCK bit is set, TCO_EN bit cannot be overwritten */ +if (tr-tco.cnt1 TCO_LOCK) { +val = (val ~ICH9_PMIO_SMI_EN_TCO_EN) | tco_en; +} pm-smi_en = ~pm-smi_en_wmask; pm-smi_en |= (val pm-smi_en_wmask); break; @@ -159,6 +168,25 @@ static const VMStateDescription vmstate_memhp_state = { } }; +static bool vmstate_test_use_tco(void *opaque) +{ +ICH9LPCPMRegs *s = opaque; +return s-enable_tco; +} + +static const VMStateDescription vmstate_tco_io_state = { +.name = ich9_pm/tco, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.needed = vmstate_test_use_tco, +.fields = (VMStateField[]) { +VMSTATE_STRUCT(tco_regs, ICH9LPCPMRegs, 1, vmstate_tco_io_sts, + TCOIORegs), +VMSTATE_END_OF_LIST() +} +}; + const VMStateDescription vmstate_ich9_pm = { .name = ich9_pm, .version_id = 1, @@ -179,6 +207,10 @@ const VMStateDescription vmstate_ich9_pm = { .subsections = (const VMStateDescription*[]) { vmstate_memhp_state, NULL +}, +.subsections = (const VMStateDescription*[]) { +vmstate_tco_io_state, +NULL } }; @@ -209,7 +241,7 @@ static void pm_powerdown_req(Notifier *n, void *opaque) acpi_pm1_evt_power_down(pm-acpi_regs); } -void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, +void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, bool enable_tco, qemu_irq sci_irq) { memory_region_init(pm-io, OBJECT(lpc_pci), ich9-pm, ICH9_PMIO_SIZE); @@ -231,6 +263,11 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, acpi-smi, 8); memory_region_add_subregion(pm-io, ICH9_PMIO_SMI_EN, pm-io_smi); +pm-enable_tco = enable_tco; +if (pm-enable_tco) { +
Re: [Qemu-devel] QEMU to generate host binary
Let's say qemu is running in System Emulation Mode, when it runs guest's binary, it can log the translated code for host. Is it possible to merge that translated code and other sections of guest's binary to make a binary which can be run directly on host. Thanks On Fri, Jun 26, 2015 at 11:34 PM, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Fri, Jun 26, 2015 at 12:33 PM, Ayaz Akram aaq...@gmail.com wrote: Hello ! Is anyone aware of an effort to produce an executable binary for host using qemu. I mean is it possible that qemu generate a binary for whatever application it is emulating, which can later be run directly on host? I'm not sure what this binary would mean just yet. Are you extracting just the guest + its runtime state to a binary that picks up where the guest left off? Or are you including the machine emulator (i.e. QEMU itself) in this new binary to avoid having to load the guest it while picking up where left off? Regards, Peter Thanks
Re: [Qemu-devel] [PATCH] block.c: fix real cdrom detection
On Jun 25, 2015, at 2:01 PM, Peter Maydell wrote: On 25 June 2015 at 18:56, Programmingkid programmingk...@gmail.com wrote: Nice to hear from you again Laurent. The only way a solution in hdev_open() would work is if it could prevent find_image_format() from executing. Otherwise find_image_format() would just quit QEMU with an error. The question you should be asking is what is Linux doing for raw CDROM devices that is different, such that it works there but doesn't work on OSX?. It would also be helpful to know which is the case that doesn't work. Does QEMU fail in all cases, or only if the cdrom drive is empty, or only if there's a disk in the drive? QEMU fails if the cdrom is specified -cdrom /dev/cdrom, and there is no cd in the drive. QEMU also fails with a real cdrom in the drive. My initial suspicion is that we need OSX support in raw-posix.c for handling the host CDROM specially -- note that Linux and FreeBSD register a bdrv_host_cdrom with an is_inserted function. The is_inserted function wouldn't make a difference. I did follow the execution of QEMU from find_image_format(). When bdrv_co_io_em() is called, it returns -22. This is where things appear to start to go wrong.
Re: [Qemu-devel] [PATCH] block.c: fix real cdrom detection
Hi, On 29/06/2015 01:43, Programmingkid wrote: On Jun 25, 2015, at 2:01 PM, Peter Maydell wrote: On 25 June 2015 at 18:56, Programmingkid programmingk...@gmail.com wrote: Nice to hear from you again Laurent. The only way a solution in hdev_open() would work is if it could prevent find_image_format() from executing. Otherwise find_image_format() would just quit QEMU with an error. The question you should be asking is what is Linux doing for raw CDROM devices that is different, such that it works there but doesn't work on OSX?. It would also be helpful to know which is the case that doesn't work. Does QEMU fail in all cases, or only if the cdrom drive is empty, or only if there's a disk in the drive? QEMU fails if the cdrom is specified -cdrom /dev/cdrom, and there is no cd in the drive. QEMU also fails with a real cdrom in the drive. My initial suspicion is that we need OSX support in raw-posix.c for handling the host CDROM specially -- note that Linux and FreeBSD register a bdrv_host_cdrom with an is_inserted function. The is_inserted function wouldn't make a difference. In fact, if your patch fixes the problem, the is_inserted with no cdrom should too: with your strcmp(/dev/cdrom, filename) == 0 , you force the selection of bdrv_raw (which is what to do). without your patch, if bdrv_is_inserted() was implemented and no cdrom in the drive !bdrv_is_inserted(bs) should also select bdrv_raw. It appears also that bdrv_host_cdrom is not registered in bdrv_file_init(). I think this is the missing part to have a host cdrom support on MacOS X. Laurent
[Qemu-devel] [PATCH] Fix incorrect small packet padding in vmxnet3
When running ESXi under qemu there is an issue with the ESXi guest discarding packets that are too short. The guest discards any packets under the normal minimum length for an ethernet packet (60). This results in odd behaviour where other hosts or VMs on other hosts can communicate with the ESXi guest just fine (since there's a physical NIC somewhere doing padding), but VMs on the host and the host itself cannot because the ARP request packets are too small for the ESXi host to accept. Someone in the past thought this was worth fixing, and added code to the vmxnet3 qemu emulation such that if it is receiving packets smaller than 60 bytes to pad the packet out to 60. Unfortunately this code is wrong (or at least in the wrong place). It does so BEFORE before taking into account the vnet_hdr at the front of the packet added by the tap device. As a result, it might add padding, but it never adds enough. Specifically it adds 10 less (the length of the vnet_hdr) than it needs to. The following (hopefully obviously correct) patch simply swaps the order of processing the vnet header and the padding. With this patch an ESXi guest is able to communicate with the host or other local VMs. Signed-off-by: Brian Kress kre...@moose.net --- a/hw/net/vmxnet3.c 2015-04-27 10:08:24.0 -0400 +++ b/hw/net/vmxnet3.c 2015-06-23 11:38:48.865728713 -0400 @@ -1879,6 +1879,12 @@ return -1; } +if (s-peer_has_vhdr) { +vmxnet_rx_pkt_set_vhdr(s-rx_pkt, (struct virtio_net_hdr *)buf); +buf += sizeof(struct virtio_net_hdr); +size -= sizeof(struct virtio_net_hdr); +} + /* Pad to minimum Ethernet frame length */ if (size sizeof(min_buf)) { memcpy(min_buf, buf, size); @@ -1887,12 +1893,6 @@ size = sizeof(min_buf); } -if (s-peer_has_vhdr) { -vmxnet_rx_pkt_set_vhdr(s-rx_pkt, (struct virtio_net_hdr *)buf); -buf += sizeof(struct virtio_net_hdr); -size -= sizeof(struct virtio_net_hdr); -} - vmxnet_rx_pkt_set_packet_type(s-rx_pkt, get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
Re: [Qemu-devel] [PATCH] block.c: fix real cdrom detection
On Jun 28, 2015, at 8:29 PM, Laurent Vivier wrote: Hi, On 29/06/2015 01:43, Programmingkid wrote: On Jun 25, 2015, at 2:01 PM, Peter Maydell wrote: On 25 June 2015 at 18:56, Programmingkid programmingk...@gmail.com wrote: Nice to hear from you again Laurent. The only way a solution in hdev_open() would work is if it could prevent find_image_format() from executing. Otherwise find_image_format() would just quit QEMU with an error. The question you should be asking is what is Linux doing for raw CDROM devices that is different, such that it works there but doesn't work on OSX?. It would also be helpful to know which is the case that doesn't work. Does QEMU fail in all cases, or only if the cdrom drive is empty, or only if there's a disk in the drive? QEMU fails if the cdrom is specified -cdrom /dev/cdrom, and there is no cd in the drive. QEMU also fails with a real cdrom in the drive. My initial suspicion is that we need OSX support in raw-posix.c for handling the host CDROM specially -- note that Linux and FreeBSD register a bdrv_host_cdrom with an is_inserted function. The is_inserted function wouldn't make a difference. In fact, if your patch fixes the problem, the is_inserted with no cdrom should too: with your strcmp(/dev/cdrom, filename) == 0 , you force the selection of bdrv_raw (which is what to do). without your patch, if bdrv_is_inserted() was implemented and no cdrom in the drive !bdrv_is_inserted(bs) should also select bdrv_raw. The thing is a cdrom would be in the drive, and !bdrv_is_inserted() would return false. It appears also that bdrv_host_cdrom is not registered in bdrv_file_init(). I think this is the missing part to have a host cdrom support on MacOS X. I don't think we need a bdrv_host_cdrom registered. If one tiny change could make the real cdrom drive work, why completely implement this structure?
Re: [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors
On Fri, 06/26 15:36, Alexandre DERUMIER wrote: Hi, There is no problem, the observasion by Andrey was just that qmp command takes a few minutes before returning, because he didn't apply https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02511.html Is this patch already apply on the block tree ? With nfs as source storage, it's really slow currently (lseek slow + a lot of nfs ops). The patch is waiting for the next PULL in Jeff Cody's tree I think. Fam - Mail original - De: Fam Zheng f...@redhat.com À: pbonzini pbonz...@redhat.com Cc: Kevin Wolf kw...@redhat.com, qemu-bl...@nongnu.org, Jeff Cody jc...@redhat.com, qemu-devel qemu-devel@nongnu.org, qemu-stable qemu-sta...@nongnu.org, stefanha stefa...@redhat.com, js...@redhat.com, wangxiaol...@ucloud.cn Envoyé: Jeudi 25 Juin 2015 12:45:38 Objet: Re: [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors On Thu, 06/25 09:02, Fam Zheng wrote: On Wed, 06/24 19:01, Paolo Bonzini wrote: On 24/06/2015 11:08, Fam Zheng wrote: Stefan, The only controversial patches are the qmp/drive-mirror ones (1-3), while patches 4-8 are still useful on their own: they fix the mentioned crash and improve iotests. Shall we merge the second half (of course none of them depend on 1-3) now that softfreeze is approaching? Stefan, would you consider applying patches 4-8? Actually why not apply all of them? Even if blockdev-mirror is a superior interface in the long run, the current behavior of drive-mirror can cause images to balloon up to the full size, which is bad. Extending drive-mirror is okay IMHO for 2.4. Before we do that, Andrey Korolyov has reported a hang issue with unmap=true, I'll take a look at it today. There is no problem, the observasion by Andrey was just that qmp command takes a few minutes before returning, because he didn't apply https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02511.html Fam
Re: [Qemu-devel] reset strategy?
On Sat, Jun 27, 2015 at 12:52 PM, Liviu Ionescu i...@livius.net wrote: On 27 Jun 2015, at 21:03, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: Try this after object creation (see xlnx-zynqmp init fn): qdev_set_parent_bus(DEVICE(s-gic), sysbus_get_default()); ah, sure, I already tried this, if I set the bus, everything works as before. but probably my question was not clear enough. my latest configuration uses no qdev_* calls at all, none of my objects are connected to sysbus, and apparently everything works without problems. the reset strategy that I used was to register a reset callback for the MCU: CortexMState *cm_state = CORTEXM_MCU_STATE(dev); qemu_register_reset(cortexm_mcu_registered_reset_callback, cm_state); static void cortexm_mcu_registered_reset_callback(void *opaque) { DeviceState *dev = opaque; /* This will actually go to derived class reset. */ device_reset(dev); } and in the object reset callback to explicitly reset all contained objects: static void cortexm_mcu_reset_callback(DeviceState *dev) { qemu_log_function_name(); /* Call parent reset(). */ cm_device_parent_reset(dev, TYPE_CORTEXM_MCU); CortexMState *cm_state = CORTEXM_MCU_STATE(dev); #if defined(CONFIG_VERBOSE) if (verbosity_level = VERBOSITY_COMMON) { printf(%s core reset.\n, cm_state-display_model); } #endif /* * Ensure the image is copied into memory before reset * fetches MSP PC */ rom_reset(NULL); /* * With the new image available, MSP PC are correct * and execution will start. */ cpu_reset(CPU(cm_state-cpu)); This should be managed by the bootloader. device_reset(cm_state-nvic); if (cm_state-itm) { device_reset(cm_state-itm); } Are are these two devices really not connected to any bus? } Note we are trying to phase out object_new in favor of embedding the device structs in the SoC container, which would mean use of object_initialise instead of object_new. any special reasons in favour of this? I'm not sure TBH, its well discussed on list though. in my configuration some of the contained objects are created conditionally, and it seemed more natural to dynamically allocate only the needed ones. an extreme example is the STM32 objects, which should allocate space for all possible peripherals, for all STM32 families and sub-families, and object_initialise only the needed ones. but I got your point, where possible I'll use statically allocated objects. Ultimately that's a bug, the system wide reset should not depend on qdev. I did not check the code, but it seems it does not depend on qdev, if there are functions registered with qemu_register_reset(), they are called, regardless if connected to sysbus or not. Yes, but you shouldn't have to register resets from machine level code. It should just happen via: vl.c:4412:qemu_register_reset(qbus_reset_all_fn, sysbus_get_default()); Regards, Peter my conclusion is that, at least for my use case, we can get rid completely of qdev calls and sysbus, as long as each object explicitly resets all objects it creates, which is somehow expected. regards, Liviu
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/24/2015 10:07 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (ghost...@gmail.com) wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. Can you document the way that you use disk migration or mirror, together with your COLO setup? I think it would make it easier to understand this restriction. At the moment I don't understand how you can switch from doing a disk migration into COLO mode - there seems to be a gap between the end of disk migration and the start of COLO. In my test, I use disk migration. After migration and before starting COLO, we will disable disk migration: +/* Disable block migration */ +s-params.blk = 0; +s-params.shared = 0; +qemu_savevm_state_begin(trans, s-params); Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? With block migration or mirror, then that pretty much allows you to replace a failed secondary doesn't it without restarting? The only thing stopping you is the NBD client needs to point to the address of the new secondary. Stefan's suggestion of being able to modify the quorum on the fly would seem to fix that problem. Yes, I am implementing adding/removing quorum child now. (The other thing I can see is that the secondary wouldn't have the conntrack data for open connections; but that's a separate problem). A related topic was that a colleague was asking about starting the qemu up and only then setting the NBD target (he's trying to wire it into OpenStack); I suggested that instead of specifying the drive on the command line, it might work to use a drive_add to add the whole quorum/drive stack before starting the guest; however again something to modify the quorum would be easier. Finally the last good reason I can see for being able to modify the quorum on the fly is that you'll need it for continuous ft to be able to transmute a secondary into a new primary. Yes, I think adding a filter on the top is also needed. Thanks
Re: [Qemu-devel] [PATCH] refresh filename after the node is replaced
On 06/26/2015 11:16 PM, Max Reitz wrote: On 26.06.2015 16:27, Wen Congyang wrote: At 2015/6/26 21:47, Max Reitz Wrote: On 25.06.2015 08:41, Wen Congyang wrote: We can use block job mirror to repair broken quorum files. But the command 'info block' doesn't output correct filename after block job mirror finishes. Which filename? The quorum filename is broken after this patch, too. In In my test, quorum has two children, s-common.bs-drv is quorum, and s-to_replace is one of his child. Without this patch, info block will output obsolete information. With this patch, the quorum's filename is right. I don't know why quorum filename is broken. Oh, yes, you are right, I forgot to execute block-job-complete. However, this patch relies on the Quorum BDS being the mirror source, which is the intentional use case but certainly not the only one: $ x86_64-softmmu/qemu-system-x86_64 -drive if=none,id=quorum,driver=quorum,children.0.file.filename=test1.qcow2,children.1.file.filename=test2.qcow2,children.1.node-name=ch1,vote-threshold=1 -drive if=none,id=drv,file=test2.qcow2 -qmp stdio {QMP: {version: {qemu: {micro: 50, minor: 3, major: 2}, package: }, capabilities: []}} {'execute':'qmp_capabilities'} {return: {}} {'execute':'drive-mirror','arguments':{'device':'drv','target':'tmp.qcow2','format':'qcow2','node-name':'nch1','replaces':'ch1','sync':'full'}} Formatting 'tmp.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 {return: {}} {timestamp: {seconds: 1435331363, microseconds: 217707}, event: BLOCK_JOB_READY, data: {device: drv, len: 0, offset: 0, speed: 0, type: mirror}} {'execute':'block-job-complete','arguments':{'device':'drv'}} {return: {}} {timestamp: {seconds: 1435331373, microseconds: 152945}, event: BLOCK_JOB_COMPLETED, data: {device: drv, len: 0, offset: 0, speed: 0, type: mirror}} {'execute':'query-block'} {return: [{device: quorum, locked: false, removable: true, inserted: {iops_rd: 0, detect_zeroes: off, image: {virtual-size: 67108864, filename: json:{\children\: [{\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test1.qcow2\}}, {\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test2.qcow2\}}], \driver\: \quorum\, \blkverify\: false, \rewrite-corrupted\: false, \vote-threshold\: 1}, format: quorum}, iops_wr: 0, ro: false, backing_file_depth: 0, drv: quorum, iops: 0, bps_wr: 0, write_threshold: 0, encrypted: false, bps: 0, bps_rd: 0, cache: {no-flush: false, direct: false, writeback: true}, file: json:{\children\: [{\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test1.qcow2\}}, {\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test2.qcow2\}}], \driver\: \quorum\, \blkverify\: false, \rewrite-corrupted\: false, \vote-threshold\: 1}, encryption_key_missing: false}, tray_open: false, type: unknown}, {device: drv, locked: false, removable: true, inserted: {iops_rd: 0, detect_zeroes: off, image: {virtual-size: 67108864, filename: test2.qcow2, cluster-size: 65536, format: qcow2, actual-size: 200704, format-specific: {type: qcow2, data: {compat: 1.1, lazy-refcounts: false, refcount-bits: 16, corrupt: false}}, dirty-flag: false}, iops_wr: 0, ro: false, backing_file_depth: 0, drv: qcow2, iops: 0, bps_wr: 0, write_threshold: 0, encrypted: false, bps: 0, bps_rd: 0, cache: {no-flush: false, direct: false, writeback: true}, file: test2.qcow2, encryption_key_missing: false}, tray_open: false, type: unknown}, {io-status: ok, device: ide1-cd0, locked: false, removable: true, tray_open: false, type: unknown}, {device: floppy0, locked: false, removable: true, tray_open: false, type: unknown}, {device: sd0, locked: false, removable: true, tray_open: false, type: unknown}]} This patch makes mirror_exit() refresh the name of the block job's device (in this case drv), which is not necessarily a parent of the node being replaced. Even if it were, imagine a configuration where there are two nested quorums, with the inner one being repaired using the replaces parameter for drive-mirror. In this case, this patch would fix the inner Quorum's file name, but not the outer one's. If both inner and outer quorum are BBs, the outer one's is not fixed. I see two solutions to this issue: Either, we do it right and that means, if there is a change in the BDS graph (e.g. because of bdrv_swap()), we have to call bdrv_refresh_filename() on all of the changed BDS's parents. In order to be able to enumerate a BDS's parents, we need Kevin's series, as said before. IIUC, the BDS can have more than one parent. Or we revive my series block: Drop BDS.filename which drops the BDS.filename field completely and always rebuilds it if it is queried. This would fix the issue as well, however, there is a reason it has been lying around for nine months now, and that reason is that we did not yet fully examine the
[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
The problem can be re-produced by the script in the below in link. http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03739.html i.e. vda_irq_num=25 vdb_irq_num=27 while [ 1 ] do for irq in {1,2,4,8,10,20,40,80} do echo $irq /proc/irq/$vda_irq_num/smp_affinity echo $irq /proc/irq/$vdb_irq_num/smp_affinity dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct done done -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1465935 Title: kvm_irqchip_commit_routes: Assertion `ret == 0' failed Status in QEMU: New Bug description: Several my QEMU instances crashed, and in the qemu log, I can see this assertion failure, qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38. Guest OS is RHEL 6.3. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions
[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03739.html Seems that this patch hasn't been accpeted yet, and also no comments for it. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1465935 Title: kvm_irqchip_commit_routes: Assertion `ret == 0' failed Status in QEMU: New Bug description: Several my QEMU instances crashed, and in the qemu log, I can see this assertion failure, qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38. Guest OS is RHEL 6.3. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions
[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
From the debug log, we can see that virq is only 1008, but irq route table has been full, i.e. 1024. In kvm_irqchip_get_virq(), it only calls kvm_flush_dynamic_msi_routes() when all virqs(total gsi_count, 1024 too) have been allocated, but irq route table has two kind of entry type, KVM_IRQ_ROUTING_IRQCHIP and KVM_IRQ_ROUTING_MSI. Seems that 16 KVM_IRQ_ROUTING_IRQCHIP entries has been reserved, if max gsi_count is still 1024, then irq route table is possible to be overflow. The fix could be either set gsi_cout=1008 or increase max irq route count to 1040. kvm_irqchip_send_msi, virq=1008, nr=1024 kvm_irqchip_commit_routes, ret=-22 kvm_irqchip_commit_routes, irq_routes nr=1024 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1465935 Title: kvm_irqchip_commit_routes: Assertion `ret == 0' failed Status in QEMU: New Bug description: Several my QEMU instances crashed, and in the qemu log, I can see this assertion failure, qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38. Guest OS is RHEL 6.3. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions
Re: [Qemu-devel] [PATCH] block.c: fix real cdrom detection
On Jun 28, 2015, at 8:29 PM, Laurent Vivier wrote: Hi, On 29/06/2015 01:43, Programmingkid wrote: On Jun 25, 2015, at 2:01 PM, Peter Maydell wrote: On 25 June 2015 at 18:56, Programmingkid programmingk...@gmail.com wrote: Nice to hear from you again Laurent. The only way a solution in hdev_open() would work is if it could prevent find_image_format() from executing. Otherwise find_image_format() would just quit QEMU with an error. The question you should be asking is what is Linux doing for raw CDROM devices that is different, such that it works there but doesn't work on OSX?. It would also be helpful to know which is the case that doesn't work. Does QEMU fail in all cases, or only if the cdrom drive is empty, or only if there's a disk in the drive? QEMU fails if the cdrom is specified -cdrom /dev/cdrom, and there is no cd in the drive. QEMU also fails with a real cdrom in the drive. My initial suspicion is that we need OSX support in raw-posix.c for handling the host CDROM specially -- note that Linux and FreeBSD register a bdrv_host_cdrom with an is_inserted function. The is_inserted function wouldn't make a difference. In fact, if your patch fixes the problem, the is_inserted with no cdrom should too: with your strcmp(/dev/cdrom, filename) == 0 , you force the selection of bdrv_raw (which is what to do). without your patch, if bdrv_is_inserted() was implemented and no cdrom in the drive !bdrv_is_inserted(bs) should also select bdrv_raw. It appears also that bdrv_host_cdrom is not registered in bdrv_file_init(). I think this is the missing part to have a host cdrom support on MacOS X. Laurent This patch is what I came up with using your idea. It uses the fact that the bdrv_is_inserted() function is called first from find_image_format(). The bdrv_is_inserted() function now points to cdrom_is_inserted(). This new function will return 0 the first time it is called, then 1 after that. So far it works. --- block/raw-posix.c | 21 + 1 files changed, 21 insertions(+), 0 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index a967464..2d35580 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -2324,6 +2324,23 @@ static int hdev_create(const char *filename, QemuOpts *opts, return ret; } +#ifdef __APPLE__ + +static int cdrom_is_inserted(BlockDriverState *bs) +{ +static int count = 0; +int returnValue = 1; + +if(count == 0) { +returnValue = 0; // get around find_image_format() issue +} + +printf(count = %d for %s, returning %d\n, count, bs-filename, returnValue); +count++; +return returnValue; +} +#endif + static BlockDriver bdrv_host_device = { .format_name= host_device, .protocol_name= host_device, @@ -2365,6 +2382,10 @@ static BlockDriver bdrv_host_device = { .bdrv_ioctl = hdev_ioctl, .bdrv_aio_ioctl = hdev_aio_ioctl, #endif + +#ifdef __APPLE__ +.bdrv_is_inserted = cdrom_is_inserted, +#endif }; #ifdef __linux__ -- 1.7.5.4
Re: [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up
On Fri, 06/26 17:06, Jason Wang wrote: On 06/25/2015 05:18 PM, Stefan Hajnoczi wrote: e1000_can_receive() checks the link up status register bit. If the bit is clear, packets will be queued and the peer may disable receive to avoid wasting CPU reading packets that cannot be delivered. The queue must be flushed once the link comes back up again. This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests and tap networking. Flushing the queue invokes the async send callback, which re-enables tap fd read. Reported-by: Jonathan Liu net...@gmail.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/net/e1000.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index bab8e2a..5c6bcd0 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -185,6 +185,9 @@ e1000_link_up(E1000State *s) { s-mac_reg[STATUS] |= E1000_STATUS_LU; s-phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS; + +/* E1000_STATUS_LU is tested by e1000_can_receive() */ +qemu_flush_queued_packets(qemu_get_queue(s-nic)); } static bool This only solves the issue partially and just for e1000. After checking all can_receive() functions, another card with similar behaviour is vmxnet3. Looking at commit a90a7425cf592a3afeff3eaf32f543b83050ee5c (tap: Drop tap_can_send) again. The commit disable tap read poll when qemu_net_queue_send() returns zero. Which is usually the following cases: 1) queue-delivering is 1 2) qemu_can_send_packet() returns zero, which is: 2a) vm_running is false or 2b) can_receive() return zero 3) qemu_net_queue_deliver() returns zero, which is: 3a) nc-receive_disabled is true or 3b) info-receive_raw() or nc-receive-receive() returns zero This means we should enable tap read poll when one of those conditions is not existed. This patch fixes 2b) only for e1000. for 1, I'm not quite sure it's a real problem or how to fix. for 2a, we may probably need a vm state change handler to flush the queue, otherwise network will stall after a stop and cont. for 2b, we can either fixes the card that check link status in its can_receive() or just simply can qemu_flush_queued_packets() in set_link. 3a and 3b looks ok, since this happen when there's no space for rx, after guest refill, qemu will call qemu_flush_queued_packets() to flush and re-enable tap read poll. 2a and 2b does not exits before this commit, since tap_send check qemu_can_send_packet() before. Looks like netmap has the same issue. Ouch! Good catch! I will take a look at the devices today and see if we can fix the problem by adding qemu_flush_queued_packets() calls in the state transition points (vm_running, can_receive). The worst case is reverting the whole series (0bc12c4f7..f4d248bdc3) for 2.4, but dropping can_read is a worthwhile step for optimizing our event loop. Thanks, Fam
[Qemu-devel] [PATCH 3/4] Split ISA and sysbus versions of m48t59 device
The m48t59 device supports both ISA and direct sysbus attached versions of the device in the one .c file. This can be awkward for some embedded machine types which need the sysbus M48T59, but don't want to pull in the ISA bus code and its other dependencies. Therefore, this patch splits out the code for the ISA attached M48T59 into its own C file. It will be built when both CONFIG_M48T59 and CONFIG_ISA_BUS are enabled. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/timer/Makefile.objs | 3 + hw/timer/m48t59-internal.h | 82 hw/timer/m48t59-isa.c | 180 +++ hw/timer/m48t59.c | 228 - 4 files changed, 283 insertions(+), 210 deletions(-) create mode 100644 hw/timer/m48t59-internal.h create mode 100644 hw/timer/m48t59-isa.c diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 133bd0d..803fde7 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -6,6 +6,9 @@ common-obj-$(CONFIG_DS1338) += ds1338.o common-obj-$(CONFIG_HPET) += hpet.o common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o common-obj-$(CONFIG_M48T59) += m48t59.o +ifeq ($(CONFIG_ISA_BUS),y) +common-obj-$(CONFIG_M48T59) += m48t59-isa.o +endif common-obj-$(CONFIG_PL031) += pl031.o common-obj-$(CONFIG_PUV3) += puv3_ost.o common-obj-$(CONFIG_TWL92230) += twl92230.o diff --git a/hw/timer/m48t59-internal.h b/hw/timer/m48t59-internal.h new file mode 100644 index 000..c50f134 --- /dev/null +++ b/hw/timer/m48t59-internal.h @@ -0,0 +1,82 @@ +/* + * QEMU M48T59 and M48T08 NVRAM emulation (common header) + * + * Copyright (c) 2003-2005, 2007 Jocelyn Mayer + * Copyright (c) 2013 Hervé Poussineau + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +#ifndef HW_M48T59_H +#define HW_M48T59_H 1 + +//#define DEBUG_NVRAM + +#if defined(DEBUG_NVRAM) +#define NVRAM_PRINTF(fmt, ...) do { printf(fmt , ## __VA_ARGS__); } while (0) +#else +#define NVRAM_PRINTF(fmt, ...) do { } while (0) +#endif + +/* + * The M48T02, M48T08 and M48T59 chips are very similar. The newer '59 has + * alarm and a watchdog timer and related control registers. In the + * PPC platform there is also a nvram lock function. + */ + +typedef struct M48txxInfo { +const char *bus_name; +uint32_t model; /* 2 = m48t02, 8 = m48t08, 59 = m48t59 */ +uint32_t size; +} M48txxInfo; + +typedef struct M48t59State { +/* Hardware parameters */ +qemu_irq IRQ; +MemoryRegion iomem; +uint32_t size; +int32_t base_year; +/* RTC management */ +time_t time_offset; +time_t stop_time; +/* Alarm watchdog */ +struct tm alarm; +QEMUTimer *alrm_timer; +QEMUTimer *wd_timer; +/* NVRAM storage */ +uint8_t *buffer; +/* Model parameters */ +uint32_t model; /* 2 = m48t02, 8 = m48t08, 59 = m48t59 */ +/* NVRAM storage */ +uint16_t addr; +uint8_t lock; +} M48t59State; + +uint32_t m48t59_read(M48t59State *NVRAM, uint32_t addr); +void m48t59_write(M48t59State *NVRAM, uint32_t addr, uint32_t val); +void m48t59_reset_common(M48t59State *NVRAM); +void m48t59_realize_common(M48t59State *s, Error **errp); + +static inline void m48t59_toggle_lock(M48t59State *NVRAM, int lock) +{ +NVRAM-lock ^= 1 lock; +} + +extern const MemoryRegionOps m48t59_io_ops; + +#endif /* HW_M48T59_H */ diff --git a/hw/timer/m48t59-isa.c b/hw/timer/m48t59-isa.c new file mode 100644 index 000..3a521dc --- /dev/null +++ b/hw/timer/m48t59-isa.c @@ -0,0 +1,180 @@ +/* + * QEMU M48T59 and M48T08 NVRAM emulation (ISA bus interface + * + * Copyright (c) 2003-2005, 2007 Jocelyn Mayer + * Copyright (c) 2013 Hervé Poussineau + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy,
[Qemu-devel] [PATCH 2/4] Allow ISA bus to be configured out
Currently, the code to handle the legacy ISA bus is always included in qemu. However there are lots of platforms that don't include ISA legacy devies, and quite a few that have never used ISA legacy devices at all. This patch allows the ISA bus code to be disabled in the configuration for platforms where it doesn't make sense. For now, the default configs are adjusted to include ISA on all platforms including PCI (since CONFIG_IDE_CORE which is in pci.mak requires ISA support) and also several others which include ISA devices. We may want to pare this down in future. This patch becomes more useful since b19c1c0 isa: remove isa_mem_base variable. since that removes a dependency on isa-bus.c from vga.c. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- default-configs/moxie-softmmu.mak | 1 + default-configs/pci.mak | 1 + default-configs/sparc-softmmu.mak | 1 + default-configs/unicore32-softmmu.mak | 1 + hw/isa/Makefile.objs | 2 +- 5 files changed, 5 insertions(+), 1 deletion(-) diff --git a/default-configs/moxie-softmmu.mak b/default-configs/moxie-softmmu.mak index 7e22863..e00d099 100644 --- a/default-configs/moxie-softmmu.mak +++ b/default-configs/moxie-softmmu.mak @@ -1,5 +1,6 @@ # Default configuration for moxie-softmmu +CONFIG_ISA_BUS=y CONFIG_MC146818RTC=y CONFIG_SERIAL=y CONFIG_SERIAL_ISA=y diff --git a/default-configs/pci.mak b/default-configs/pci.mak index 7e10903..9f2b98c 100644 --- a/default-configs/pci.mak +++ b/default-configs/pci.mak @@ -1,4 +1,5 @@ CONFIG_PCI=y +CONFIG_ISA_BUS=y CONFIG_VIRTIO_PCI=y CONFIG_VIRTIO=y CONFIG_USB_UHCI=y diff --git a/default-configs/sparc-softmmu.mak b/default-configs/sparc-softmmu.mak index ab796b3..004b0f4 100644 --- a/default-configs/sparc-softmmu.mak +++ b/default-configs/sparc-softmmu.mak @@ -1,5 +1,6 @@ # Default configuration for sparc-softmmu +CONFIG_ISA_BUS=y CONFIG_ECC=y CONFIG_ESP=y CONFIG_ESCC=y diff --git a/default-configs/unicore32-softmmu.mak b/default-configs/unicore32-softmmu.mak index de38577..5f6c4a8 100644 --- a/default-configs/unicore32-softmmu.mak +++ b/default-configs/unicore32-softmmu.mak @@ -1,4 +1,5 @@ # Default configuration for unicore32-softmmu +CONFIG_ISA_BUS=y CONFIG_PUV3=y CONFIG_PTIMER=y CONFIG_PCKBD=y diff --git a/hw/isa/Makefile.objs b/hw/isa/Makefile.objs index 9164556..fb37c55 100644 --- a/hw/isa/Makefile.objs +++ b/hw/isa/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += isa-bus.o +common-obj-$(CONFIG_ISA_BUS) += isa-bus.o common-obj-$(CONFIG_APM) += apm.o common-obj-$(CONFIG_I82378) += i82378.o common-obj-$(CONFIG_PC87312) += pc87312.o -- 2.4.3
[Qemu-devel] [PATCH 4/4] Disable info irq and info pic for target-ppc
The info irq and info pic HMP commands are available on some, but not all targets, and what they do isn't terribly consistent. For SPARC and LM32 they do something platform specific, but for x86, powerpc, and MIPS they print some information from the i8259 (and only the i8259) interrupt controller. It's debatable whether these commands are any use at all, and we should probably make better, qdev aware ways of getting information from a machines PICs. However, those don't exist yet, so on x86 it's at least potentially useful to have these HMP commands. I can't speak for MIPS. For ppc, though, the i8259, if it exists at all, is usually just a secondary controller for legacy ISA. The only case where i8259 is the main system PIC on ppc is for the ancient and little-used PReP platform. So, even without QOM-ish replacement, the info pic and info irq HMP commands have no value on ppc. This patch, therefore, disables these commands for ppc targets. This will allow ppc builds which don't include PReP to not include ISA bus support either. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/monitor.c b/monitor.c index aeea2b5..8c56bfa 100644 --- a/monitor.c +++ b/monitor.c @@ -2573,7 +2573,7 @@ static mon_cmd_t info_cmds[] = { .help = show the command line history, .mhandler.cmd = hmp_info_history, }, -#if defined(TARGET_I386) || defined(TARGET_PPC) || defined(TARGET_MIPS) || \ +#if defined(TARGET_I386) || defined(TARGET_MIPS) || \ defined(TARGET_LM32) || (defined(TARGET_SPARC) !defined(TARGET_SPARC64)) { .name = irq, -- 2.4.3
[Qemu-devel] [PATCH 0/4] Allow ISA bus support to be configured out
At the moment isa-bus.c is compiled unconditionally for all targets. However, some targets have never used legacy ISA devices. Many more targets have at least some machine types without ISA. These patches allow ISA bus to be disabled in the configuration, thus allowing cut down configurations for targets and machine types that don't have ISA. In addition to creating the CONFIG_ISA_BUS config option itself, this involves breaking some non-obvious dependencies on the ISA code. David Gibson (4): Split serial-isa into its own config option Allow ISA bus to be configured out Split ISA and sysbus versions of m48t59 device Disable info irq and info pic for target-ppc default-configs/alpha-softmmu.mak | 1 + default-configs/arm-softmmu.mak | 1 + default-configs/i386-softmmu.mak | 1 + default-configs/mips-softmmu.mak | 1 + default-configs/mips64-softmmu.mak| 1 + default-configs/mips64el-softmmu.mak | 1 + default-configs/mipsel-softmmu.mak| 1 + default-configs/moxie-softmmu.mak | 2 + default-configs/pci.mak | 1 + default-configs/ppc-softmmu.mak | 1 + default-configs/ppc64-softmmu.mak | 1 + default-configs/ppcemb-softmmu.mak| 1 + default-configs/sh4-softmmu.mak | 1 + default-configs/sh4eb-softmmu.mak | 1 + default-configs/sparc-softmmu.mak | 1 + default-configs/sparc64-softmmu.mak | 1 + default-configs/unicore32-softmmu.mak | 1 + default-configs/x86_64-softmmu.mak| 1 + hw/char/Makefile.objs | 3 +- hw/isa/Makefile.objs | 2 +- hw/timer/Makefile.objs| 3 + hw/timer/m48t59-internal.h| 82 hw/timer/m48t59-isa.c | 180 +++ hw/timer/m48t59.c | 228 +++--- monitor.c | 2 +- 25 files changed, 306 insertions(+), 213 deletions(-) create mode 100644 hw/timer/m48t59-internal.h create mode 100644 hw/timer/m48t59-isa.c -- 2.4.3
[Qemu-devel] [PATCH 1/4] Split serial-isa into its own config option
At present, the core device model code for 8250-like serial ports (serial.c) and the code for serial ports attached to ISA-style legacy IO (serial-isa.c) are both controlled by the CONFIG_SERIAL variable. There are lots and lots of embedded platforms that have 8250-like serial ports but have never had anything resembling ISA legacy IO. Therefore, split serial-isa into its own CONFIG_SERIAL_ISA option so it can be disabled for platforms where it's not appropriate. For now, I enabled CONFIG_SERIAL_ISA in every default-config where CONFIG_SERIAL is enabled, excepting microblaze, xtensa and or32. Those platforms have technically lost functionality, but since they have no other PCI or ISA devices, it's fairly clear they never actually used leagacy IO stuff. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- default-configs/alpha-softmmu.mak| 1 + default-configs/arm-softmmu.mak | 1 + default-configs/i386-softmmu.mak | 1 + default-configs/mips-softmmu.mak | 1 + default-configs/mips64-softmmu.mak | 1 + default-configs/mips64el-softmmu.mak | 1 + default-configs/mipsel-softmmu.mak | 1 + default-configs/moxie-softmmu.mak| 1 + default-configs/ppc-softmmu.mak | 1 + default-configs/ppc64-softmmu.mak| 1 + default-configs/ppcemb-softmmu.mak | 1 + default-configs/sh4-softmmu.mak | 1 + default-configs/sh4eb-softmmu.mak| 1 + default-configs/sparc64-softmmu.mak | 1 + default-configs/x86_64-softmmu.mak | 1 + hw/char/Makefile.objs| 3 ++- 16 files changed, 17 insertions(+), 1 deletion(-) diff --git a/default-configs/alpha-softmmu.mak b/default-configs/alpha-softmmu.mak index 7f6161e..e0d75e3 100644 --- a/default-configs/alpha-softmmu.mak +++ b/default-configs/alpha-softmmu.mak @@ -3,6 +3,7 @@ include pci.mak include usb.mak CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_I8254=y CONFIG_PCKBD=y CONFIG_VGA_CIRRUS=y diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak index 74f1db3..abeec21 100644 --- a/default-configs/arm-softmmu.mak +++ b/default-configs/arm-softmmu.mak @@ -7,6 +7,7 @@ CONFIG_ISA_MMIO=y CONFIG_NAND=y CONFIG_ECC=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_PTIMER=y CONFIG_SD=y CONFIG_MAX7310=y diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak index 91d602c..bc2fc72 100644 --- a/default-configs/i386-softmmu.mak +++ b/default-configs/i386-softmmu.mak @@ -9,6 +9,7 @@ CONFIG_VGA_CIRRUS=y CONFIG_VMWARE_VGA=y CONFIG_VMMOUSE=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_PARALLEL=y CONFIG_I8254=y CONFIG_PCSPK=y diff --git a/default-configs/mips-softmmu.mak b/default-configs/mips-softmmu.mak index 44467c3..cccf41b 100644 --- a/default-configs/mips-softmmu.mak +++ b/default-configs/mips-softmmu.mak @@ -9,6 +9,7 @@ CONFIG_VGA_ISA_MM=y CONFIG_VGA_CIRRUS=y CONFIG_VMWARE_VGA=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_PARALLEL=y CONFIG_I8254=y CONFIG_PCSPK=y diff --git a/default-configs/mips64-softmmu.mak b/default-configs/mips64-softmmu.mak index 66ed5f9..6fab828 100644 --- a/default-configs/mips64-softmmu.mak +++ b/default-configs/mips64-softmmu.mak @@ -9,6 +9,7 @@ CONFIG_VGA_ISA_MM=y CONFIG_VGA_CIRRUS=y CONFIG_VMWARE_VGA=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_PARALLEL=y CONFIG_I8254=y CONFIG_PCSPK=y diff --git a/default-configs/mips64el-softmmu.mak b/default-configs/mips64el-softmmu.mak index bfca2b2..2facfc6 100644 --- a/default-configs/mips64el-softmmu.mak +++ b/default-configs/mips64el-softmmu.mak @@ -9,6 +9,7 @@ CONFIG_VGA_ISA_MM=y CONFIG_VGA_CIRRUS=y CONFIG_VMWARE_VGA=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_PARALLEL=y CONFIG_I8254=y CONFIG_PCSPK=y diff --git a/default-configs/mipsel-softmmu.mak b/default-configs/mipsel-softmmu.mak index 0162ef0..8331100 100644 --- a/default-configs/mipsel-softmmu.mak +++ b/default-configs/mipsel-softmmu.mak @@ -9,6 +9,7 @@ CONFIG_VGA_ISA_MM=y CONFIG_VGA_CIRRUS=y CONFIG_VMWARE_VGA=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_PARALLEL=y CONFIG_I8254=y CONFIG_PCSPK=y diff --git a/default-configs/moxie-softmmu.mak b/default-configs/moxie-softmmu.mak index 1a95476..7e22863 100644 --- a/default-configs/moxie-softmmu.mak +++ b/default-configs/moxie-softmmu.mak @@ -2,4 +2,5 @@ CONFIG_MC146818RTC=y CONFIG_SERIAL=y +CONFIG_SERIAL_ISA=y CONFIG_VGA=y diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak index 4befde3..b5bb29b 100644 --- a/default-configs/ppc-softmmu.mak +++ b/default-configs/ppc-softmmu.mak @@ -46,5 +46,6 @@ CONFIG_PLATFORM_BUS=y CONFIG_ETSEC=y CONFIG_LIBDECNUMBER=y # For PReP +CONFIG_SERIAL_ISA=y CONFIG_MC146818RTC=y CONFIG_ISA_TESTDEV=y diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak index ab62cc7..2b05329 100644 --- a/default-configs/ppc64-softmmu.mak +++ b/default-configs/ppc64-softmmu.mak @@ -50,5 +50,6 @@ CONFIG_LIBDECNUMBER=y CONFIG_XICS=$(CONFIG_PSERIES) CONFIG_XICS_KVM=$(and