[Qemu-devel] Re: [PATCH] kvm: x86: Save/restore error_code
Juan Quintela writes: Jason Wang jasow...@redhat.com wrote: Juan Quintela writes: Jason Wang jasow...@redhat.com wrote: The saving and restoring of error_code seems lost and convert the error_code to uint32_t. Signed-off-by: Jason Wang jasow...@redhat.com --- target-i386/cpu.h |4 ++-- target-i386/machine.c |2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) It should be a new subsection. The test is if has_error_code != 0 according to gleb. Later, Juan. Thanks for reminding, and maybe we can just use VMSTATE_UINT32_TEST() which is simpler than subsection to do the check, isn't it? we need the subsection, that way we don't need to bump the section version. Later, Juan. Have tried the subsection, but there's an issue I find: When we use subsections with the structure who have an nested VMStateDescription at the end of fields, the subsections could not be loaded correctly becuase qemu always try to match the subsection with the nested one. So when we use subsections for vmstate_cpu, the subsection always fail as it try to load the cpu's subsection for vmstate_ymmh_reg which is an nested VMStateDescription on the end. Maybe we need to modify vmstate_subsection_load() to handle this condition. Any thought about this? diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 06e40f3..c990db9 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -688,7 +688,7 @@ typedef struct CPUX86State { uint64_t pat; /* exception/interrupt handling */ -int error_code; +uint32_t error_code; int exception_is_int; target_ulong exception_next_eip; target_ulong dr[8]; /* debug registers */ @@ -933,7 +933,7 @@ uint64_t cpu_get_tsc(CPUX86State *env); #define cpu_list_id x86_cpu_list #define cpudef_setup x86_cpudef_setup -#define CPU_SAVE_VERSION 12 +#define CPU_SAVE_VERSION 13 /* MMU modes definitions */ #define MMU_MODE0_SUFFIX _kernel diff --git a/target-i386/machine.c b/target-i386/machine.c index d78eceb..0e467da 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -491,6 +491,8 @@ static const VMStateDescription vmstate_cpu = { VMSTATE_UINT64_V(xcr0, CPUState, 12), VMSTATE_UINT64_V(xstate_bv, CPUState, 12), VMSTATE_YMMH_REGS_VARS(ymmh_regs, CPUState, CPU_NB_REGS, 12), + +VMSTATE_UINT32_V(error_code, CPUState, 13), VMSTATE_END_OF_LIST() /* The above list is not sorted /wrt version numbers, watch out! */ }, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Qemu-devel] Re: [PATCH V2] qemu, kvm: Enable user space NMI injection for kvm guest
Am 10.12.2010 08:42, Lai Jiangshan wrote: Make use of the new KVM_NMI IOCTL to send NMIs into the KVM guest if the user space raised them. (example: qemu monitor's nmi command) Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com --- diff --git a/configure b/configure index 2917874..f6f9362 100755 --- a/configure +++ b/configure @@ -1646,6 +1646,9 @@ if test $kvm != no ; then #if !defined(KVM_CAP_DESTROY_MEMORY_REGION_WORKS) #error Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS #endif +#if !defined(KVM_CAP_USER_NMI) +#error Missing KVM capability KVM_CAP_USER_NMI +#endif int main(void) { return 0; } EOF if test $kerneldir != ; then That's what I meant. We also have a runtime check for KVM_CAP_DESTROY_MEMORY_REGION_WORKS on kvm init, but IMHO adding the same for KVM_CAP_USER_NMI would be overkill. So... diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 7dfc357..755f8c9 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1417,6 +1417,13 @@ int kvm_arch_get_registers(CPUState *env) int kvm_arch_pre_run(CPUState *env, struct kvm_run *run) { +/* Inject NMI */ +if (env-interrupt_request CPU_INTERRUPT_NMI) { +env-interrupt_request = ~CPU_INTERRUPT_NMI; +DPRINTF(injected NMI\n); +kvm_vcpu_ioctl(env, KVM_NMI); +} + /* Try to inject an interrupt if the guest can accept it */ if (run-ready_for_interrupt_injection (env-interrupt_request CPU_INTERRUPT_HARD) Acked-by: Jan Kiszka jan.kis...@siemens.com Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Re: [PATCH] fix qruncom compilation problems
On 12/09/2010 06:29 PM, Stefano Bonifazi wrote: how can one think that addresses around zero are free for a mapping?? Addresses around zero are always free, because if they weren't you couldn't detect NULL pointer dereferences reliably. mmap-ing at zero thus is a tricky operation, because it removes the possibility to detect NULL pointer dereferences. What's worse, such ability would be lost even for _kernel_ dereferences of NULL, thus opening a large security hole for privilege-escalation or kernel exploits. So, mmap-ing addresses close to zero is restricted to root. Paolo
[Qemu-devel] Re: [PATCH 09/13] ahci: add ahci emulation
Am 09.12.2010 17:18, schrieb Alexander Graf: Kevin Wolf wrote: Am 09.12.2010 16:48, schrieb Alexander Graf: +static void ncq_cb(void *opaque, int ret) +{ +NCQTransferState *ncq_tfs = (NCQTransferState *)opaque; +IDEState *ide_state; + +if (ret 0) { +/* XXX error */ +} Missing error handling. Yes, that's what the XXX stands for :). I think Stefan wanted to tell us that he thinks this XXX should be addressed. I don't disagree, by the way. ;-) +static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, +int slot, QEMUSGList *sg) +{ +NCQFrame *ncq_fis = (NCQFrame*)cmd_fis; +uint8_t tag = ncq_fis-tag 3; +NCQTransferState *ncq_tfs = s-dev[port].ncq_tfs[tag]; + +if (ncq_tfs-used) { +/* error - already in use */ +fprintf(stderr, %s: tag %d already used\n, __FUNCTION__, tag); +return; +} + +ncq_tfs-used = 1; +ncq_tfs-drive = s-dev[port]; +ncq_tfs-drive-cmd_fis = cmd_fis; +ncq_tfs-drive-cmd_fis_len = 0x20; +ncq_tfs-slot = slot; +ncq_tfs-lba = ((uint64_t)ncq_fis-lba5 40) | + ((uint64_t)ncq_fis-lba4 32) | + ((uint64_t)ncq_fis-lba3 24) | + ((uint64_t)ncq_fis-lba2 16) | + ((uint64_t)ncq_fis-lba1 8) | + (uint64_t)ncq_fis-lba0; + +/* Note: We calculate the sector count, but don't currently rely on it. + * The total size of the DMA buffer tells us the transfer size instead. */ +ncq_tfs-sector_count = ((uint16_t)ncq_fis-sector_count_high 8) | +ncq_fis-sector_count_low; + +DPRINTF(port, NCQ transfer LBA from %ld to %ld, drive max %ld\n, +ncq_tfs-lba, ncq_tfs-lba + ncq_tfs-sector_count - 2, +s-dev[port].port.ifs[0].nb_sectors - 1); + +ncq_tfs-sglist = *sg; +ncq_tfs-tag = tag; + +switch(ncq_fis-command) { +case READ_FPDMA_QUEUED: +DPRINTF(port, NCQ reading %d sectors from LBA %ld, tag %d\n, +ncq_tfs-sector_count-1, ncq_tfs-lba, ncq_tfs-tag); +ncq_tfs-is_read = 1; + +/* XXX: The specification is unclear about whether the DMA Setup + * FIS here should have the I bit set, but it suggest that it should + * not. Linux works without this interrupt, so I disabled it. + * If someone knows if it is needed, please tell me, or fix this. */ + +/* ahci_trigger_irq(s,s-dev[port],PORT_IRQ_STAT_DSS); */ +DPRINTF(port, tag %d aio read %ld\n, ncq_tfs-tag, ncq_tfs-lba); +dma_bdrv_read(ncq_tfs-drive-port.ifs[0].bs, ncq_tfs-sglist, + ncq_tfs-lba, ncq_cb, ncq_tfs); +break; +case WRITE_FPDMA_QUEUED: +DPRINTF(port, NCQ writing %d sectors to LBA %ld, tag %d\n, +ncq_tfs-sector_count-1, ncq_tfs-lba, ncq_tfs-tag); +ncq_tfs-is_read = 0; +/* ahci_trigger_irq(s,s-dev[port],PORT_IRQ_STAT_DSS); */ +DPRINTF(port, tag %d aio write %ld\n, ncq_tfs-tag, ncq_tfs-lba); +dma_bdrv_write(ncq_tfs-drive-port.ifs[0].bs, ncq_tfs-sglist, + ncq_tfs-lba, ncq_cb, ncq_tfs); +break; +default: +hw_error(ahci: tried to process non-NCQ command as NCQ\n); Guest triggerable abort. Those happen. The guest can shoot itself in the foot. We have more of these in other places. Just check virtio.c and search for abort() :). They are bugs which should be fixed in virtio rather than being spread to new code. Not sure about that. Would you prefer a broken guest to abort so you can debug it or to have it spew your log files with error messages or to silently ignore errors and never find bugs? If you need it for debugging, maybe have a DPRINTF_ERROR macro that aborts with an error message if DEBUG is defined, and doesn't do anything in normal builds? I'm still not sure if aborting is a good idea there, but I think it would be acceptable. For normal builds, the preferred behaviour is doing the same as real hardware in such cases. And real hardware doesn't kill the machine when the driver is doing stupid things, but rather sets an error bit in some register or something. Kevin
[Qemu-devel] [Bug 595117] Re: qemu-nbd slow and missing writeback cache option
For the record, there's more on that bug at http://thread.gmane.org/gmane.linux.ubuntu.bugs.server/36923 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/595117 Title: qemu-nbd slow and missing writeback cache option Status in QEMU: Invalid Status in “qemu-kvm” package in Ubuntu: Expired Bug description: Binary package hint: qemu-kvm dpkg -l | grep qemu ii kvm 1:84+dfsg-0ubuntu16+0.12.3+noroms+0ubuntu9dummy transitional pacakge from kvm to qemu- ii qemu 0.12.3+noroms-0ubuntu9 dummy transitional pacakge from qemu to qemu ii qemu-common 0.12.3+noroms-0ubuntu9 qemu common functionality (bios, documentati ii qemu-kvm 0.12.3+noroms-0ubuntu9 Full virtualization on i386 and amd64 hardwa ii qemu-kvm-extras 0.12.3+noroms-0ubuntu9 fast processor emulator binaries for non-x86 ii qemu-launcher1.7.4-1ubuntu2 GTK+ front-end to QEMU computer emulator ii qemuctl 0.2-2 controlling GUI for qemu lucid amd64. qemu-nbd is a lot slower when writing to disk than say nbd-server. It appears it is because by default the disk image it serves is open with O_SYNC. The --nocache option, unintuitively, makes matters a bit better because it causes the image to be open with O_DIRECT instead of O_SYNC. The qemu code allows an image to be open without any of those flags, but unfortunately qemu-nbd doesn't have the option to do that (qemu doesn't allow the image to be open with both O_SYNC and O_DIRECT though). The default of qemu-img (of using O_SYNC) is not very sensible because anyway, the client (the kernel) uses caches (write-back), (and qemu-nbd -d doesn't flush those by the way). So if for instance qemu-nbd is killed, regardless of whether qemu-nbd uses O_SYNC, O_DIRECT or not, the data in the image will not be consistent anyway, unless syncs are done by the client (like fsync on the nbd device or sync mount option), and with qemu-nbd's O_SYNC mode, those syncs will be extremely slow. Attached is a patch that adds a --cache={off,none,writethrough,writeback} option to qemu-nbd. --cache=off is the same as --nocache (that is use O_DIRECT), writethrough is using O_SYNC and is still the default so this patch doesn't change the functionality. writeback is none of those flags, so is the addition of this patch. The patch also does an fsync upon qemu-nbd -d to make sure data is flushed to the image before removing the nbd. Consider this test scenario: dd bs=1M count=100 of=a /dev/null qemu-nbd --cache=x -c /dev/nbd0 a cp /dev/zero /dev/nbd0 time perl -MIO::Handle -e 'STDOUT-sync or die$!' 1 /dev/nbd0 With cache=writethrough (the default), it takes over 10 minutes to write those 100MB worth of zeroes. Running a strace, we see the recvfrom and sentos delayed by each 1kb write(2)s to disk (10 to 30 ms per write). With cache=off, it takes about 30 seconds. With cache=writeback, it takes about 3 seconds, which is similar to the performance you get with nbd-server Note that the cp command runs instantly as the data is buffered by the client (the kernel), and not sent to qemu-nbd until the fsync(2) is called.
Re: [Qemu-devel] [RFC][PATCH v5 00/21] virtagent: host/guest RPC communication agent
On Thu, Dec 9, 2010 at 8:45 PM, Michael Roth mdr...@linux.vnet.ibm.com wrote: On 12/08/2010 04:10 AM, Stefan Hajnoczi wrote: What concrete use-cases are there? * Reboot support on x86. A QMP command can invoke guest-initiated reboot via virtagent. * ? * viewfile The ability to do a quick peek at guest stats via, say, /proc, is a use case that seems to be fairly generally desirable. It's what essentially started all this work, actually. That's also why I think a simple/limited viewfile RPC for relatively small text files is warranted regardless of whatever approach we end up taking for handling large/binary file transfers. * getdmesg Really useful for trouble-shooting things like soft-lockups. * ping Heartbeat monitoring has also been a fairly re-occurring approach to identifying potential problems in our cloud, and it's not even something we're capable of accomplishing in a production environment due to having limited network access to the guests. Being able to do it without relying on custom/network-based daemons would be pretty useful. * exec (planned) Internally and externally I've seen interest in guest-initiated snapshots, but that would tie into exec or some other, higher-level, RPC, which isn't yet well-defined. * copyfile (planned) Nothing solid, but I think it's generally desirable. Quick access to coredumps and such would be useful. Lots of discussion on how to implement this and I think we have some good potential approaches to adding this soon. * writefile (planned) Nothing solid. But guest activation is probably a big potential use case for this one. Managing various guest kernel params is another. Another would be deploying custom scripts to run via exec. Perhaps getfile and putfile as names? Will virtagent be extensible by host administrators or end-users? For example, can I drop in a custom command to collect statistics and invoke it across VMs on my hosts? Do I need to recompile QEMU and/or the virtagent daemon? writefile + exec would probably be the best way to achieve this. I don't think there are any plans to make the supported set of RPCs pluggable/extendable. Thanks for explaining this, I think this is an important aspect of virtagent. Users will want to take advantage of an always-available host-guest management transport. There should be a clear scope defined for virtagent. For example users might include backup, monitoring, and inventory (making sure installed software is up-to-date) software. I imagine just how flexible and easy to use will be controversial because some people may feel users should use this channel into the VM carefully and only in situations where other remote access methods are not suitable (e.g. ssh, etc). If there is a flexible interface from the start then new use-cases can be implemented without modifying QEMU/virtagent source code and updating guests. Stefan
[Qemu-devel] Re: [PATCH v2 2/2] qemu, qmp: convert do_inject_nmi() to QObject, QError
Lai Jiangshan la...@cn.fujitsu.com writes: Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt). changed from v1 Add document. Add error handling when the cpu index is invalid. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com --- diff --git a/hmp-commands.hx b/hmp-commands.hx index 23024ba..f86d9fe 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -724,7 +724,8 @@ ETEXI .args_type = cpu_index:i, .params = cpu, .help = inject an NMI on the given CPU, -.mhandler.cmd = do_inject_nmi, +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_inject_nmi, }, #endif STEXI diff --git a/monitor.c b/monitor.c index ec31eac..f375eb3 100644 --- a/monitor.c +++ b/monitor.c @@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const QDict *qdict) #endif #if defined(TARGET_I386) -static void do_inject_nmi(Monitor *mon, const QDict *qdict) +static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data) { CPUState *env; int cpu_index = qdict_get_int(qdict, cpu_index); @@ -2127,8 +2127,11 @@ static void do_inject_nmi(Monitor *mon, const QDict *qdict) for (env = first_cpu; env != NULL; env = env-next_cpu) if (env-cpu_index == cpu_index) { cpu_interrupt(env, CPU_INTERRUPT_NMI); -break; +return 0; } + +qerror_report(QERR_INVALID_CPU_INDEX, cpu_index); +return -1; do_cpu_set() reports invalud index like this: qerror_report(QERR_INVALID_PARAMETER_VALUE, index, a CPU number); What about sticking to that? [...]
[Qemu-devel] Re: [PATCH v2 1/2] QError: new QERR_INVALID_CPU_INDEX
On Fri, 10 Dec 2010 14:36:01 +0800 Lai Jiangshan la...@cn.fujitsu.com wrote: Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com As Markus said, we report this as an invalid parameter in do_cpu(), we can do the same for inject-nmi. --- diff --git a/qerror.c b/qerror.c index ac2cdaf..f59fb58 100644 --- a/qerror.c +++ b/qerror.c @@ -117,6 +117,10 @@ static const QErrorStringTable qerror_table[] = { .desc = Invalid block format '%(name)', }, { +.error_fmt = QERR_INVALID_CPU_INDEX, +.desc = Invalid CPU index '%(cpu_index)', +}, +{ .error_fmt = QERR_INVALID_PARAMETER, .desc = Invalid parameter '%(name)', }, diff --git a/qerror.h b/qerror.h index 943a24b..9117dda 100644 --- a/qerror.h +++ b/qerror.h @@ -102,6 +102,9 @@ QError *qobject_to_qerror(const QObject *obj); #define QERR_INVALID_BLOCK_FORMAT \ { 'class': 'InvalidBlockFormat', 'data': { 'name': %s } } +#define QERR_INVALID_CPU_INDEX \ +{ 'class': 'InvalidCPUIndex', 'data': { 'cpu_index': %d } } + #define QERR_INVALID_PARAMETER \ { 'class': 'InvalidParameter', 'data': { 'name': %s } }
[Qemu-devel] Re: [PATCH v2 2/2] qemu, qmp: convert do_inject_nmi() to QObject, QError
On Fri, 10 Dec 2010 14:36:08 +0800 Lai Jiangshan la...@cn.fujitsu.com wrote: +SQMP +inject_nmi +-- + +Inject an NMI on the given CPU (x86 only). + +Arguments: + +- cpu_index: the index of the CPU to be injected NMI (json-int) + +Example: + +- { execute: inject_nmi, arguments: { cpu_index: 0 } } +- { return: {} } + +EQMP + Avi, Anthony, can you please review this? Do we expect some kind of ack from the guest? Do we expect it respond in some way? Also note that the current series defines only one error condition: invalid cpu index. Can this fail in other ways?
Re: [Qemu-devel] [PATCH 0/3] add hotplug opt-out option for devices.
Gerd Hoffmann kra...@redhat.com writes: I understand why you're adding this but this is one of those horrible abuses of qdev that we really need to avoid. There are two valid reasons why hotplug is not possible: 1) Hotplugging is not supported by the *slot*. This is something that needs to be exposes through ACPI. This is not a qdev property, but a property of a PCI slot. Well, yea, right. Sort of. The ACPI thing applies to some of the slots only. But, yes, strictly speaking this is a slot not a device property in the case of PCI. Problem is qemu doesn't really has an idea what a pci slot is ... Needs fixing. Can't see how we can get sane hot plug without a proper representation of PCI slots in QEMU. It's very important that we do this correctly because Windows puts a little icon in the systray that advertises quick-removal of devices in slots that support hotplug. Indeed. 2) The PCI device is soldered to the MB or is otherwise not part of a PCI slot. Again, this is part of the ACPI definition. (3) The qemu emulation can't handle hot-unplug. Since the PIIX3 lives in slot 1, our ACPI tables should not advertise slot 0 or slot 1 as supporting hotplug. They do currently. Should be easily fixable. Hotplug information has no business being part of the core qdev structures. Hotplug is a PCI concept and the information needs to live at the PCI layer to be meaningfully. Wrong. PCI certainly isn't the only bus which supports hotplug. It *does* make sense to handle generic hotplug stuff at qdev level. Could the proper place be qbus instead of qdev? An ideal interface would explicitly allow a user to mark a series of PCI slots as no supporting hotplug. It would be convenient in order to ensure that your virtio-net wasn't accidentally ejected by a click-happy Windows user. Indeed. That one is a bit harder I suspect. Can this be done without generating acpi tables dynamically?
Re: [Qemu-devel] [PATCH 10/11] config: Add header file for device config options
On 10.12.2010, at 13:37, Markus Armbruster wrote: Alexander Graf ag...@suse.de writes: On 21.11.2010, at 13:37, Blue Swirl wrote: On Fri, Nov 19, 2010 at 2:56 AM, Alexander Graf ag...@suse.de wrote: So far we have C preprocessor defines for target and host config options, but we're lacking any information on which devices are available. We do need that information at times though, for example in the ahci patch where we need to call a legacy init function depending on whether we have support compiled in or not. That does not seem right. Devices should not care about what other devices may exist. Perhaps stub style approach would be better. Well, for the -drive parameter we need to know what devices we can create and I'd like to keep that code as close as possible to the actual device code. Stupid question: why can't we just try to create, then check status to figure out whether it failed because the device isn't available? That's what the later version of this patch that in between is ripped out of the patch set did :) Alex
Re: [Qemu-devel] [PATCH 1/1] NBD isn't used by qemu-img, so don't link qemu-img against NBD objects
Jes Sorensen jes.soren...@redhat.com writes: On 11/22/10 16:20, Anthony Liguori wrote: On 11/22/2010 09:10 AM, Jes Sorensen wrote: On 11/22/10 16:08, Anthony Liguori wrote: On 11/22/2010 08:58 AM, Jes Sorensen wrote: Right, the right solution is probably to create a block driver list argument for configure, similar to what we have for the sound drivers. --block-drv-whitelist= Any idea what the difference is between 'whitelist' and 'list' in this context? Everything is built, but only the formats that are in the whitelist are usable. Kinda defeats the purpose IMHO. It would be useful to be able to strip out the formats one doesn't want to get a slimmed down binary. There are two different purposes here: 1. You want to remove support for a format from QEMU and all the tools. That's what you have in mind, I think. 2. You want to distinguish between good for production and not so good for production formats. That's what the whitelist is for. Formats need to be whitelisted to be usable with QEMU proper. Tools normally ignore the whitelist. Lets you attempt offline format conversion for non-whitelisted formats, which is useful. Both are legitimate, in my opinion.
[Qemu-devel] [PATCH 0/5] virtio-serial: Trivial fixes, don't copy buffers to host
Hi, This patch series converts virtio-serial-bus to use the guest buffers instead of copying over guest data to the host, as suggested by Paul. In addition, there are some trivial fixes for the virtio-console and virtio-serial code. Amit Shah (5): virtio-console: Factor out common init between console and generic ports virtio-console: Remove unnecessary braces virtio-serial: Simplify condition for a while loop virtio-serial: Don't copy over guest buffer to host virtio-serial: Error out if guest sends unexpected vq elements hw/virtio-console.c| 34 +++--- hw/virtio-serial-bus.c | 44 +++- 2 files changed, 50 insertions(+), 28 deletions(-) -- 1.7.3.2
[Qemu-devel] [PATCH 1/5] virtio-console: Factor out common init between console and generic ports
The initialisation for generic ports and console ports is similar. Factor out the parts that are the same in a different function that can be called from each of the initfns. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-console.c | 31 ++- 1 files changed, 14 insertions(+), 17 deletions(-) diff --git a/hw/virtio-console.c b/hw/virtio-console.c index caea11f..d7fe68b 100644 --- a/hw/virtio-console.c +++ b/hw/virtio-console.c @@ -58,24 +58,28 @@ static void chr_event(void *opaque, int event) } } -/* Virtio Console Ports */ -static int virtconsole_initfn(VirtIOSerialDevice *dev) +static int generic_port_init(VirtConsole *vcon, VirtIOSerialDevice *dev) { -VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); -VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); - -port-info = dev-info; - -port-is_console = true; +vcon-port.info = dev-info; if (vcon-chr) { qemu_chr_add_handlers(vcon-chr, chr_can_read, chr_read, chr_event, vcon); -port-info-have_data = flush_buf; +vcon-port.info-have_data = flush_buf; } return 0; } +/* Virtio Console Ports */ +static int virtconsole_initfn(VirtIOSerialDevice *dev) +{ +VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); +VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); + +port-is_console = true; +return generic_port_init(vcon, dev); +} + static int virtconsole_exitfn(VirtIOSerialDevice *dev) { VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); @@ -115,14 +119,7 @@ static int virtserialport_initfn(VirtIOSerialDevice *dev) VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); -port-info = dev-info; - -if (vcon-chr) { -qemu_chr_add_handlers(vcon-chr, chr_can_read, chr_read, chr_event, - vcon); -port-info-have_data = flush_buf; -} -return 0; +return generic_port_init(vcon, dev); } static VirtIOSerialPortInfo virtserialport_info = { -- 1.7.3.2
[Qemu-devel] [PATCH 2/5] virtio-console: Remove unnecessary braces
Remove unnecessary braces around a case statement. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-console.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/hw/virtio-console.c b/hw/virtio-console.c index d7fe68b..d0b9354 100644 --- a/hw/virtio-console.c +++ b/hw/virtio-console.c @@ -48,10 +48,9 @@ static void chr_event(void *opaque, int event) VirtConsole *vcon = opaque; switch (event) { -case CHR_EVENT_OPENED: { +case CHR_EVENT_OPENED: virtio_serial_open(vcon-port); break; -} case CHR_EVENT_CLOSED: virtio_serial_close(vcon-port); break; -- 1.7.3.2
[Qemu-devel] [PATCH 4/5] virtio-serial: Don't copy over guest buffer to host
When the guest writes something to a host, we copied over the entire buffer first into the host and then processed it. Do away with that, it could result in a malicious guest causing a DoS on the host. Reported-by: Paul Brook p...@codesourcery.com Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-serial-bus.c | 20 1 files changed, 12 insertions(+), 8 deletions(-) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index ecf0056..3bbd915 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -125,17 +125,21 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, return; } while (virtqueue_pop(vq, elem)) { -uint8_t *buf; -size_t ret, buf_size; +unsigned int i; -if (!discard) { -buf_size = iov_size(elem.out_sg, elem.out_num); -buf = qemu_malloc(buf_size); -ret = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, buf_size); +if (discard) { +goto next; +} +for (i = 0; i elem.out_num; i++) { +size_t buf_size; -port-info-have_data(port, buf, ret); -qemu_free(buf); +buf_size = elem.out_sg[i].iov_len; + +port-info-have_data(port, + elem.out_sg[i].iov_base, + buf_size); } +next: virtqueue_push(vq, elem, 0); } virtio_notify(vdev, vq); -- 1.7.3.2
[Qemu-devel] [PATCH 3/5] virtio-serial: Simplify condition for a while loop
Separate out a non-changing condition over the period of a loop into an if statement before the loop. This will be used later to re-work the loop. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-serial-bus.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index 74ba5ec..ecf0056 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -121,7 +121,10 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, assert(port || discard); assert(virtio_queue_ready(vq)); -while ((discard || !port-throttled) virtqueue_pop(vq, elem)) { +if (!discard port-throttled) { +return; +} +while (virtqueue_pop(vq, elem)) { uint8_t *buf; size_t ret, buf_size; -- 1.7.3.2
[Qemu-devel] [PATCH 5/5] virtio-serial: Error out if guest sends unexpected vq elements
Check if the guest really sent any items in the out_vq before using them. Similarly, check if there is a buffer to send data in before writing. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-serial-bus.c | 19 +++ 1 files changed, 19 insertions(+), 0 deletions(-) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index 3bbd915..3a3032f 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -102,6 +102,11 @@ static size_t write_to_port(VirtIOSerialPort *port, break; } +if (elem.in_num 1) { +error_report(No buffer to send data in?); +abort(); +} + len = iov_from_buf(elem.in_sg, elem.in_num, buf + offset, size - offset); offset += len; @@ -127,6 +132,11 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, while (virtqueue_pop(vq, elem)) { unsigned int i; +if (elem.out_num 1) { +error_report(No data sent by guest?); +abort(); +} + if (discard) { goto next; } @@ -169,6 +179,11 @@ static size_t send_control_msg(VirtIOSerialPort *port, void *buf, size_t len) return 0; } +if (elem.in_num 1) { +error_report(No buffer to send control data in?); +abort(); +} + cpkt = (struct virtio_console_control *)buf; stl_p(cpkt-id, port-id); memcpy(elem.in_sg[0].iov_base, buf, len); @@ -386,6 +401,10 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq) while (virtqueue_pop(vq, elem)) { size_t cur_len, copied; +if (elem.out_num 1) { +error_report(No data sent in control packet); +abort(); +} cur_len = iov_size(elem.out_sg, elem.out_num); /* * Allocate a new buf only if we didn't have one previously or -- 1.7.3.2
Re: [Qemu-devel] [PATCH 4/5] ide: add TRIM support
On Thu, Dec 02, 2010 at 03:07:49PM +0100, Kevin Wolf wrote: This looks wrong. Wouldn't werror=stop cause the request to be retried as a write when the VM is resumed? Indeed. But having a copypaste error gives just about right reason to mention that after read and write this is the third almost unchanged copy of this code. Eventually we'll want to refactor this. I've added a patch to refactor the DMA code to the next iteration of the patch series. While we're at it, do you know why in the eot: case we set BM_STATUS_INT, but don't actually call ide_set_irq? From what I understand, those two should always be coupled, but I might be wrong. No idea, sorry.
Re: [Qemu-devel] [PATCH 0/3] add hotplug opt-out option for devices.
Hi, Wrong. PCI certainly isn't the only bus which supports hotplug. It *does* make sense to handle generic hotplug stuff at qdev level. Could the proper place be qbus instead of qdev? No. But PCI is the only bus where some devices are hot-pluggable and some are not. On all other busses it is either all or no devices. So moving to pci is an option (patches for that are on the list). cheers, Gerd
[Qemu-devel] [PATCH] ide: Register vm change state handler once only
We register the vm change state handler in a PCI BAR map() function. This function can be called multiple times throughout the lifetime of a PCI IDE device. This results in duplicate vm change state handlers being register, none of which are ever unregistered. Instead, register the vm change state handler in the device's init function once and for all. piix tested, cmd646 and via not tested. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- hw/ide/cmd646.c | 16 +--- hw/ide/piix.c | 32 +++- hw/ide/via.c| 32 +++- 3 files changed, 55 insertions(+), 25 deletions(-) diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c index dfe6091..23fc6e1 100644 --- a/hw/ide/cmd646.c +++ b/hw/ide/cmd646.c @@ -167,9 +167,6 @@ static void bmdma_map(PCIDevice *pci_dev, int region_num, for(i = 0;i 2; i++) { BMDMAState *bm = d-bmdma[i]; -d-bus[i].bmdma = bm; -bm-bus = d-bus+i; -qemu_add_vm_change_state_handler(ide_dma_restart_cb, bm); if (i == 0) { register_ioport_write(addr, 4, 1, bmdma_writeb_0, d); @@ -228,6 +225,7 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev) PCIIDEState *d = DO_UPCAST(PCIIDEState, dev, dev); uint8_t *pci_conf = d-dev.config; qemu_irq *irq; +int i; pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_CMD); pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_CMD_646); @@ -253,10 +251,14 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev) pci_conf[PCI_INTERRUPT_PIN] = 0x01; // interrupt on pin 1 irq = qemu_allocate_irqs(cmd646_set_irq, d, 2); -ide_bus_new(d-bus[0], d-dev.qdev); -ide_bus_new(d-bus[1], d-dev.qdev); -ide_init2(d-bus[0], irq[0]); -ide_init2(d-bus[1], irq[1]); +for (i = 0; i 2; i++) { +ide_bus_new(d-bus[i], d-dev.qdev); +ide_init2(d-bus[i], irq[i]); + +d-bus[i].bmdma = d-bmdma[i]; +bm-bus = d-bus[i]; +qemu_add_vm_change_state_handler(ide_dma_restart_cb, d-bmdma[i]); +} vmstate_register(dev-qdev, 0, vmstate_ide_pci, d); qemu_register_reset(cmd646_reset, d); diff --git a/hw/ide/piix.c b/hw/ide/piix.c index e02b89a..f2752e7 100644 --- a/hw/ide/piix.c +++ b/hw/ide/piix.c @@ -76,9 +76,6 @@ static void bmdma_map(PCIDevice *pci_dev, int region_num, for(i = 0;i 2; i++) { BMDMAState *bm = d-bmdma[i]; -d-bus[i].bmdma = bm; -bm-bus = d-bus+i; -qemu_add_vm_change_state_handler(ide_dma_restart_cb, bm); register_ioport_write(addr, 1, 1, bmdma_cmd_writeb, bm); @@ -112,6 +109,28 @@ static void piix3_reset(void *opaque) pci_conf[0x20] = 0x01; /* BMIBA: 20-23h */ } +static void pci_piix_init_ports(PCIIDEState *d) { +int i; +struct { +int iobase; +int iobase2; +int isairq; +} port_info[] = { +{0x1f0, 0x3f6, 14}, +{0x170, 0x376, 15}, +}; + +for (i = 0; i 2; i++) { +ide_bus_new(d-bus[i], d-dev.qdev); +ide_init_ioport(d-bus[i], port_info[i].iobase, port_info[i].iobase2); +ide_init2(d-bus[i], isa_reserve_irq(port_info[i].isairq)); + +d-bus[i].bmdma = d-bmdma[i]; +d-bmdma[i].bus = d-bus[i]; +qemu_add_vm_change_state_handler(ide_dma_restart_cb, d-bmdma[i]); +} +} + static int pci_piix_ide_initfn(PCIIDEState *d) { uint8_t *pci_conf = d-dev.config; @@ -125,13 +144,8 @@ static int pci_piix_ide_initfn(PCIIDEState *d) vmstate_register(d-dev.qdev, 0, vmstate_ide_pci, d); -ide_bus_new(d-bus[0], d-dev.qdev); -ide_bus_new(d-bus[1], d-dev.qdev); -ide_init_ioport(d-bus[0], 0x1f0, 0x3f6); -ide_init_ioport(d-bus[1], 0x170, 0x376); +pci_piix_init_ports(d); -ide_init2(d-bus[0], isa_reserve_irq(14)); -ide_init2(d-bus[1], isa_reserve_irq(15)); return 0; } diff --git a/hw/ide/via.c b/hw/ide/via.c index 66be0c4..839c571 100644 --- a/hw/ide/via.c +++ b/hw/ide/via.c @@ -78,9 +78,6 @@ static void bmdma_map(PCIDevice *pci_dev, int region_num, for(i = 0;i 2; i++) { BMDMAState *bm = d-bmdma[i]; -d-bus[i].bmdma = bm; -bm-bus = d-bus+i; -qemu_add_vm_change_state_handler(ide_dma_restart_cb, bm); register_ioport_write(addr, 1, 1, bmdma_cmd_writeb, bm); @@ -135,6 +132,28 @@ static void via_reset(void *opaque) pci_set_long(pci_conf + 0xc0, 0x00020001); } +static void vt82c686b_init_ports(PCIIDEState *d) { +int i; +struct { +int iobase; +int iobase2; +int isairq; +} port_info[] = { +{0x1f0, 0x3f6, 14}, +{0x170, 0x376, 15}, +}; + +for (i = 0; i 2; i++) { +ide_bus_new(d-bus[i], d-dev.qdev); +ide_init2(d-bus[i], isa_reserve_irq(port_info[i].isairq)); +ide_init_ioport(d-bus[i], port_info[i].iobase, port_info[i].iobase2); + +d-bus[i].bmdma = d-bmdma[i]; +d-bmdma[i].bus = d-bus[i]; +
[Qemu-devel] ]PATCH 0/7] add TRIM/UNMAP support, v3
This patchset adds support for the ATA TRIM and SCSI WRITE SAME with unmap commands, which allow reclaiming free space from a backing image. The user facing implementation is pretty complete, but not really efficient because the underlying bdrv_discard implementation doesn't use the aio implementation yet. The reason for that is that the SCSI layer doesn't really allow any asynchronous commands except for READ/WRITE by design, and implementing the ATA TRIM command with it's multiple ranges is rather painful, and combined with the SCSI limitation I didn't bother yet. The only backend support so far is the XFS hole punching ioctl, but others can be added easily when they become available. A virtio implementation for a discard command would also be pretty easy, but until we actually support a better backend then a plain sparse file it's not worth using for production enviroments anyway, but more for playing with the thin provisioning infrastructure, or observing guest behaviour when TRIM / unmap is supported. If the support is enabled and the backend doesn't support hole punching the TRIM / WRITE SAME commands become no-ops so that migration from hosts supporting or not supporting it works. Version 3: - refactor IDE dma support code - proper brace obsfucation - fix compile without xfs headers - use bool instead of int for a one-byte flag Version 2: - replace tabs with spaces - return -ENOMEDIUM from bdrv_discard if there's no driver assigned - actually list the TP EVPD page as supported when querying for supported EVPD pages
[Qemu-devel] Re: [PATCH 5/5] virtio-serial: Error out if guest sends unexpected vq elements
On (Fri) Dec 10 2010 [13:59:50], Paul Brook wrote: Check if the guest really sent any items in the out_vq before using them. Similarly, check if there is a buffer to send data in before writing. Can this actually happen? If so why/how? Why does it need a special case in this device? A malicious guest (ie, a guest with the virtio_console module suitably modified) could send in buffers with the 'input' bit set instead of output as expected or vice-versa. If this is guest triggerable then calling abort() is wrong. It's either a guest bug or a malicious guest. What action is recommended? Amit
[Qemu-devel] [PATCH 3/7] make dma_bdrv_io available to drivers
Make dma_bdrv_io available for drivers, and pass an explicit I/O function instead of hardcoding bdrv_aio_readv/bdrv_aio_writev. This is required to implement non-READ/WRITE dma commands in the ide driver, e.g. the upcoming TRIM support. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/dma-helpers.c === --- qemu.orig/dma-helpers.c 2010-11-23 14:13:42.178253390 +0100 +++ qemu/dma-helpers.c 2010-11-23 14:14:45.186253110 +0100 @@ -47,6 +47,7 @@ typedef struct { target_phys_addr_t sg_cur_byte; QEMUIOVector iov; QEMUBH *bh; +DMAIOFunc *io_func; } DMAAIOCB; static void dma_bdrv_cb(void *opaque, int ret); @@ -116,13 +117,8 @@ static void dma_bdrv_cb(void *opaque, in return; } -if (dbs-is_write) { -dbs-acb = bdrv_aio_writev(dbs-bs, dbs-sector_num, dbs-iov, - dbs-iov.size / 512, dma_bdrv_cb, dbs); -} else { -dbs-acb = bdrv_aio_readv(dbs-bs, dbs-sector_num, dbs-iov, - dbs-iov.size / 512, dma_bdrv_cb, dbs); -} +dbs-acb = dbs-io_func(dbs-bs, dbs-sector_num, dbs-iov, +dbs-iov.size / 512, dma_bdrv_cb, dbs); if (!dbs-acb) { dma_bdrv_unmap(dbs); qemu_iovec_destroy(dbs-iov); @@ -144,12 +140,12 @@ static AIOPool dma_aio_pool = { .cancel = dma_aio_cancel, }; -static BlockDriverAIOCB *dma_bdrv_io( +BlockDriverAIOCB *dma_bdrv_io( BlockDriverState *bs, QEMUSGList *sg, uint64_t sector_num, -BlockDriverCompletionFunc *cb, void *opaque, -int is_write) +DMAIOFunc *io_func, BlockDriverCompletionFunc *cb, +void *opaque, int is_write) { -DMAAIOCB *dbs = qemu_aio_get(dma_aio_pool, bs, cb, opaque); +DMAAIOCB *dbs = qemu_aio_get(dma_aio_pool, bs, cb, opaque); dbs-acb = NULL; dbs-bs = bs; @@ -158,6 +154,7 @@ static BlockDriverAIOCB *dma_bdrv_io( dbs-sg_cur_index = 0; dbs-sg_cur_byte = 0; dbs-is_write = is_write; +dbs-io_func = io_func; dbs-bh = NULL; qemu_iovec_init(dbs-iov, sg-nsg); dma_bdrv_cb(dbs, 0); @@ -173,12 +170,12 @@ BlockDriverAIOCB *dma_bdrv_read(BlockDri QEMUSGList *sg, uint64_t sector, void (*cb)(void *opaque, int ret), void *opaque) { -return dma_bdrv_io(bs, sg, sector, cb, opaque, 0); +return dma_bdrv_io(bs, sg, sector, bdrv_aio_readv, cb, opaque, 0); } BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs, QEMUSGList *sg, uint64_t sector, void (*cb)(void *opaque, int ret), void *opaque) { -return dma_bdrv_io(bs, sg, sector, cb, opaque, 1); +return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque, 1); } Index: qemu/dma.h === --- qemu.orig/dma.h 2010-11-23 14:13:42.185253809 +0100 +++ qemu/dma.h 2010-11-23 14:14:45.186253110 +0100 @@ -32,6 +32,14 @@ void qemu_sglist_add(QEMUSGList *qsg, ta target_phys_addr_t len); void qemu_sglist_destroy(QEMUSGList *qsg); +typedef BlockDriverAIOCB *DMAIOFunc(BlockDriverState *bs, int64_t sector_num, + QEMUIOVector *iov, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque); + +BlockDriverAIOCB *dma_bdrv_io(BlockDriverState *bs, + QEMUSGList *sg, uint64_t sector_num, + DMAIOFunc *io_func, BlockDriverCompletionFunc *cb, + void *opaque, int is_write); BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs, QEMUSGList *sg, uint64_t sector, BlockDriverCompletionFunc *cb, void *opaque);
[Qemu-devel] [PATCH 2/7] scsi-disk: support WRITE SAME (16) with unmap bit
Support discards via the WRITE SAME command with the unmap bit set, and tell the initiator about the support for it via the block limit and the new thin provisioning EVPD pages. Also fix the comment which incorrectly describedthe block limits EVPD page. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/hw/scsi-disk.c === --- qemu.orig/hw/scsi-disk.c2010-12-07 09:35:44.203009851 +0100 +++ qemu/hw/scsi-disk.c 2010-12-07 09:42:07.984255141 +0100 @@ -424,7 +424,8 @@ static int scsi_disk_emulate_inquiry(SCS outbuf[buflen++] = 0x80; // unit serial number outbuf[buflen++] = 0x83; // device identification if (bdrv_get_type_hint(s-bs) != BDRV_TYPE_CDROM) { -outbuf[buflen++] = 0xb0; // block device characteristics +outbuf[buflen++] = 0xb0; // block limits +outbuf[buflen++] = 0xb2; // thin provisioning } outbuf[pages] = buflen - pages - 1; // number of pages break; @@ -466,8 +467,10 @@ static int scsi_disk_emulate_inquiry(SCS buflen += id_len; break; } -case 0xb0: /* block device characteristics */ +case 0xb0: /* block limits */ { +unsigned int unmap_sectors = +s-qdev.conf.discard_granularity / s-qdev.blocksize; unsigned int min_io_size = s-qdev.conf.min_io_size / s-qdev.blocksize; unsigned int opt_io_size = @@ -492,6 +495,21 @@ static int scsi_disk_emulate_inquiry(SCS outbuf[13] = (opt_io_size 16) 0xff; outbuf[14] = (opt_io_size 8) 0xff; outbuf[15] = opt_io_size 0xff; + +/* optimal unmap granularity */ +outbuf[28] = (unmap_sectors 24) 0xff; +outbuf[29] = (unmap_sectors 16) 0xff; +outbuf[30] = (unmap_sectors 8) 0xff; +outbuf[31] = unmap_sectors 0xff; +break; +} +case 0xb2: /* thin provisioning */ +{ +outbuf[3] = buflen = 8; +outbuf[4] = 0; +outbuf[5] = 0x40; /* write same with unmap supported */ +outbuf[6] = 0; +outbuf[7] = 0; break; } default: @@ -959,6 +977,11 @@ static int scsi_disk_emulate_command(SCS outbuf[11] = 0; outbuf[12] = 0; outbuf[13] = get_physical_block_exp(s-qdev.conf); + +/* set TPE bit if the format supports discard */ +if (s-qdev.conf.discard_granularity) +outbuf[14] = 0x80; + /* Protection, exponent and lowest lba field left blank. */ buflen = req-cmd.xfer; break; @@ -1123,6 +1146,34 @@ static int32_t scsi_send_command(SCSIDev goto illegal_lba; } break; +case WRITE_SAME_16: +len = r-req.cmd.xfer / d-blocksize; + +DPRINTF(WRITE SAME(16) (sector % PRId64 , count %d)\n, +r-req.cmd.lba, len); + +if (r-req.cmd.lba s-max_lba) { +goto illegal_lba; +} + +/* + * We only support WRITE SAME with the unmap bit set for now. + */ +if (!(buf[1] 0x8)) { +goto fail; +} + +rc = bdrv_discard(s-bs, r-req.cmd.lba * s-cluster_size, + len * s-cluster_size); +if (rc 0) { +/* XXX: better error code ?*/ +goto fail; +} + +scsi_req_set_status(r, GOOD, NO_SENSE); +scsi_req_complete(r-req); +scsi_remove_request(r); +return 0; default: DPRINTF(Unknown SCSI command (%2.2x)\n, buf[0]); fail: Index: qemu/hw/scsi-defs.h === --- qemu.orig/hw/scsi-defs.h2010-12-07 09:35:44.219005032 +0100 +++ qemu/hw/scsi-defs.h 2010-12-07 09:38:35.726003986 +0100 @@ -84,6 +84,7 @@ #define MODE_SENSE_10 0x5a #define PERSISTENT_RESERVE_IN 0x5e #define PERSISTENT_RESERVE_OUT 0x5f +#define WRITE_SAME_16 0x93 #define MAINTENANCE_IN0xa3 #define MAINTENANCE_OUT 0xa4 #define MOVE_MEDIUM 0xa5
[Qemu-devel] [PATCH 7/7] raw-posix: add discard support
Add support to discard blocks in a raw image residing on an XFS filesystem by calling the XFS_IOC_UNRESVSP64 ioctl to punch holes. Support for other hole punching mechanisms can be added when they become available. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c 2010-12-07 09:35:40.806004264 +0100 +++ qemu/block/raw-posix.c 2010-12-07 09:44:30.093290897 +0100 @@ -69,6 +69,10 @@ #include sys/diskslice.h #endif +#ifdef CONFIG_XFS +#include xfs/xfs.h +#endif + //#define DEBUG_FLOPPY //#define DEBUG_BLOCK @@ -120,6 +124,9 @@ typedef struct BDRVRawState { #endif uint8_t *aligned_buf; unsigned aligned_buf_size; +#ifdef CONFIG_XFS +bool is_xfs : 1; +#endif } BDRVRawState; static int fd_open(BlockDriverState *bs); @@ -196,6 +203,12 @@ static int raw_open_common(BlockDriverSt #endif } +#ifdef CONFIG_XFS +if (platform_test_xfs_fd(s-fd)) { +s-is_xfs = 1; +} +#endif + return 0; out_free_buf: @@ -740,6 +753,36 @@ static int raw_flush(BlockDriverState *b return qemu_fdatasync(s-fd); } +#ifdef CONFIG_XFS +static int xfs_discard(BDRVRawState *s, int64_t sector_num, int nb_sectors) +{ +struct xfs_flock64 fl; + +memset(fl, 0, sizeof(fl)); +fl.l_whence = SEEK_SET; +fl.l_start = sector_num 9; +fl.l_len = (int64_t)nb_sectors 9; + +if (xfsctl(NULL, s-fd, XFS_IOC_UNRESVSP64, fl) 0) { +printf(cannot punch hole (%s)\n, strerror(errno)); +return -errno; +} + +return 0; +} +#endif + +static int raw_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) +{ +#ifdef CONFIG_XFS +BDRVRawState *s = bs-opaque; + +if (s-is_xfs) +return xfs_discard(s, sector_num, nb_sectors); +#endif + +return 0; +} static QEMUOptionParameter raw_create_options[] = { { @@ -761,6 +804,7 @@ static BlockDriver bdrv_file = { .bdrv_close = raw_close, .bdrv_create = raw_create, .bdrv_flush = raw_flush, +.bdrv_discard = raw_discard, .bdrv_aio_readv = raw_aio_readv, .bdrv_aio_writev = raw_aio_writev, Index: qemu/configure === --- qemu.orig/configure 2010-12-07 09:35:40.815261632 +0100 +++ qemu/configure 2010-12-07 09:44:52.764255834 +0100 @@ -288,6 +288,7 @@ xen= linux_aio= attr= vhost_net= +xfs= gprof=no debug_tcg=no @@ -1393,6 +1394,27 @@ EOF fi ## +# xfsctl() probe, used for raw-posix +if test $xfs != no ; then + cat $TMPC EOF +#include xfs/xfs.h +int main(void) +{ +xfsctl(NULL, 0, 0, NULL); +return 0; +} +EOF + if compile_prog ; then +xfs=yes + else +if test $xfs = yes ; then + feature_not_found xfs +fi +xfs=no + fi +fi + +## # vde libraries probe if test $vde != no ; then vde_libs=-lvdeplug @@ -2354,6 +2376,7 @@ echo vhost-net support $vhost_net echo Trace backend $trace_backend echo Trace output file $trace_file-pid echo spice support $spice +echo xfsctl support$xfs if test $sdl_too_old = yes; then echo - Your SDL version is too old - please upgrade to have SDL support @@ -2499,6 +2522,9 @@ fi if test $uuid = yes ; then echo CONFIG_UUID=y $config_host_mak fi +if test $xfs = yes ; then + echo CONFIG_XFS=y $config_host_mak +fi qemu_version=`head $source_path/VERSION` echo VERSION=$qemu_version $config_host_mak echo PKGVERSION=$pkgversion $config_host_mak
[Qemu-devel] [PATCH 1/7] block: add discard support
Add a new bdrv_discard method to free blocks in a mapping image, and a new drive property to set the granularity for these discard. If no discard granularity support is set discard support is disabled. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/block.c === --- qemu.orig/block.c 2010-12-07 09:34:17.563010201 +0100 +++ qemu/block.c2010-12-07 09:37:44.734255486 +0100 @@ -1499,6 +1499,17 @@ int bdrv_has_zero_init(BlockDriverState return 1; } +int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) +{ +if (!bs-drv) { +return -ENOMEDIUM; +} +if (!bs-drv-bdrv_discard) { +return 0; +} +return bs-drv-bdrv_discard(bs, sector_num, nb_sectors); +} + /* * Returns true iff the specified sector is present in the disk image. Drivers * not implementing the functionality are assumed to not support backing files, Index: qemu/block.h === --- qemu.orig/block.h 2010-12-07 09:34:17.571010061 +0100 +++ qemu/block.h2010-12-07 09:34:29.689022144 +0100 @@ -146,6 +146,7 @@ int bdrv_flush(BlockDriverState *bs); void bdrv_flush_all(void); void bdrv_close_all(void); +int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors); int bdrv_has_zero_init(BlockDriverState *bs); int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum); Index: qemu/block_int.h === --- qemu.orig/block_int.h 2010-12-07 09:34:17.588010480 +0100 +++ qemu/block_int.h2010-12-07 09:34:29.693010131 +0100 @@ -73,6 +73,8 @@ struct BlockDriver { BlockDriverCompletionFunc *cb, void *opaque); BlockDriverAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque); +int (*bdrv_discard)(BlockDriverState *bs, int64_t sector_num, +int nb_sectors); int (*bdrv_aio_multiwrite)(BlockDriverState *bs, BlockRequest *reqs, int num_reqs); @@ -227,6 +229,7 @@ typedef struct BlockConf { uint16_t logical_block_size; uint16_t min_io_size; uint32_t opt_io_size; +uint32_t discard_granularity; } BlockConf; static inline unsigned int get_physical_block_exp(BlockConf *conf) @@ -249,6 +252,8 @@ static inline unsigned int get_physical_ DEFINE_PROP_UINT16(physical_block_size, _state, \ _conf.physical_block_size, 512), \ DEFINE_PROP_UINT16(min_io_size, _state, _conf.min_io_size, 0), \ -DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0) +DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0), \ +DEFINE_PROP_UINT32(discard_granularity, _state, \ + _conf.discard_granularity, 0) #endif /* BLOCK_INT_H */ Index: qemu/block/raw.c === --- qemu.orig/block/raw.c 2010-12-07 09:34:17.596025496 +0100 +++ qemu/block/raw.c2010-12-07 09:34:29.695297321 +0100 @@ -65,6 +65,11 @@ static int raw_probe(const uint8_t *buf, return 1; /* everything can be opened as raw image */ } +static int raw_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) +{ +return bdrv_discard(bs-file, sector_num, nb_sectors); +} + static int raw_is_inserted(BlockDriverState *bs) { return bdrv_is_inserted(bs-file); @@ -130,6 +135,7 @@ static BlockDriver bdrv_raw = { .bdrv_aio_readv = raw_aio_readv, .bdrv_aio_writev= raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, +.bdrv_discard = raw_discard, .bdrv_is_inserted = raw_is_inserted, .bdrv_eject = raw_eject,
[Qemu-devel] Re: [PATCH] blockdev: check dinfo ptr before using
Am 08.12.2010 17:05, schrieb Ryan Harper: If a user decides to punish a guest by revoking its block device via drive_del, and subsequently also attempts to remove the pci device backing it, and the device is using blockdev_auto_del() then we get a segfault when we attempt to access dinfo-auto_del.[1] The fix is to check if drive_get_by_blockdev() actually returns a valid dinfo pointer or not. 1. (qemu) pci_add auto storage file=images/test01.raw,if=virtio,id=block1,snapshot=on (qemu) drive_del block1 (qemu) pci_del 5 *segfault* Signed-off-by: Ryan Harper ry...@us.ibm.com Thanks, applied to the block branch. Kevin
[Qemu-devel] [PATCH 5/7] ide: also reset io_buffer_index for writes
Currenly the code only resets the io_buffer_index field for reads, but the code seems to expect this for all types of I/O. I guess we simply don't hit large enough transfers that would require this often enough. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/hw/ide/core.c === --- qemu.orig/hw/ide/core.c 2010-12-10 11:35:23.164005731 +0100 +++ qemu/hw/ide/core.c 2010-12-10 11:35:30.471253949 +0100 @@ -605,8 +605,7 @@ static void ide_dma_cb(void *opaque, int /* launch next transfer */ n = s-nsector; -if (s-is_read) -s-io_buffer_index = 0; +s-io_buffer_index = 0; s-io_buffer_size = n * 512; if (dma_buf_prepare(bm, s-is_read) == 0) goto eot;
[Qemu-devel] [PATCH 4/7] ide: factor dma handling helpers
The DMA I/O path is duplicated between read and write commands, and support for the TRIM command would add another copy. Factor the code into common helpers using the s-is_read flag added for the macio ATA controller, and the newly added dma_bdrv_io function. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/hw/ide/core.c === --- qemu.orig/hw/ide/core.c 2010-12-10 11:17:57.328254648 +0100 +++ qemu/hw/ide/core.c 2010-12-10 11:35:23.164005731 +0100 @@ -61,7 +61,8 @@ static inline int media_is_cd(IDEState * return (media_present(s) s-nb_sectors = CD_MAX_SECTORS); } -static void ide_dma_start(IDEState *s, BlockDriverCompletionFunc *dma_cb); +static void ide_dma_start(IDEState *s, BlockDriverCompletionFunc *dma_cb, + DMAIOFunc *io_func); static void ide_dma_restart(IDEState *s, int is_read); static void ide_atapi_cmd_read_dma_cb(void *opaque, int ret); static int ide_handle_rw_error(IDEState *s, int error, int op); @@ -568,7 +569,7 @@ static int dma_buf_rw(BMDMAState *bm, in return 1; } -static void ide_read_dma_cb(void *opaque, int ret) +static void ide_dma_cb(void *opaque, int ret) { BMDMAState *bm = opaque; IDEState *s = bmdma_active_if(bm); @@ -576,9 +577,12 @@ static void ide_read_dma_cb(void *opaque int64_t sector_num; if (ret 0) { -if (ide_handle_rw_error(s, -ret, -BM_STATUS_DMA_RETRY | BM_STATUS_RETRY_READ)) -{ +int op = BM_STATUS_DMA_RETRY; + +if (s-is_read) +op |= BM_STATUS_RETRY_READ; + +if (ide_handle_rw_error(s, -ret, op)) { return; } } @@ -586,7 +590,7 @@ static void ide_read_dma_cb(void *opaque n = s-io_buffer_size 9; sector_num = ide_get_sector(s); if (n 0) { -dma_buf_commit(s, 1); +dma_buf_commit(s, s-is_read); sector_num += n; ide_set_sector(s, sector_num); s-nsector -= n; @@ -596,32 +600,39 @@ static void ide_read_dma_cb(void *opaque if (s-nsector == 0) { s-status = READY_STAT | SEEK_STAT; ide_set_irq(s-bus); -eot: -bm-status |= BM_STATUS_INT; -ide_dma_set_inactive(bm); -return; +goto eot; } /* launch next transfer */ n = s-nsector; -s-io_buffer_index = 0; +if (s-is_read) +s-io_buffer_index = 0; s-io_buffer_size = n * 512; -if (dma_buf_prepare(bm, 1) == 0) +if (dma_buf_prepare(bm, s-is_read) == 0) goto eot; + #ifdef DEBUG_AIO -printf(aio_read: sector_num=% PRId64 n=%d\n, sector_num, n); +printf(ide_dma_cb: read: %d sector_num=% PRId64 n=%d\n, + s-is_read, sector_num, n); #endif -bm-aiocb = dma_bdrv_read(s-bs, s-sg, sector_num, ide_read_dma_cb, bm); -ide_dma_submit_check(s, ide_read_dma_cb, bm); + +bm-aiocb = dma_bdrv_io(s-bs, s-sg, sector_num, bm-io_func, +ide_dma_cb, bm, !s-is_read); +ide_dma_submit_check(s, ide_dma_cb, bm); +return; + +eot: +bm-status |= BM_STATUS_INT; +ide_dma_set_inactive(bm); } -static void ide_sector_read_dma(IDEState *s) +static void ide_sector_dma(IDEState *s, DMAIOFunc *io_func, bool is_read) { s-status = READY_STAT | SEEK_STAT | DRQ_STAT | BUSY_STAT; s-io_buffer_index = 0; s-io_buffer_size = 0; -s-is_read = 1; -ide_dma_start(s, ide_read_dma_cb); +s-is_read = is_read; +ide_dma_start(s, ide_dma_cb, io_func); } static void ide_sector_write_timer_cb(void *opaque) @@ -714,58 +725,6 @@ void ide_dma_restart_cb(void *opaque, in } } -static void ide_write_dma_cb(void *opaque, int ret) -{ -BMDMAState *bm = opaque; -IDEState *s = bmdma_active_if(bm); -int n; -int64_t sector_num; - -if (ret 0) { -if (ide_handle_rw_error(s, -ret, BM_STATUS_DMA_RETRY)) -return; -} - -n = s-io_buffer_size 9; -sector_num = ide_get_sector(s); -if (n 0) { -dma_buf_commit(s, 0); -sector_num += n; -ide_set_sector(s, sector_num); -s-nsector -= n; -} - -/* end of transfer ? */ -if (s-nsector == 0) { -s-status = READY_STAT | SEEK_STAT; -ide_set_irq(s-bus); -eot: -bm-status |= BM_STATUS_INT; -ide_dma_set_inactive(bm); -return; -} - -n = s-nsector; -s-io_buffer_size = n * 512; -/* launch next transfer */ -if (dma_buf_prepare(bm, 0) == 0) -goto eot; -#ifdef DEBUG_AIO -printf(aio_write: sector_num=% PRId64 n=%d\n, sector_num, n); -#endif -bm-aiocb = dma_bdrv_write(s-bs, s-sg, sector_num, ide_write_dma_cb, bm); -ide_dma_submit_check(s, ide_write_dma_cb, bm); -} - -static void ide_sector_write_dma(IDEState *s) -{ -s-status = READY_STAT | SEEK_STAT | DRQ_STAT | BUSY_STAT; -s-io_buffer_index = 0; -s-io_buffer_size = 0; -s-is_read = 0; -ide_dma_start(s, ide_write_dma_cb); -} - void
[Qemu-devel] [PATCH 6/7] ide: add TRIM support
Add support for the data set management command, and the TRIM sub function in it. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/hw/ide/core.c === --- qemu.orig/hw/ide/core.c 2010-12-10 11:35:30.471253949 +0100 +++ qemu/hw/ide/core.c 2010-12-10 11:35:33.430253739 +0100 @@ -146,6 +146,9 @@ static void ide_identify(IDEState *s) put_le16(p + 66, 120); put_le16(p + 67, 120); put_le16(p + 68, 120); +if (dev dev-conf.discard_granularity) { +put_le16(p + 69, (1 14)); /* determinate TRIM behavior */ +} put_le16(p + 80, 0xf0); /* ata3 - ata6 supported */ put_le16(p + 81, 0x16); /* conforms to ata5 */ /* 14=NOP supported, 5=WCACHE supported, 0=SMART supported */ @@ -172,6 +175,9 @@ static void ide_identify(IDEState *s) dev = s-unit ? s-bus-slave : s-bus-master; if (dev dev-conf.physical_block_size) put_le16(p + 106, 0x6000 | get_physical_block_exp(dev-conf)); +if (dev dev-conf.discard_granularity) { +put_le16(p + 169, 1); /* TRIM support */ +} memcpy(s-identify_data, p, sizeof(s-identify_data)); s-identify_set = 1; @@ -1746,6 +1752,72 @@ static void ide_clear_hob(IDEBus *bus) bus-ifs[1].select = ~(1 7); } +typedef struct TrimAIOCB { +BlockDriverAIOCB common; +QEMUBH *bh; +int ret; +} TrimAIOCB; + +static void trim_aio_cancel(BlockDriverAIOCB *acb) +{ +TrimAIOCB *iocb = container_of(acb, TrimAIOCB, common); + +qemu_bh_delete(iocb-bh); +iocb-bh = NULL; +qemu_aio_release(iocb); +} + +static AIOPool trim_aio_pool = { +.aiocb_size = sizeof(TrimAIOCB), +.cancel = trim_aio_cancel, +}; + +static void ide_trim_bh_cb(void *opaque) +{ +TrimAIOCB *iocb = opaque; + +iocb-common.cb(iocb-common.opaque, iocb-ret); + +qemu_bh_delete(iocb-bh); +iocb-bh = NULL; + +qemu_aio_release(iocb); +} + +static BlockDriverAIOCB *ide_issue_trim(BlockDriverState *bs, +int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, +BlockDriverCompletionFunc *cb, void *opaque) +{ +TrimAIOCB *iocb; +int i, j, ret; + +iocb = qemu_aio_get(trim_aio_pool, bs, cb, opaque); +iocb-bh = qemu_bh_new(ide_trim_bh_cb, iocb); +iocb-ret = 0; + +for (j = 0; j qiov-niov; j++) { +uint64_t *buffer = qiov-iov[j].iov_base; + +for (i = 0; i qiov-iov[j].iov_len / 8; i++) { +/* 6-byte LBA + 2-byte range per entry */ +uint64_t entry = le64_to_cpu(buffer[i]); +uint64_t sector = entry 0xULL; +uint16_t count = entry 48; + +if (count == 0) +break; + +ret = bdrv_discard(bs, sector * 512, count * 512); +if (!iocb-ret) +iocb-ret = ret; +} +} + +qemu_bh_schedule(iocb-bh); + +return iocb-common; +} + void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val) { IDEBus *bus = opaque; @@ -1825,6 +1897,17 @@ void ide_ioport_write(void *opaque, uint break; switch(val) { +case WIN_DSM: +switch (s-feature) { +case DSM_TRIM: +if (!s-bs) + goto abort_cmd; +ide_sector_dma(s, ide_issue_trim, 0); +break; +default: +goto abort_cmd; +} +break; case WIN_IDENTIFY: if (s-bs s-drive_kind != IDE_CD) { if (s-drive_kind != IDE_CFATA) Index: qemu/hw/ide/internal.h === --- qemu.orig/hw/ide/internal.h 2010-12-10 11:33:41.444006150 +0100 +++ qemu/hw/ide/internal.h 2010-12-10 11:35:33.446003914 +0100 @@ -60,7 +60,11 @@ typedef struct BMDMAState BMDMAState; */ #define CFA_REQ_EXT_ERROR_CODE 0x03 /* CFA Request Extended Error Code */ /* - * 0x04-0x07 Reserved + * 0x04-0x05 Reserved + */ +#define WIN_DSM 0x06 +/* + * 0x07 Reserved */ #define WIN_SRST 0x08 /* ATAPI soft reset command */ #define WIN_DEVICE_RESET 0x08 @@ -188,6 +192,9 @@ typedef struct BMDMAState BMDMAState; #define IDE_DMA_BUF_SECTORS 256 +/* feature values for Data Set Management */ +#define DSM_TRIM0x01 + #if (IDE_DMA_BUF_SECTORS MAX_MULT_SECTORS) #error IDE_DMA_BUF_SECTORS must be bigger or equal to MAX_MULT_SECTORS #endif Index: qemu/hw/ide/qdev.c === --- qemu.orig/hw/ide/qdev.c 2010-12-10 11:24:19.324253880 +0100 +++ qemu/hw/ide/qdev.c 2010-12-10 11:35:33.450257860 +0100 @@ -110,6 +110,11 @@ static int ide_drive_initfn(IDEDevice *d const char *serial; DriveInfo *dinfo; +if (dev-conf.discard_granularity dev-conf.discard_granularity != 512) { +error_report(discard_granularity
[Qemu-devel] Re: [PATCH 4/5] virtio-serial: Don't copy over guest buffer to host
On (Fri) Dec 10 2010 [14:02:37], Paul Brook wrote: -if (!discard) { +if (discard) { +goto next; +} +next: virtqueue_push(vq, elem, 0); Please don't do this. Could you elaborate? I can move the 'discard' check into the following 'for' loop, but since the value of discard doesn't change, I moved it outside. You've replaced a perfectly good if block with a goto. Paul
[Qemu-devel] [PATCH v2 4/4] virtio-serial: Don't copy over guest buffer to host
When the guest writes something to a host, we copied over the entire buffer first into the host and then processed it. Do away with that, it could result in a malicious guest causing a DoS on the host. Reported-by: Paul Brook p...@codesourcery.com Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-serial-bus.c | 16 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index ecf0056..a0886a2 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -125,16 +125,16 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, return; } while (virtqueue_pop(vq, elem)) { -uint8_t *buf; -size_t ret, buf_size; +unsigned int i; -if (!discard) { -buf_size = iov_size(elem.out_sg, elem.out_num); -buf = qemu_malloc(buf_size); -ret = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, buf_size); +for (i = 0; !discard i elem.out_num; i++) { +size_t buf_size; -port-info-have_data(port, buf, ret); -qemu_free(buf); +buf_size = elem.out_sg[i].iov_len; + +port-info-have_data(port, + elem.out_sg[i].iov_base, + buf_size); } virtqueue_push(vq, elem, 0); } -- 1.7.3.2
[Qemu-devel] Re: [PATCH 4/5] virtio-serial: Don't copy over guest buffer to host
On (Fri) Dec 10 2010 [15:17:18], Paul Brook wrote: On (Fri) Dec 10 2010 [14:02:37], Paul Brook wrote: -if (!discard) { +if (discard) { +goto next; +} +next: virtqueue_push(vq, elem, 0); Please don't do this. Could you elaborate? I can move the 'discard' check into the following 'for' loop, but since the value of discard doesn't change, I moved it outside. You've replaced a perfectly good if block with a goto. To keep the indentation levels low. I've put it in the for loop for v2. Amit
[Qemu-devel] Re: [PATCH] correct migrate_set_speed's args_type
[Note cc: Dan, Avi] Luiz Capitulino lcapitul...@redhat.com writes: On Tue, 23 Nov 2010 10:43:48 -0200 Luiz Capitulino lcapitul...@redhat.com wrote: On Tue, 23 Nov 2010 13:41:26 +0800 Wen Congyang we...@cn.fujitsu.com wrote: The args_type of migrate_set_speed in qmp-commands.hx is wrong. When we set migrate speed by json, qemu will be core dumped. Signed-off-by: Wen Congyang Nice catch. Was caused by 07de3e60b05 and hence affects master only. Could you please mention that in the commit log? Also, your email address is missing in the signed-off-by line. There's another problem there: we used to accept a json number but now we accept only a json integer. Markus, are you aware of this change? I pointed out the incompatible change in my review[*] of the offending patch: This isn't backwards bug-compatible. Before, a client could send any number. Any fractional part was ignored. Now, the number must be an integer. Other numbers are rejected. I don't care myself, but others have argued most forcefully for keeping QMP fully backward compatible from 0.13 on, so they might object. Others did not object, so this went in. Not released yet, so it's not too late to object. [*] http://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01905.html
[Qemu-devel] Re: [PATCH] correct migrate_set_speed's args_type
On Fri, 10 Dec 2010 16:20:34 +0100 Markus Armbruster arm...@redhat.com wrote: [Note cc: Dan, Avi] Luiz Capitulino lcapitul...@redhat.com writes: On Tue, 23 Nov 2010 10:43:48 -0200 Luiz Capitulino lcapitul...@redhat.com wrote: On Tue, 23 Nov 2010 13:41:26 +0800 Wen Congyang we...@cn.fujitsu.com wrote: The args_type of migrate_set_speed in qmp-commands.hx is wrong. When we set migrate speed by json, qemu will be core dumped. Signed-off-by: Wen Congyang Nice catch. Was caused by 07de3e60b05 and hence affects master only. Could you please mention that in the commit log? Also, your email address is missing in the signed-off-by line. There's another problem there: we used to accept a json number but now we accept only a json integer. Markus, are you aware of this change? I pointed out the incompatible change in my review[*] of the offending patch: Ok, I can remember now. I really think this is a small improvement and I doubt there's any client out there depending on this. This isn't backwards bug-compatible. Before, a client could send any number. Any fractional part was ignored. Now, the number must be an integer. Other numbers are rejected. I don't care myself, but others have argued most forcefully for keeping QMP fully backward compatible from 0.13 on, so they might object. Others did not object, so this went in. Not released yet, so it's not too late to object. [*] http://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01905.html
[Qemu-devel] [PATCH v2 0/4] virtio-serial: Trivial fixes, don't copy buffers to host
Hi, This patch series converts virtio-serial-bus to use the guest buffers instead of copying over guest data to the host, as suggested by Paul. In addition, there are some trivial fixes for the virtio-console and virtio-serial code. v2: - drop the erroring out patch till we decide what's to be done - remove goto usage. Amit Shah (4): virtio-console: Factor out common init between console and generic ports virtio-console: Remove unnecessary braces virtio-serial: Simplify condition for a while loop virtio-serial: Don't copy over guest buffer to host hw/virtio-console.c| 34 +++--- hw/virtio-serial-bus.c | 21 - 2 files changed, 27 insertions(+), 28 deletions(-) -- 1.7.3.2
Re: [Qemu-devel] [PATCH] audio: reset timer when enabling capture mode
On Thu, 9 Dec 2010, Michael Walle wrote: The audio timer also has to be reset when a capture device is enabled. This will ensure the timer to be started even if just capture devices are active. It was done in 39deb1e496de81957167daebf5cf5d1fbd5e47c2 -- mailto:av1...@comtv.ru
Re: [Qemu-devel] [PATCH] noaudio: fix return value for read()
On Thu, 9 Dec 2010, Michael Walle wrote: Read should return bytes instead of samples. Thanks, applied. [..snip..] -- mailto:av1...@comtv.ru
[Qemu-devel] Re: [PATCH 1/6] qemu, kvm: Enable NMI support for user space irqchip
On 12/09/2010 03:25 PM, Jan Kiszka wrote: Am 09.12.2010 07:58, Lai Jiangshan wrote: Make use of the new KVM_NMI IOCTL to send NMIs into the KVM guest if the user space APIC emulation or some other source raised them. In that light, the subject is not absolutely correct. [...] Actually, we already depend on KVM_CAP_DESTROY_MEMORY_REGION_WORKS which was introduced with 2.6.29 as well. I would suggest to simply extend the static configure check and avoid new #ifdefs in the code. Thanks for pushing this! Was obviously so trivial that it was forgotten... Thanks. We want to inject nmi to kvm guest via qemu monitor, that's this path's purpose. But I can't get you means, what I should do to fix this patch? Just remove the #ifdefs OR use kvm_check_extension(KVM_CAP_USER_NMI) ? +static int can_user_nmi = -1; + +if (can_user_nmi == -1) +can_user_nmi = kvm_check_extension(kvm_state, KVM_CAP_USER_NMI); + +if (can_user_nmi 0 env-interrupt_request CPU_INTERRUPT_NMI) { +env-interrupt_request = ~CPU_INTERRUPT_NMI; +DPRINTF(injected NMI\n); +kvm_vcpu_ioctl(env, KVM_NMI); +} Thanks, Lai
[Qemu-devel] Re: [PATCH] kvm: x86: Save/restore error_code
Jason Wang jasow...@redhat.com wrote: Juan Quintela writes: Jason Wang jasow...@redhat.com wrote: The saving and restoring of error_code seems lost and convert the error_code to uint32_t. Signed-off-by: Jason Wang jasow...@redhat.com --- target-i386/cpu.h |4 ++-- target-i386/machine.c |2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) It should be a new subsection. The test is if has_error_code != 0 according to gleb. Later, Juan. Thanks for reminding, and maybe we can just use VMSTATE_UINT32_TEST() which is simpler than subsection to do the check, isn't it? we need the subsection, that way we don't need to bump the section version. Later, Juan. diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 06e40f3..c990db9 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -688,7 +688,7 @@ typedef struct CPUX86State { uint64_t pat; /* exception/interrupt handling */ -int error_code; +uint32_t error_code; int exception_is_int; target_ulong exception_next_eip; target_ulong dr[8]; /* debug registers */ @@ -933,7 +933,7 @@ uint64_t cpu_get_tsc(CPUX86State *env); #define cpu_list_id x86_cpu_list #define cpudef_setupx86_cpudef_setup -#define CPU_SAVE_VERSION 12 +#define CPU_SAVE_VERSION 13 /* MMU modes definitions */ #define MMU_MODE0_SUFFIX _kernel diff --git a/target-i386/machine.c b/target-i386/machine.c index d78eceb..0e467da 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -491,6 +491,8 @@ static const VMStateDescription vmstate_cpu = { VMSTATE_UINT64_V(xcr0, CPUState, 12), VMSTATE_UINT64_V(xstate_bv, CPUState, 12), VMSTATE_YMMH_REGS_VARS(ymmh_regs, CPUState, CPU_NB_REGS, 12), + +VMSTATE_UINT32_V(error_code, CPUState, 13), VMSTATE_END_OF_LIST() /* The above list is not sorted /wrt version numbers, watch out! */ },
Re: [Qemu-devel] [PATCH 1/6] [RFC] Emulation of GRLIB GPTimer as defined in GRLIB IP Core User's Manual.
On Wed, Dec 08, 2010 at 10:39:43AM +0100, Fabien Chouteau wrote: On 12/08/2010 09:30 AM, Edgar E. Iglesias wrote: On Tue, Dec 07, 2010 at 10:55:33AM +0100, Fabien Chouteau wrote: On 12/06/2010 06:12 PM, Blue Swirl wrote: On Mon, Dec 6, 2010 at 9:26 AM, Fabien Chouteauchout...@adacore.com wrote: +DeviceState *grlib_gptimer_create(target_phys_addr_t base, + uint32_tnr_timers, + uint32_tfreq, + qemu_irq *cpu_irqs, + int base_irq) This function belongs to leon3.c. I don't see why. GPTimer is a peripheral and you may want to use it in an other system. This might depend a bit on taste, but I agree with Blue that we shouldn't clutter the device models with this kind of instantiator helper calls. IMO it's better to put them higher up (e.g board code or similar). Do you mean like Xilinx devices where the instantiators are in-lined functions in hw/xilinx.h? Yes, that's one way. But if you only have a single board you can also just put the function with the board code. Cheers
[Qemu-devel] [Bug 543478] Re: qemus pmemsave doesn't accept / in filename
[Expired for qemu-kvm (Ubuntu) because there has been no activity for 60 days.] ** Changed in: qemu-kvm (Ubuntu) Status: Incomplete = Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/543478 Title: qemus pmemsave doesn't accept / in filename Status in QEMU: Invalid Status in “qemu-kvm” package in Ubuntu: Expired Bug description: Binary package hint: qemu-kvm Please see my conversation with qemu: (qemu) pmemsave unexpected end of expression (qemu) help pmemsave pmemsave addr size file -- save to disk physical memory dump starting at 'addr' of size 'size' (qemu) pmemsave 0 512M /tmp/qemu.mem pmemsave: extraneous characters at the end of line (qemu) pmemsave 0 512 /tmp/qemu.mem invalid char in expression (qemu) pmemsave 0 512 /tmp/qemu invalid char in expression (qemu) pmemsave 0 512 qemu.mem (qemu) pmemsave 0 512M qemu.mem pmemsave: extraneous characters at the end of line Let me comment on each one of those: (qemu) pmemsave unexpected end of expression I expected some sort of hint as to where to get more information. Maybe just a Type ``help pmemsave'' to get syntax information would be sufficient. (qemu) help pmemsave pmemsave addr size file -- save to disk physical memory dump starting at 'addr' of size 'size' Nice. But an example would be nice. My proposal: I.e.: pmemsave 0 1G /tmp/qemu.mem (qemu) pmemsave 0 512M /tmp/qemu.mem pmemsave: extraneous characters at the end of line eh. Would be nice if it told me *which* character was extraneous and what extraneous means. My proposal: Couldn't parse character at position 23, please see help pmemsave for an example. (qemu) pmemsave 0 512 /tmp/qemu.mem invalid char in expression Hm. Interesting. Again, would be nice if it printed me the offending character. My proposal: Could not parse character at position 23, please see help pmemsave for an example. (qemu) pmemsave 0 512 /tmp/qemu invalid char in expression Now I got rid of almost everything but it still doesn't work. (qemu) pmemsave 0 512 qemu.mem aha! No slashes?! Seriously? (qemu) pmemsave 0 512M qemu.mem pmemsave: extraneous characters at the end of line And no M or G modifiers? If I want to dump 2GB then I'd have to calculate the number in bytes and paste that long string. I expected qemu to be able to parse the K, M, G suffixes. Also, I'm wondering why it doesn't offer to dump all memory. ProblemType: Bug Architecture: amd64 CurrentDmesg: [150870.676062] kvm_intel: Unknown symbol kvm_vcpu_on_spin [150947.222923] cron[24260]: segfault at 0 ip (null) sp 7fffe865eed8 error 14 in cron[40+9000] [150947.224187] cron[24261]: segfault at 0 ip (null) sp 7fffe865eed8 error 14 in cron[40+9000] Date: Sun Mar 21 15:03:13 2010 DistroRelease: Ubuntu 9.10 KvmCmdLine: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD muelli 23807 23806 13 239738 228464 0 14:54 pts/800:01:22 /usr/bin/kvm -S -M pc -m 512 -smp 1 -name vanilla_ubuntu -monitor stdio -boot c -drive file=/home/muelli/qemu/ubuntu8.10/ubuntu.img,if=ide,index=0,boot=on -drive file=,if=ide,media=cdrom,index=2 -net nic,model=rtl8139,macaddr=f0:00:BA:12:34:56 -net user,hostfwd=tcp::2223-:22,smb=/tmp/share -serial pty -snapshot MachineType: LENOVO 766636G Package: kvm 1:84+dfsg-0ubuntu16+0.11.0+0ubuntu6.3 PccardctlIdent: Socket 0: no product info available PccardctlStatus: Socket 0: no card ProcCmdLine: root=/dev/mapper/cryptroot source=UUID=9c3d5596-27c6-4fd5-bfcd-fa8eef6f1230 ro vdso32=0 quiet splash crashkernel=384M-2G:64M,2G-:128M ProcVersionSignature: Ubuntu 2.6.32-16.24-generic SourcePackage: qemu-kvm Uname: Linux 2.6.32-16-generic x86_64 dmi.bios.date: 03/12/2009 dmi.bios.vendor: LENOVO dmi.bios.version: 7NETC0WW (2.20 ) dmi.board.name: 766636G dmi.board.vendor: LENOVO dmi.board.version: Not Available dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: Not Available dmi.modalias: dmi:bvnLENOVO:bvr7NETC0WW(2.20):bd03/12/2009:svnLENOVO:pn766636G:pvrThinkPadX61:rvnLENOVO:rn766636G:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable: dmi.product.name: 766636G dmi.product.version: ThinkPad X61 dmi.sys.vendor: LENOVO
[Qemu-devel] Re: [RFC][PATCH v5 07/21] virtagent: add va.getfile RPC
On 12/09/10 22:04, Michael Roth wrote: On 12/09/2010 08:40 AM, Adam Litke wrote: Actually, a host-controlled interface would be both simpler to implement (on both ends) and would concentrate control in the host (which is what we probably want anyway). I also like the host-controlled mechanism. I think the difficulty is about the same either way though. Both would require an FD to be kept open, and some mechanism to read/seek in chunks and then send it back. I don't think this should replace the underlying RPC for viewfile however. I'd much rather it be an additional RPC, and we just put strict limits on the size of files viewfile/getfile can handle and make it clear that it is intended for quickly viewing text. I'll rename the getfile RPC to viewfile to make this a bit clearer as well. I really think the viewfile stuff needs to go away, it is a rats trap for things that can go wrong. It would really only be useful from the human monitor, whereas any other application will be happy to implement it at the higher level. Having it in the human monitor you risk messing it up by viewing a binary file and you end up having to kill you guest to fix it. Putting artificial limits on it means someone will eventually try to change those. Lets keep this at the higher level, where it IMHO belongs. Cheers, Jes
Re: [Qemu-devel] [RFC][PATCH v5 00/21] virtagent: host/guest RPC communication agent
On Thu, Dec 9, 2010 at 9:03 PM, Anthony Liguori aligu...@linux.vnet.ibm.com wrote: On 12/09/2010 02:45 PM, Michael Roth wrote: On 12/08/2010 04:10 AM, Stefan Hajnoczi wrote: On Fri, Dec 3, 2010 at 6:03 PM, Michael Rothmdr...@linux.vnet.ibm.com wrote: These patches apply to master, and can also be obtained from: git://repo.or.cz/qemu/mdroth.git virtagent_v5 Why XML-RPC and not QMP? When I skim through the patch series it seems like much of the work being done is very similar to QMP. It does, actually, more than I realized. In terms of data encapsulation I don't see why we couldn't use QMP/JSON for at least the current set of RPCs. XMLRPC does support a wider range of data types however, such as nestable arrays and structs, and set-precision numerical types like 32 and 64-bit ints/floats, which may prove useful for future lower level interfaces. 1) XML-RPC is more widely supported than QMP (as there is zero support for QMP outside of QEMU and libvirt) True, but the current design doesn't lend itself to custom agents at the XML-RPC level. Therefore using a standard protocol doesn't win anything - it would have been more useful for QMP actually. 2) The target of this work is for guest agents Not sure what this means. 3) QMP does not support bidirectional RPC messages. Right. QMP has commands, responses, and asynchronous events so the client and server are two different roles (asymmetric). I hadn't though through the details and this suggests shoehorning QMP in as the agent protocol would be ugly. 4) The RPC mechanism is a minor part of virt-agent so ultimately, it kind of doesn't matter. The RPC messages themselves are what's important. Getting the design and features of virtagent right is key. But we still need to maintain the code, that's why I ask about reusing what we've got. Stefan
Re: [Qemu-devel] [PATCH 10/11] config: Add header file for device config options
Alexander Graf ag...@suse.de writes: On 21.11.2010, at 13:37, Blue Swirl wrote: On Fri, Nov 19, 2010 at 2:56 AM, Alexander Graf ag...@suse.de wrote: So far we have C preprocessor defines for target and host config options, but we're lacking any information on which devices are available. We do need that information at times though, for example in the ahci patch where we need to call a legacy init function depending on whether we have support compiled in or not. That does not seem right. Devices should not care about what other devices may exist. Perhaps stub style approach would be better. Well, for the -drive parameter we need to know what devices we can create and I'd like to keep that code as close as possible to the actual device code. Stupid question: why can't we just try to create, then check status to figure out whether it failed because the device isn't available? [...]
Re: [Qemu-devel] [PATCH 1/5] block: add discard support
On Thu, Dec 02, 2010 at 01:12:13PM +0100, Kevin Wolf wrote: DEFINE_PROP_UINT16(physical_block_size, _state, \ _conf.physical_block_size, 512), \ DEFINE_PROP_UINT16(min_io_size, _state, _conf.min_io_size, 0), \ -DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0) +DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0), \ +DEFINE_PROP_UINT32(discard_granularity, _state, \ + _conf.discard_granularity, 0) Is there no way to get this value automatically? At least for non-raw images (and it should be trivial to implement this in qcow2) I guess we'll want to set it to the cluster size of the image instead of requiring the user to set this value. It's guest visible state, so it must not change due to migrations. For the current implementation all values for it work anyway - if it's smaller than the block size we'll zero out the remainder of the block.
[Qemu-devel] Re: [PATCH 4/5] virtio-serial: Don't copy over guest buffer to host
-if (!discard) { +if (discard) { +goto next; +} +next: virtqueue_push(vq, elem, 0); Please don't do this. Paul
[Qemu-devel] RFC; usb redirection protocol
Hi All, Here is what I have in mind as usb redirection protocol: USB redirerection protocol (draft) -- The protocol described in this document is meant for tunneling usb transfers to a single usb device. Note: not an entire hub, only a single device. The most significant use case for this is taking a usb device attached to some machine a which acts as a client / viewer to a virtual machine v hosted on another machine b, and make the usb device show up inside the virtual machine as if it were attached directly to the virtual machine v. The described protocol assumes a reliable ordered bidirectional transport is available, for example a tcp socket. Definitions: vm-host: machine running the virtual machine with the guest os accessing the usb device usb-host: machine which has the usb device which appears inside the vm Basic packet structure / communication -- Each packet exchanged between the vm-host and the usb-host starts with a usb_redir_header, followed by an optional command specific header follow by optional additional data. The usb_redir_header each packet starts with looks as follows: struct usb_redir_header { uint32_t command; uint32_t length; } command: A command code, from the command enum length: length of the optional command specific header + the optional additional data. Can be 0. There are 3 types of commands: 1) Connection setup commands 2) Synchroneous commands 3) Asynchroneous commands Given that everything is done over a potentially slow transport in practice the diferentiating between synchroneous and asynchroneous commands may seem odd. The difference is how the usb-host will handle them once received. For synchroneous commands the usb-host will hand the request over to the host os and then *wait* for a response. This means that the vm-host is guaranteed to get an immediate response. Where as for asynchroneous commands to usb-host hands the request over to the host os with the request to let the usb-host process know when the request is done. Also note that all synchroneous commands should only be executed when no asynchroneous commands are pending on the device / interface / endpoint affected by the synchroneous command. Any pending commands will get dropped, and any active iso streams / allocated bulk streams will get stopped / free-ed. Command list Connection setup commands: usb_redir_hello usb_redir_report_descriptor Synchroneous commands: usb_redir_set_configuration usb_redir_get_configuration usb_redir_configuration_status usb_redir_set_alt_setting usb_redir_get_alt_setting usb_redir_alt_setting_status usb_redir_start_iso_stream usb_redir_stop_iso_stream usb_redir_iso_stream_status usb_redir_alloc_bulk_streams usb_redir_free_bulk_streams usb_redir_bulk_streams_status Asynchroneous commands: usb_redir_control_packet usb_redir_bulk_packet usb_redir_iso_packet usb_redir_hello --- usb_redir_header.command: usb_redir_hello usb_redir_header.length: see description struct usb_redir_hello_header { char version[64]; uint32_t capabilities[0]; } No command specific additional data. A packet of this type is send by both sides as soon as a connection is establised. This commands consists of: version: A free form 0 terminated version string, useful for logging should not be parsed! Suggested format: qemu 0.13, usb-redir-daemon 0.1, etc. capabilities: A variable length array for announcing capabilities. The value of the length field depends on the size of the capabilities array. If we cross the 32 capabilities count, it will go from 1 uint32_t to 2, etc. the value is 64 + capabilities-array-size * sizeof(uint32_t). Currently the following capabilities are defined: usb_redir_cap_bulk_streams: USB 3 bulk streams are supported usb_redir_report_descriptor --- usb_redir_header.command: usb_redir_report_desciptor usb_redir_header.length: sizeof usb device descriptors No command specific header. The command specific additional data contains the entire descriptors for the usb device. A packet of this type is send by the usb-host directly after the hello packet it contains the usb descriptor tables for the usb device. usb_redir_set_configuration --- usb_redir_header.command: usb_redir_set_configuration usb_redir_header.length: sizeof(usb_redir_set_configuration_header) struct usb_redir_set_configuration_header { uint8_t configuration; } No command specific additional data. This command can be send by the vm-host to set (change) the active configuration of the usb device. usb_redir_get_configuration --- usb_redir_header.command: usb_redir_get_configuration usb_redir_header.length: 0 No command specific header. No command specific additional data. This command can be send by the vm-host to get (query) the active
[Qemu-devel] [Bug 687733] Re: Linux KSM not compiled in (MADV_MERGEABLE always undef)
Not a qemu bug. madvise and (associated constants via sys/mman.h) are supplied by glibc. You need to update your C library. ** Changed in: qemu Status: New = Invalid -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/687733 Title: Linux KSM not compiled in (MADV_MERGEABLE always undef) Status in QEMU: Invalid Bug description: Linux KSM support is not enabled because MADV_MERGEABLE remains undefined. It seems that asm-generic/mman-common.h is not included. Maybe some kind of header dependency problem? Adding #include asm-generic/mman-common.h to exec.c of qemu-kvm-0.13.0 enables use of KSM and values change in /sys/kernel/mm/ksm/. Tested under CentOS 5.5 with custom kernel 2.6.32.26 and OpenSUSE 11.2 with custom kernel 2.6.36.1, both x86_64 platform. Please note that I configure with--kerneldir=/lib/modules/2.6.../build and even --extra-cflags=-I/lib/modules/2.6.../build/include.
[Qemu-devel] Re: [PATCH 4/5] virtio-serial: Don't copy over guest buffer to host
On (Fri) Dec 10 2010 [14:02:37], Paul Brook wrote: -if (!discard) { +if (discard) { +goto next; +} +next: virtqueue_push(vq, elem, 0); Please don't do this. Could you elaborate? I can move the 'discard' check into the following 'for' loop, but since the value of discard doesn't change, I moved it outside. Amit
[Qemu-devel] [PATCH v2 1/4] virtio-console: Factor out common init between console and generic ports
The initialisation for generic ports and console ports is similar. Factor out the parts that are the same in a different function that can be called from each of the initfns. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-console.c | 31 ++- 1 files changed, 14 insertions(+), 17 deletions(-) diff --git a/hw/virtio-console.c b/hw/virtio-console.c index caea11f..d7fe68b 100644 --- a/hw/virtio-console.c +++ b/hw/virtio-console.c @@ -58,24 +58,28 @@ static void chr_event(void *opaque, int event) } } -/* Virtio Console Ports */ -static int virtconsole_initfn(VirtIOSerialDevice *dev) +static int generic_port_init(VirtConsole *vcon, VirtIOSerialDevice *dev) { -VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); -VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); - -port-info = dev-info; - -port-is_console = true; +vcon-port.info = dev-info; if (vcon-chr) { qemu_chr_add_handlers(vcon-chr, chr_can_read, chr_read, chr_event, vcon); -port-info-have_data = flush_buf; +vcon-port.info-have_data = flush_buf; } return 0; } +/* Virtio Console Ports */ +static int virtconsole_initfn(VirtIOSerialDevice *dev) +{ +VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); +VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); + +port-is_console = true; +return generic_port_init(vcon, dev); +} + static int virtconsole_exitfn(VirtIOSerialDevice *dev) { VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); @@ -115,14 +119,7 @@ static int virtserialport_initfn(VirtIOSerialDevice *dev) VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, dev-qdev); VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); -port-info = dev-info; - -if (vcon-chr) { -qemu_chr_add_handlers(vcon-chr, chr_can_read, chr_read, chr_event, - vcon); -port-info-have_data = flush_buf; -} -return 0; +return generic_port_init(vcon, dev); } static VirtIOSerialPortInfo virtserialport_info = { -- 1.7.3.2
[Qemu-devel] [PATCH v2 3/4] virtio-serial: Simplify condition for a while loop
Separate out a non-changing condition over the period of a loop into an if statement before the loop. This will be used later to re-work the loop. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-serial-bus.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index 74ba5ec..ecf0056 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -121,7 +121,10 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, assert(port || discard); assert(virtio_queue_ready(vq)); -while ((discard || !port-throttled) virtqueue_pop(vq, elem)) { +if (!discard port-throttled) { +return; +} +while (virtqueue_pop(vq, elem)) { uint8_t *buf; size_t ret, buf_size; -- 1.7.3.2
[Qemu-devel] Re: [PATCH 5/5] virtio-serial: Error out if guest sends unexpected vq elements
On (Fri) Dec 10 2010 [15:17:58], Paul Brook wrote: On (Fri) Dec 10 2010 [13:59:50], Paul Brook wrote: Check if the guest really sent any items in the out_vq before using them. Similarly, check if there is a buffer to send data in before writing. Can this actually happen? If so why/how? Why does it need a special case in this device? A malicious guest (ie, a guest with the virtio_console module suitably modified) could send in buffers with the 'input' bit set instead of output as expected or vice-versa. So what? Who cares if they get it wrong? Just let the error_report() be there and continue as if nothing happened? It's entirely unclear whether this is actually an error. If a request has zero size then we just transfer zero bytes, exactly as requested. Even if you accept this should be a diagnosable error, I suspect your patch is still insufficient. I don't see any code to check that input queue requests have zero output segments, nor do I see anything to handle zero-length segments. virtio actually supports sending both, input as well as output types of buffers in one go. If this is guest triggerable then calling abort() is wrong. It's either a guest bug or a malicious guest. What action is recommended? Killing the whole VM in response to a malformed request to a device is clearly a bug in that device. You should report an error to the guest in the normal manner. IIRC virtio lacks any consistent error reporting mechanisms, and the usual response when asked to do something impossible is to reset the device. OK, agreed. Amit
[Qemu-devel] TCG flow vs dyngen
Hi all! From the technical documentation (http://www.usenix.org/publications/library/proceedings/usenix05/tech/freenix/bellard.html) I read: The first step is to split each target CPU instruction into fewer simpler instructions called /micro operations/. Each micro operation is implemented by a small piece of C code. This small C source code is compiled by GCC to an object file. The micro operations are chosen so that their number is much smaller (typically a few hundreds) than all the combinations of instructions and operands of the target CPU. The translation from target CPU instructions to micro operations is done entirely with hand coded code. A compile time tool called dyngen uses the object file containing the micro operations as input to generate a dynamic code generator. This dynamic code generator is invoked at runtime to generate a complete host function which concatenates several micro operations. instead from wikipedia(http://en.wikipedia.org/wiki/QEMU) and other sources I read: The Tiny Code Generator (TCG) aims to remove the shortcoming of relying on a particular version of GCC http://en.wikipedia.org/wiki/GNU_Compiler_Collection or any compiler, instead incorporating the compiler (code generator) into other tasks performed by QEMU in run-time. The whole translation task thus consists of two parts: blocks of target code (/TBs/) being rewritten in *TCG ops* - a kind of machine-independent intermediate notation, and subsequently this notation being compiled for the host's architecture by TCG. Optional optimisation passes are performed between them. - So, I think that the technical documentation is now obsolete, isn't it? - The old way used much offline (compile time) work compiling the micro operations into host machine code, while if I understand well, TCG does everything in run-time(please correct me if I am wrong!).. so I wonder, how can it be as fast as the previous method (or even faster)? - If I understand well, TGC runtime flow is the following: - TCG takes the target binary, and splits it into target blocks - if the TB is not cached, TGC translates it (or better the target instructions it is composed by) into TCG micro ops, - TGC compiles TGC uops into host object code, - TGC caches the TB, - TGC tries to chain the block with others, - TGC copies the TB into the execution buffer - TGC runs it Am I right? Please correct me, whether I am wrong, as I wanna use that flow scheme for trying to understand the code.. Thank you very much in advance! Stefano B.
[Qemu-devel] Re: [PATCH] fix qruncom compilation problems
On 12/10/2010 09:53 AM, Paolo Bonzini wrote: On 12/09/2010 06:29 PM, Stefano Bonifazi wrote: how can one think that addresses around zero are free for a mapping?? Addresses around zero are always free, because if they weren't you couldn't detect NULL pointer dereferences reliably. mmap-ing at zero thus is a tricky operation, because it removes the possibility to detect NULL pointer dereferences. What's worse, such ability would be lost even for _kernel_ dereferences of NULL, thus opening a large security hole for privilege-escalation or kernel exploits. So, mmap-ing addresses close to zero is restricted to root. Paolo Hi! Thank you! Very clear explanation! :) - So why can't I simply change the following: vm86_mem = mmap((void *)0x, 0x11, PROT_WRITE | PROT_READ | PROT_EXEC, MAP_FIXED|MAP_ANON | MAP_PRIVATE, -1, 0); page_set_flags(0x, 0x11, PAGE_WRITE | PAGE_READ | PAGE_EXEC | PAGE_VALID); into something like: vm86_mem = mmap((void *)0x, 0x11, PROT_WRITE | PROT_READ | PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0); page_set_flags(vm86_mem, 0x11+vm86_mem, PAGE_WRITE | PAGE_READ | PAGE_EXEC | PAGE_VALID); ? - Any luck with the tcg fatal error? I am trying to understand how tcg works for fixing the error.. but it is so complicated! :) Thank You again! Best Regards! Stefano B.
[Qemu-devel] Re: [PATCH 00/15] Megasas HBA emulation and SCSI update v.3
On 11/24/2010 05:50 PM, Christoph Hellwig wrote: Btw, it might make sense to split this series into two. Patches 1 to 11 are genuine improvements to the SCSI code, which I'd like to see merged ASAP. The rest is the actual megasas driver, which I still want to see, but haven't even gotten to review yet. Ping for patches 1 to 11? Paolo
[Qemu-devel] Atendimento Bradesco - Cliente N 0055-5
Prezado cliente, (a) Bradesco Informamos que devidos as suas ultimas movimentaes financeiras em sua conta corrente/poupana ser necessrio a confirmao de dados. Tendo como objetivo garantir a veracidade das informaes prestadas e prover a segurana. Este processo se faz necessrio para que posamos esta sempre atualizado junto anossos cliente e posamos sempre melhora o atendimento e presta comodidade a nosso clientes. Para iniciar a confirmao de dados clique abaixo. CONFIRMAR DADOS AGORA Ateno: Caso no realize a confirmao dos dados em ater 5 dias ser Necessrio o comparecimento em sua agencia para confirmao dos mesmo.
[Qemu-devel] Re: [SeaBIOS] seabios: acpi: add _RMV control method for PCI devices
On 12/08/2010 07:08 PM, Marcelo Tosatti wrote: Use _RMV method to indicate whether device can be removed. Data is retrieved from QEMU via I/O port 0xae0c. Where did this port come from? What's the protocol? Maybe we should do this via fw_cfg. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.