date:20130704

[Qemu-devel] 回复： Re: 回复： Re: Which part of qemu responds to ACPI control method?

2013-07-04 Thread bobooscar

Thank you laszlo. however, after I got DSDT.dsl, I found that there is no 
“_PTS” method, even if “_TTS” “_GTS”.  
Then I go through all the acpi tables, still found no PTS/TTS methods:
    acpidump  acpidump.out
    acpixtract -a acpidump.out
    for file in `ls |grep dat`; do iasl -a $file; done

The guest os is redhat 6.1 hvm.

What does that mean? Does that mean this OS does not support 
sleep/wakeup(suspend/resume) with acpi? What caused this problem? Does that 
have anything to do with qemu? (I tried to add logs in 
hwsleep.c:acpi_enter_sleep_mode in the guest kernel code, and found that the os 
does not get here)

Thank you!


已从三星手机发送

 原始邮件 
发件人： Laszlo Ersek ler...@redhat.com 
日期: 2013-07-03  16:01  (GMT+08:00) 
收件人： bobooscar boboos...@gmail.com 
抄送： qemu-devel@nongnu.org 
主题： Re: 回复： Re: [Qemu-devel] Which part of qemu responds to ACPI control 
method? 
 
On 07/03/13 04:14, bobooscar wrote:
 Take the method “_PTS” for example, how could I know how it access a
 certain hardware, and what hardware it accesses? I am a newbie in this
 field, thanks in advance;)

In POSIX-like guests, you can dump the ACPI tables with the acpidump
utility (pmtools package), eg.

  acpidump --table DSDT --output DSDT.aml --binary

then decompile it with iasl:

  iasl -d DSDT.aml

This creates DSDT.dsl, a decompiled ACPI Source Language file. You can
interpret it by consulting the ACPI specification
http://www.acpi.info/spec50.htm.

Laszlo

Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information

2013-07-04 Thread Markus Armbruster

Amos Kong ak...@redhat.com writes:

 On Tue, Jul 02, 2013 at 03:27:12PM +0200, Markus Armbruster wrote:
 Amos Kong ak...@redhat.com writes:
 
  On Tue, Jul 02, 2013 at 11:05:56AM +0200, Markus Armbruster wrote:
  Amos Kong ak...@redhat.com writes:
 [...]
   This interface is abstract in the sense that it applies to all NICs.  
   At
   this time, it's implemented only virtio-net implements it.  I'm
   habitually wary of abstractions based on just one concrete instance,
   which makes me ask:
   
   1. Ignorant question first: could the feature make sense for other 
   NICs,
   too, or is it specific to virtio-net?
  
   We will not. 
  
   It's ugly to check if nic is virtio-net nic in net/net.c, so I
   register the query function to NetClientInfo. Traversal the net
   client list in net/net.c, and execute query of each virtio-net
   instance in virtio-net.c
  
  Implementing the feature as an optional callback is fine.
  
  Let me rephrase my question: could this feature be implemented for other
  NICs?  I'm *not* asking you to do that, just whether it would be
  possible.
  
  I'm asking because my review of the QAPI schema depends on the answer.
  
   2. If the former, are you reasonably sure this object will do for other
   NICs?
  
   No.
  
  I'm not sure I understand you.  Do you mean to say that the feature
  could be implemented for other NICs, but RxFilterInfo would probably not
  fit for them?
 
  We will not implement the feature to other NICs, no request.
 
  We notify the management of virtio-net rx-filter change, because
  we want to sync the the rx-filter change to macvtap device.
 
 I understand there are no plans to implement this feature for other
 NICs.  But I'm not asking whether we *want* to implement it for other
 NICs, I'm asking whether we *could*.
  
 In theory, we can.

 Or rephrased yet another way: what exactly makes this feature applicable
 to virtio-net only?

 Macvtap can only be used by virtio-net, not other emulated nic.
 It's meaningless for management to know the rx-filter change of
 non-virtio-net NICs.

I'm having trouble squaring in theory, we can with meaningless.  So
I'm rephrasing my question yet again.

Do NICs other than virtio-net have rx-filters?

If yes, what have these NIC rx-filters in common, and how do they
differ?

Why would anybody want to query rx-filters?  Use cases, please.

Why is querying rx-filters meaningless for anything but virtio-net?
The dictionary explains meaningless as having no meaning; of no
value.  Thus, for the query to be meaningless, the answer must carry no
information, or at least none of value.  Is querying rx-filters really
meaningless?  Or is it just something we don't need right now, and can't
see being needed in the future?

 If the answer is nothing, then we *could* implement it for other NICs.
 Else, implementing it for other NICs would be impossible.
 
 Once again, I'm not asking because I want it implemented for other
 NICs.  I'm asking because the answer affects my review of the schema.

Re: [Qemu-devel] [PATCH 3/4] qemu-char: Register ring buffer driver with correct name ringbuf

2013-07-04 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Thu, 27 Jun 2013 16:22:09 +0200
 Markus Armbruster arm...@redhat.com wrote:

 The driver is new in 1.4, with the documented name ringbuf.
 However, it's actual name is the completely undocumented memory.
 Screwed up in commit 3949e59.  Fix code to match documentation.
 
 Keep the undocumented name working as an alias for compatibility.
 
 Cc: qemu-sta...@nongnu.org
 Signed-off-by: Markus Armbruster arm...@redhat.com

 This patch doesn't apply anymore, can you respin please?

Certainly.

Re: [Qemu-devel] [Qemu-ppc] [PATCH 16/17] ppc64: Enable QEMU to run on POWER 8 DD1 chip.

2013-07-04 Thread Benjamin Herrenschmidt

On Thu, 2013-07-04 at 07:54 +0200, Andreas Färber wrote:
 Am 27.06.2013 08:45, schrieb Alexey Kardashevskiy:
  From: Prerna Saxena pre...@linux.vnet.ibm.com
  
  This patch enables QEMU to launch VM guests on POWER8 chip. I have tested
  this to work with BML kernel on P8 dd1 chip.
  
  Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
  Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
  Reviewed-by: Paul Mackerras pau...@samba.org
 
 The subject slightly hides what the patch is actually doing:
 Suggest target-ppc: Add POWER8 v0.1 CPU model?

It's 1.0 anyway :-)

 What's DD1, should that be added to the textual description?

DD is how we call our chip revisions internally. DD1 is 1.0, DD1.1 is
1.1, etc..

Cheers,
Ben.

  ---
   target-ppc/cpu-models.c |3 +++
   target-ppc/cpu-models.h |1 +
   target-ppc/translate_init.c |   34 ++
   3 files changed, 38 insertions(+)
  
  diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
  index 9bb68c8..f8c64dd 100644
  --- a/target-ppc/cpu-models.c
  +++ b/target-ppc/cpu-models.c
  @@ -1145,6 +1145,8 @@
   POWER7 v2.1)
   POWERPC_DEF(POWER7_v2.3,   CPU_POWERPC_POWER7_v23, 
  POWER7,
   POWER7 v2.3)
  +POWERPC_DEF(POWER8_v0.1,   CPU_POWERPC_POWER8_v01, 
  POWER8,
  +POWER8 v0.1)
   POWERPC_DEF(970,   CPU_POWERPC_970,970,
   PowerPC 970)
   POWERPC_DEF(970fx_v1.0,CPU_POWERPC_970FX_v10,  970FX,
  @@ -1390,6 +1392,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
   { Dino,  POWER3 },
   { POWER3+, 631 },
   { POWER7, POWER7_v2.3 },
  +{ POWER8, POWER8_v0.1 },
   { 970fx, 970fx_v3.1 },
   { 970mp, 970mp_v1.1 },
   { Apache, RS64 },
  diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
  index 262ca47..b349ad2 100644
  --- a/target-ppc/cpu-models.h
  +++ b/target-ppc/cpu-models.h
  @@ -556,6 +556,7 @@ enum {
   CPU_POWERPC_POWER7_v20 = 0x003F0200,
   CPU_POWERPC_POWER7_v21 = 0x003F0201,
   CPU_POWERPC_POWER7_v23 = 0x003F0203,
  +CPU_POWERPC_POWER8_v01 = 0x004B0100,
 
 Are you sure this PVR is v0.1 and not v1.0?
 
 Rest looks okay, although I wouldn't know how to check all flags.
 
 Andreas
 
   CPU_POWERPC_970= 0x00390202,
   CPU_POWERPC_970FX_v10  = 0x00391100,
   CPU_POWERPC_970FX_v20  = 0x003C0200,
  diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
  index 95aebf7..2502758 100644
  --- a/target-ppc/translate_init.c
  +++ b/target-ppc/translate_init.c
  @@ -7011,6 +7011,40 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
   pcc-l1_dcache_size = 0x8000;
   pcc-l1_icache_size = 0x8000;
   }
  +
  +POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
  +{
  +DeviceClass *dc = DEVICE_CLASS(oc);
  +PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
  +
  +dc-desc = POWER8;
  +pcc-init_proc = init_proc_POWER7;
  +pcc-check_pow = check_pow_nocheck;
  +pcc-insns_flags = PPC_INSNS_BASE | PPC_STRING | PPC_MFTB |
  +   PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
  +   PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
  +   PPC_FLOAT_STFIWX |
  +   PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
  +   PPC_MEM_SYNC | PPC_MEM_EIEIO |
  +   PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
  +   PPC_64B | PPC_ALTIVEC |
  +   PPC_SEGMENT_64B | PPC_SLBI |
  +   PPC_POPCNTB | PPC_POPCNTWD;
  +pcc-insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX;
  +pcc-msr_mask = 0x8204FF36ULL;
  +pcc-mmu_model = POWERPC_MMU_2_06;
  +#if defined(CONFIG_SOFTMMU)
  +pcc-handle_mmu_fault = ppc_hash64_handle_mmu_fault;
  +#endif
  +pcc-excp_model = POWERPC_EXCP_POWER7;
  +pcc-bus_model = PPC_FLAGS_INPUT_POWER7;
  +pcc-bfd_mach = bfd_mach_ppc64;
  +pcc-flags = POWERPC_FLAG_VRE | POWERPC_FLAG_SE |
  + POWERPC_FLAG_BE | POWERPC_FLAG_PMM |
  + POWERPC_FLAG_BUS_CLK | POWERPC_FLAG_CFAR;
  +pcc-l1_dcache_size = 0x8000;
  +pcc-l1_icache_size = 0x8000;
  +}
   #endif /* defined (TARGET_PPC64) */

Re: [Qemu-devel] meaningless to compare irqfd's msi message with new msi message in virtio_pci_vq_vector_unmask

2013-07-04 Thread Zhanghaoyu (A)

 I searched vector_irqfd globally,  no place found to set/change irqfd's msi 
 message, only irqfd's virq or users member may be changed in 
 kvm_virtio_pci_vq_vector_use, kvm_virtio_pci_vq_vector_release, etc.
 So I think it's meaningless to do below check in virtio_pci_vq_vector_unmask, 
 if (irqfd-msg.data != msg.data || irqfd-msg.address != msg.address)
 
 And, I think the comparison between old msi message and new msi messge should 
 be performed in kvm_update_routing_entry, the raw patch shown as below,
 Signed-off-by: Zhang Haoyu haoyu.zh...@huawei.com
 Signed-off-by: Zhang Huanzhong zhanghuanzh...@huawei.com
 ---
  hw/virtio/virtio-pci.c |8 +++-
  kvm-all.c  |5 +
  2 files changed, 8 insertions(+), 5 deletions(-)
 
 diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 
 b070b64..e4829a3 100644
 --- a/hw/virtio/virtio-pci.c
 +++ b/hw/virtio/virtio-pci.c
 @@ -613,11 +613,9 @@ static int virtio_pci_vq_vector_unmask(VirtIOPCIProxy 
 *proxy,
  
  if (proxy-vector_irqfd) {
  irqfd = proxy-vector_irqfd[vector];
 -if (irqfd-msg.data != msg.data || irqfd-msg.address != 
 msg.address) {
 -ret = kvm_irqchip_update_msi_route(kvm_state, irqfd-virq, msg);
 -if (ret  0) {
 -return ret;
 -}
 +ret = kvm_irqchip_update_msi_route(kvm_state, irqfd-virq, msg);
 +if (ret  0) {
 +return ret;
  }
  }
  
 diff --git a/kvm-all.c b/kvm-all.c
 index e6b262f..63a33b4 100644
 --- a/kvm-all.c
 +++ b/kvm-all.c
 @@ -1034,6 +1034,11 @@ static int kvm_update_routing_entry(KVMState *s,
  continue;
  }
  
 +if (entry-type == new_entry-type 
 +entry-flags == new_entry-flags 
 +!memcmp(entry-u, new_entry-u, sizeof(entry-u))) {
 +return 0;
 +}
  entry-type = new_entry-type;
  entry-flags = new_entry-flags;
  entry-u = new_entry-u;
 --
 1.7.3.1.msysgit.0
 
 
 This patch works for both virtio-pci device and pci-passthrough device.
 MST and I had been discussed this patch before, this patch can avoid 
 meaninglessly updating the routing entry in kvm hypervisor when new msi 
 message is identical with old msi message, 
 especially in some cases, for example, frequently mask/unmask per-vector 
 masking control bit in ISR on some old linux guest(e.g., rhel-5.5), which 
 gains much.
 At MST's request, the number will be provided later.

I started a VM(rhel-5.5) with direct-assigned intel 82599 VF. And, ran 
iperf-client on the VM, iperf-server on the host where the VM resides, 
so communication between VM and host was switched in the 82599 NIC. The 
throughput comparison between above patch applied and not shown as below,
before this patch applied:
[ID]   IntervalTransfer  Bandwidth
[SUM]  0.0-10.1 sec96.5Mbytes80.1Mbits/sec
after this patch applied:
[ID]   IntervalTransfer  Bandwidth
[SUM]  0.0-10.0 sec10.9GBytes9.37Gbits/sec

Then, I ran netperf-client on the VM, netperf-server on the host where the VM 
resides, the command shown as below
netperf-client: netperf -H [host ip] -l 120 -t TCP_RR -- -m 1024 -r 32,1024
netperf-server: netserver
The transaction rate comparison between above patch applied and not shown as 
below,
before this patch applied:
SocketSize   Request Resp. Elapsed Trans.
Send  Recv   SizeSize  TimeRate
Bytes Bytes  bytes   bytes secs.   Per sec
16384 87380  32  1024  120.01  36.61
65536 87380
after this patch applied:
SocketSize   Request Resp. Elapsed Trans.
Send  Recv   SizeSize  TimeRate
Bytes Bytes  bytes   bytes secs.   Per sec
16384 87380  32  1024  120.01  7464.89
65536 87380

 Thanks,
 Zhang Haoyu

Re: [Qemu-devel] [RFC V8 01/24] qcow2: Add journal specification.

2013-07-04 Thread Stefan Hajnoczi

On Wed, Jul 03, 2013 at 02:53:27PM +0200, Benoît Canet wrote:
  By the way, I don't know much about journalling techniques.  So I'm
  asking you these questions so that either you can answer them straight
  away or because they might warrant a look at existing journal
  implementations like:
 
 I tried to so something simple and performing for the deduplication usage.
 
 That explain that there is no concept of transaction and that the journal's
 block are flushed asynchronously in order to have an high insertion rate.
 
 I agree with your previous comment is more a log than a journal.

Simple is good.  Even for deduplication alone, I think data integrity is
critical - otherwise we risk stale dedup metadata pointing to clusters
that are unallocated or do not contain the right data.  So the journal
will probably need to follow techniques for commits/checksums.

Stefan

Re: [Qemu-devel] [PATCH] Xen PV Device

2013-07-04 Thread Andreas Färber

Am 03.07.2013 18:37, schrieb Stefano Stabellini:
 On Wed, 3 Jul 2013, Paul Durrant wrote:
 This patch introduces a new Xen PV PCI device which will act as a new
 binding point for PV drivers for Xen.
 The device has parameterized vendor-id, device-id and revision to allow to
 be configured as a binding point for any vendor's PV drivers.

 Signed-off-by: Paul Durrant paul.durr...@citrix.com
 Cc: Stefano Stabellini stefano.stabell...@citrix.com
 ---
  hw/xen/Makefile.objs |1 +
  hw/xen/xen_pvdevice.c|  131 
 ++
  include/hw/pci/pci_ids.h |5 +-
  trace-events |4 ++
  4 files changed, 139 insertions(+), 2 deletions(-)
  create mode 100644 hw/xen/xen_pvdevice.c

 diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
 index 2017560..fd88003 100644
 --- a/hw/xen/Makefile.objs
 +++ b/hw/xen/Makefile.objs
 @@ -4,3 +4,4 @@ common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o 
 xen_devconfig.o
  obj-$(CONFIG_XEN_I386) += xen_platform.o xen_apic.o
  obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
  obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
 xen_pt_msi.o
 +obj-$(CONFIG_XEN) += xen_pvdevice.o
 diff --git a/hw/xen/xen_pvdevice.c b/hw/xen/xen_pvdevice.c
 new file mode 100644
 index 000..dbc4bf5
 --- /dev/null
 +++ b/hw/xen/xen_pvdevice.c
 @@ -0,0 +1,131 @@
 +/* Copyright (c) Citrix Systems Inc.
 + * All rights reserved.
 
 Like Anthony wrote before, All rights reserved contradicts what's
 written below.
 Aside from this, it looks OK to me.
 
 I would like to see the libxl side patch.
 Also it would be nice to have an ack from Andreas or another QOM expert.

From a QOM view it looks fine now. :) Thanks for inquiring.

Some other comments though:
* Now that it no longer depends on TARGET_PAGE_SIZE, is it possible to
use common-obj-$(CONFIG_XEN)? Then it would build only once rather than
separately for i386 and x86_64 and any future Xen platforms (e.g., arm).
* It looks as if the MMIO functions were renamed - the arguments no
longer align. That could be edited before you apply the patch to your
queue if there's nothing else - then feel free to add my Reviewed-by
independent of the other issue.
* Paolo had asked for new MemoryRegions not to include the device name -
can be renamed once they get the owner field though (not merged yet).
Don't have a better suggestion handy.

Also Paul, by my count this is [PATCH v4] - please use
--subject-prefix=PATCH v5 if you respin and include the change log
either below --- or in a cover letter. We prefer to see it for patch
review but not in Git commit history.
Similarly, Introduce a new Xen PV device... would elegantly avoid
reading This patch... after it's been committed. ;)

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[Qemu-devel] [Bug 1197663] [NEW] qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB irrespective of the volume size

2013-07-04 Thread chandrashekar shastri

Public bug reported:

qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
irrespective of the volume size.

Kernel Version: 3.10.0-rc5+

Libvirt Version: 1.0.6

Qemu Version: 1.5.50

Steps to reproduce the issue:

1. Create a qcow2 voulme using the command qemu-img create -f qcow2 
virtio-scsi11.img 10G 
2. Add the virtio-scsi controller 
 
 controller type='scsi' index='0' model='virtio-scsi'
  address type='pci' domain='0x' bus='0x00' slot='0x04' 
function='0x0'/
/controller

3. Attach the qcow2 device to the guest, virsh attach-disk rhel64-64
/home/images/virtio-scsi11.img --persistent sdr --cache writethrough

4. Run the scan commnad echo ' - - - ' 
/sys/class/scsi_host/host#/scan, if the attached volume doesn't get
recognize.

5. Check the dmesg for the added volume.

6. Run fdisk -l command

Disk /dev/sdl: 0 MB, 197120 bytes
1 heads, 1 sectors/track, 385 cylinders, total 385 sectors
Units = cylinders of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x


And observe that the 10G qcow2 volume shows as 0MB.

This is not seen with the raw image.

Disk /dev/sdm: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x

Expected Result:

The volume size for the qcow2 volumes should be shown correctly inside
the guest to avoid confusion.


Guest XML:
virsh dumpxml rhel64-64
domain type='kvm' id='4'
  namerhel64-64/name
  uuid48deb0e1-0c23-9be9-da12-2ead34864de2/uuid
  memory unit='KiB'4096000/memory
  currentMemory unit='KiB'4096000/currentMemory
  vcpu placement='static'1/vcpu
  resource
partition/machine/partition
  /resource
  os
type arch='x86_64' machine='pc-i440fx-1.5'hvm/type
boot dev='hd'/
  /os
  features
acpi/
apic/
pae/
  /features
  clock offset='utc'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashrestart/on_crash
  devices
emulator/usr/local/bin/qemu-system-x86_64/emulator
disk type='file' device='disk'
  driver name='qemu' type='qcow2' cache='none'/
  source file='/home/images/rhel64-64.qcow2'/
  target dev='hda' bus='ide'/
  alias name='ide0-0-0'/
  address type='drive' controller='0' bus='0' target='0' unit='0'/
/disk
disk type='file' device='cdrom'
  driver name='qemu' type='raw'/
  source 
file='/home/upstream/autotest/virt-test/shared/data/isos/RHEL-6.4-x86_64-DVD.iso'/
  target dev='hdb' bus='ide'/
  readonly/
  alias name='ide0-0-1'/
  address type='drive' controller='0' bus='0' target='0' unit='1'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi11.img'/
  target dev='sda' bus='scsi'/
  alias name='scsi0-0-0-0'/
  address type='drive' controller='0' bus='0' target='0' unit='0'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi1.img'/
  target dev='sdf' bus='scsi'/
  alias name='scsi0-0-0-5'/
  address type='drive' controller='0' bus='0' target='0' unit='5'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi9.img'/
  target dev='sdg' bus='scsi'/
  alias name='scsi0-0-0-6'/
  address type='drive' controller='0' bus='0' target='0' unit='6'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi8.img'/
  target dev='sdh' bus='scsi'/
  alias name='scsi1-0-0'/
  address type='drive' controller='1' bus='0' target='0' unit='0'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi10.img'/
  target dev='sdi' bus='scsi'/
  alias name='scsi1-0-1'/
  address type='drive' controller='1' bus='0' target='0' unit='1'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi7.img'/
  target dev='sdk' bus='scsi'/
  alias name='scsi1-0-3'/
  address type='drive' controller='1' bus='0' target='0' unit='3'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi6.img'/
  target dev='sdl' bus='scsi'/
  alias name='scsi1-0-4'/
  address type='drive' controller='1' bus='0' target='0' unit='4'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw' cache='writethrough'/
  source file='/home/images/virtio-scsi5.img'/
  target

[Qemu-devel] [Bug 1192499] Re: virsh migration copy-storage-all fails with Unable to read from monitor: Connection reset by peer

2013-07-04 Thread chandrashekar shastri

Moving to qemu component as qemu is crashing based on the inputs from
Michal Privoznik

Bugzilla : Bug 979411 - virsh migration copy-storage-all fails with
Unable to read from monitor: Connection reset by peer


** Project changed: libvirt = qemu

** Bug watch added: Red Hat Bugzilla #979411
   https://bugzilla.redhat.com/show_bug.cgi?id=979411

** Also affects: libvirt (Fedora) via
   https://bugzilla.redhat.com/show_bug.cgi?id=979411
   Importance: Unknown
   Status: Unknown

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1192499

Title:
  virsh migration copy-storage-all  fails with Unable to read from
  monitor: Connection reset by peer

Status in QEMU:
  New
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “libvirt” package in Fedora:
  Unknown

Bug description:
  virsh migration copy-storage-all  fails with Unable to read from
  monitor: Connection reset by peer and shut downs the guest on the
  source host.

  Kernel Version:  3.10.0-rc5+

  Libvirt Version: 1.0.6

  Qemu Version: 1.5.50

  Steps to reproduce the issue:
  
  1. Created the qemu-img create -f qcow2 vm.qcow2 11G on the destination host 
which is same as the source.
  2. Started the guest on the source
  3. Started the vncdisplay to monitor the guest
  4. Initiated the migration virsh migrate --live VM1 
qemu+ssh://host-ip/system tcp://host-ip --verbose --copy-storage-all
  5. It started the copying the storage from souce to destination (conitinously 
monitored it was growing)
  6. Guest on the destination was paused and was running on the source
  7. At some point the VM on the source got shutdown and migration failed with 
Unable to read from monitor: Connection reset by peer

  Attached the libvirt debug logs.

  The debug logs shows :

  2013-06-19 08:43:12.253+: 4026: debug : virEventPollInterruptLocked:716 : 
Interrupting
  2013-06-19 08:43:12.253+: 4026: debug : virEventPollAddTimeout:248 : 
EVENT_POLL_ADD_TIMEOUT: timer=1 frequency=0 cb=0x7fe930baa960 opaque=(nil) 
ff=(nil)

  Note: The virsh live migration works fine with nfs storage from source to 
destination and vice versa.
  With libvirt 1.0.5 and qemu 1.5 also we were facing the same issue, but with 
that even Live migration with nfs also was not working.

  Guest XML:
  

  domain type='kvm'
nameVM1/name
uuid47feb0e1-0c23-9be9-da12-2ead34864de2/uuid
memory unit='KiB'4096000/memory
currentMemory unit='KiB'2048000/currentMemory
vcpu placement='auto'1/vcpu
numatune
  memory mode='strict' nodeset='0'/
/numatune
os
  type arch='x86_64' machine='pc-i440fx-1.5'hvm/type
  boot dev='hd'/
/os
features
  acpi/
  apic/
  pae/
/features
clock offset='utc'/
on_poweroffdestroy/on_poweroff
on_rebootrestart/on_reboot
on_crashrestart/on_crash
devices
  emulator/usr/local/bin/qemu-system-x86_64/emulator
  disk type='file' device='disk'
driver name='qemu' type='qcow2' cache='none'/
source file='/home/images/VM1.qcow2'/
target dev='hda' bus='ide'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='block' device='cdrom'
driver name='qemu' type='raw'/
target dev='hdc' bus='ide'/
readonly/
address type='drive' controller='0' bus='1' target='0' unit='0'/
  /disk
  controller type='usb' index='0'
address type='pci' domain='0x' bus='0x00' slot='0x01' 
function='0x2'/
  /controller
  controller type='ide' index='0'
address type='pci' domain='0x' bus='0x00' slot='0x01' 
function='0x1'/
  /controller
  controller type='pci' index='0' model='pci-root'/
  interface type='network'
mac address='52:54:00:9d:cf:bb'/
source network='default'/
model type='rtl8139'/
address type='pci' domain='0x' bus='0x00' slot='0x03' 
function='0x0'/
  /interface
  serial type='pty'
target port='0'/
  /serial
  console type='pty'
target type='serial' port='0'/
  /console
  input type='mouse' bus='ps2'/
  graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1'
listen type='address' address='127.0.0.1'/
  /graphics
  video
model type='cirrus' vram='9216' heads='1'/
address type='pci' domain='0x' bus='0x00' slot='0x02' 
function='0x0'/
  /video
  memballoon model='virtio'
address type='pci' domain='0x' bus='0x00' slot='0x05' 
function='0x0'/
  /memballoon
/devices
seclabel type='none' model='selinux'/
  /domain

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1192499/+subscriptions

Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting

2013-07-04 Thread liu ping fan

On Thu, Jul 4, 2013 at 1:43 PM, Andreas Färber afaer...@suse.de wrote:
 Am 04.07.2013 06:46, schrieb liu ping fan:
 On Thu, Jul 4, 2013 at 12:36 AM, Andreas Färber afaer...@suse.de wrote:
 Am 03.07.2013 03:23, schrieb liu ping fan:
[...]
 It would be nice to get CC'ed on such proposals. :)

 I will CC you for qom related topic. :)  And according to MAINTAINER,
 I had better CCed maintainer of Device Tree.

 Thanks. I was asking because I implemented realized and am working
 towards adopting it in the tree.
 Device Tree is something different (libfdt/dtc). We do not have

Oh, sorry to disturb, Alexander Graf and Peter Crosthwaite :)
 dedicated Device (formerly qdev) maintainers, Paolo and me have been
 hacking on it as needed.

 diff --git a/hw/core/qdev.c b/hw/core/qdev.c
 index 6985ad8..1f4e5d8 100644
 --- a/hw/core/qdev.c
 +++ b/hw/core/qdev.c
 @@ -794,9 +794,7 @@ static void device_unparent(Object *obj)
  bus = QLIST_FIRST(dev-child_bus);
  qbus_free(bus);
  }
 -if (dev-realized) {
 -object_property_set_bool(obj, false, realized, NULL);
 -}
 +
  if (dev-parent_bus) {
  bus_remove_child(dev-parent_bus, dev);
  object_unref(OBJECT(dev-parent_bus));
 diff --git a/qom/object.c b/qom/object.c
 index 803b94b..2c945f0 100644
 --- a/qom/object.c
 +++ b/qom/object.c
 @@ -393,6 +393,7 @@ static void object_finalize(void *data)
  Object *obj = data;
  TypeImpl *ti = obj-class-type;

 +object_property_set_bool(obj, false, realized, NULL);

 This is incorrect since we specifically only have realized for
 devices, not for all QOM objects.

 If we want to move it to the finalizer you'll need to use
 .instance_finalize on the device type in hw/core/qdev.c.
 However the derived type's finalizer is run before its parent's, which
 Do you mean the sequence in object_deinit()?

 Yes.

 may lead to realized = false accessing freed memory.
 If my understanding as above is correct, we just need to guarantee
 realized=false (e.g. pci_e1000_uninit )for  derived type will only
 free the resource at its layer, and not touch its parent's, then it
 can not access freed memory, right?

 For .instance_finalize you are right.

 For realized, it is up to the derived type to choose when to call the
 parent's realized implementation, e.g. a PCI device's unrealize
 implementation will need to call PCIDevice's unrealize after its own
 cleanups if it needs to access the config space or other resources
 allocated/free at PCIDevice layer. I doubt we can make it a rule not to
 touch the parent's resources at all.

I think we can make rules more simple. When device_finalize() called,
we will let realized=false, and this will reclaim e1000's extra
resource, and then pci extra resource. And there is no issue about
touching freed memory.

 But at least today, TYPE_OBJECT does not have an instance_finalize

Think it will not happen. Since instance_finalize is a hook for
derived object, as for Object, object_finalize is the one, right?
 implementation, so moving realized=false to
 hw/core/qdev.c:device_finalize() instead may be an option - hoping Paolo
 can comment more on device_unparent() vs. device_finalize() usage.

I guess device_unparent = isolate and device_finalize = reclaim
resource, basing on the understanding of Paolo's patches Delay
destruction of memory regions to instance_finalize.

Regards,
Pingfan

 Regards,
 Andreas

  object_deinit(obj, ti);
  object_property_del_all(obj);


 --
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[Qemu-devel] [Bug 1197663] Re: qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB irrespective of the volume size

2013-07-04 Thread chandrashekar shastri

** Also affects: fedora
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1197663

Title:
  qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
  irrespective of the volume size

Status in QEMU:
  New
Status in Fedora:
  New

Bug description:
  qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
  irrespective of the volume size.

  Kernel Version: 3.10.0-rc5+

  Libvirt Version: 1.0.6

  Qemu Version: 1.5.50

  Steps to reproduce the issue:

  1. Create a qcow2 voulme using the command qemu-img create -f qcow2 
virtio-scsi11.img 10G 
  2. Add the virtio-scsi controller 
   
   controller type='scsi' index='0' model='virtio-scsi'
address type='pci' domain='0x' bus='0x00' slot='0x04' 
function='0x0'/
  /controller

  3. Attach the qcow2 device to the guest, virsh attach-disk rhel64-64
  /home/images/virtio-scsi11.img --persistent sdr --cache writethrough

  4. Run the scan commnad echo ' - - - ' 
  /sys/class/scsi_host/host#/scan, if the attached volume doesn't get
  recognize.

  5. Check the dmesg for the added volume.

  6. Run fdisk -l command

  Disk /dev/sdl: 0 MB, 197120 bytes
  1 heads, 1 sectors/track, 385 cylinders, total 385 sectors
  Units = cylinders of 1 * 512 = 512 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x

  
  And observe that the 10G qcow2 volume shows as 0MB.

  This is not seen with the raw image.

  Disk /dev/sdm: 10.7 GB, 10737418240 bytes
  64 heads, 32 sectors/track, 10240 cylinders
  Units = cylinders of 2048 * 512 = 1048576 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x

  Expected Result:

  The volume size for the qcow2 volumes should be shown correctly inside
  the guest to avoid confusion.

  
  Guest XML:
  virsh dumpxml rhel64-64
  domain type='kvm' id='4'
namerhel64-64/name
uuid48deb0e1-0c23-9be9-da12-2ead34864de2/uuid
memory unit='KiB'4096000/memory
currentMemory unit='KiB'4096000/currentMemory
vcpu placement='static'1/vcpu
resource
  partition/machine/partition
/resource
os
  type arch='x86_64' machine='pc-i440fx-1.5'hvm/type
  boot dev='hd'/
/os
features
  acpi/
  apic/
  pae/
/features
clock offset='utc'/
on_poweroffdestroy/on_poweroff
on_rebootrestart/on_reboot
on_crashrestart/on_crash
devices
  emulator/usr/local/bin/qemu-system-x86_64/emulator
  disk type='file' device='disk'
driver name='qemu' type='qcow2' cache='none'/
source file='/home/images/rhel64-64.qcow2'/
target dev='hda' bus='ide'/
alias name='ide0-0-0'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='cdrom'
driver name='qemu' type='raw'/
source 
file='/home/upstream/autotest/virt-test/shared/data/isos/RHEL-6.4-x86_64-DVD.iso'/
target dev='hdb' bus='ide'/
readonly/
alias name='ide0-0-1'/
address type='drive' controller='0' bus='0' target='0' unit='1'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi11.img'/
target dev='sda' bus='scsi'/
alias name='scsi0-0-0-0'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi1.img'/
target dev='sdf' bus='scsi'/
alias name='scsi0-0-0-5'/
address type='drive' controller='0' bus='0' target='0' unit='5'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi9.img'/
target dev='sdg' bus='scsi'/
alias name='scsi0-0-0-6'/
address type='drive' controller='0' bus='0' target='0' unit='6'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi8.img'/
target dev='sdh' bus='scsi'/
alias name='scsi1-0-0'/
address type='drive' controller='1' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi10.img'/
target dev='sdi' bus='scsi'/
alias name='scsi1-0-1'/
address type='drive' controller='1' bus='0' target='0' unit='1'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source

Re: [Qemu-devel] [PATCH] full introspection support for QMP

2013-07-04 Thread Kevin Wolf

Am 03.07.2013 um 17:59 hat Anthony Liguori geschrieben:
 Kevin Wolf kw...@redhat.com writes:
 
  Am 02.07.2013 um 19:06 hat Anthony Liguori geschrieben:
  Eric Blake ebl...@redhat.com writes:
   On 07/02/2013 08:51 AM, Anthony Liguori wrote:
   Amos Kong ak...@redhat.com writes:
   
   Introduces new monitor command to query QMP schema information,
   the return data is a nested dict/list, it contains the useful
   metadata.
  
   we can add events definations to qapi-schema.json, then it can
   also be queried.
  
   Signed-off-by: Amos Kong ak...@redhat.com
   
   Maybe I'm being too meta here, but why not just return qapi-schema.json
   as a string and call it as day?
 
  I know you don't agree with this, but as I mentioned several times
  before, I think the schema as returned by the introspection functions
  shouldn't contain what a qemu of this version _could_ in theory provide,
  but what this specific build actually _does_ provide. It shouldn't
  include things that are compiled out.
 
 I really don't disagree with you here.  I just don't like having two
 formats for the schema.

So you agree that we have to postprocess at least in the sense that we
leave out things that aren't available?

In this case, I think you already have most of the postprocessing code
(and this diffstat of this patch seems to show that it's not that much
code anyway), so code size isn't a valid point any more. Then we can
concentrate on getting the optimal wire format and do whatever is needed
to implement it.

   I've also been the one arguing that the additional complexity (an array 
   of
   {name:str,type:str,optional:bool}) is better for libvirt in
   that the JSON is then well-suited for scanning (it is easier to scan
   through an array where the key is a constant name, and looking for the
   value that we are interested in, than it is to scan through a dictionary
   where the keys of the dictionary are the names we are interested in).
   That is, the JSON in qapi-schema.json is a nice compact representation
   that works for humans, but may be a bit TOO compact for handling via
   machines.
  
  But adding a bunch of code to do JSON translation just adds a bunch of
  additional complexity.
  
  One reasonable compromise would be:
  
  { command: foo, arguments: { name: str, id: int },
  optional: { bar: bool } }
 
  This assumes that optional vs. mandatory is the only property we ever
  want to describe for fields. Eric's approach is much more future-proof.
  Let's keep the format of qapi-schema.json an implementation detail that
  we can change and extend when necessary.
 
 It's always possible to add another argument that describes additional
 information.
 
 For instance:
 
 { command: foo,
   arguments: { name: str, id: int },
   optional: { bar: bool },
   defaults: { bar: false } }
 
 That doesn't mean I think exposing defaults is good, but rather that
 it's still possible to do this in a compact form.

Yeah, it's possible, but it feels kind of backwards to have the
properties on the top level and repeat the field names for each property
that they have.

How does it work for nested structs? There you don't have the
arguments substructure, so you'd have to have optional as a child of
all the other fields or something like that. It becomes ugly quite
quickly.

Kevin

Re: [Qemu-devel] [PATCH V1 1/2] Implement sync modes for drive-backup.

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 20:14, Ian Main ha scritto:
  
  Should the source be bs for MIRROR_SYNC_MODE_NONE?  Also in this case
  you may want to default the format to qcow2 instead of bs's format.
 I'm not sure that it matters what the source is for NONE.  Since we are
 copying all new writes, whether they would go to a top-most layer or not
 shouldn't matter?

It would matter for reads of still-uncopied data, though.  You have to
read from the topmost layer, not the one below.

 As for qcow2 format, there is a 'format' option to the drive-backup API
 which specifies the format.  I guess we could set the default to qcow2
 instead of the source format?  Anyone have any opinions on that?

That would be another possibility.  Perhaps use qcow2 for top or none,
and the source format for full.

 I have made the other changes above done.  If I don't hear on this issue
 soon I'll post another revision.

You can go ahead and post anyway (just remember to fix the backing file
issue), it is a simple patch on top of what you have.

Paolo

Re: [Qemu-devel] [PATCH] full introspection support for QMP

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 18:06, Anthony Liguori ha scritto:
 Paolo Bonzini pbonz...@redhat.com writes:
 
 Il 03/07/2013 14:54, Anthony Liguori ha scritto:
 So, qapi-schema.json has to be readable/writable _mostly_ by humans.
 That it is valid JSON is little more than a curious accident, because

 I can assure you that it wasn't an accident.

 Sure, it is not.  But when designing the right API for a QMP client, it
 doesn't matter if it is or not.  If QMP used ASN.1 or something like
 that as the wire protocol, we would not use JSON just for the schema,
 would we?
 
 JSON is a pretty good representation of Python data structures and the
 intention was for qapi-schema.json to be generated by another tool.
 
 But I understand the point you're trying to make.  The thing is, QMP is
 JSON now so it's somewhat academic.

If we generated a Python or C API based on the schema, should the client
care (or know) that QMP is JSON?

 Does 'type' have argument 'foo':

bool('foo' in type_dict['data']) or
  bool('*foo' in type_dict['data'])

 (as a QMP client I want to send the argument, I don't care if it is
 optional or not) and here the abstraction is already falling, IMHO.  It
 should be one of these:
 
 Whether 'type' is in 'foo' is a static property.  We would never add
 non-optional arguments to a function so the first part of the clause is
 a constant expression.

What about returned types?  I'm not sure we've never added non-optional
arguments, even though in principle it was not the right thing to do.

 C) Does 'enum' have 'value'
- bool('value' in enum_dict['data'])

 D) Does 'command' have 'parameter'
- bool('parameter' in command_dict['data'])

 What is the type of 'parameter' in command:

 command_dict['data']['parameter'] or
command_dict['data']['*parameter']
 
 That's a fair point.  But again, this is a constant expression.  Type
 values never change.

Not necessarily, a type that is currently used in two places can be
split in two different types, with different optional fields.

I understand though that command_dict['data']['parameter'] is either
always true or always false, because new parameters are always added as
optional.  Still, for something that targets a new-enough QEMU only,
there is no need to know if the parameter has always been there, or was
added as optional.

 What are we really optimizing here for?

I think we should optimize for the clients, not for ourselves.

Paolo

 Regards,
 
 Anthony Liguori
 
 It should be something like these:

 command_dict['data'].arguments['parameter'].type
 command_dict['data']['arguments']['parameter']['type']

 The example that Eric sent is not something that I would find easy to
 read/write.  qapi-schema.json instead is more than acceptable.

 I don't think the example Eric sent is any easier to parse
 programmatically.

 It is, see the above examples.

 That's the problem I have here.  I don't see why we
 can't have both a human readable and machine readable syntax.

 It is machine readable, but that doesn't mean it constitutes a nice API.

 Paolo

 Furthermore, qapi.py is an existence proof that we do :-)

 Regards,

 Anthony Liguori


 Paolo

Re: [Qemu-devel] [PATCH v2 3/4] ide: Set BSY bit during FLUSH

2013-07-04 Thread Kevin Wolf

Am 03.07.2013 um 22:02 hat Alex Williamson geschrieben:
 On Wed, 2013-06-05 at 15:17 +0200, Kevin Wolf wrote:
  From: Andreas Färber afaer...@suse.de
  
  The implementation of the ATA FLUSH command invokes a flush at the block
  layer, which may on raw files on POSIX entail a synchronous fdatasync().
  This may in some cases take so long that the SLES 11 SP1 guest driver
  reports I/O errors and filesystems get corrupted or remounted read-only.
  
  Avoid this by setting BUSY_STAT, so that the guest is made aware we are
  in the middle of an operation and no ATA commands are attempted to be
  processed concurrently.
  
  Addresses BNC#637297.
  
  Suggested-by: Gonglei (Arei) arei.gong...@huawei.com
  Signed-off-by: Andreas Färber afaer...@suse.de
  Signed-off-by: Kevin Wolf kw...@redhat.com
  ---
   hw/ide/core.c | 1 +
   1 file changed, 1 insertion(+)
  
  diff --git a/hw/ide/core.c b/hw/ide/core.c
  index c7a8041..9926d92 100644
  --- a/hw/ide/core.c
  +++ b/hw/ide/core.c
  @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s)
   return;
   }
   
  +s-status |= BUSY_STAT;
   bdrv_acct_start(s-bs, s-acct, 0, BDRV_ACCT_FLUSH);
   bdrv_aio_flush(s-bs, ide_flush_cb, s);
   }
 
 
 I can no longer boot win7 x64 on q35 with IDE using a qcow2 image.  git
 bisect determined this patch is the culprit.
 
 -M q35 -nodefconfig -readconfig docs/q35-chipset.cfg -drive
 file=image.qcow2,if=none,id=mydisk -device
 ide-drive,drive=mydisk,bus=ide.0

This means you're using AHCI, right?

handle_cmd() in ahci.c checks the flags and does indeed behave
differently now:

if (s-dev[port].port.ifs[0].status  (BUSY_STAT|DRQ_STAT)) {
/* async command, complete later */
s-dev[port].busy_slot = slot;
return -1;
}

/* done handling the command */
return 0;

The caller of this code updates pr-cmd_issue to clear the bit for the
respective command slot. This is missed now, and the later completion
mentioned in the comment doesn't happen for flushes, the IDE core never
calls back into the AHCI core for the completion.

The correct fix might be to call ide_set_inactive() in the flush
callback, though I haven't checked in detail yet whether there's
anything specific to DMA read/write in ide_set_inactive().

Kevin

Re: [Qemu-devel] [PATCH] full introspection support for QMP

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 17:59, Anthony Liguori ha scritto:
 For instance:
 
 { command: foo,
   arguments: { name: str, id: int },
   optional: { bar: bool },
   defaults: { bar: false } }

This is still not a dictionary that QAPI is able to describe.

Paolo

Re: [Qemu-devel] [PATCH 12/23] ide: Convert FLUSH CACHE to ide_cmd_table handler

2013-07-04 Thread Kevin Wolf

Am 03.07.2013 um 23:51 hat Alex Williamson geschrieben:
 On Wed, 2013-07-03 at 15:41 -0600, Alex Williamson wrote:
  On Mon, 2013-06-24 at 11:10 +0200, Stefan Hajnoczi wrote:
   From: Kevin Wolf kw...@redhat.com
   
   Signed-off-by: Kevin Wolf kw...@redhat.com
   Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
   ---
hw/ide/core.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
   
   diff --git a/hw/ide/core.c b/hw/ide/core.c
   index 8789758..83e86aa 100644
   --- a/hw/ide/core.c
   +++ b/hw/ide/core.c
   @@ -1184,6 +1184,12 @@ static bool cmd_write_dma(IDEState *s, uint8_t cmd)
return false;
}

   +static bool cmd_flush_cache(IDEState *s, uint8_t cmd)
   +{
   +ide_flush_cache(s);
   +return false;
   +}
   +
static bool cmd_read_native_max(IDEState *s, uint8_t cmd)
{
bool lba48 = (cmd == WIN_READ_NATIVE_MAX_EXT);
   @@ -1345,8 +1351,8 @@ static const struct {
[WIN_SETIDLE1]= { cmd_nop, ALL_OK },
[WIN_CHECKPOWERMODE1] = { cmd_check_power_mode, ALL_OK | 
   SET_DSC },
[WIN_SLEEPNOW1]   = { cmd_nop, ALL_OK },
   -[WIN_FLUSH_CACHE] = { NULL, ALL_OK },
   -[WIN_FLUSH_CACHE_EXT] = { NULL, HD_CFA_OK },
   +[WIN_FLUSH_CACHE] = { cmd_flush_cache, ALL_OK },
   +[WIN_FLUSH_CACHE_EXT] = { cmd_flush_cache, HD_CFA_OK },
[WIN_IDENTIFY]= { cmd_identify, ALL_OK },
[WIN_SETFEATURES] = { cmd_set_features, ALL_OK | SET_DSC 
   },
[IBM_SENSE_CONDITION] = { NULL, CFA_OK },
   @@ -1403,10 +1409,6 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
}

switch(val) {
   -case WIN_FLUSH_CACHE:
   -case WIN_FLUSH_CACHE_EXT:
   -ide_flush_cache(s);
   -break;
case WIN_SEEK:
/* XXX: Check that seek is within bounds */
s-status = READY_STAT | SEEK_STAT;
  
  This also breaks win7 x64 q35 IDE.  Note that while this change looks
  like a no-op, filling in a handler now means that we do:
  
  s-status = READY_STAT | BUSY_STAT;
  
  before calling the handler and don't clear it on the way out since the
  function statically returns false.  This then introduces the same bug as
  f68ec837.  Thanks,
 
 This seems to work around the bug, but I'll leave it to those of you who
 actually know how IDE works for a proper fix:
 
 diff --git a/hw/ide/core.c b/hw/ide/core.c
 index 96b468c..8893849 100644
 --- a/hw/ide/core.c
 +++ b/hw/ide/core.c
 @@ -1186,6 +1186,7 @@ static bool cmd_write_dma(IDEState *s, uint8_t cmd)
  static bool cmd_flush_cache(IDEState *s, uint8_t cmd)
  {
  ide_flush_cache(s);
 +s-status = ~BUSY_STAT;
  return false;
  }

This is wrong, the BSY bit must remain set while the FLUSH command is
running. As I said in the other thread, the real problem is that AHCI
isn't notified about the command completion for flushes.

Kevin

Re: [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed

2013-07-04 Thread Paolo Bonzini

Il 04/07/2013 04:20, Fam Zheng ha scritto:
 On Wed, 07/03 16:34, Paolo Bonzini wrote:
 Do not do two reads for each sector; load each sector of the bitmap
 and use bitmap operations to process it.

 Writes are still dog slow!

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  block/cow.c | 54 --
  1 file changed, 32 insertions(+), 22 deletions(-)

 diff --git a/block/cow.c b/block/cow.c
 index 1cc2e89..204451e 100644
 --- a/block/cow.c
 +++ b/block/cow.c
 @@ -126,18 +126,31 @@ static inline int cow_set_bit(BlockDriverState *bs, 
 int64_t bitnum)
  return 0;
  }
  
 -static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
 +#define BITS_PER_BITMAP_SECTOR (512 * 8)
 +
 +/* Cannot use bitmap.c on big-endian machines.  */
 +static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
  {
 -uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
 -uint8_t bitmap;
 -int ret;
 +return (bitmap[bitnum / 8]  (1  (bitnum  7))) != 0;
 +}
  
 -ret = bdrv_pread(bs-file, offset, bitmap, sizeof(bitmap));
 -if (ret  0) {
 -   return ret;
 +static int cow_find_streak(const uint8_t *bitmap, int value, int start, int 
 nb_sectors)
 I think type bool is better for 'value' as you don't booleanize it.  And
 also int64_t for start?

start is always between 0 and BITS_PER_BITMAP_SECTOR.

value here is a bit value, so 0 or 1 rather than true or false.  I
prefer to keep it as int, but it can be changed.

Paolo

 +{
 +int streak_value = value ? 0xFF : 0;
 +int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
 +int bitnum = start;
 +while (bitnum  last) {
 +if ((bitnum  7) == 0  bitmap[bitnum / 8] == streak_value) {
 +bitnum += 8;
 +continue;
 +}
 +if (cow_test_bit(bitnum, bitmap) == value) {
 +bitnum++;
 +continue;
 +}
 +break;
  }
 -
 -return !!(bitmap  (1  (bitnum % 8)));
 +return MIN(bitnum, last) - start;
  }
  
  /* Return true if first block has been changed (ie. current version is
 @@ -146,23 +159,20 @@ static inline int is_bit_set(BlockDriverState *bs, 
 int64_t bitnum)
  static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
  int64_t sector_num, int nb_sectors, int *num_same)
  {
 +int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
 +uint64_t offset = (bitnum / 8)  -BDRV_SECTOR_SIZE;
 +uint8_t bitmap[512];
 +int ret;
  int changed;
  
 -if (nb_sectors == 0) {
 -*num_same = nb_sectors;
 -return 0;
 -}
 -
 -changed = is_bit_set(bs, sector_num);
 -if (changed  0) {
 -return 0; /* XXX: how to return I/O errors? */
 -}
 -
 -for (*num_same = 1; *num_same  nb_sectors; (*num_same)++) {
 -if (is_bit_set(bs, sector_num + *num_same) != changed)
 -break;
 +ret = bdrv_pread(bs-file, offset, bitmap, sizeof(bitmap));
 +if (ret  0) {
 +return ret;
  }
  
 +bitnum = BITS_PER_BITMAP_SECTOR - 1;
 +changed = cow_test_bit(bitnum, bitmap);
 +*num_same = cow_find_streak(bitmap, changed, bitnum, nb_sectors);
  return changed;
  }
  
 -- 
 1.8.2.1

[Qemu-devel] [PATCH] Makefile: disable parallel build with dtc

2013-07-04 Thread Michael S. Tsirkin

Sometimes I get this error when building with -j 4:
ar: two different operation options specified
make[1]: *** [libfdt/libfdt.a] Error 1
make: *** [subdir-dtc] Error 2

dtc make does not seem to support parallel make.
Force non-parallel build to fix this.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index c06bfab..e86c15a 100644
--- a/Makefile
+++ b/Makefile
@@ -145,7 +145,7 @@ pixman/Makefile: $(SRC_PATH)/pixman/configure
 $(SRC_PATH)/pixman/configure:
(cd $(SRC_PATH)/pixman; autoreconf -v --install)
 
-DTC_MAKE_ARGS=-I$(SRC_PATH)/dtc VPATH=$(SRC_PATH)/dtc -C dtc V=$(V) 
LIBFDT_srcdir=$(SRC_PATH)/dtc/libfdt
+DTC_MAKE_ARGS=-I$(SRC_PATH)/dtc VPATH=$(SRC_PATH)/dtc -C dtc V=$(V) 
LIBFDT_srcdir=$(SRC_PATH)/dtc/libfdt --jobs=1
 DTC_CFLAGS=$(CFLAGS) $(QEMU_CFLAGS)
 DTC_CPPFLAGS=-I$(BUILD_DIR)/dtc -I$(SRC_PATH)/dtc -I$(SRC_PATH)/dtc/libfdt
 
-- 
MST

Re: [Qemu-devel] [PATCH] qom: Use atomics for object refcounting

2013-07-04 Thread Paolo Bonzini

Il 02/07/2013 18:36, Anthony Liguori ha scritto:
 Paolo Bonzini pbonz...@redhat.com writes:
 
 Il 02/07/2013 16:47, Anthony Liguori ha scritto:
 Jan Kiszka jan.kis...@siemens.com writes:

 Objects can soon be referenced/dereference outside the BQL. So we need
 to use atomics in object_ref/unref.

 Based on patch by Liu Ping Fan.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  qom/object.c |5 ++---
  1 files changed, 2 insertions(+), 3 deletions(-)

 diff --git a/qom/object.c b/qom/object.c
 index 803b94b..a76a30b 100644
 --- a/qom/object.c
 +++ b/qom/object.c
 @@ -683,16 +683,15 @@ GSList *object_class_get_list(const char 
 *implements_type,
  
  void object_ref(Object *obj)
  {
 -obj-ref++;
 + __sync_fetch_and_add(obj-ref, 1);
  }
  
  void object_unref(Object *obj)
  {
  g_assert(obj-ref  0);
 -obj-ref--;
  
  /* parent always holds a reference to its children */
 -if (obj-ref == 0) {
 +if (__sync_sub_and_fetch(obj-ref, 1) == 0) {
  object_finalize(obj);
  }
  }

 Should we introduce something akin to kref now that referencing counting
 has gotten fancy?

 I'm not a big fan of kref (it seems _too_ thin a wrapper to me, i.e. it
 doesn't really wrap enough to be useful), but I wouldn't oppose it if
 someone else does it.
 
 I had honestly hoped Object was light enough to be used for this
 purpose.  What do you think?

We should make it more robust against objects that are not in the QOM
composition tree (adding/removing the child property is relatively
slow).  As things stand, QOM is definitely too slow for something like
SCSIRequest.

In the long term, it is definitely nice to use Object more.  But if we
really had to abstract things, for now I'd just do

#define atomic_ref(x)  atomic_inc(x)
#define atomic_unref_test_zero(x)  (atomic_fetch_dec(x) == 1)

or something like that.

Paolo

Re: [Qemu-devel] [PATCH 10/17] block: define get_block_status return value

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 23:04, Peter Lieven ha scritto:
  Define the return value of get_block_status.  Bits 0, 1, 2 and 8-62
  are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-7
  are left for future extensions.
 Is Bit 8 not also reserved for future use? BDRV_SECTOR_BITS is 9.

Right.

 Can you explain which information is exactly returned in Bits 9-62?

Bits 9-62 are the offset at which the data is stored in bs-file, they
are valid if bit 2 (BDRV_BLOCK_OFFSET_VALID) is 1.

Paolo

Re: [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed

2013-07-04 Thread Paolo Bonzini

Il 04/07/2013 04:40, Fam Zheng ha scritto:
 On Wed, 07/03 16:34, Paolo Bonzini wrote:
 Only sync once per write, rather than once per sector.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  block/cow.c | 19 ---
  1 file changed, 16 insertions(+), 3 deletions(-)

 diff --git a/block/cow.c b/block/cow.c
 index 204451e..133e596 100644
 --- a/block/cow.c
 +++ b/block/cow.c
 @@ -106,7 +106,7 @@ static int cow_open(BlockDriverState *bs, QDict 
 *options, int flags)
   * XXX(hch): right now these functions are extremely inefficient.
   * We should just read the whole bitmap we'll need in one go instead.
   */
 -static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
 +static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum, bool 
 *first)
 Why flush _before first_ write, rather than (more intuitively) flush
 _after last_ write?

Because you have to flush the data before you start writing the
metadata.  Flushing the metadata can be done when the guest issues a flush.

This ensures that, in case of a power loss, the metadata will never
refer to data that hasn't been written.

Paolo

 And personally I think bool sync makes a better
 signature than bool *first, although it's not that critical as
 cow_update_bitmap is the only caller.
  {
  uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
  uint8_t bitmap;
 @@ -117,9 +117,21 @@ static inline int cow_set_bit(BlockDriverState *bs, 
 int64_t bitnum)
 return ret;
  }
  
 +if (bitmap  (1  (bitnum % 8))) {
 +return 0;
 +}
 +
 +if (*first) {
 +ret = bdrv_flush(bs-file);
 +if (ret  0) {
 +   return ret;
 +}
 +*first = false;
 +}
 +
  bitmap |= (1  (bitnum % 8));
  
 -ret = bdrv_pwrite_sync(bs-file, offset, bitmap, sizeof(bitmap));
 +ret = bdrv_pwrite(bs-file, offset, bitmap, sizeof(bitmap));
  if (ret  0) {
 return ret;
  }
 @@ -181,9 +193,10 @@ static int cow_update_bitmap(BlockDriverState *bs, 
 int64_t sector_num,
  {
  int error = 0;
  int i;
 +bool first = true;
  
  for (i = 0; i  nb_sectors; i++) {
 -error = cow_set_bit(bs, sector_num + i);
 +error = cow_set_bit(bs, sector_num + i, first);
  if (error) {
  break;
  }
 -- 
 1.8.2.1

Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a map subcommand

2013-07-04 Thread Paolo Bonzini

Il 04/07/2013 07:34, Fam Zheng ha scritto:
  +if ((e-flags  (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == 
  BDRV_BLOCK_DATA) {
  +printf(%lld %lld %d %lld\n,
  +   (long long) e-start, (long long) e-length,
  +   e-depth, (long long) e-offset);
  +}
 Why %lld and explicit cast, not using PRId64?

Will fix.

 Is BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO distinguishable here for the
 user? By offset?

I'm not sure I understand the question.

Zero blocks are always omitted in the human format.  Only non-zero
blocks are listed.

Paolo

Re: [Qemu-devel] [PATCH] Xen PV Device

2013-07-04 Thread Paul Durrant

 -Original Message-
 From: Stefano Stabellini [mailto:stefano.stabell...@eu.citrix.com]
 Sent: 03 July 2013 17:38
 To: Paul Durrant
 Cc: qemu-devel@nongnu.org; xen-de...@lists.xen.org; Stefano Stabellini;
 afaer...@suse.de
 Subject: Re: [PATCH] Xen PV Device

 On Wed, 3 Jul 2013, Paul Durrant wrote:
  This patch introduces a new Xen PV PCI device which will act as a new
  binding point for PV drivers for Xen.
  The device has parameterized vendor-id, device-id and revision to allow to
  be configured as a binding point for any vendor's PV drivers.

  Signed-off-by: Paul Durrant paul.durr...@citrix.com
  Cc: Stefano Stabellini stefano.stabell...@citrix.com
  ---
   hw/xen/Makefile.objs |1 +
   hw/xen/xen_pvdevice.c|  131
 ++
   include/hw/pci/pci_ids.h |5 +-
   trace-events |4 ++
   4 files changed, 139 insertions(+), 2 deletions(-)
   create mode 100644 hw/xen/xen_pvdevice.c

  diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
  index 2017560..fd88003 100644
  --- a/hw/xen/Makefile.objs
  +++ b/hw/xen/Makefile.objs
  @@ -4,3 +4,4 @@ common-obj-$(CONFIG_XEN_BACKEND) +=
 xen_backend.o xen_devconfig.o
   obj-$(CONFIG_XEN_I386) += xen_platform.o xen_apic.o
   obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
   obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o
 xen_pt_config_init.o xen_pt_msi.o
  +obj-$(CONFIG_XEN) += xen_pvdevice.o
  diff --git a/hw/xen/xen_pvdevice.c b/hw/xen/xen_pvdevice.c
  new file mode 100644
  index 000..dbc4bf5
  --- /dev/null
  +++ b/hw/xen/xen_pvdevice.c
  @@ -0,0 +1,131 @@
  +/* Copyright (c) Citrix Systems Inc.
  + * All rights reserved.

 Like Anthony wrote before, All rights reserved contradicts what's
 written below.
 Aside from this, it looks OK to me.

 I would like to see the libxl side patch.

Working on it, but it's not required to use the new device so I don't think the 
QEMU patch need be predicated on it.

  Paul

Re: [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats

2013-07-04 Thread Paolo Bonzini

Il 04/07/2013 05:22, Fam Zheng ha scritto:
  +case VMDK_OK:
  +  /* TODO: might return offset if the extents are in bs-file.  */
  +  ret = BDRV_BLOCK_DATA;
 if (extent-file == bs-file) {
 ret |= BDRV_BLOCK_OFFSET_VALID | offset;
 }

Thanks. :)

Paolo

Re: [Qemu-devel] [PATCH] Xen PV Device

2013-07-04 Thread Paul Durrant

 -Original Message-
 
  Like Anthony wrote before, All rights reserved contradicts what's
  written below.

Like I said, it's part of all BSD licenses that I can find. It's certainly in 
the template on the OSI website and the FreeBSD license for instance.

  Aside from this, it looks OK to me.
 
  I would like to see the libxl side patch.
  Also it would be nice to have an ack from Andreas or another QOM expert.
 
 From a QOM view it looks fine now. :) Thanks for inquiring.
 
 Some other comments though:
 * Now that it no longer depends on TARGET_PAGE_SIZE, is it possible to
 use common-obj-$(CONFIG_XEN)? Then it would build only once rather than
 separately for i386 and x86_64 and any future Xen platforms (e.g., arm).

Sure, that sounds sensible.

 * It looks as if the MMIO functions were renamed - the arguments no
 longer align. That could be edited before you apply the patch to your
 queue if there's nothing else - then feel free to add my Reviewed-by
 independent of the other issue.

Thanks.

 * Paolo had asked for new MemoryRegions not to include the device name -
 can be renamed once they get the owner field though (not merged yet).
 Don't have a better suggestion handy.
 

I guess this can be fixed up later.

 Also Paul, by my count this is [PATCH v4] - please use
 --subject-prefix=PATCH v5 if you respin and include the change log
 either below --- or in a cover letter. We prefer to see it for patch
 review but not in Git commit history.

Ok. I was unsure what to do since this device was under a different name so I 
opted to reset the version back to 1. I'll call the next one v5 as you suggest. 
I'm still finding my way with git so thanks for the tips.

 Similarly, Introduce a new Xen PV device... would elegantly avoid
 reading This patch... after it's been committed. ;)
 

Sure. Good point.

  Paul

Re: [Qemu-devel] [PATCH v4 0/9] Make 'dump-guest-memory' dump in kdump-compressed format

2013-07-04 Thread Stefan Hajnoczi

On Wed, Jul 03, 2013 at 03:39:51PM +0800, Qiao Nuohan wrote:
 On 07/01/2013 07:45 PM, Stefan Hajnoczi wrote:
 In flatten format, data will be write to dumpfile block by block, and uses the
 following structure to indicate the offset and size of a data block.
 
 struct makedumpfile_data_header {
 int64_t offset;
 int64_t buf_size;
 };
 
 For more information, please refer to makedumpfile
 
 
 http://sourceforge.net/projects/makedumpfile/

I see.  From the QEMU code perspective this will be simpler.

Stefan

Re: [Qemu-devel] [PATCH 3/3] PPC PReP: can run without bios image

2013-07-04 Thread Julio Guerra

No conclusion was finally done about the new option proposal to load
roms files. It really would be handy.

--
Julio Guerra

Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a map subcommand

2013-07-04 Thread Fam Zheng

On Thu, 07/04 10:16, Paolo Bonzini wrote:
 Il 04/07/2013 07:34, Fam Zheng ha scritto:
   +if ((e-flags  (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == 
   BDRV_BLOCK_DATA) {
   +printf(%lld %lld %d %lld\n,
   +   (long long) e-start, (long long) e-length,
   +   e-depth, (long long) e-offset);
   +}
  Why %lld and explicit cast, not using PRId64?
 
 Will fix.
 
  Is BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO distinguishable here for the
  user? By offset?
 
 I'm not sure I understand the question.
 
 Zero blocks are always omitted in the human format.  Only non-zero
 blocks are listed.
I missed this.

-- 
Fam

Re: [Qemu-devel] [PATCH] Citrix PV Bus device

2013-07-04 Thread Michael S. Tsirkin

On Tue, Jul 02, 2013 at 12:10:17PM +0100, Peter Maydell wrote:
 On 2 July 2013 11:57, Paul Durrant paul.durr...@citrix.com wrote:
  -Original Message-
  From: Paolo Bonzini [mailto:pbonz...@redhat.com]
   So the reason to place the device here is TARGET_PAGE_SIZE...
   We really need a way to access that value from common code,
   somewhere down my TODO list. :/
 
 We probably don't, because it generally doesn't mean what you
 think it does. It's the smallest possible page size the guest
 CPU supports, which may not be the same as the actual page
 size the guest OS is using.
 
  Why does it need to be in pages rather than bytes?
 
  It doesn't necessarily need to be in pages; it's just a more
  convenient quantity than bytes.
 
 It isn't really more convienient, because the guest would have
 to tell QEMU what the page size was. (I'm told that virtio is
 planning to move to a simple just use a byte count approach.)
 
 thanks
 -- PMM

Yes, sometime in a distant future ...

Re: [Qemu-devel] [PATCH v5] Add timestamp to error_report()

2013-07-04 Thread Stefan Hajnoczi

On Thu, Jul 04, 2013 at 02:57:13AM +, Seiji Aguchi wrote:

  -Original Message-
  From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
  Sent: Wednesday, July 03, 2013 5:14 AM
  To: Seiji Aguchi
  Cc: qemu-devel@nongnu.org; aligu...@us.ibm.com; berra...@redhat.com; 
  kw...@redhat.com; mtosa...@redhat.com;
  arm...@redhat.com; Tomoki Sekiyama; pbonz...@redhat.com; 
  lcapitul...@redhat.com; ler...@redhat.com; ebl...@redhat.com;
  dle-deve...@lists.sourceforge.net
  Subject: Re: [PATCH v5] Add timestamp to error_report()

  On Tue, Jul 02, 2013 at 02:09:24PM +, Seiji Aguchi wrote:

 +DEF(msg, HAS_ARG, QEMU_OPTION_msg,
 +-msg [timestamp=on|off]\n
 +  change the format of messages\n
 +  timestamp=on|off enables leading timestamps (default:on)\n,
 +QEMU_ARCH_ALL)
 +STEXI
 +@item -msg timestamp=on|off
 +@findex -msg
 +prepend a timestamp to each log message.
 +(disabled by default)
 +ETEXI

I am confused.  If the user specifies -msg then enable_timestamp_msg is
on by default.  If the user does not specify -msg then
enable_timestmap_msg is off.  Did I get that right?

   Yes.

This means that the default behavior of QEMU does not change but you can
add -msg to enable timestamps.

I'm happy with this but find the documentation confusing.

   I can remove (disabled by default) if needed.

  Perhaps the simplest solution is timestamp=off by default.  Then there
  can be no confusion and users must do -msg timestamp=on to enable
  timestamps.

  If you really want to keep -msg as a shortcut for -msg timestamp=on,
  then please document explicitly that:
  1. Without -msg timestamps are off.
  1. With -msg timestamps are on.
  2. -msg timestamp=off can be used to turn timestamps off again.

 My apologies for the confusion.

 The syntax, -msg [timestamp=on|off], was wrong.
 -msg timestamp[=on|off] is correct.

 And there is no way to make timestamp optional, as far as I looked into a 
 source code.
 Therefore, the explanation should be as below.
 (I think it is reasonable to keep -msg timestamp as a shortcut for -msg 
 timestamp=on.)

Yes, I think you are correct.  I thought previously that -msg works
but it seems an option is always required.

 snip
 +DEF(msg, HAS_ARG, QEMU_OPTION_msg,
 +-msg timestamp[=on|off]\n
 +change the format of messages\n
 +on|off controls leading timestamps (default:on)\n,
 +QEMU_ARCH_ALL)
 +STEXI
 +@item -msg timestamp[=on|off]
 +@findex -msg
 +prepend a timestamp to each log message.(default:on)
 +ETEXI
 snip

 To be simpler, we may be able to introduce just a single -msg-timestamp.
 But I think current -msg timestamp[=on|off] is reasonable
 because other options may be introduced to msg, like log_level or debug.

Yep.

Stefan

Re: [Qemu-devel] PVFS2 Block Driver Support

2013-07-04 Thread Stefan Hajnoczi

On Wed, Jul 03, 2013 at 11:41:08AM -0400, Timothy Scott wrote:
 In testing my block driver implementation, I am receiving the following
 error when trying to run an orangefs protocol with a qcow2 image format:
 
   +Header extension too large
   +qemu-io: can't open device pvfs2:/...
   +no file open, try 'help open'
 
 Is './check -pvfs2 -qcow2' a valid usecase in the iotests suite for
 specifying a qcow2 format file over the orangefs protocol?

I haven't run IMGPROTO + IMGFMT tests but looking at the code it should
work.

The Header extension too large error message comes from block/qcow2.c
so it seems the header data is corrupt.

I suggest running ./check -pvfs2 first to make sure it passes the raw
image tests.

Once that seems okay it's worth looking into issues from ./check -pvfs2
-qcow2 and installing/running guests.

Stefan

Re: [Qemu-devel] [PATCH] hw/9pfs: Fix potential memory leak and avoid reuse of freed memory

2013-07-04 Thread M. Mohan Kumar

Stefan Weil s...@weilnetz.de writes:

 The leak was reported by cppcheck.

 Function proxy_init also calls g_free for ctx-fs_root.
 Avoid reuse of this memory by setting ctx-fs_root to NULL.

 Signed-off-by: Stefan Weil s...@weilnetz.de
Reviewed-by: M. Mohan Kumar mo...@in.ibm.com
 ---

 Hi,

 I'm not sure whether ctx-fs_root should also be freed in the error case.
 Please feel free to modify my patch if needed.

 Regards
 Stefan Weil

  hw/9pfs/virtio-9p-proxy.c |2 ++
  1 file changed, 2 insertions(+)

 diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/virtio-9p-proxy.c
 index 8ba2959..5f44bb7 100644
 --- a/hw/9pfs/virtio-9p-proxy.c
 +++ b/hw/9pfs/virtio-9p-proxy.c
 @@ -1153,10 +1153,12 @@ static int proxy_init(FsContext *ctx)
  sock_id = atoi(ctx-fs_root);
  if (sock_id  0) {
  fprintf(stderr, socket descriptor not initialized\n);
 +g_free(proxy);
  return -1;
  }
  }
  g_free(ctx-fs_root);
 +ctx-fs_root = NULL;

  proxy-in_iovec.iov_base  = g_malloc(PROXY_MAX_IO_SZ + PROXY_HDR_SZ);
  proxy-in_iovec.iov_len   = PROXY_MAX_IO_SZ + PROXY_HDR_SZ;
 -- 
 1.7.10.4

Re: [Qemu-devel] [Bug 1187529] [PATCH] Update mappings after PCI bridge live migration or save-restore.

2013-07-04 Thread Michael S. Tsirkin

On Wed, Jul 03, 2013 at 11:04:16AM -0400, Don Koch wrote:
 From: Don Koch dk...@verizon.com
 
 Update mappings for PCI bridge after live migration.
 
 Signed-off-by: Don Koch dk...@verizon.com
 ---
 This fixes bug 1187529: devices on a PCI bridge stop working after migration.

Thanks, this looks good, but any bridge device would
need this fix, won't it?
Could we call this from get_pci_config_device instead?
This way all bridge devices would be fixed.


  hw/pci-bridge/pci_bridge_dev.c | 9 +
  hw/pci/pci_bridge.c| 2 +-
  include/hw/pci/pci_bridge.h| 1 +
  3 files changed, 11 insertions(+), 1 deletion(-)
 
 diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
 index 971b432..9e5062e 100644
 --- a/hw/pci-bridge/pci_bridge_dev.c
 +++ b/hw/pci-bridge/pci_bridge_dev.c
 @@ -110,6 +110,14 @@ static void qdev_pci_bridge_dev_reset(DeviceState *qdev)
  shpc_reset(dev);
  }
  
 +static int pci_bridge_dev_post_load(void *opaque, int ver) {
 +PCIDevice *d = opaque;
 +PCIBridge *s = container_of(d, PCIBridge, dev);
 +
 +pci_bridge_update_mappings(s);
 +return 0;
 +}
 +
  static Property pci_bridge_dev_properties[] = {
  /* Note: 0 is not a legal chassis number. */
  DEFINE_PROP_UINT8(chassis_nr, PCIBridgeDev, chassis_nr, 0),
 @@ -119,6 +127,7 @@ static Property pci_bridge_dev_properties[] = {
  
  static const VMStateDescription pci_bridge_dev_vmstate = {
  .name = pci_bridge,
 +.post_load = pci_bridge_dev_post_load,
  .fields = (VMStateField[]) {
  VMSTATE_PCI_DEVICE(bridge.dev, PCIBridgeDev),
  SHPC_VMSTATE(bridge.dev.shpc, PCIBridgeDev),
 diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
 index 24be6c5..3897bd8 100644
 --- a/hw/pci/pci_bridge.c
 +++ b/hw/pci/pci_bridge.c
 @@ -224,7 +224,7 @@ static void pci_bridge_region_cleanup(PCIBridge *br, 
 PCIBridgeWindows *w)
  g_free(w);
  }
  
 -static void pci_bridge_update_mappings(PCIBridge *br)
 +void pci_bridge_update_mappings(PCIBridge *br)
  {
  PCIBridgeWindows *w = br-windows;
  
 diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
 index 1868f7a..1d8f997 100644
 --- a/include/hw/pci/pci_bridge.h
 +++ b/include/hw/pci/pci_bridge.h
 @@ -37,6 +37,7 @@ PCIBus *pci_bridge_get_sec_bus(PCIBridge *br);
  pcibus_t pci_bridge_get_base(const PCIDevice *bridge, uint8_t type);
  pcibus_t pci_bridge_get_limit(const PCIDevice *bridge, uint8_t type);
  
 +void pci_bridge_update_mappings(PCIBridge *br);
  void pci_bridge_write_config(PCIDevice *d,
   uint32_t address, uint32_t val, int len);
  void pci_bridge_disable_base_limit(PCIDevice *dev);
 -- 
 1.7.11.7

Re: [Qemu-devel] [libvirt] best way to provide disk storage for vm without shared storage system

2013-07-04 Thread Stefan Hajnoczi

On Wed, Jul 03, 2013 at 05:31:44PM +0400, Vasiliy Tolstov wrote:
 Now i provide ext4 fs for qcow2 images (raid1 with two sata disks).
 Now i don't need live migration (but may need it in feature).
 What is the best way to provide disks to vm in case of performance,
 ability to create backups (i don't want lvm snapshots)?
 
 As i search from google more speed can take physical storage - lvm.
 But may be QED or FVD can provide near lvm performance to me?

Best really depends.  If you don't want to use LVM you could use raw
image files (fast) and perform backups inside the guest just like on a
physical machine.

qcow2 has pretty good performance nowadays.  If you care about
performance then benchmark your workload to decide which configuration
best.  There is no single answer because it depends on your workload and
additional constraints (like no LVM).

Stefan

Re: [Qemu-devel] [PULL v2 00/21] pci,kvm,misc enhancements

2013-07-04 Thread Michael S. Tsirkin

On Fri, Jun 28, 2013 at 12:44:17PM -0500, Anthony Liguori wrote:
 Markus Armbruster arm...@redhat.com writes:
 
  Michael S. Tsirkin m...@redhat.com writes:
 
pvpanic: fix fwcfg for big endian hosts
 
  Umm, 1+10+9 is 20, but the pull is for 21 patches; what's going on here?
 
 Funny :-)
 
 Note that the series is missing 2/21.
 
 The branch has 20 commits, I suspect Michael did git-format-patch to a
 directory, deleted one of the files, and never bothered changing the N
 of M.
 
 Regards,
 
 Anthony Liguori

Actually no, apparently something went wrong when sending mail.

-- 
MST

Re: [Qemu-devel] [PATCH v6] add timestamp to error_report()

2013-07-04 Thread Stefan Hajnoczi

On Wed, Jul 03, 2013 at 11:02:46PM -0400, Seiji Aguchi wrote:
 [Issue]
 When we offer a customer support service and a problem happens
 in a customer's system, we try to understand the problem by
 comparing what the customer reports with message logs of the
 customer's system.
 
 In this case, we often need to know when the problem happens.
 
 But, currently, there is no timestamp in qemu's error messages.
 Therefore, we may not be able to understand the problem based on
 error messages.
 
 [Solution]
 Add a timestamp to qemu's error message logged by
 error_report() with g_time_val_to_iso8601().
 
 Signed-off-by: Seiji Aguchi seiji.agu...@hds.com
 ---
 Changelog
  v5 - v6
  - Remove include/qemu/time.h and utils/qemu-time.c.
  - Fix a syntax and indent of messages in msg option's DEF().
  - Change explanation of the msg option.
 
  v4 - v5
  - Fix descriptions of msg option.
  - Rename TIME_H to QEMU_TIME_H. (avoiding double inclusion of qemu/time.h)
  - Change argument of qemu_get_timestamp_str to char * and size_t.
  - Confirmed msg option is displayed by query-command-line-options.
 
  v3 - v4
  - Correct email address of Signed-off-by.
 
  v2 - v3
  - Use g_time_val_to_iso8601() to get timestamp instead of
copying libvirt's time-handling functions.
 
According to discussion below, qemu doesn't need to take care
if timestamp functions are async-signal safe or not.
 
http://marc.info/?l=qemu-develm=136741841921265w=2
 
Also, In the review of v2 patch, strftime() are recommended to
format string. But it is not a suitable function to handle msec.
 
Then, simply call g_time_val_to_iso8601().
 
  - Intoroduce a common time-handling function to util/qemu-time.c.
(Suggested by Daniel P. Berrange)
 
  - Add testing for g_time_val_to_iso8601() to tests/test-time.c.
The test cases are copied from libvirt's virtimetest.
(Suggested by Daniel P. Berrange)
 
  v1 - v2
 
  - add an option, -msg timestamp={on|off}, to enable output message with 
 timestamp
 ---
  include/qemu/error-report.h |2 ++
  qemu-options.hx |   11 +++
  util/qemu-error.c   |   10 ++
  vl.c|   26 ++
  4 files changed, 49 insertions(+), 0 deletions(-)

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com

[Qemu-devel] [PATCH v3 01/18] range: add Range structure

2013-07-04 Thread Michael S. Tsirkin

Sometimes we need to pass ranges around, add a
handy structure for this purpose.

Note: memory.c defines its own concept of AddrRange structure for
working with 128 addresses.  It's necessary there for doing range math.
This is not needed for most users: struct Range is
much simpler, and is only used for passing the range around.

Cc: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 include/qemu/range.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/include/qemu/range.h b/include/qemu/range.h
index 3502372..b76cc0d 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -1,6 +1,22 @@
 #ifndef QEMU_RANGE_H
 #define QEMU_RANGE_H
 
+#include inttypes.h
+
+/*
+ * Operations on 64 bit address ranges.
+ * Notes:
+ *   - ranges must not wrap around 0, but can include the last byte ~0x0LL.
+ *   - this can not represent a full 0 to ~0x0LL range.
+ */
+
+/* A structure representing a range of addresses. */
+struct Range {
+uint64_t begin; /* First byte of the range, or 0 if empty. */
+uint64_t end;   /* 1 + the last byte. 0 if range empty or ends at ~0x0LL. 
*/
+};
+typedef struct Range Range;
+
 /* Get last byte of a range from offset + length.
  * Undefined for ranges that wrap around 0. */
 static inline uint64_t range_get_last(uint64_t offset, uint64_t len)
-- 
MST

[Qemu-devel] [PATCH v3 02/18] pci: store PCI hole ranges in guestinfo structure

2013-07-04 Thread Michael S. Tsirkin

Will be used to pass hole ranges to guests.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/i386/pc.c  | 46 +-
 hw/i386/pc_piix.c | 14 +-
 hw/i386/pc_q35.c  |  6 +-
 hw/pci-host/q35.c |  8 
 include/hw/i386/pc.h  | 19 ++-
 include/hw/pci-host/q35.h |  2 ++
 include/qemu/typedefs.h   |  1 +
 7 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 78f92e2..8af1e4e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -989,6 +989,48 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 }
 }
 
+typedef struct PcGuestInfoState {
+PcGuestInfo info;
+Notifier machine_done;
+} PcGuestInfoState;
+
+static
+void pc_guest_info_machine_done(Notifier *notifier, void *data)
+{
+PcGuestInfoState *guest_info_state = container_of(notifier,
+  PcGuestInfoState,
+  machine_done);
+}
+
+PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
+ram_addr_t above_4g_mem_size)
+{
+PcGuestInfoState *guest_info_state = g_malloc0(sizeof *guest_info_state);
+PcGuestInfo *guest_info = guest_info_state-info;
+
+guest_info-pci_info.w32.end = IO_APIC_DEFAULT_ADDRESS;
+if (sizeof(hwaddr) == 4) {
+guest_info-pci_info.w64.begin = 0;
+guest_info-pci_info.w64.end = 0;
+} else {
+/*
+ * BIOS does not set MTRR entries for the 64 bit window, so no need to
+ * align address to power of two.  Align address at 1G, this makes sure
+ * it can be exactly covered with a PAT entry even when using huge
+ * pages.
+ */
+guest_info-pci_info.w64.begin =
+ROUND_UP((0x1ULL  32) + above_4g_mem_size, 0x1ULL  30);
+guest_info-pci_info.w64.end = guest_info-pci_info.w64.begin +
+(0x1ULL  62);
+assert(guest_info-pci_info.w64.begin = guest_info-pci_info.w64.end);
+}
+
+guest_info_state-machine_done.notify = pc_guest_info_machine_done;
+qemu_add_machine_init_done_notifier(guest_info_state-machine_done);
+return guest_info;
+}
+
 void pc_acpi_init(const char *default_dsdt)
 {
 char *filename;
@@ -1030,7 +1072,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
ram_addr_t below_4g_mem_size,
ram_addr_t above_4g_mem_size,
MemoryRegion *rom_memory,
-   MemoryRegion **ram_memory)
+   MemoryRegion **ram_memory,
+   PcGuestInfo *guest_info)
 {
 int linux_boot, i;
 MemoryRegion *ram, *option_rom_mr;
@@ -1082,6 +1125,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
 for (i = 0; i  nb_option_roms; i++) {
 rom_add_option(option_rom[i].name, option_rom[i].bootindex);
 }
+guest_info-fw_cfg = fw_cfg;
 return fw_cfg;
 }
 
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index fa59a0c..4637bde 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -90,6 +90,7 @@ static void pc_init1(MemoryRegion *system_memory,
 MemoryRegion *rom_memory;
 DeviceState *icc_bridge;
 FWCfgState *fw_cfg = NULL;
+PcGuestInfo *guest_info;
 
 if (xen_enabled()  xen_hvm_init() != 0) {
 fprintf(stderr, xen hardware virtual machine initialisation 
failed\n);
@@ -124,12 +125,23 @@ static void pc_init1(MemoryRegion *system_memory,
 rom_memory = system_memory;
 }
 
+guest_info = pc_guest_info_init(below_4g_mem_size, above_4g_mem_size);
+
+/* Set PCI window size the way seabios has always done it. */
+/* Power of 2 so bios can cover it with a single MTRR */
+if (ram_size = 0x8000)
+guest_info-pci_info.w32.begin = 0x8000;
+else if (ram_size = 0xc000)
+guest_info-pci_info.w32.begin = 0xc000;
+else
+guest_info-pci_info.w32.begin = 0xe000;
+
 /* allocate ram and load rom/bios */
 if (!xen_enabled()) {
 fw_cfg = pc_memory_init(system_memory,
kernel_filename, kernel_cmdline, initrd_filename,
below_4g_mem_size, above_4g_mem_size,
-   rom_memory, ram_memory);
+   rom_memory, ram_memory, guest_info);
 }
 
 gsi_state = g_malloc0(sizeof(*gsi_state));
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index bb0ce6a..a13acf2 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -77,6 +77,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 ICH9LPCState *ich9_lpc;
 PCIDevice *ahci;
 DeviceState *icc_bridge;
+PcGuestInfo *guest_info;
 
 icc_bridge = qdev_create(NULL, TYPE_ICC_BRIDGE);
 object_property_add_child(qdev_get_machine(), icc-bridge,
@@ -105,11 +106,13 @@ static void

[Qemu-devel] [PATCH v3 04/18] pc_piix: cleanup init compat handling

2013-07-04 Thread Michael S. Tsirkin

Make sure 1.4 calls 1.5, 1.3 calls 1.4 etc.
This way it's enough to add enough new compat hook
in a single place in piix.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/i386/pc_piix.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 8a18dbe..e393022 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -270,38 +270,28 @@ static void pc_init_pci_1_5(QEMUMachineInitArgs *args)
 
 static void pc_init_pci_1_4(QEMUMachineInitArgs *args)
 {
-has_pci_info = false;
 has_pvpanic = false;
 x86_cpu_compat_set_features(n270, FEAT_1_ECX, 0, CPUID_EXT_MOVBE);
-pc_init_pci(args);
+pc_init_pci_1_5(args);
 }
 
 static void pc_init_pci_1_3(QEMUMachineInitArgs *args)
 {
-has_pci_info = false;
 enable_compat_apic_id_mode();
-has_pvpanic = false;
-pc_init_pci(args);
+pc_init_pci_1_4(args);
 }
 
 /* PC machine init function for pc-1.1 to pc-1.2 */
 static void pc_init_pci_1_2(QEMUMachineInitArgs *args)
 {
-has_pci_info = false;
 disable_kvm_pv_eoi();
-enable_compat_apic_id_mode();
-has_pvpanic = false;
-pc_init_pci(args);
+pc_init_pci_1_3(args);
 }
 
 /* PC machine init function for pc-0.14 to pc-1.0 */
 static void pc_init_pci_1_0(QEMUMachineInitArgs *args)
 {
-has_pci_info = false;
-disable_kvm_pv_eoi();
-enable_compat_apic_id_mode();
-has_pvpanic = false;
-pc_init_pci(args);
+pc_init_pci_1_2(args);
 }
 
 /* PC init function for pc-0.10 to pc-0.13, and reused by xenfv */
-- 
MST

[Qemu-devel] [PULL v3 00/18] pci,misc enhancements

2013-07-04 Thread Michael S. Tsirkin

Changes from v2:
- rebased to origin/master
- fixed up botched posting

The following changes since commit ab8bf29078e0ab8347e2ff8b4e5542f7a0c751cf:

  Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-07-03 
08:37:00 -0500)

are available in the git repository at:


  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_anthony

for you to fetch changes up to e34cc4adf3106ff5bed9723b8f9b4730f1662f7d:

  pci: Fold host_buses list into PCIHostState functionality (2013-07-04 
10:45:32 +0300)


pci,misc enhancements

This includes some pci enhancements:

Better support for systems with multiple PCI root buses
FW cfg interface for more robust pci programming in BIOS
Minor fixes/cleanups for fw cfg and cross-version migration -
because of dependencies with other patches

Signed-off-by: Michael S. Tsirkin m...@redhat.com


Andrew Jones (1):
  e1000: cleanup process_tx_desc

David Gibson (10):
  pci: Cleanup configuration for pci-hotplug.c
  pci: Move pci_read_devaddr to pci-hotplug-old.c
  pci: Abolish pci_find_root_bus()
  pci: Use helper to find device's root bus in pci_find_domain()
  pci: Replace pci_find_domain() with more general pci_root_bus_path()
  pci: Add root bus argument to pci_get_bus_devfn()
  pci: Add root bus parameter to pci_nic_init()
  pci: Simpler implementation of primary PCI bus
  pci: Remove domain from PCIHostBus
  pci: Fold host_buses list into PCIHostState functionality

Michael S. Tsirkin (7):
  range: add Range structure
  pci: store PCI hole ranges in guestinfo structure
  pc: pass PCI hole ranges to Guests
  pc_piix: cleanup init compat handling
  MAINTAINERS: s/Marcelo/Paolo/
  pvpanic: initialization cleanup
  pvpanic: fix fwcfg for big endian hosts

 MAINTAINERS |   2 +-
 default-configs/i386-softmmu.mak|   3 +-
 default-configs/ppc64-softmmu.mak   |   2 -
 default-configs/x86_64-softmmu.mak  |   3 +-
 hmp-commands.hx |   4 +-
 hw/alpha/dp264.c|   2 +-
 hw/arm/realview.c   |   6 +-
 hw/arm/versatilepb.c|   2 +-
 hw/i386/pc.c|  74 ++-
 hw/i386/pc_piix.c   |  40 +---
 hw/i386/pc_q35.c|  18 +++-
 hw/mips/mips_fulong2e.c |   6 +-
 hw/mips/mips_malta.c|   6 +-
 hw/misc/pvpanic.c   |  31 ---
 hw/net/e1000.c  |  18 ++--
 hw/pci-host/piix.c  |   9 ++
 hw/pci-host/q35.c   |  17 
 hw/pci/Makefile.objs|   2 +-
 hw/pci/{pci-hotplug.c = pci-hotplug-old.c} |  75 ---
 hw/pci/pci.c| 137 ++--
 hw/pci/pci_host.c   |   1 +
 hw/pci/pcie_aer.c   |   9 +-
 hw/ppc/e500.c   |   2 +-
 hw/ppc/mac_newworld.c   |   2 +-
 hw/ppc/mac_oldworld.c   |   2 +-
 hw/ppc/ppc440_bamboo.c  |   2 +-
 hw/ppc/prep.c   |   2 +-
 hw/ppc/spapr.c  |   2 +-
 hw/ppc/spapr_pci.c  |  10 ++
 hw/sh4/r2d.c|   5 +-
 hw/sparc64/sun4u.c  |   2 +-
 include/hw/i386/pc.h|  22 -
 include/hw/pci-host/q35.h   |   2 +
 include/hw/pci/pci.h|  17 ++--
 include/hw/pci/pci_host.h   |  12 +++
 include/qemu/range.h|  16 
 include/qemu/typedefs.h |   1 +
 37 files changed, 404 insertions(+), 162 deletions(-)
 rename hw/pci/{pci-hotplug.c = pci-hotplug-old.c} (78%)

[Qemu-devel] [PATCH v3 07/18] pvpanic: initialization cleanup

2013-07-04 Thread Michael S. Tsirkin

Avoid use of static variables: PC systems
initialize pvpanic device through pvpanic_init,
so we can simply create the fw_cfg file at that point.
This also makes it possible to skip device
creation completely if fw_cfg is not there, e.g. for xen -
so the ports it reserves are not discoverable by guests.

Also, make pvpanic_init void since callers ignore return
status anyway.

Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Laszlo Ersek ler...@redhat.com
Cc: Paul Durrant paul.durr...@citrix.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/misc/pvpanic.c| 30 --
 include/hw/i386/pc.h |  2 +-
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c
index 060099b..83ed226 100644
--- a/hw/misc/pvpanic.c
+++ b/hw/misc/pvpanic.c
@@ -97,26 +97,28 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error 
**errp)
 {
 ISADevice *d = ISA_DEVICE(dev);
 PVPanicState *s = ISA_PVPANIC_DEVICE(dev);
-static bool port_configured;
-FWCfgState *fw_cfg;
 
 isa_register_ioport(d, s-io, s-ioport);
+}
 
-if (!port_configured) {
-fw_cfg = fw_cfg_find();
-if (fw_cfg) {
-fw_cfg_add_file(fw_cfg, etc/pvpanic-port,
-g_memdup(s-ioport, sizeof(s-ioport)),
-sizeof(s-ioport));
-port_configured = true;
-}
-}
+static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg)
+{
+PVPanicState *s = ISA_PVPANIC_DEVICE(dev);
+
+fw_cfg_add_file(fw_cfg, etc/pvpanic-port,
+g_memdup(s-ioport, sizeof(s-ioport)),
+sizeof(s-ioport));
 }
 
-int pvpanic_init(ISABus *bus)
+void pvpanic_init(ISABus *bus)
 {
-isa_create_simple(bus, TYPE_ISA_PVPANIC_DEVICE);
-return 0;
+ISADevice *dev;
+FWCfgState *fw_cfg = fw_cfg_find();
+if (!fw_cfg) {
+return;
+}
+dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE);
+pvpanic_fw_cfg(dev, fw_cfg);
 }
 
 static Property pvpanic_isa_properties[] = {
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index dbdd523..5949e7e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -193,7 +193,7 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, 
int irq, NICInfo *nd)
 void pc_system_firmware_init(MemoryRegion *rom_memory);
 
 /* pvpanic.c */
-int pvpanic_init(ISABus *bus);
+void pvpanic_init(ISABus *bus);
 
 /* e820 types */
 #define E820_RAM1
-- 
MST

[Qemu-devel] [PATCH v3 03/18] pc: pass PCI hole ranges to Guests

2013-07-04 Thread Michael S. Tsirkin

Guest currently has to jump through lots of hoops to guess the PCI hole
ranges.  It's fragile, and makes us change BIOS each time we add a new
chipset.  Let's report the window in a ROM file, to make BIOS do exactly
what QEMU intends.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/i386/pc.c | 26 ++
 hw/i386/pc_piix.c| 16 +++-
 hw/i386/pc_q35.c | 12 ++--
 include/hw/i386/pc.h |  1 +
 4 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8af1e4e..7c4794c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -989,6 +989,31 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 }
 }
 
+/* pci-info ROM file. Little endian format */
+typedef struct PcRomPciInfo {
+uint64_t w32_min;
+uint64_t w32_max;
+uint64_t w64_min;
+uint64_t w64_max;
+} PcRomPciInfo;
+
+static void pc_fw_cfg_guest_info(PcGuestInfo *guest_info)
+{
+PcRomPciInfo *info;
+if (!guest_info-has_pci_info) {
+return;
+}
+
+info = g_malloc(sizeof *info);
+info-w32_min = cpu_to_le64(guest_info-pci_info.w32.begin);
+info-w32_max = cpu_to_le64(guest_info-pci_info.w32.end);
+info-w64_min = cpu_to_le64(guest_info-pci_info.w64.begin);
+info-w64_max = cpu_to_le64(guest_info-pci_info.w64.end);
+/* Pass PCI hole info to guest via a side channel.
+ * Required so guest PCI enumeration does the right thing. */
+fw_cfg_add_file(guest_info-fw_cfg, etc/pci-info, info, sizeof *info);
+}
+
 typedef struct PcGuestInfoState {
 PcGuestInfo info;
 Notifier machine_done;
@@ -1000,6 +1025,7 @@ void pc_guest_info_machine_done(Notifier *notifier, void 
*data)
 PcGuestInfoState *guest_info_state = container_of(notifier,
   PcGuestInfoState,
   machine_done);
+pc_fw_cfg_guest_info(guest_info_state-info);
 }
 
 PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 4637bde..8a18dbe 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -57,6 +57,7 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 
 static bool has_pvpanic = true;
+static bool has_pci_info = true;
 
 /* PC hardware initialisation */
 static void pc_init1(MemoryRegion *system_memory,
@@ -126,6 +127,7 @@ static void pc_init1(MemoryRegion *system_memory,
 }
 
 guest_info = pc_guest_info_init(below_4g_mem_size, above_4g_mem_size);
+guest_info-has_pci_info = has_pci_info;
 
 /* Set PCI window size the way seabios has always done it. */
 /* Power of 2 so bios can cover it with a single MTRR */
@@ -260,8 +262,15 @@ static void pc_init_pci(QEMUMachineInitArgs *args)
  initrd_filename, cpu_model, 1, 1);
 }
 
+static void pc_init_pci_1_5(QEMUMachineInitArgs *args)
+{
+has_pci_info = false;
+pc_init_pci(args);
+}
+
 static void pc_init_pci_1_4(QEMUMachineInitArgs *args)
 {
+has_pci_info = false;
 has_pvpanic = false;
 x86_cpu_compat_set_features(n270, FEAT_1_ECX, 0, CPUID_EXT_MOVBE);
 pc_init_pci(args);
@@ -269,6 +278,7 @@ static void pc_init_pci_1_4(QEMUMachineInitArgs *args)
 
 static void pc_init_pci_1_3(QEMUMachineInitArgs *args)
 {
+has_pci_info = false;
 enable_compat_apic_id_mode();
 has_pvpanic = false;
 pc_init_pci(args);
@@ -277,6 +287,7 @@ static void pc_init_pci_1_3(QEMUMachineInitArgs *args)
 /* PC machine init function for pc-1.1 to pc-1.2 */
 static void pc_init_pci_1_2(QEMUMachineInitArgs *args)
 {
+has_pci_info = false;
 disable_kvm_pv_eoi();
 enable_compat_apic_id_mode();
 has_pvpanic = false;
@@ -286,6 +297,7 @@ static void pc_init_pci_1_2(QEMUMachineInitArgs *args)
 /* PC machine init function for pc-0.14 to pc-1.0 */
 static void pc_init_pci_1_0(QEMUMachineInitArgs *args)
 {
+has_pci_info = false;
 disable_kvm_pv_eoi();
 enable_compat_apic_id_mode();
 has_pvpanic = false;
@@ -302,6 +314,7 @@ static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs 
*args)
 const char *initrd_filename = args-initrd_filename;
 const char *boot_device = args-boot_device;
 has_pvpanic = false;
+has_pci_info = false;
 disable_kvm_pv_eoi();
 enable_compat_apic_id_mode();
 pc_init1(get_system_memory(),
@@ -320,6 +333,7 @@ static void pc_init_isa(QEMUMachineInitArgs *args)
 const char *initrd_filename = args-initrd_filename;
 const char *boot_device = args-boot_device;
 has_pvpanic = false;
+has_pci_info = false;
 if (cpu_model == NULL)
 cpu_model = 486;
 disable_kvm_pv_eoi();
@@ -359,7 +373,7 @@ static QEMUMachine pc_i440fx_machine_v1_6 = {
 static QEMUMachine pc_i440fx_machine_v1_5 = {
 .name = pc-i440fx-1.5,
 .desc = Standard PC (i440FX + PIIX, 1996),
-.init = pc_init_pci,
+.init = pc_init_pci_1_5,

[Qemu-devel] [PATCH v3 06/18] MAINTAINERS: s/Marcelo/Paolo/

2013-07-04 Thread Michael S. Tsirkin

Marcelo doesn't maintain kvm anymore,
Paolo is taking over the job.
Update MAINTAINERS to stop flooding Marcelo with mail.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ad9c860..11dffee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -155,7 +155,7 @@ Guest CPU Cores (KVM):
 
 Overall
 M: Gleb Natapov g...@redhat.com
-M: Marcelo Tosatti mtosa...@redhat.com
+M: Paolo Bonzini pbonz...@redhat.com
 L: k...@vger.kernel.org
 S: Supported
 F: kvm-*
-- 
MST

[Qemu-devel] [PATCH v3 08/18] pvpanic: fix fwcfg for big endian hosts

2013-07-04 Thread Michael S. Tsirkin

Convert port number to little endian when
exposing it in fw cfg.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/misc/pvpanic.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c
index 83ed226..792d8e4 100644
--- a/hw/misc/pvpanic.c
+++ b/hw/misc/pvpanic.c
@@ -104,10 +104,11 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error 
**errp)
 static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg)
 {
 PVPanicState *s = ISA_PVPANIC_DEVICE(dev);
+uint16_t *pvpanic_port = g_malloc(sizeof(*pvpanic_port));
+*pvpanic_port = cpu_to_le16(s-ioport);
 
-fw_cfg_add_file(fw_cfg, etc/pvpanic-port,
-g_memdup(s-ioport, sizeof(s-ioport)),
-sizeof(s-ioport));
+fw_cfg_add_file(fw_cfg, etc/pvpanic-port, pvpanic_port,
+sizeof(*pvpanic_port));
 }
 
 void pvpanic_init(ISABus *bus)
-- 
MST

[Qemu-devel] [PATCH v3 12/18] pci: Use helper to find device's root bus in pci_find_domain()

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

Currently pci_find_domain() performs two functions - it locates the PCI
root bus above the given bus, then looks up that root bus's domain number.
This patch adds a helper function to perform the first task, finding the
root bus for a given PCI device.  This is then used in pci_find_domain().
This changes pci_find_domain()'s signature slightly, taking a PCIDevice
instead of a PCIBus - since all callers passed something of the form
dev-bus, this simplifies things slightly.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci-hotplug-old.c |  2 +-
 hw/pci/pci.c | 20 +---
 hw/pci/pcie_aer.c|  3 +--
 include/hw/pci/pci.h |  3 ++-
 4 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/hw/pci/pci-hotplug-old.c b/hw/pci/pci-hotplug-old.c
index 7a47d6b..37e0720 100644
--- a/hw/pci/pci-hotplug-old.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -276,7 +276,7 @@ void pci_device_hot_add(Monitor *mon, const QDict *qdict)
 
 if (dev) {
 monitor_printf(mon, OK domain %d, bus %d, slot %d, function %d\n,
-   pci_find_domain(dev-bus),
+   pci_find_domain(dev),
pci_bus_num(dev-bus), PCI_SLOT(dev-devfn),
PCI_FUNC(dev-devfn));
 } else
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index fc99e3b..69a6995 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -259,18 +259,24 @@ PCIBus *pci_find_primary_bus(void)
 return NULL;
 }
 
-int pci_find_domain(const PCIBus *bus)
+PCIBus *pci_device_root_bus(const PCIDevice *d)
 {
-PCIDevice *d;
-struct PCIHostBus *host;
+PCIBus *bus = d-bus;
 
-/* obtain root bus */
 while ((d = bus-parent_dev) != NULL) {
 bus = d-bus;
 }
 
+return bus;
+}
+
+int pci_find_domain(const PCIDevice *dev)
+{
+const PCIBus *rootbus = pci_device_root_bus(dev);
+struct PCIHostBus *host;
+
 QLIST_FOREACH(host, host_buses, next) {
-if (host-bus == bus) {
+if (host-bus == rootbus) {
 return host-domain;
 }
 }
@@ -1997,7 +2003,7 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
 fprintf(stderr, ERROR: %04x:%02x:%02x.%x 
 Attempt to add PCI capability %x at offset 
 %x overlaps existing capability %x at offset %x\n,
-pci_find_domain(pdev-bus), pci_bus_num(pdev-bus),
+pci_find_domain(pdev), pci_bus_num(pdev-bus),
 PCI_SLOT(pdev-devfn), PCI_FUNC(pdev-devfn),
 cap_id, offset, overlapping_cap, i);
 return -EINVAL;
@@ -2152,7 +2158,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
 path[path_len] = '\0';
 
 /* First field is the domain. */
-s = snprintf(domain, sizeof domain, %04x:00, pci_find_domain(d-bus));
+s = snprintf(domain, sizeof domain, %04x:00, pci_find_domain(d));
 assert(s == domain_len);
 memcpy(path, domain, domain_len);
 
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 1ce72ce..06f77ac 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -1022,8 +1022,7 @@ int do_pcie_aer_inject_error(Monitor *mon,
 *ret_data = qobject_from_jsonf({'id': %s, 
'domain': %d, 'bus': %d, 'devfn': %d, 
'ret': %d},
-   id,
-   pci_find_domain(dev-bus),
+   id, pci_find_domain(dev),
pci_bus_num(dev-bus), dev-devfn,
ret);
 assert(*ret_data);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 7b89d88..f2bf1ed 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -390,7 +390,8 @@ void pci_for_each_device(PCIBus *bus, int bus_num,
  void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
  void *opaque);
 PCIBus *pci_find_primary_bus(void);
-int pci_find_domain(const PCIBus *bus);
+PCIBus *pci_device_root_bus(const PCIDevice *d);
+int pci_find_domain(const PCIDevice *dev);
 PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn);
 int pci_qdev_find_device(const char *id, PCIDevice **pdev);
 PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr);
-- 
MST

[Qemu-devel] [PATCH v3 10/18] pci: Move pci_read_devaddr to pci-hotplug-old.c

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

pci_read_devaddr() is only used by the legacy functions for the old PCI
hotplug interface in pci-hotplug-old.c.  So we move the function there,
and make it static.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci-hotplug-old.c | 14 ++
 hw/pci/pci.c | 16 +---
 include/hw/pci/pci.h |  4 ++--
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/hw/pci/pci-hotplug-old.c b/hw/pci/pci-hotplug-old.c
index b3c233c..a0b5558 100644
--- a/hw/pci/pci-hotplug-old.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -36,6 +36,20 @@
 #include sysemu/blockdev.h
 #include qapi/error.h
 
+static int pci_read_devaddr(Monitor *mon, const char *addr, int *domp,
+int *busp, unsigned *slotp)
+{
+/* strip legacy tag */
+if (!strncmp(addr, pci_addr=, 9)) {
+addr += 9;
+}
+if (pci_parse_devaddr(addr, domp, busp, slotp, NULL)) {
+monitor_printf(mon, Invalid pci address\n);
+return -1;
+}
+return 0;
+}
+
 static PCIDevice *qemu_pci_hot_add_nic(Monitor *mon,
const char *devaddr,
const char *opts_str)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 61b681a..adf4da5 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -522,7 +522,7 @@ static void pci_set_default_subsystem_id(PCIDevice *pci_dev)
  * Parse [[domain:]bus:]slot, return -1 on error if funcp == NULL
  *   [[domain:]bus:]slot.func, return -1 on error
  */
-static int pci_parse_devaddr(const char *addr, int *domp, int *busp,
+int pci_parse_devaddr(const char *addr, int *domp, int *busp,
   unsigned int *slotp, unsigned int *funcp)
 {
 const char *p;
@@ -581,20 +581,6 @@ static int pci_parse_devaddr(const char *addr, int *domp, 
int *busp,
 return 0;
 }
 
-int pci_read_devaddr(Monitor *mon, const char *addr, int *domp, int *busp,
- unsigned *slotp)
-{
-/* strip legacy tag */
-if (!strncmp(addr, pci_addr=, 9)) {
-addr += 9;
-}
-if (pci_parse_devaddr(addr, domp, busp, slotp, NULL)) {
-monitor_printf(mon, Invalid pci address\n);
-return -1;
-}
-return 0;
-}
-
 PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr)
 {
 int dom, bus;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 6ef1f97..b5edef8 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -395,8 +395,8 @@ PCIDevice *pci_find_device(PCIBus *bus, int bus_num, 
uint8_t devfn);
 int pci_qdev_find_device(const char *id, PCIDevice **pdev);
 PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr);
 
-int pci_read_devaddr(Monitor *mon, const char *addr, int *domp, int *busp,
- unsigned *slotp);
+int pci_parse_devaddr(const char *addr, int *domp, int *busp,
+  unsigned int *slotp, unsigned int *funcp);
 
 void pci_device_deassert_intx(PCIDevice *dev);
 
-- 
MST

[Qemu-devel] [PATCH v3 05/18] e1000: cleanup process_tx_desc

2013-07-04 Thread Michael S. Tsirkin

From: Andrew Jones drjo...@redhat.com

Coverity complains about two overruns in process_tx_desc(). The
complaints are false positives, but we might as well eliminate
them. The problem is that hdr is defined as an unsigned int,
but then used to offset an array of size 65536, and another of
size 256 bytes. hdr will actually never be greater than 255
though, as it's assigned only once and to the value of
tp-hdr_len, which is an uint8_t. This patch simply gets rid of
hdr, replacing it with tp-hdr_len, which makes it consistent
with all other tp member use in the function.

v2:
 - also cleanup coding style issues in the touched lines

Signed-off-by: Andrew Jones drjo...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/net/e1000.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index e6f46f0..620f947 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -556,7 +556,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 uint32_t txd_lower = le32_to_cpu(dp-lower.data);
 uint32_t dtype = txd_lower  (E1000_TXD_CMD_DEXT | E1000_TXD_DTYP_D);
 unsigned int split_size = txd_lower  0x, bytes, sz, op;
-unsigned int msh = 0xf, hdr = 0;
+unsigned int msh = 0xf;
 uint64_t addr;
 struct e1000_context_desc *xp = (struct e1000_context_desc *)dp;
 struct e1000_tx *tp = s-tx;
@@ -603,8 +603,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 
 addr = le64_to_cpu(dp-buffer_addr);
 if (tp-tse  tp-cptse) {
-hdr = tp-hdr_len;
-msh = hdr + tp-mss;
+msh = tp-hdr_len + tp-mss;
 do {
 bytes = split_size;
 if (tp-size + bytes  msh)
@@ -612,14 +611,16 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 
 bytes = MIN(sizeof(tp-data) - tp-size, bytes);
 pci_dma_read(s-dev, addr, tp-data + tp-size, bytes);
-if ((sz = tp-size + bytes) = hdr  tp-size  hdr)
-memmove(tp-header, tp-data, hdr);
+sz = tp-size + bytes;
+if (sz = tp-hdr_len  tp-size  tp-hdr_len) {
+memmove(tp-header, tp-data, tp-hdr_len);
+}
 tp-size = sz;
 addr += bytes;
 if (sz == msh) {
 xmit_seg(s);
-memmove(tp-data, tp-header, hdr);
-tp-size = hdr;
+memmove(tp-data, tp-header, tp-hdr_len);
+tp-size = tp-hdr_len;
 }
 } while (split_size -= bytes);
 } else if (!tp-tse  tp-cptse) {
@@ -633,8 +634,9 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 
 if (!(txd_lower  E1000_TXD_CMD_EOP))
 return;
-if (!(tp-tse  tp-cptse  tp-size  hdr))
+if (!(tp-tse  tp-cptse  tp-size  tp-hdr_len)) {
 xmit_seg(s);
+}
 tp-tso_frames = 0;
 tp-sum_needed = 0;
 tp-vlan_needed = 0;
-- 
MST

[Qemu-devel] [PATCH v3 09/18] pci: Cleanup configuration for pci-hotplug.c

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

pci-hotplug.c and the CONFIG_PCI_HOTPLUG variable which controls its
compilation are misnamed.  They're not about PCI hotplug in general, but
rather about the pci_add/pci_del interface which are now deprecated in
favour of the more general device_add/device_del interface.  This patch
therefore renames them to pci-hotplug-old.c and CONFIG_PCI_HOTPLUG_OLD.

CONFIG_PCI_HOTPLUG=y was listed twice in {i386,x86_64}-softmmu.make for no
particular reason, so we clean that up too.  In addition it was included in
ppc64-softmmu.mak for which the old hotplug interface was never used and is
unsuitable, so we remove that too.

Most of pci-hotplug.c was additionaly protected by #ifdef TARGET_I386.  The
small piece which wasn't is only called from the pci_add and pci_del hooks
in hmp-commands.hx, which themselves were protected by #ifdef TARGET_I386.
This patch therefore also removes the #ifdef from pci-hotplug-old.c,
and changes the ifdefs in hmp-commands.hx to use CONFIG_PCI_HOTPLUG_OLD.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 default-configs/i386-softmmu.mak| 3 +--
 default-configs/ppc64-softmmu.mak   | 2 --
 default-configs/x86_64-softmmu.mak  | 3 +--
 hmp-commands.hx | 4 ++--
 hw/pci/Makefile.objs| 2 +-
 hw/pci/{pci-hotplug.c = pci-hotplug-old.c} | 6 +++---
 6 files changed, 8 insertions(+), 12 deletions(-)
 rename hw/pci/{pci-hotplug.c = pci-hotplug-old.c} (98%)

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 03deca2..4a0fc9c 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -28,11 +28,10 @@ CONFIG_APPLESMC=y
 CONFIG_I8259=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_TPM_TIS=$(CONFIG_TPM)
-CONFIG_PCI_HOTPLUG=y
+CONFIG_PCI_HOTPLUG_OLD=y
 CONFIG_MC146818RTC=y
 CONFIG_PAM=y
 CONFIG_PCI_PIIX=y
-CONFIG_PCI_HOTPLUG=y
 CONFIG_WDT_IB700=y
 CONFIG_PC_SYSFW=y
 CONFIG_XEN_I386=$(CONFIG_XEN)
diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index cb279cb..5a72b5f 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -45,7 +45,5 @@ CONFIG_OPENPIC=y
 CONFIG_PSERIES=y
 CONFIG_E500=y
 CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
-# For pSeries
-CONFIG_PCI_HOTPLUG=y
 # For PReP
 CONFIG_MC146818RTC=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index 599b630..10bb0c6 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -28,11 +28,10 @@ CONFIG_APPLESMC=y
 CONFIG_I8259=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_TPM_TIS=$(CONFIG_TPM)
-CONFIG_PCI_HOTPLUG=y
+CONFIG_PCI_HOTPLUG_OLD=y
 CONFIG_MC146818RTC=y
 CONFIG_PAM=y
 CONFIG_PCI_PIIX=y
-CONFIG_PCI_HOTPLUG=y
 CONFIG_WDT_IB700=y
 CONFIG_PC_SYSFW=y
 CONFIG_XEN_I386=$(CONFIG_XEN)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..d1cdcfb 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1077,7 +1077,7 @@ STEXI
 Add drive to PCI storage controller.
 ETEXI
 
-#if defined(TARGET_I386)
+#if defined(CONFIG_PCI_HOTPLUG_OLD)
 {
 .name   = pci_add,
 .args_type  = pci_addr:s,type:s,opts:s?,
@@ -1093,7 +1093,7 @@ STEXI
 Hot-add PCI device.
 ETEXI
 
-#if defined(TARGET_I386)
+#if defined(CONFIG_PCI_HOTPLUG_OLD)
 {
 .name   = pci_del,
 .args_type  = pci_addr:s,
diff --git a/hw/pci/Makefile.objs b/hw/pci/Makefile.objs
index a7fb9d0..720f438 100644
--- a/hw/pci/Makefile.objs
+++ b/hw/pci/Makefile.objs
@@ -8,4 +8,4 @@ common-obj-$(CONFIG_PCI) += pcie.o pcie_aer.o pcie_port.o
 common-obj-$(CONFIG_NO_PCI) += pci-stub.o
 common-obj-$(CONFIG_ALL) += pci-stub.o
 
-obj-$(CONFIG_PCI_HOTPLUG) += pci-hotplug.o
+common-obj-$(CONFIG_PCI_HOTPLUG_OLD) += pci-hotplug-old.o
diff --git a/hw/pci/pci-hotplug.c b/hw/pci/pci-hotplug-old.c
similarity index 98%
rename from hw/pci/pci-hotplug.c
rename to hw/pci/pci-hotplug-old.c
index 12287d1..b3c233c 100644
--- a/hw/pci/pci-hotplug.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -1,5 +1,7 @@
 /*
- * QEMU PCI hotplug support
+ * Deprecated PCI hotplug interface support
+ * This covers the old pci_add / pci_del command, whereas the more general
+ * device_add / device_del commands are now preferred.
  *
  * Copyright (c) 2004 Fabrice Bellard
  *
@@ -34,7 +36,6 @@
 #include sysemu/blockdev.h
 #include qapi/error.h
 
-#if defined(TARGET_I386)
 static PCIDevice *qemu_pci_hot_add_nic(Monitor *mon,
const char *devaddr,
const char *opts_str)
@@ -257,7 +258,6 @@ void pci_device_hot_add(Monitor *mon, const QDict *qdict)
 } else
 monitor_printf(mon, failed to add %s\n, opts);
 }
-#endif
 
 static int pci_device_hot_remove(Monitor *mon, const char *pci_addr)
 {
-- 
MST

[Qemu-devel] [PATCH v3 13/18] pci: Replace pci_find_domain() with more general pci_root_bus_path()

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

pci_find_domain() is used in a number of places where we want an id for a
whole PCI domain (i.e. the subtree under a PCI root bus).  The trouble is
that many platforms may support multiple independent host bridges with no
hardware supplied notion of domain number.

This patch, therefore, replaces calls to pci_find_domain() with calls to
a new pci_root_bus_path() returning a string.  The new call is implemented
in terms of a new callback in the host bridge class, so it can be defined
in some way that's well defined for the platform.  When no callback is
available we fall back on the qbus name.

Most current uses of pci_find_domain() are for error or informational
messages, so the change in identifiers should be harmless.  The exception
is pci_get_dev_path(), whose results form part of migration streams.  To
maintain compatibility with old migration streams, the PIIX PCI host is
altered to always supply  for this path, which matches the old domain
number (since the code didn't actually support domains other than 0).

For the pseries (spapr) PCI bridge we use a different platform-unique
identifier (pseries machines can routinely have dozens of PCI host
bridges).  Theoretically that breaks migration streams, but given that we
don't yet have migration support for pseries, it doesn't matter.

Any other machines that have working migration support including PCI
devices will need to be updated to maintain migration stream compatibility.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci-host/piix.c|  9 +
 hw/pci-host/q35.c |  9 +
 hw/pci/pci-hotplug-old.c  |  4 ++--
 hw/pci/pci.c  | 38 --
 hw/pci/pci_host.c |  1 +
 hw/pci/pcie_aer.c |  8 
 hw/ppc/spapr_pci.c| 10 ++
 include/hw/pci/pci.h  |  2 +-
 include/hw/pci/pci_host.h | 10 ++
 9 files changed, 66 insertions(+), 25 deletions(-)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index f9e68c3..c36e725 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -629,11 +629,20 @@ static const TypeInfo i440fx_info = {
 .class_init= i440fx_class_init,
 };
 
+static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
+PCIBus *rootbus)
+{
+/* For backwards compat with old device paths */
+return ;
+}
+
 static void i440fx_pcihost_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
 
+hc-root_bus_path = i440fx_pcihost_root_bus_path;
 k-init = i440fx_pcihost_initfn;
 dc-fw_name = pci;
 dc-no_user = 1;
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 3a5cff9..13148ed 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -63,6 +63,13 @@ static int q35_host_init(SysBusDevice *dev)
 return 0;
 }
 
+static const char *q35_host_root_bus_path(PCIHostState *host_bridge,
+  PCIBus *rootbus)
+{
+/* For backwards compat with old device paths */
+return ;
+}
+
 static Property mch_props[] = {
 DEFINE_PROP_UINT64(MCFG, Q35PCIHost, host.base_addr,
 MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT),
@@ -73,7 +80,9 @@ static void q35_host_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
 
+hc-root_bus_path = q35_host_root_bus_path;
 k-init = q35_host_init;
 dc-props = mch_props;
 dc-fw_name = pci;
diff --git a/hw/pci/pci-hotplug-old.c b/hw/pci/pci-hotplug-old.c
index 37e0720..e251810 100644
--- a/hw/pci/pci-hotplug-old.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -275,8 +275,8 @@ void pci_device_hot_add(Monitor *mon, const QDict *qdict)
 }
 
 if (dev) {
-monitor_printf(mon, OK domain %d, bus %d, slot %d, function %d\n,
-   pci_find_domain(dev),
+monitor_printf(mon, OK root bus %s, bus %d, slot %d, function %d\n,
+   pci_root_bus_path(dev),
pci_bus_num(dev-bus), PCI_SLOT(dev-devfn),
PCI_FUNC(dev-devfn));
 } else
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 69a6995..350b872 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -25,6 +25,7 @@
 #include hw/pci/pci.h
 #include hw/pci/pci_bridge.h
 #include hw/pci/pci_bus.h
+#include hw/pci/pci_host.h
 #include monitor/monitor.h
 #include net/net.h
 #include sysemu/sysemu.h
@@ -270,19 +271,20 @@ PCIBus *pci_device_root_bus(const PCIDevice *d)
 return bus;
 }
 
-int pci_find_domain(const PCIDevice *dev)
+const char *pci_root_bus_path(PCIDevice *dev)
 {
-const PCIBus *rootbus =

[Qemu-devel] [PATCH v3 15/18] pci: Add root bus parameter to pci_nic_init()

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

At present, pci_nic_init() and pci_nic_init_nofail() assume that they will
only create a NIC under the primary PCI root.  As we add support for
multiple PCI roots, that may no longer be the case.  This patch adds a root
bus parameter to pci_nic_init() (and updates callers accordingly) to allow
the machine init code using it to specify the right PCI root for NICs
created by old-style -net nic parameters.  NICs created new-style, with
-device can of course be put anywhere.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/alpha/dp264.c |  2 +-
 hw/arm/realview.c|  6 --
 hw/arm/versatilepb.c |  2 +-
 hw/i386/pc.c |  2 +-
 hw/mips/mips_fulong2e.c  |  6 +++---
 hw/mips/mips_malta.c |  6 +++---
 hw/pci/pci-hotplug-old.c |  3 ++-
 hw/pci/pci.c | 10 ++
 hw/ppc/e500.c|  2 +-
 hw/ppc/mac_newworld.c|  2 +-
 hw/ppc/mac_oldworld.c|  2 +-
 hw/ppc/ppc440_bamboo.c   |  2 +-
 hw/ppc/prep.c|  2 +-
 hw/ppc/spapr.c   |  2 +-
 hw/sh4/r2d.c |  5 -
 hw/sparc64/sun4u.c   |  2 +-
 include/hw/pci/pci.h |  6 --
 17 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index 8695efb..8dad08f 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -89,7 +89,7 @@ static void clipper_init(QEMUMachineInitArgs *args)
 
 /* Network setup.  e1000 is good enough, failing Tulip support.  */
 for (i = 0; i  nb_nics; i++) {
-pci_nic_init_nofail(nd_table[i], e1000, NULL);
+pci_nic_init_nofail(nd_table[i], pci_bus, e1000, NULL);
 }
 
 /* IDE disk setup.  */
diff --git a/hw/arm/realview.c b/hw/arm/realview.c
index d6f47bf..036a188 100644
--- a/hw/arm/realview.c
+++ b/hw/arm/realview.c
@@ -59,7 +59,7 @@ static void realview_init(QEMUMachineInitArgs *args,
 qemu_irq *irqp;
 qemu_irq pic[64];
 qemu_irq mmc_irq[2];
-PCIBus *pci_bus;
+PCIBus *pci_bus = NULL;
 NICInfo *nd;
 i2c_bus *i2c;
 int n;
@@ -250,7 +250,9 @@ static void realview_init(QEMUMachineInitArgs *args,
 }
 done_nic = 1;
 } else {
-pci_nic_init_nofail(nd, rtl8139, NULL);
+if (pci_bus) {
+pci_nic_init_nofail(nd, pci_bus, rtl8139, NULL);
+}
 }
 }
 
diff --git a/hw/arm/versatilepb.c b/hw/arm/versatilepb.c
index 753757e..15eb086 100644
--- a/hw/arm/versatilepb.c
+++ b/hw/arm/versatilepb.c
@@ -244,7 +244,7 @@ static void versatile_init(QEMUMachineInitArgs *args, int 
board_id)
 smc91c111_init(nd, 0x1001, sic[25]);
 done_smc = 1;
 } else {
-pci_nic_init_nofail(nd, rtl8139, NULL);
+pci_nic_init_nofail(nd, pci_bus, rtl8139, NULL);
 }
 }
 if (usb_enabled(false)) {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7c4794c..80c27d6 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1310,7 +1310,7 @@ void pc_nic_init(ISABus *isa_bus, PCIBus *pci_bus)
 if (!pci_bus || (nd-model  strcmp(nd-model, ne2k_isa) == 0)) {
 pc_init_ne2k_isa(isa_bus, nd);
 } else {
-pci_nic_init_nofail(nd, e1000, NULL);
+pci_nic_init_nofail(nd, pci_bus, e1000, NULL);
 }
 }
 }
diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index 00c9071..db67966 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -231,7 +231,7 @@ static void audio_init (PCIBus *pci_bus)
 }
 
 /* Network support */
-static void network_init (void)
+static void network_init (PCIBus *pci_bus)
 {
 int i;
 
@@ -244,7 +244,7 @@ static void network_init (void)
 default_devaddr = 07;
 }
 
-pci_nic_init_nofail(nd, rtl8139, default_devaddr);
+pci_nic_init_nofail(nd, pci_bus, rtl8139, default_devaddr);
 }
 }
 
@@ -393,7 +393,7 @@ static void mips_fulong2e_init(QEMUMachineInitArgs *args)
 /* Sound card */
 audio_init(pci_bus);
 /* Network card */
-network_init();
+network_init(pci_bus);
 }
 
 static QEMUMachine mips_fulong2e_machine = {
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 8a4459d..5843fad 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -468,7 +468,7 @@ static MaltaFPGAState *malta_fpga_init(MemoryRegion 
*address_space,
 }
 
 /* Network support */
-static void network_init(void)
+static void network_init(PCIBus *pci_bus)
 {
 int i;
 
@@ -480,7 +480,7 @@ static void network_init(void)
 /* The malta board has a PCNet card using PCI SLOT 11 */
 default_devaddr = 0b;
 
-pci_nic_init_nofail(nd, pcnet, default_devaddr);
+pci_nic_init_nofail(nd, pci_bus, pcnet, default_devaddr);
 }
 }
 
@@ -985,7 +985,7 @@ void mips_malta_init(QEMUMachineInitArgs *args)
 fdctrl_init_isa(isa_bus, fd);
 
 /* Network card */
-

[Qemu-devel] [PATCH v3 11/18] pci: Abolish pci_find_root_bus()

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

pci_find_root_bus() takes a domain parameter.  Currently PCI root buses
with domain other than 0 can't be created, so this is more or less a long
winded way of retrieving the main PCI root bus.  Numbered domains don't
actually properly cover the (non x86) possibilities for multiple PCI root
buses, so this patch for now enforces the domain == 0 restriction in other
places to replace pci_find_root_bus() with an explicit
pci_find_primary_bus().

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci-hotplug-old.c | 34 +-
 hw/pci/pci.c | 19 +++
 include/hw/pci/pci.h |  2 +-
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/hw/pci/pci-hotplug-old.c b/hw/pci/pci-hotplug-old.c
index a0b5558..7a47d6b 100644
--- a/hw/pci/pci-hotplug-old.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -36,17 +36,23 @@
 #include sysemu/blockdev.h
 #include qapi/error.h
 
-static int pci_read_devaddr(Monitor *mon, const char *addr, int *domp,
+static int pci_read_devaddr(Monitor *mon, const char *addr,
 int *busp, unsigned *slotp)
 {
+int dom;
+
 /* strip legacy tag */
 if (!strncmp(addr, pci_addr=, 9)) {
 addr += 9;
 }
-if (pci_parse_devaddr(addr, domp, busp, slotp, NULL)) {
+if (pci_parse_devaddr(addr, dom, busp, slotp, NULL)) {
 monitor_printf(mon, Invalid pci address\n);
 return -1;
 }
+if (dom != 0) {
+monitor_printf(mon, Multiple PCI domains not supported, use 
device_add\n);
+return -1;
+}
 return 0;
 }
 
@@ -128,18 +134,22 @@ static int scsi_hot_add(Monitor *mon, DeviceState 
*adapter,
 
 int pci_drive_hot_add(Monitor *mon, const QDict *qdict, DriveInfo *dinfo)
 {
-int dom, pci_bus;
+int pci_bus;
 unsigned slot;
+PCIBus *root = pci_find_primary_bus();
 PCIDevice *dev;
 const char *pci_addr = qdict_get_str(qdict, pci_addr);
 
 switch (dinfo-type) {
 case IF_SCSI:
-if (pci_read_devaddr(mon, pci_addr, dom, pci_bus, slot)) {
+if (!root) {
+monitor_printf(mon, no primary PCI bus\n);
+goto err;
+}
+if (pci_read_devaddr(mon, pci_addr, pci_bus, slot)) {
 goto err;
 }
-dev = pci_find_device(pci_find_root_bus(dom), pci_bus,
-  PCI_DEVFN(slot, 0));
+dev = pci_find_device(root, pci_bus, PCI_DEVFN(slot, 0));
 if (!dev) {
 monitor_printf(mon, no pci device with address %s\n, pci_addr);
 goto err;
@@ -275,16 +285,22 @@ void pci_device_hot_add(Monitor *mon, const QDict *qdict)
 
 static int pci_device_hot_remove(Monitor *mon, const char *pci_addr)
 {
+PCIBus *root = pci_find_primary_bus();
 PCIDevice *d;
-int dom, bus;
+int bus;
 unsigned slot;
 Error *local_err = NULL;
 
-if (pci_read_devaddr(mon, pci_addr, dom, bus, slot)) {
+if (!root) {
+monitor_printf(mon, no primary PCI bus\n);
+return -1;
+}
+
+if (pci_read_devaddr(mon, pci_addr, bus, slot)) {
 return -1;
 }
 
-d = pci_find_device(pci_find_root_bus(dom), bus, PCI_DEVFN(slot, 0));
+d = pci_find_device(root, bus, PCI_DEVFN(slot, 0));
 if (!d) {
 monitor_printf(mon, slot %d empty\n, slot);
 return -1;
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index adf4da5..fc99e3b 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -246,12 +246,12 @@ static void pci_host_bus_register(int domain, PCIBus *bus)
 QLIST_INSERT_HEAD(host_buses, host, next);
 }
 
-PCIBus *pci_find_root_bus(int domain)
+PCIBus *pci_find_primary_bus(void)
 {
 struct PCIHostBus *host;
 
 QLIST_FOREACH(host, host_buses, next) {
-if (host-domain == domain) {
+if (host-domain == 0) {
 return host-bus;
 }
 }
@@ -583,20 +583,31 @@ int pci_parse_devaddr(const char *addr, int *domp, int 
*busp,
 
 PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr)
 {
+PCIBus *root = pci_find_primary_bus();
 int dom, bus;
 unsigned slot;
 
+if (!root) {
+fprintf(stderr, No primary PCI bus\n);
+return NULL;
+}
+
 if (!devaddr) {
 *devfnp = -1;
-return pci_find_bus_nr(pci_find_root_bus(0), 0);
+return pci_find_bus_nr(root, 0);
 }
 
 if (pci_parse_devaddr(devaddr, dom, bus, slot, NULL)  0) {
 return NULL;
 }
 
+if (dom != 0) {
+fprintf(stderr, No support for non-zero PCI domains\n);
+return NULL;
+}
+
 *devfnp = PCI_DEVFN(slot, 0);
-return pci_find_bus_nr(pci_find_root_bus(dom), bus);
+return pci_find_bus_nr(root, bus);
 }
 
 static void pci_init_cmask(PCIDevice *dev)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b5edef8..7b89d88 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -389,7 +389,7

[Qemu-devel] [PATCH v3 16/18] pci: Simpler implementation of primary PCI bus

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

Currently pci_find_primary_bus() searches the list of root buses for one
with domain 0.  But since host buses are always registered with domain 0,
this just amounts to finding the only PCI host bus.  The only remaining
users of pci_find_primary_bus() are in pci-hotplug-old.c, which implements
the old style pci_add/pci_del commands.

Therefore, this patch redefines pci_find_primary_bus() to find the only
PCI root bus, returning an error if there are multiple roots.  The callers
in pci-hotplug-old.c are updated correspondingly, to produce sensible
error messages.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci-hotplug-old.c | 26 --
 hw/pci/pci.c |  9 ++---
 2 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/hw/pci/pci-hotplug-old.c b/hw/pci/pci-hotplug-old.c
index 807260c..8077289 100644
--- a/hw/pci/pci-hotplug-old.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -62,10 +62,17 @@ static PCIDevice *qemu_pci_hot_add_nic(Monitor *mon,
 {
 Error *local_err = NULL;
 QemuOpts *opts;
+PCIBus *root = pci_find_primary_bus();
 PCIBus *bus;
 int ret, devfn;
 
-bus = pci_get_bus_devfn(devfn, pci_find_primary_bus(), devaddr);
+if (!root) {
+monitor_printf(mon, no primary PCI bus (if there are multiple
+PCI roots, you must use device_add instead));
+return NULL;
+}
+
+bus = pci_get_bus_devfn(devfn, root, devaddr);
 if (!bus) {
 monitor_printf(mon, Invalid PCI device address %s\n, devaddr);
 return NULL;
@@ -92,8 +99,7 @@ static PCIDevice *qemu_pci_hot_add_nic(Monitor *mon,
 monitor_printf(mon, Parameter addr not supported\n);
 return NULL;
 }
-return pci_nic_init(nd_table[ret], pci_find_primary_bus(),
-rtl8139, devaddr);
+return pci_nic_init(nd_table[ret], root, rtl8139, devaddr);
 }
 
 static int scsi_hot_add(Monitor *mon, DeviceState *adapter,
@@ -144,7 +150,8 @@ int pci_drive_hot_add(Monitor *mon, const QDict *qdict, 
DriveInfo *dinfo)
 switch (dinfo-type) {
 case IF_SCSI:
 if (!root) {
-monitor_printf(mon, no primary PCI bus\n);
+monitor_printf(mon, no primary PCI bus (if there are multiple
+PCI roots, you must use device_add instead));
 goto err;
 }
 if (pci_read_devaddr(mon, pci_addr, pci_bus, slot)) {
@@ -177,6 +184,7 @@ static PCIDevice *qemu_pci_hot_add_storage(Monitor *mon,
 DriveInfo *dinfo = NULL;
 int type = -1;
 char buf[128];
+PCIBus *root = pci_find_primary_bus();
 PCIBus *bus;
 int devfn;
 
@@ -206,7 +214,12 @@ static PCIDevice *qemu_pci_hot_add_storage(Monitor *mon,
 dinfo = NULL;
 }
 
-bus = pci_get_bus_devfn(devfn, pci_find_primary_bus(), devaddr);
+if (!root) {
+monitor_printf(mon, no primary PCI bus (if there are multiple
+PCI roots, you must use device_add instead));
+return NULL;
+}
+bus = pci_get_bus_devfn(devfn, root, devaddr);
 if (!bus) {
 monitor_printf(mon, Invalid PCI device address %s\n, devaddr);
 return NULL;
@@ -293,7 +306,8 @@ static int pci_device_hot_remove(Monitor *mon, const char 
*pci_addr)
 Error *local_err = NULL;
 
 if (!root) {
-monitor_printf(mon, no primary PCI bus\n);
+monitor_printf(mon, no primary PCI bus (if there are multiple
+PCI roots, you must use device_del instead));
 return -1;
 }
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2f2db0f..e0995aa 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -249,15 +249,18 @@ static void pci_host_bus_register(int domain, PCIBus *bus)
 
 PCIBus *pci_find_primary_bus(void)
 {
+PCIBus *primary_bus = NULL;
 struct PCIHostBus *host;
 
 QLIST_FOREACH(host, host_buses, next) {
-if (host-domain == 0) {
-return host-bus;
+if (primary_bus) {
+/* We have multiple root buses, refuse to select a primary */
+return NULL;
 }
+primary_bus = host-bus;
 }
 
-return NULL;
+return primary_bus;
 }
 
 PCIBus *pci_device_root_bus(const PCIDevice *d)
-- 
MST

Re: [Qemu-devel] [PATCH] Makefile: disable parallel build with dtc

2013-07-04 Thread Peter Maydell

On 4 July 2013 09:06, Michael S. Tsirkin m...@redhat.com wrote:
 Sometimes I get this error when building with -j 4:
 ar: two different operation options specified
 make[1]: *** [libfdt/libfdt.a] Error 1
 make: *** [subdir-dtc] Error 2

 dtc make does not seem to support parallel make.
 Force non-parallel build to fix this.

So, this is the second time somebody's reported this, and
I think it would be better to try to figure out what's
going on. Can you report what the actual ar command is
when run with V=1 ?

Also, can you confirm that you haven't got an environment
that sets ARFLAGS to something weird (including ) ?

thanks
-- PMM

[Qemu-devel] [PATCH v3 14/18] pci: Add root bus argument to pci_get_bus_devfn()

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

pci_get_bus_devfn() interprets a full PCI address string to give a PCIBus *
and device/function number within that bus.  Currently it assumes it is
working on an address under the primary PCI root bus.  This patch extends
it to allow the caller to specify a root bus.  This might seem a little odd
since the supplied address can (theoretically) include a PCI domain number.
However, attempting to use a non-zero domain number there is currently an
error, so that shouldn't really cause problems.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci-hotplug-old.c | 4 ++--
 hw/pci/pci.c | 7 ---
 include/hw/pci/pci.h | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/pci/pci-hotplug-old.c b/hw/pci/pci-hotplug-old.c
index e251810..e92d646 100644
--- a/hw/pci/pci-hotplug-old.c
+++ b/hw/pci/pci-hotplug-old.c
@@ -65,7 +65,7 @@ static PCIDevice *qemu_pci_hot_add_nic(Monitor *mon,
 PCIBus *bus;
 int ret, devfn;
 
-bus = pci_get_bus_devfn(devfn, devaddr);
+bus = pci_get_bus_devfn(devfn, pci_find_primary_bus(), devaddr);
 if (!bus) {
 monitor_printf(mon, Invalid PCI device address %s\n, devaddr);
 return NULL;
@@ -205,7 +205,7 @@ static PCIDevice *qemu_pci_hot_add_storage(Monitor *mon,
 dinfo = NULL;
 }
 
-bus = pci_get_bus_devfn(devfn, devaddr);
+bus = pci_get_bus_devfn(devfn, pci_find_primary_bus(), devaddr);
 if (!bus) {
 monitor_printf(mon, Invalid PCI device address %s\n, devaddr);
 return NULL;
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 350b872..c4f63ad 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -589,12 +589,13 @@ int pci_parse_devaddr(const char *addr, int *domp, int 
*busp,
 return 0;
 }
 
-PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr)
+PCIBus *pci_get_bus_devfn(int *devfnp, PCIBus *root, const char *devaddr)
 {
-PCIBus *root = pci_find_primary_bus();
 int dom, bus;
 unsigned slot;
 
+assert(!root-parent_dev);
+
 if (!root) {
 fprintf(stderr, No primary PCI bus\n);
 return NULL;
@@ -1588,7 +1589,7 @@ PCIDevice *pci_nic_init(NICInfo *nd, const char 
*default_model,
 if (i  0)
 return NULL;
 
-bus = pci_get_bus_devfn(devfn, devaddr);
+bus = pci_get_bus_devfn(devfn, pci_find_primary_bus(), devaddr);
 if (!bus) {
 error_report(Invalid PCI device address %s for device %s,
  devaddr, pci_nic_names[i]);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index e0597b7..3a43fba 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -394,7 +394,7 @@ PCIBus *pci_device_root_bus(const PCIDevice *d);
 const char *pci_root_bus_path(PCIDevice *dev);
 PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn);
 int pci_qdev_find_device(const char *id, PCIDevice **pdev);
-PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr);
+PCIBus *pci_get_bus_devfn(int *devfnp, PCIBus *root, const char *devaddr);
 
 int pci_parse_devaddr(const char *addr, int *domp, int *busp,
   unsigned int *slotp, unsigned int *funcp);
-- 
MST

[Qemu-devel] [PATCH v3 18/18] pci: Fold host_buses list into PCIHostState functionality

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

The host_buses list is an odd structure - a list of pointers to PCI root
buses existing in parallel to the normal qdev tree structure.  This patch
removes it, instead putting the link pointers into the PCIHostState
structure, which have a 1:1 relationship to PCIHostBus structures anyway.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci.c  | 33 ++---
 include/hw/pci/pci_host.h |  2 ++
 2 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index d861b40..8680063 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -90,11 +90,7 @@ static void pci_del_option_rom(PCIDevice *pdev);
 static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET;
 static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU;
 
-struct PCIHostBus {
-struct PCIBus *bus;
-QLIST_ENTRY(PCIHostBus) next;
-};
-static QLIST_HEAD(, PCIHostBus) host_buses;
+static QLIST_HEAD(, PCIHostState) pci_host_bridges;
 
 static const VMStateDescription vmstate_pcibus = {
 .name = PCIBUS,
@@ -237,20 +233,19 @@ static int pcibus_reset(BusState *qbus)
 return 1;
 }
 
-static void pci_host_bus_register(PCIBus *bus)
+static void pci_host_bus_register(PCIBus *bus, DeviceState *parent)
 {
-struct PCIHostBus *host;
-host = g_malloc0(sizeof(*host));
-host-bus = bus;
-QLIST_INSERT_HEAD(host_buses, host, next);
+PCIHostState *host_bridge = PCI_HOST_BRIDGE(parent);
+
+QLIST_INSERT_HEAD(pci_host_bridges, host_bridge, next);
 }
 
 PCIBus *pci_find_primary_bus(void)
 {
 PCIBus *primary_bus = NULL;
-struct PCIHostBus *host;
+PCIHostState *host;
 
-QLIST_FOREACH(host, host_buses, next) {
+QLIST_FOREACH(host, pci_host_bridges, next) {
 if (primary_bus) {
 /* We have multiple root buses, refuse to select a primary */
 return NULL;
@@ -302,7 +297,7 @@ static void pci_bus_init(PCIBus *bus, DeviceState *parent,
 /* host bridge */
 QLIST_INIT(bus-child);
 
-pci_host_bus_register(bus);
+pci_host_bus_register(bus, parent);
 
 vmstate_register(NULL, -1, vmstate_pcibus, bus);
 }
@@ -1533,11 +1528,11 @@ static PciInfo *qmp_query_pci_bus(PCIBus *bus, int 
bus_num)
 PciInfoList *qmp_query_pci(Error **errp)
 {
 PciInfoList *info, *head = NULL, *cur_item = NULL;
-struct PCIHostBus *host;
+PCIHostState *host_bridge;
 
-QLIST_FOREACH(host, host_buses, next) {
+QLIST_FOREACH(host_bridge, pci_host_bridges, next) {
 info = g_malloc0(sizeof(*info));
-info-value = qmp_query_pci_bus(host-bus, 0);
+info-value = qmp_query_pci_bus(host_bridge-bus, 0);
 
 /* XXX: waiting for the qapi to support GSList */
 if (!cur_item) {
@@ -2201,11 +2196,11 @@ static int pci_qdev_find_recursive(PCIBus *bus,
 
 int pci_qdev_find_device(const char *id, PCIDevice **pdev)
 {
-struct PCIHostBus *host;
+PCIHostState *host_bridge;
 int rc = -ENODEV;
 
-QLIST_FOREACH(host, host_buses, next) {
-int tmp = pci_qdev_find_recursive(host-bus, id, pdev);
+QLIST_FOREACH(host_bridge, pci_host_bridges, next) {
+int tmp = pci_qdev_find_recursive(host_bridge-bus, id, pdev);
 if (!tmp) {
 rc = 0;
 break;
diff --git a/include/hw/pci/pci_host.h b/include/hw/pci/pci_host.h
index 44052f2..ba31595 100644
--- a/include/hw/pci/pci_host.h
+++ b/include/hw/pci/pci_host.h
@@ -46,6 +46,8 @@ struct PCIHostState {
 MemoryRegion mmcfg;
 uint32_t config_reg;
 PCIBus *bus;
+
+QLIST_ENTRY(PCIHostState) next;
 };
 
 typedef struct PCIHostBridgeClass {
-- 
MST

[Qemu-devel] [PATCH 1/1] hw/9pfs: Fix memory leak in error path

2013-07-04 Thread M. Mohan Kumar

From: M. Mohan Kumar mo...@in.ibm.com

Fix few more memory leaks in virtio-9p-device.c detected using valgrind.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
 hw/9pfs/virtio-9p-device.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index dc6f4e4..35e2af4 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -68,14 +68,14 @@ static int virtio_9p_device_init(VirtIODevice *vdev)
 fprintf(stderr, Virtio-9p device couldn't find fsdev with the 
 id = %s\n,
 s-fsconf.fsdev_id ? s-fsconf.fsdev_id : NULL);
-return -1;
+goto out;
 }
 
 if (!s-fsconf.tag) {
 /* we haven't specified a mount_tag */
 fprintf(stderr, fsdev with id %s needs mount_tag arguments\n,
 s-fsconf.fsdev_id);
-return -1;
+goto out;
 }
 
 s-ctx.export_flags = fse-export_flags;
@@ -85,10 +85,10 @@ static int virtio_9p_device_init(VirtIODevice *vdev)
 if (len  MAX_TAG_LEN - 1) {
 fprintf(stderr, mount tag '%s' (%d bytes) is longer than 
 maximum (%d bytes), s-fsconf.tag, len, MAX_TAG_LEN - 1);
-return -1;
+goto out;
 }
 
-s-tag = strdup(s-fsconf.tag);
+s-tag = g_strdup(s-fsconf.tag);
 s-ctx.uid = -1;
 
 s-ops = fse-ops;
@@ -99,11 +99,11 @@ static int virtio_9p_device_init(VirtIODevice *vdev)
 if (s-ops-init(s-ctx)  0) {
 fprintf(stderr, Virtio-9p Failed to initialize fs-driver with id:%s
  and export path:%s\n, s-fsconf.fsdev_id, s-ctx.fs_root);
-return -1;
+goto out;
 }
 if (v9fs_init_worker_threads()  0) {
 fprintf(stderr, worker thread initialization failed\n);
-return -1;
+goto out;
 }
 
 /*
@@ -115,18 +115,26 @@ static int virtio_9p_device_init(VirtIODevice *vdev)
 if (s-ops-name_to_path(s-ctx, NULL, /, path)  0) {
 fprintf(stderr,
 error in converting name to path %s, strerror(errno));
-return -1;
+goto out;
 }
 if (s-ops-lstat(s-ctx, path, stat)) {
 fprintf(stderr, share path %s does not exist\n, fse-path);
-return -1;
+goto out;
 } else if (!S_ISDIR(stat.st_mode)) {
 fprintf(stderr, share path %s is not a directory\n, fse-path);
-return -1;
+goto out;
 }
 v9fs_path_free(path);
 
 return 0;
+out:
+g_free(s-ctx.fs_root);
+g_free(s-tag);
+virtio_cleanup(vdev);
+v9fs_path_free(path);
+
+return -1;
+
 }
 
 /* virtio-9p device */
-- 
1.7.11.7

Re: [Qemu-devel] [PATCH v3 00/14] tcg: remainder and tcg-arm updates

2013-07-04 Thread Claudio Fontana

On 03.07.2013 23:29, Richard Henderson wrote:
 Changes v2-v3:
   * Add myself to tcg maintainers, as per afaerber's suggestion.
   * Fix rebase error wrt aarch64, as per claudio.
   * Include tcg-arm unwind patch set; no point in half measures.
 
 
 r~
 
 
 The following changes since commit ab8bf29078e0ab8347e2ff8b4e5542f7a0c751cf:
 
   Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-07-03 
 08:37:00 -0500)
 
 are available in the git repository at:
 
   git://github.com/rth7680/qemu.git tcg-next
 
 for you to fetch changes up to 6688d5d7eefa67b5f50b6f03a2456e4635781b3b:
 
   tcg-arm: Implement tcg_register_jit (2013-07-03 11:17:57 -0700)
 
 
 Richard Henderson (14):
   tcg: Add myself to general TCG maintainership
   tcg: Split rem requirement from div requirement
   tcg-arm: Don't implement rem
   tcg-ppc: Don't implement rem
   tcg-ppc64: Don't implement rem
   tcg: Allow non-constant control macros
   tcg: Simplify logic using TCG_OPF_NOT_PRESENT
   tcg-arm: Make use of conditional availability of opcodes for divide
   tcg-arm: Simplify logic in detecting the ARM ISA in use
   tcg-arm: Use AT_PLATFORM to detect the host ISA
   tcg: Fix high_pc fields in .debug_info
   tcg: Move the CIE and FDE header definitions to common code
   tcg-i386: Use QEMU_BUILD_BUG_ON instead of assert for frame size
   tcg-arm: Implement tcg_register_jit
 
  MAINTAINERS  |   1 +
  tcg/aarch64/tcg-target.h |   2 +
  tcg/arm/tcg-target.c | 172 
 ++-
  tcg/arm/tcg-target.h |  15 +++--
  tcg/hppa/tcg-target.c|  35 +++---
  tcg/hppa/tcg-target.h|   1 +
  tcg/i386/tcg-target.c|  45 +
  tcg/ia64/tcg-target.h|   2 +
  tcg/mips/tcg-target.h|   1 +
  tcg/ppc/tcg-target.c |  14 
  tcg/ppc/tcg-target.h |   1 +
  tcg/ppc64/tcg-target.c   |  26 ---
  tcg/ppc64/tcg-target.h   |   2 +
  tcg/sparc/tcg-target.c   |  35 +++---
  tcg/sparc/tcg-target.h   |   2 +
  tcg/tcg-op.h |  32 +++--
  tcg/tcg-opc.h|  36 +-
  tcg/tcg.c|  26 +--
  tcg/tcg.h|   6 +-
  tcg/tci/tcg-target.h |   2 +
  20 files changed, 242 insertions(+), 214 deletions(-)
 

Tested tcg/aarch64 on Aarch64 Foundationv8 (sparc-softmmu, arm-softmmu, 
x86_64-softmmu).

Tested-by: Claudio Fontana claudio.font...@huawei.com
Reviewed-by: Claudio Fontana claudio.font...@huawei.com

[Qemu-devel] [PATCH v5] Xen PV Device

2013-07-04 Thread Paul Durrant

Introduces a new Xen PV PCI device which will act as a binding point for
PV drivers for Xen.
The device has parameterized vendor-id, device-id and revision to allow to
be configured as a binding point for any vendor's PV drivers.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Stefano Stabellini stefano.stabell...@citrix.com
Reviewed-by: Andreas Färber afaer...@suse.de
---

V5:
- Addresses comments from Andreas Färber

V4:
- Renamed from 'Citrix PV Bus' to 'Xen PV Device'
- Paramaterized vendor-id and device-id as requested by Stefano Stabellini

V3:
- Addresses comments from Anthony Liguori and Peter Maydell

V2:
- Addresses comments from Andreas Farber and Paolo Bonzini

 hw/xen/Makefile.objs |1 +
 hw/xen/xen_pvdevice.c|  131 ++
 include/hw/pci/pci_ids.h |5 +-
 trace-events |4 ++
 4 files changed, 139 insertions(+), 2 deletions(-)
 create mode 100644 hw/xen/xen_pvdevice.c

diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index 2017560..cd2df6a 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,6 @@
 # xen backend driver support
 common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o
+common-obj-y += xen_pvdevice.o
 
 obj-$(CONFIG_XEN_I386) += xen_platform.o xen_apic.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
diff --git a/hw/xen/xen_pvdevice.c b/hw/xen/xen_pvdevice.c
new file mode 100644
index 000..93dfab2
--- /dev/null
+++ b/hw/xen/xen_pvdevice.c
@@ -0,0 +1,131 @@
+/* Copyright (c) Citrix Systems Inc.
+ * All rights reserved.
+ * 
+ * Redistribution and use in source and binary forms, 
+ * with or without modification, are permitted provided 
+ * that the following conditions are met:
+ * 
+ * *   Redistributions of source code must retain the above 
+ * copyright notice, this list of conditions and the 
+ * following disclaimer.
+ * *   Redistributions in binary form must reproduce the above 
+ * copyright notice, this list of conditions and the 
+ * following disclaimer in the documentation and/or other 
+ * materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 
+ * CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, 
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR 
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 
+ * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 
+ * SUCH DAMAGE.
+ */
+
+#include hw/hw.h
+#include hw/pci/pci.h
+#include trace.h
+
+#define TYPE_XEN_PV_DEVICE  xen-pvdevice
+
+#define XEN_PV_DEVICE(obj) \
+ OBJECT_CHECK(XenPVDevice, (obj), TYPE_XEN_PV_DEVICE)
+
+typedef struct XenPVDevice {
+/* private */
+PCIDevice   parent_obj;
+/* public */
+uint16_tvendor_id;
+uint16_tdevice_id;
+uint8_t revision;
+uint32_tsize;
+MemoryRegionmmio;
+} XenPVDevice;
+
+static uint64_t xen_pv_mmio_read(void *opaque, hwaddr addr,
+ unsigned size)
+{
+trace_xen_pv_mmio_read(addr);
+
+return ~(uint64_t)0;
+}
+
+static void xen_pv_mmio_write(void *opaque, hwaddr addr,
+  uint64_t val, unsigned size)
+{
+trace_xen_pv_mmio_write(addr);
+}
+
+static const MemoryRegionOps xen_pv_mmio_ops = {
+.read = xen_pv_mmio_read,
+.write = xen_pv_mmio_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static int xen_pv_init(PCIDevice *pci_dev)
+{
+XenPVDevice *d = XEN_PV_DEVICE(pci_dev);
+uint8_t *pci_conf;
+
+pci_conf = pci_dev-config;
+
+pci_set_word(pci_conf + PCI_VENDOR_ID, d-vendor_id);
+pci_set_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID, d-vendor_id);
+pci_set_word(pci_conf + PCI_DEVICE_ID, d-device_id);
+pci_set_word(pci_conf + PCI_SUBSYSTEM_ID, d-device_id);
+pci_set_byte(pci_conf + PCI_REVISION_ID, d-revision);
+
+pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_MEMORY);
+
+pci_config_set_prog_interface(pci_conf, 0);
+
+pci_conf[PCI_INTERRUPT_PIN] = 1;
+
+memory_region_init_io(d-mmio, xen_pv_mmio_ops, d,
+  mmio, d-size);
+
+pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_MEM_PREFETCH,
+ d-mmio);
+
+return 0;
+}
+
+static Property xen_pv_props[] = {
+DEFINE_PROP_UINT16(vendor-id, XenPVDevice, vendor_id, PCI_VENDOR_ID_XEN),
+DEFINE_PROP_UINT16(device-id, XenPVDevice, device_id, 
PCI_DEVICE_ID_XEN_PVDEVICE),
+

[Qemu-devel] [PATCH v3 17/18] pci: Remove domain from PCIHostBus

2013-07-04 Thread Michael S. Tsirkin

From: David Gibson da...@gibson.dropbear.id.au

There are now no users of the domain field of PCIHostBus, so remove it
from the structure, and as a parameter from the pci_host_bus_register()
function which sets it.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index e0995aa..d861b40 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -91,7 +91,6 @@ static uint16_t pci_default_sub_vendor_id = 
PCI_SUBVENDOR_ID_REDHAT_QUMRANET;
 static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU;
 
 struct PCIHostBus {
-int domain;
 struct PCIBus *bus;
 QLIST_ENTRY(PCIHostBus) next;
 };
@@ -238,11 +237,10 @@ static int pcibus_reset(BusState *qbus)
 return 1;
 }
 
-static void pci_host_bus_register(int domain, PCIBus *bus)
+static void pci_host_bus_register(PCIBus *bus)
 {
 struct PCIHostBus *host;
 host = g_malloc0(sizeof(*host));
-host-domain = domain;
 host-bus = bus;
 QLIST_INSERT_HEAD(host_buses, host, next);
 }
@@ -303,7 +301,8 @@ static void pci_bus_init(PCIBus *bus, DeviceState *parent,
 
 /* host bridge */
 QLIST_INIT(bus-child);
-pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
+
+pci_host_bus_register(bus);
 
 vmstate_register(NULL, -1, vmstate_pcibus, bus);
 }
-- 
MST

Re: [Qemu-devel] [PATCH] Makefile: disable parallel build with dtc

2013-07-04 Thread Andreas Färber

Am 04.07.2013 11:17, schrieb Peter Maydell:
 On 4 July 2013 09:06, Michael S. Tsirkin m...@redhat.com wrote:
 Sometimes I get this error when building with -j 4:
 ar: two different operation options specified
 make[1]: *** [libfdt/libfdt.a] Error 1
 make: *** [subdir-dtc] Error 2

 dtc make does not seem to support parallel make.
 Force non-parallel build to fix this.
 
 So, this is the second time somebody's reported this, and
 I think it would be better to try to figure out what's
 going on. Can you report what the actual ar command is
 when run with V=1 ?
 
 Also, can you confirm that you haven't got an environment
 that sets ARFLAGS to something weird (including ) ?

I did confirm that my environment does not have ARFLAGS set; I believe
the issue is that ARFLAGS=$(ARFLAGS) is being passed in the Makefile,
effectively setting it to .

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH] Makefile: disable parallel build with dtc

2013-07-04 Thread Michael S. Tsirkin

On Thu, Jul 04, 2013 at 10:17:46AM +0100, Peter Maydell wrote:
 On 4 July 2013 09:06, Michael S. Tsirkin m...@redhat.com wrote:
  Sometimes I get this error when building with -j 4:
  ar: two different operation options specified
  make[1]: *** [libfdt/libfdt.a] Error 1
  make: *** [subdir-dtc] Error 2
 
  dtc make does not seem to support parallel make.
  Force non-parallel build to fix this.
 
 So, this is the second time somebody's reported this, and
 I think it would be better to try to figure out what's
 going on. Can you report what the actual ar command is
 when run with V=1 ?

it stopped reproducing now :(

 Also, can you confirm that you haven't got an environment
 that sets ARFLAGS to something weird (including ) ?
 
 thanks
 -- PMM

I can confirm that.

Re: [Qemu-devel] [PATCH] Makefile: disable parallel build with dtc

2013-07-04 Thread Peter Maydell

On 4 July 2013 10:45, Andreas Färber afaer...@suse.de wrote:
 Am 04.07.2013 11:17, schrieb Peter Maydell:
 Also, can you confirm that you haven't got an environment
 that sets ARFLAGS to something weird (including ) ?

 I did confirm that my environment does not have ARFLAGS set; I believe
 the issue is that ARFLAGS=$(ARFLAGS) is being passed in the Makefile,
 effectively setting it to .

That should set it to rv, because the top level make will
default ARFLAGS to that if you haven't set it explicitly.

You can test this by seeing whether a V=1 build runs the libfdt make
with ARFLAGS=rv or something else:

cam-vm-266:precise:qemu$ make -C build/x86 -j4 V=1
make: Entering directory `/home/petmay01/linaro/qemu-from-laptop/qemu/build/x86'
make -I/home/petmay01/linaro/qemu-from-laptop/qemu/dtc
VPATH=/home/petmay01/linaro/qemu-from-laptop/qemu/dtc -C dtc V=1
LIBFDT_srcdir=/home/petmay01/linaro/qemu-from-laptop/qemu/dtc/libfdt
CPPFLAGS=-I/home/petmay01/linaro/qemu-from-laptop/qemu/build/x86/dtc
-I/home/petmay01/linaro/qemu-from-laptop/qemu/dtc
-I/home/petmay01/linaro/qemu-from-laptop/qemu/dtc/libfdt CFLAGS=-O2
-D_FORTIFY_SOURCE=2 -g  -Werror -m64 -D_GNU_SOURCE
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes
-fno-strict-aliasing  -Wendif-labels -Wmissing-include-dirs
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k
-Winit-self -Wignored-qualifiers -Wold-style-declaration
-Wold-style-definition -Wtype-limits -fstack-protector-all
-I/usr/include/libpng12   -I/usr/include/pixman-1
-I/home/petmay01/linaro/qemu-from-laptop/qemu/dtc/libfdt -pthread
-I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include
-I/home/petmay01/linaro/qemu-from-laptop/qemu/tests
LDFLAGS=-Wl,--warn-common -m64 -static -g  ARFLAGS=rv CC=ccache
gcc AR=ar LD=ld
BUILD_DIR=/home/petmay01/linaro/qemu-from-laptop/qemu/build/x86
libfdt/libfdt.a
make[1]: Entering directory
`/home/petmay01/linaro/qemu-from-laptop/qemu/build/x86/dtc'

(this is GNU Make 3.81 from ubuntu package 3.81-8.1ubuntu1.1)

-- PMM

[Qemu-devel] [PATCH V4 06/10] NUMA: split out the common range parser

2013-07-04 Thread Wanlong Gao

Since cpus parser and hostnode parser have the common range parser
part, split it out to the common range parser to avoid the duplicate
code.

Reviewed-by: Bandan Das b...@redhat.com
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 vl.c | 89 
 1 file changed, 37 insertions(+), 52 deletions(-)

diff --git a/vl.c b/vl.c
index 38e0d3d..6e86dcf 100644
--- a/vl.c
+++ b/vl.c
@@ -1338,47 +1338,55 @@ char *get_boot_devices_list(size_t *size)
 return list;
 }
 
-static void numa_node_parse_cpus(int nodenr, const char *cpus, Error **errp)
+static int numa_node_parse_common(const char *str,
+  unsigned long long *value,
+  unsigned long long *endvalue)
 {
 char *endptr;
-unsigned long long value, endvalue;
-
-/* Empty CPU range strings will be considered valid, they will simply
- * not set any bit in the CPU bitmap.
- */
-if (!*cpus) {
-return;
+if (parse_uint(str, value, endptr, 10)  0) {
+return -1;
 }
 
-if (parse_uint(cpus, value, endptr, 10)  0) {
-goto error;
-}
 if (*endptr == '-') {
-if (parse_uint_full(endptr + 1, endvalue, 10)  0) {
-goto error;
+if (parse_uint_full(endptr + 1, endvalue, 10)  0) {
+   return -1;
 }
 } else if (*endptr == '\0') {
-endvalue = value;
+*endvalue = *value;
 } else {
-goto error;
+return -1;
 }
 
-if (endvalue = MAX_CPUMASK_BITS) {
-endvalue = MAX_CPUMASK_BITS - 1;
-fprintf(stderr,
-qemu: NUMA: A max of %d VCPUs are supported\n,
- MAX_CPUMASK_BITS);
+if (*endvalue = MAX_CPUMASK_BITS) {
+*endvalue = MAX_CPUMASK_BITS - 1;
+fprintf(stderr, qemu: NUMA: A max number %d is supported\n,
+MAX_CPUMASK_BITS);
 }
 
-if (endvalue  value) {
-goto error;
+if (*endvalue  *value) {
+return -1;
 }
 
-bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
-return;
+return 0;
+}
 
-error:
-error_setg(errp, Invalid NUMA CPU range: %s\n, cpus);
+static void numa_node_parse_cpus(int nodenr, const char *cpus, Error **errp)
+{
+unsigned long long value, endvalue;
+
+/* Empty CPU range strings will be considered valid, they will simply
+ * not set any bit in the CPU bitmap.
+ */
+if (!*cpus) {
+return;
+}
+
+if (numa_node_parse_common(cpus, value, endvalue)  0) {
+error_setg(errp, Invalid NUMA CPU range: %s, cpus);
+return;
+}
+
+bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
 return;
 }
 
@@ -1403,7 +1411,6 @@ void numa_node_parse_mpol(int nodenr, const char *mpol, 
Error **errp)
 void numa_node_parse_hostnode(int nodenr, const char *hostnode, Error **errp)
 {
 unsigned long long value, endvalue;
-char *endptr;
 bool clear = false;
 unsigned long *bm = numa_info[nodenr].host_mem;
 
@@ -1422,27 +1429,9 @@ void numa_node_parse_hostnode(int nodenr, const char 
*hostnode, Error **errp)
 return;
 }
 
-if (parse_uint(hostnode, value, endptr, 10)  0)
-goto error;
-if (*endptr == '-') {
-if (parse_uint_full(endptr + 1, endvalue, 10)  0) {
-goto error;
-}
-} else if (*endptr == '\0') {
-endvalue = value;
-} else {
-goto error;
-}
-
-if (endvalue = MAX_CPUMASK_BITS) {
-endvalue = MAX_CPUMASK_BITS - 1;
-fprintf(stderr,
-qemu: NUMA: A max of %d host nodes are supported\n,
- MAX_CPUMASK_BITS);
-}
-
-if (endvalue  value) {
-goto error;
+if (numa_node_parse_common(hostnode, value, endvalue)  0) {
+error_setg(errp, Invalid host NUMA ndoes range: %s, hostnode);
+return;
 }
 
 if (clear)
@@ -1451,10 +1440,6 @@ void numa_node_parse_hostnode(int nodenr, const char 
*hostnode, Error **errp)
 bitmap_set(bm, value, endvalue - value + 1);
 
 return;
-
-error:
-error_setg(errp, Invalid host NUMA nodes range: %s, hostnode);
-return;
 }
 
 static int numa_add_cpus(const char *name, const char *value, void *opaque)
-- 
1.8.3.2.634.g7a3187e

[Qemu-devel] [PATCH V4 01/10] NUMA: Support multiple CPU ranges on -numa option

2013-07-04 Thread Wanlong Gao

From: Bandan Das b...@redhat.com

This allows us to use the cpus property multiple times
to specify multiple cpu (ranges) to the -numa option :

-numa node,cpus=1,cpus=2,cpus=4
or
-numa node,cpus=1-3,cpus=5

Note that after this patch, the defalut suffix of -numa node,mem=N
will no longer be M. So we must add the suffix M like -numa node,mem=NM
when assigning N MB of node memory size.

Signed-off-by: Bandan Das b...@redhat.com
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 qemu-options.hx |   3 +-
 vl.c| 108 ++--
 2 files changed, 67 insertions(+), 44 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 137a39b..449cf36 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -100,7 +100,8 @@ STEXI
 @item -numa @var{opts}
 @findex -numa
 Simulate a multi node NUMA system. If mem and cpus are omitted, resources
-are split equally.
+are split equally. The -cpus property may be specified multiple times
+to denote multiple cpus or cpu ranges.
 ETEXI
 
 DEF(add-fd, HAS_ARG, QEMU_OPTION_add_fd,
diff --git a/vl.c b/vl.c
index 6d9fd7d..6f2e17a 100644
--- a/vl.c
+++ b/vl.c
@@ -516,6 +516,32 @@ static QemuOptsList qemu_realtime_opts = {
 },
 };
 
+static QemuOptsList qemu_numa_opts = {
+.name = numa,
+.implied_opt_name = type,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_numa_opts.head),
+.desc = {
+{
+.name = type,
+.type = QEMU_OPT_STRING,
+.help = node type
+},{
+.name = nodeid,
+.type = QEMU_OPT_NUMBER,
+.help = node ID
+},{
+.name = mem,
+.type = QEMU_OPT_SIZE,
+.help = memory size
+},{
+.name = cpus,
+.type = QEMU_OPT_STRING,
+.help = cpu number or range
+},
+{ /* end of list */ }
+},
+};
+
 const char *qemu_get_vm_name(void)
 {
 return qemu_name;
@@ -1349,56 +1375,37 @@ error:
 exit(1);
 }
 
-static void numa_add(const char *optarg)
+
+static int numa_add_cpus(const char *name, const char *value, void *opaque)
 {
-char option[128];
-char *endptr;
-unsigned long long nodenr;
+int *nodenr = opaque;
 
-optarg = get_opt_name(option, 128, optarg, ',');
-if (*optarg == ',') {
-optarg++;
+if (!strcmp(name, cpu)) {
+numa_node_parse_cpus(*nodenr, value);
 }
-if (!strcmp(option, node)) {
-
-if (nb_numa_nodes = MAX_NODES) {
-fprintf(stderr, qemu: too many NUMA nodes\n);
-exit(1);
-}
+return 0;
+}
 
-if (get_param_value(option, 128, nodeid, optarg) == 0) {
-nodenr = nb_numa_nodes;
-} else {
-if (parse_uint_full(option, nodenr, 10)  0) {
-fprintf(stderr, qemu: Invalid NUMA nodeid: %s\n, option);
-exit(1);
-}
-}
+static int numa_init_func(QemuOpts *opts, void *opaque)
+{
+uint64_t nodenr, mem_size;
 
-if (nodenr = MAX_NODES) {
-fprintf(stderr, qemu: invalid NUMA nodeid: %llu\n, nodenr);
-exit(1);
-}
+nodenr = qemu_opt_get_number(opts, nodeid, nb_numa_nodes++);
 
-if (get_param_value(option, 128, mem, optarg) == 0) {
-node_mem[nodenr] = 0;
-} else {
-int64_t sval;
-sval = strtosz(option, endptr);
-if (sval  0 || *endptr) {
-fprintf(stderr, qemu: invalid numa mem size: %s\n, optarg);
-exit(1);
-}
-node_mem[nodenr] = sval;
-}
-if (get_param_value(option, 128, cpus, optarg) != 0) {
-numa_node_parse_cpus(nodenr, option);
-}
-nb_numa_nodes++;
-} else {
-fprintf(stderr, Invalid -numa option: %s\n, option);
+if (nodenr = MAX_NODES) {
+fprintf(stderr, qemu: Max number of NUMA nodes reached : %d\n,
+(int)nodenr);
 exit(1);
 }
+
+mem_size = qemu_opt_get_size(opts, mem, 0);
+node_mem[nodenr] = mem_size;
+
+if (qemu_opt_foreach(opts, numa_add_cpus, nodenr, 1)  0) {
+return -1;
+}
+
+return 0;
 }
 
 static QemuOptsList qemu_smp_opts = {
@@ -2933,6 +2940,7 @@ int main(int argc, char **argv, char **envp)
 qemu_add_opts(qemu_object_opts);
 qemu_add_opts(qemu_tpmdev_opts);
 qemu_add_opts(qemu_realtime_opts);
+qemu_add_opts(qemu_numa_opts);
 
 runstate_init();
 
@@ -3119,7 +3127,16 @@ int main(int argc, char **argv, char **envp)
 }
 break;
 case QEMU_OPTION_numa:
-numa_add(optarg);
+olist = qemu_find_opts(numa);
+opts = qemu_opts_parse(olist, optarg, 1);
+if (!opts) {
+exit(1);
+}
+optarg = qemu_opt_get(opts, type);
+if (!optarg || strcmp(optarg, node)) {
+

[Qemu-devel] [PATCH V4 00/10] Add support for binding guest numa nodes to host numa nodes

2013-07-04 Thread Wanlong Gao

As you know, QEMU can't direct it's memory allocation now, this may cause
guest cross node access performance regression.
And, the worse thing is that if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
=kvm_assign_device()
  = kvm_iommu_map_memslots()
= kvm_iommu_map_pages()
   = kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policy before
the pages are really mapped.

According to this patch set, we are able to set guest nodes memory policy
like following:

 -numa node,nodeid=0,mem=1024,cpus=0,mem-policy=membind,mem-hostnode=0-1
 -numa node,nodeid=1,mem=1024,cpus=1,mem-policy=interleave,mem-hostnode=1

This supports 
mem-policy={membind|interleave|preferred},mem-hostnode=[+|!]{all|N-N} like 
format.

And patch 8/10 adds a QMP command set-mpol to set the memory policy for every
guest nodes:
set-mpol nodeid=0 mem-policy=membind mem-hostnode=0-1

And patch 9/10 adds a monitor command set-mpol whose format like:
set-mpol 0 mem-policy=membind,mem-hostnode=0-1

And with patch 10/10, we can get the current memory policy of each guest node
using monitor command info numa, for example:

(qemu) info numa
2 nodes
node 0 cpus: 0
node 0 size: 1024 MB
node 0 mempolicy: membind=0,1
node 1 cpus: 1
node 1 size: 1024 MB
node 1 mempolicy: interleave=1


V1-V2:
change to use QemuOpts in numa options (Paolo)
handle Error in mpol parser (Paolo)
change qmp command format to mem-policy=membind,mem-hostnode=0-1 like 
(Paolo)
V2-V3:
also handle Error in cpus parser (5/10)
split out common parser from cpus and hostnode parser (Bandan 6/10)
V3-V4:
rebase to request for comments


Bandan Das (1):
  NUMA: Support multiple CPU ranges on -numa option

Wanlong Gao (9):
  NUMA: Add numa_info structure to contain numa nodes info
  NUMA: Add Linux libnuma detection
  NUMA: parse guest numa nodes memory policy
  NUMA: handle Error in cpus, mpol and hostnode parser
  NUMA: split out the common range parser
  NUMA: set guest numa nodes memory policy
  NUMA: add qmp command set-mpol to set memory policy for NUMA node
  NUMA: add hmp command set-mpol
  NUMA: show host memory policy info in info numa command

 configure   |  32 ++
 cpus.c  | 143 +++-
 hmp-commands.hx |  16 +++
 hmp.c   |  35 ++
 hmp.h   |   1 +
 hw/i386/pc.c|   4 +-
 hw/net/eepro100.c   |   1 -
 include/sysemu/sysemu.h |  20 +++-
 monitor.c   |  44 +++-
 qapi-schema.json|  15 +++
 qemu-options.hx |   3 +-
 qmp-commands.hx |  35 ++
 vl.c| 285 +++-
 13 files changed, 553 insertions(+), 81 deletions(-)

-- 
1.8.3.1.448.gfb7dfaa

[Qemu-devel] [PATCH V4 10/10] NUMA: show host memory policy info in info numa command

2013-07-04 Thread Wanlong Gao

Show host memory policy of nodes in the info numa monitor command.
After this patch, the monitor command info numa will show the
information like following if the host numa support is enabled:

(qemu) info numa
2 nodes
node 0 cpus: 0
node 0 size: 1024 MB
node 0 mempolicy: membind=0,1
node 1 cpus: 1
node 1 size: 1024 MB
node 1 mempolicy: interleave=1

Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 monitor.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/monitor.c b/monitor.c
index 93ac045..a40415d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -74,6 +74,11 @@
 #endif
 #include hw/lm32/lm32_pic.h
 
+#ifdef CONFIG_NUMA
+#include numa.h
+#include numaif.h
+#endif
+
 //#define DEBUG
 //#define DEBUG_COMPLETION
 
@@ -1808,6 +1813,7 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
 int i;
 CPUArchState *env;
 CPUState *cpu;
+unsigned long first, next;
 
 monitor_printf(mon, %d nodes\n, nb_numa_nodes);
 for (i = 0; i  nb_numa_nodes; i++) {
@@ -1821,6 +1827,42 @@ static void do_info_numa(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, \n);
 monitor_printf(mon, node %d size: % PRId64  MB\n, i,
 numa_info[i].node_mem  20);
+
+#ifdef CONFIG_NUMA
+monitor_printf(mon, node %d mempolicy: , i);
+switch (numa_info[i].flags  NODE_HOST_POLICY_MASK) {
+case NODE_HOST_BIND:
+monitor_printf(mon, membind=);
+break;
+case NODE_HOST_INTERLEAVE:
+monitor_printf(mon, interleave=);
+break;
+case NODE_HOST_PREFERRED:
+monitor_printf(mon, preferred=);
+break;
+default:
+monitor_printf(mon, default\n);
+continue;
+}
+
+if (numa_info[i].flags  NODE_HOST_RELATIVE)
+monitor_printf(mon, +);
+
+next = first = find_first_bit(numa_info[i].host_mem, MAX_CPUMASK_BITS);
+monitor_printf(mon, %lu, first);
+do {
+if (next == numa_max_node())
+break;
+next = find_next_bit(numa_info[i].host_mem, MAX_CPUMASK_BITS,
+ next + 1);
+if (next  numa_max_node() || next == MAX_CPUMASK_BITS)
+break;
+
+monitor_printf(mon, ,%lu, next);
+} while (true);
+
+monitor_printf(mon, \n);
+#endif
 }
 }
 
-- 
1.8.3.2.634.g7a3187e

[Qemu-devel] [PATCH V4 03/10] NUMA: Add Linux libnuma detection

2013-07-04 Thread Wanlong Gao

Add detection of libnuma (mostly contained in the numactl package)
to the configure script. Can be enabled or disabled on the command line,
default is use if available.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 configure | 32 
 1 file changed, 32 insertions(+)

diff --git a/configure b/configure
index 0e0adde..9d3b4ce 100755
--- a/configure
+++ b/configure
@@ -242,6 +242,7 @@ gtk=
 gtkabi=2.0
 tpm=no
 libssh2=
+numa=
 
 # parse CC options first
 for opt do
@@ -944,6 +945,10 @@ for opt do
   ;;
   --enable-libssh2) libssh2=yes
   ;;
+  --disable-numa) numa=no
+  ;;
+  --enable-numa) numa=yes
+  ;;
   *) echo ERROR: unknown option $opt; show_help=yes
   ;;
   esac
@@ -1158,6 +1163,8 @@ echo   --gcov=GCOV  use specified gcov 
[$gcov_tool]
 echo   --enable-tpm enable TPM support
 echo   --disable-libssh2disable ssh block device support
 echo   --enable-libssh2 enable ssh block device support
+echo   --disable-numa   disable libnuma support
+echo   --enable-numaenable libnuma support
 echo 
 echo NOTE: The object files are built at the place where configure is 
launched
 exit 1
@@ -2389,6 +2396,27 @@ EOF
 fi
 
 ##
+# libnuma probe
+
+if test $numa != no ; then
+  numa=no
+  cat  $TMPC  EOF
+#include numa.h
+int main(void) { return numa_available(); }
+EOF
+
+  if compile_prog  -lnuma ; then
+numa=yes
+libs_softmmu=-lnuma $libs_softmmu
+  else
+if test $numa = yes ; then
+  feature_not_found linux NUMA (install numactl?)
+fi
+numa=no
+  fi
+fi
+
+##
 # linux-aio probe
 
 if test $linux_aio != no ; then
@@ -3557,6 +3585,7 @@ echo TPM support   $tpm
 echo libssh2 support   $libssh2
 echo TPM passthrough   $tpm_passthrough
 echo QOM debugging $qom_cast_debug
+echo NUMA host support $numa
 
 if test $sdl_too_old = yes; then
 echo - Your SDL version is too old - please upgrade to have SDL support
@@ -3590,6 +3619,9 @@ echo extra_cflags=$EXTRA_CFLAGS  $config_host_mak
 echo extra_ldflags=$EXTRA_LDFLAGS  $config_host_mak
 echo qemu_localedir=$qemu_localedir  $config_host_mak
 echo libs_softmmu=$libs_softmmu  $config_host_mak
+if test $numa = yes; then
+  echo CONFIG_NUMA=y  $config_host_mak
+fi
 
 echo ARCH=$ARCH  $config_host_mak
 
-- 
1.8.3.2.634.g7a3187e

[Qemu-devel] [PATCH V4 09/10] NUMA: add hmp command set-mpol

2013-07-04 Thread Wanlong Gao

Add hmp command set-mpol to set host memory policy for a guest
NUMA node. Then we can also set node's memory policy using
the monitor command like:
(qemu) set-mpol 0 mem-policy=membind,mem-hostnode=0-1

Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 hmp-commands.hx | 16 
 hmp.c   | 35 +++
 hmp.h   |  1 +
 3 files changed, 52 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..417b69f 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1567,6 +1567,22 @@ Executes a qemu-io command on the given block device.
 ETEXI
 
 {
+.name   = set-mpol,
+.args_type  = nodeid:i,args:s?,
+.params = nodeid [args],
+.help   = set host memory policy for a guest NUMA node,
+.mhandler.cmd = hmp_set_mpol,
+},
+
+STEXI
+@item set-mpol @var{nodeid} @var{args}
+@findex set-mpol
+
+Set host memory policy for a guest NUMA node
+
+ETEXI
+
+{
 .name   = info,
 .args_type  = item:s?,
 .params = [subcommand],
diff --git a/hmp.c b/hmp.c
index 2daed43..57a5730 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1482,3 +1482,38 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 
 hmp_handle_error(mon, err);
 }
+
+void hmp_set_mpol(Monitor *mon, const QDict *qdict)
+{
+Error *local_err = NULL;
+bool has_mpol = true;
+bool has_hostnode = true;
+const char *mpol = NULL;
+const char *hostnode = NULL;
+QemuOpts *opts;
+
+uint64_t nodeid = qdict_get_int(qdict, nodeid);
+const char *args = qdict_get_try_str(qdict, args);
+
+if (args == NULL) {
+has_mpol = false;
+has_hostnode = false;
+} else {
+opts = qemu_opts_parse(qemu_find_opts(numa), args, 1);
+if (opts == NULL) {
+error_setg(local_err, Parsing memory policy args failed);
+} else {
+mpol = qemu_opt_get(opts, mem-policy);
+if (mpol == NULL) {
+has_mpol = false;
+}
+hostnode = qemu_opt_get(opts, mem-hostnode);
+if (hostnode == NULL) {
+has_hostnode = false;
+}
+}
+}
+
+qmp_set_mpol(nodeid, has_mpol, mpol, has_hostnode, hostnode, local_err);
+hmp_handle_error(mon, local_err);
+}
diff --git a/hmp.h b/hmp.h
index 56d2e92..81f631b 100644
--- a/hmp.h
+++ b/hmp.h
@@ -86,5 +86,6 @@ void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict);
 void hmp_chardev_add(Monitor *mon, const QDict *qdict);
 void hmp_chardev_remove(Monitor *mon, const QDict *qdict);
 void hmp_qemu_io(Monitor *mon, const QDict *qdict);
+void hmp_set_mpol(Monitor *mon, const QDict *qdict);
 
 #endif
-- 
1.8.3.2.634.g7a3187e

[Qemu-devel] [PATCH V4 04/10] NUMA: parse guest numa nodes memory policy

2013-07-04 Thread Wanlong Gao

The memory policy setting format is like:
mem-policy={membind|interleave|preferred},mem-hostnode=[+|!]{all|N-N}
And we are adding this setting as a suboption of -numa,
the memory policy then can be set like following:
 -numa node,nodeid=0,mem=1024,cpus=0,mem-policy=membind,mem-hostnode=0-1
 -numa node,nodeid=1,mem=1024,cpus=1,mem-policy=interleave,mem-hostnode=!1

Reviewed-by: Bandan Das b...@redhat.com
Signed-off-by: Andre Przywara andre.przyw...@amd.com
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 include/sysemu/sysemu.h |   8 
 vl.c| 110 
 2 files changed, 118 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 70fd2ed..993b8e0 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -130,10 +130,18 @@ extern QEMUClock *rtc_clock;
 
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
+#define NODE_HOST_NONE0x00
+#define NODE_HOST_BIND0x01
+#define NODE_HOST_INTERLEAVE  0x02
+#define NODE_HOST_PREFERRED   0x03
+#define NODE_HOST_POLICY_MASK 0x03
+#define NODE_HOST_RELATIVE0x04
 extern int nb_numa_nodes;
 struct node_info {
 uint64_t node_mem;
 DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+DECLARE_BITMAP(host_mem, MAX_CPUMASK_BITS);
+unsigned int flags;
 };
 extern struct node_info numa_info[MAX_NODES];
 
diff --git a/vl.c b/vl.c
index 5207b8e..495b3a8 100644
--- a/vl.c
+++ b/vl.c
@@ -536,6 +536,14 @@ static QemuOptsList qemu_numa_opts = {
 .name = cpus,
 .type = QEMU_OPT_STRING,
 .help = cpu number or range
+},{
+.name = mem-policy,
+.type = QEMU_OPT_STRING,
+.help = memory policy
+},{
+.name = mem-hostnode,
+.type = QEMU_OPT_STRING,
+.help = host node number or range for memory policy
 },
 { /* end of list */ }
 },
@@ -1374,6 +1382,79 @@ error:
 exit(1);
 }
 
+static void numa_node_parse_mpol(int nodenr, const char *mpol)
+{
+if (!mpol) {
+return;
+}
+
+if (!strcmp(mpol, interleave)) {
+numa_info[nodenr].flags |= NODE_HOST_INTERLEAVE;
+} else if (!strcmp(mpol, preferred)) {
+numa_info[nodenr].flags |= NODE_HOST_PREFERRED;
+} else if (!strcmp(mpol, membind)) {
+numa_info[nodenr].flags |= NODE_HOST_BIND;
+} else {
+fprintf(stderr, qemu: Invalid memory policy: %s\n, mpol);
+}
+}
+
+static void numa_node_parse_hostnode(int nodenr, const char *hostnode)
+{
+unsigned long long value, endvalue;
+char *endptr;
+bool clear = false;
+unsigned long *bm = numa_info[nodenr].host_mem;
+
+if (hostnode[0] == '!') {
+clear = true;
+bitmap_fill(bm, MAX_CPUMASK_BITS);
+hostnode++;
+}
+if (hostnode[0] == '+') {
+numa_info[nodenr].flags |= NODE_HOST_RELATIVE;
+hostnode++;
+}
+
+if (!strcmp(hostnode, all)) {
+bitmap_fill(bm, MAX_CPUMASK_BITS);
+return;
+}
+
+if (parse_uint(hostnode, value, endptr, 10)  0)
+goto error;
+if (*endptr == '-') {
+if (parse_uint_full(endptr + 1, endvalue, 10)  0) {
+goto error;
+}
+} else if (*endptr == '\0') {
+endvalue = value;
+} else {
+goto error;
+}
+
+if (endvalue = MAX_CPUMASK_BITS) {
+endvalue = MAX_CPUMASK_BITS - 1;
+fprintf(stderr,
+qemu: NUMA: A max of %d host nodes are supported\n,
+ MAX_CPUMASK_BITS);
+}
+
+if (endvalue  value) {
+goto error;
+}
+
+if (clear)
+bitmap_clear(bm, value, endvalue - value + 1);
+else
+bitmap_set(bm, value, endvalue - value + 1);
+
+return;
+
+error:
+fprintf(stderr, qemu: Invalid host NUMA nodes range: %s\n, hostnode);
+return;
+}
 
 static int numa_add_cpus(const char *name, const char *value, void *opaque)
 {
@@ -1385,6 +1466,25 @@ static int numa_add_cpus(const char *name, const char 
*value, void *opaque)
 return 0;
 }
 
+static int numa_add_mpol(const char *name, const char *value, void *opaque)
+{
+int *nodenr = opaque;
+
+if (!strcmp(name, mem-policy)) {
+numa_node_parse_mpol(*nodenr, value);
+}
+return 0;
+}
+
+static int numa_add_hostnode(const char *name, const char *value, void *opaque)
+{
+int *nodenr = opaque;
+if (!strcmp(name, mem-hostnode)) {
+numa_node_parse_hostnode(*nodenr, value);
+}
+return 0;
+}
+
 static int numa_init_func(QemuOpts *opts, void *opaque)
 {
 uint64_t nodenr, mem_size;
@@ -1404,6 +1504,14 @@ static int numa_init_func(QemuOpts *opts, void *opaque)
 return -1;
 }
 
+if (qemu_opt_foreach(opts, numa_add_mpol, nodenr, 1)  0) {
+return -1;
+}
+
+if (qemu_opt_foreach(opts, numa_add_hostnode, nodenr, 1)  0) {
+return -1;
+}
+
 return 0;
 }
 
@@ -2962,6 +3070,8 @@ int main(int argc, char

[Qemu-devel] [PATCH V4 02/10] NUMA: Add numa_info structure to contain numa nodes info

2013-07-04 Thread Wanlong Gao

Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 cpus.c  |  2 +-
 hw/i386/pc.c|  4 ++--
 hw/net/eepro100.c   |  1 -
 include/sysemu/sysemu.h |  8 ++--
 monitor.c   |  2 +-
 vl.c| 24 
 6 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/cpus.c b/cpus.c
index 20958e5..496d5ce 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1180,7 +1180,7 @@ void set_numa_modes(void)
 for (env = first_cpu; env != NULL; env = env-next_cpu) {
 cpu = ENV_GET_CPU(env);
 for (i = 0; i  nb_numa_nodes; i++) {
-if (test_bit(cpu-cpu_index, node_cpumask[i])) {
+if (test_bit(cpu-cpu_index, numa_info[i].node_cpu)) {
 cpu-numa_node = i;
 }
 }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 78f92e2..78b5a72 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -650,14 +650,14 @@ static FWCfgState *bochs_bios_init(void)
 unsigned int apic_id = x86_cpu_apic_id_from_index(i);
 assert(apic_id  apic_id_limit);
 for (j = 0; j  nb_numa_nodes; j++) {
-if (test_bit(i, node_cpumask[j])) {
+if (test_bit(i, numa_info[j].node_cpu)) {
 numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
 break;
 }
 }
 }
 for (i = 0; i  nb_numa_nodes; i++) {
-numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(node_mem[i]);
+numa_fw_cfg[apic_id_limit + 1 + i] = 
cpu_to_le64(numa_info[i].node_mem);
 }
 fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg,
  (1 + apic_id_limit + nb_numa_nodes) *
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index dc99ea6..478c688 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -105,7 +105,6 @@
 #define PCI_IO_SIZE 64
 #define PCI_FLASH_SIZE  (128 * KiB)
 
-#define BIT(n) (1  (n))
 #define BITS(n, m) (((0xU  (31 - n))  (31 - n + m))  m)
 
 /* The SCB accepts the following controls for the Tx and Rx units: */
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 2fb71af..70fd2ed 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -9,6 +9,7 @@
 #include qapi-types.h
 #include qemu/notify.h
 #include qemu/main-loop.h
+#include qemu/bitmap.h
 
 /* vl.c */
 
@@ -130,8 +131,11 @@ extern QEMUClock *rtc_clock;
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
-extern uint64_t node_mem[MAX_NODES];
-extern unsigned long *node_cpumask[MAX_NODES];
+struct node_info {
+uint64_t node_mem;
+DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+};
+extern struct node_info numa_info[MAX_NODES];
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/monitor.c b/monitor.c
index 9be515c..93ac045 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1820,7 +1820,7 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
 }
 monitor_printf(mon, \n);
 monitor_printf(mon, node %d size: % PRId64  MB\n, i,
-node_mem[i]  20);
+numa_info[i].node_mem  20);
 }
 }
 
diff --git a/vl.c b/vl.c
index 6f2e17a..5207b8e 100644
--- a/vl.c
+++ b/vl.c
@@ -250,8 +250,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
 QTAILQ_HEAD_INITIALIZER(fw_boot_order);
 
 int nb_numa_nodes;
-uint64_t node_mem[MAX_NODES];
-unsigned long *node_cpumask[MAX_NODES];
+struct node_info numa_info[MAX_NODES];
 
 uint8_t qemu_uuid[16];
 
@@ -1367,7 +1366,7 @@ static void numa_node_parse_cpus(int nodenr, const char 
*cpus)
 goto error;
 }
 
-bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
 return;
 
 error:
@@ -1399,7 +1398,7 @@ static int numa_init_func(QemuOpts *opts, void *opaque)
 }
 
 mem_size = qemu_opt_get_size(opts, mem, 0);
-node_mem[nodenr] = mem_size;
+numa_info[nodenr].node_mem = mem_size;
 
 if (qemu_opt_foreach(opts, numa_add_cpus, nodenr, 1)  0) {
 return -1;
@@ -2961,8 +2960,8 @@ int main(int argc, char **argv, char **envp)
 translation = BIOS_ATA_TRANSLATION_AUTO;
 
 for (i = 0; i  MAX_NODES; i++) {
-node_mem[i] = 0;
-node_cpumask[i] = bitmap_new(MAX_CPUMASK_BITS);
+numa_info[i].node_mem = 0;
+bitmap_zero(numa_info[i].node_cpu, MAX_CPUMASK_BITS);
 }
 
 nb_numa_nodes = 0;
@@ -4228,7 +4227,7 @@ int main(int argc, char **argv, char **envp)
  * and distribute the available memory equally across all nodes
  */
 for (i = 0; i  nb_numa_nodes; i++) {
-if (node_mem[i] != 0)
+if (numa_info[i].node_mem != 0)
 break;
 }
 if (i == nb_numa_nodes) {
@@ -4238,14 +4237,15 @@ int

[Qemu-devel] [PATCH V4 07/10] NUMA: set guest numa nodes memory policy

2013-07-04 Thread Wanlong Gao

Set the guest numa nodes memory policies using the mbind(2)
system call node by node.
After this patch, we are able to set guest nodes memory policies
through the QEMU options, this arms to solve the guest cross
nodes memory access performance issue.
And as you all know, if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
=kvm_assign_device()
  = kvm_iommu_map_memslots()
= kvm_iommu_map_pages()
   = kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policies before
the pages are really mapped.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 cpus.c | 87 ++
 1 file changed, 87 insertions(+)

diff --git a/cpus.c b/cpus.c
index 496d5ce..7240de7 100644
--- a/cpus.c
+++ b/cpus.c
@@ -60,6 +60,15 @@
 
 #endif /* CONFIG_LINUX */
 
+#ifdef CONFIG_NUMA
+#include numa.h
+#include numaif.h
+#ifndef MPOL_F_RELATIVE_NODES
+#define MPOL_F_RELATIVE_NODES (1  14)
+#define MPOL_F_STATIC_NODES   (1  15)
+#endif
+#endif
+
 static CPUArchState *next_cpu;
 
 static bool cpu_thread_is_idle(CPUState *cpu)
@@ -1171,6 +1180,75 @@ static void tcg_exec_all(void)
 exit_request = 0;
 }
 
+#ifdef CONFIG_NUMA
+static int node_parse_bind_mode(unsigned int nodeid)
+{
+int bind_mode;
+
+switch (numa_info[nodeid].flags  NODE_HOST_POLICY_MASK) {
+case NODE_HOST_BIND:
+bind_mode = MPOL_BIND;
+break;
+case NODE_HOST_INTERLEAVE:
+bind_mode = MPOL_INTERLEAVE;
+break;
+case NODE_HOST_PREFERRED:
+bind_mode = MPOL_PREFERRED;
+break;
+default:
+bind_mode = MPOL_DEFAULT;
+return bind_mode;
+}
+
+bind_mode |= (numa_info[nodeid].flags  NODE_HOST_RELATIVE) ?
+MPOL_F_RELATIVE_NODES : MPOL_F_STATIC_NODES;
+
+return bind_mode;
+}
+#endif
+
+static int set_node_mpol(unsigned int nodeid)
+{
+#ifdef CONFIG_NUMA
+void *ram_ptr;
+RAMBlock *block;
+ram_addr_t len, ram_offset = 0;
+int bind_mode;
+int i;
+
+QTAILQ_FOREACH(block, ram_list.blocks, next) {
+if (!strcmp(block-mr-name, pc.ram)) {
+break;
+}
+}
+
+if (block-host == NULL)
+return -1;
+
+ram_ptr = block-host;
+for (i = 0; i  nodeid; i++) {
+len = numa_info[i].node_mem;
+ram_offset += len;
+}
+
+len = numa_info[i].node_mem;
+bind_mode = node_parse_bind_mode(i);
+
+/* This is a workaround for a long standing bug in Linux'
+ * mbind implementation, which cuts off the last specified
+ * node. To stay compatible should this bug be fixed, we
+ * specify one more node and zero this one out.
+ */
+clear_bit(numa_num_configured_nodes() + 1, numa_info[i].host_mem);
+if (mbind(ram_ptr + ram_offset, len, bind_mode,
+numa_info[i].host_mem, numa_num_configured_nodes() + 1, 0)) {
+perror(mbind);
+return -1;
+}
+#endif
+return 0;
+}
+
 void set_numa_modes(void)
 {
 CPUArchState *env;
@@ -1185,6 +1263,15 @@ void set_numa_modes(void)
 }
 }
 }
+
+#ifdef CONFIG_NUMA
+for (i = 0; i  nb_numa_nodes; i++) {
+if (set_node_mpol(i) == -1) {
+fprintf(stderr,
+qemu: can't set host memory policy for node%d\n, i);
+}
+}
+#endif
 }
 
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
-- 
1.8.3.2.634.g7a3187e

[Qemu-devel] [PATCH V4 05/10] NUMA: handle Error in cpus, mpol and hostnode parser

2013-07-04 Thread Wanlong Gao

As Paolo pointed out that, handle Error in mpol and hostnode parser
will make it easier to be used for example in mem-hotplug in the future.
And this will be used later in set-mpol QMP command.
Also handle Error in cpus parser to be consistent with others.

Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 include/sysemu/sysemu.h |  4 
 vl.c| 42 --
 2 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 993b8e0..0f135fe 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -144,6 +144,10 @@ struct node_info {
 unsigned int flags;
 };
 extern struct node_info numa_info[MAX_NODES];
+extern void numa_node_parse_mpol(int nodenr, const char *hostnode,
+ Error **errp);
+extern void numa_node_parse_hostnode(int nodenr, const char *hostnode,
+ Error **errp);
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/vl.c b/vl.c
index 495b3a8..38e0d3d 100644
--- a/vl.c
+++ b/vl.c
@@ -1338,7 +1338,7 @@ char *get_boot_devices_list(size_t *size)
 return list;
 }
 
-static void numa_node_parse_cpus(int nodenr, const char *cpus)
+static void numa_node_parse_cpus(int nodenr, const char *cpus, Error **errp)
 {
 char *endptr;
 unsigned long long value, endvalue;
@@ -1378,13 +1378,14 @@ static void numa_node_parse_cpus(int nodenr, const char 
*cpus)
 return;
 
 error:
-fprintf(stderr, qemu: Invalid NUMA CPU range: %s\n, cpus);
-exit(1);
+error_setg(errp, Invalid NUMA CPU range: %s\n, cpus);
+return;
 }
 
-static void numa_node_parse_mpol(int nodenr, const char *mpol)
+void numa_node_parse_mpol(int nodenr, const char *mpol, Error **errp)
 {
 if (!mpol) {
+error_setg(errp, Should specify memory policy);
 return;
 }
 
@@ -1395,11 +1396,11 @@ static void numa_node_parse_mpol(int nodenr, const char 
*mpol)
 } else if (!strcmp(mpol, membind)) {
 numa_info[nodenr].flags |= NODE_HOST_BIND;
 } else {
-fprintf(stderr, qemu: Invalid memory policy: %s\n, mpol);
+error_setg(errp, Invalid memory policy: %s, mpol);
 }
 }
 
-static void numa_node_parse_hostnode(int nodenr, const char *hostnode)
+void numa_node_parse_hostnode(int nodenr, const char *hostnode, Error **errp)
 {
 unsigned long long value, endvalue;
 char *endptr;
@@ -1452,16 +1453,22 @@ static void numa_node_parse_hostnode(int nodenr, const 
char *hostnode)
 return;
 
 error:
-fprintf(stderr, qemu: Invalid host NUMA nodes range: %s\n, hostnode);
+error_setg(errp, Invalid host NUMA nodes range: %s, hostnode);
 return;
 }
 
 static int numa_add_cpus(const char *name, const char *value, void *opaque)
 {
 int *nodenr = opaque;
+Error *err = NULL;
 
 if (!strcmp(name, cpu)) {
-numa_node_parse_cpus(*nodenr, value);
+numa_node_parse_cpus(*nodenr, value, err);
+}
+if (error_is_set(err)) {
+fprintf(stderr, qemu: %s\n, error_get_pretty(err));
+error_free(err);
+return -1;
 }
 return 0;
 }
@@ -1469,19 +1476,34 @@ static int numa_add_cpus(const char *name, const char 
*value, void *opaque)
 static int numa_add_mpol(const char *name, const char *value, void *opaque)
 {
 int *nodenr = opaque;
+Error *err = NULL;
 
 if (!strcmp(name, mem-policy)) {
-numa_node_parse_mpol(*nodenr, value);
+numa_node_parse_mpol(*nodenr, value, err);
+}
+if (error_is_set(err)) {
+fprintf(stderr, qemu: %s\n, error_get_pretty(err));
+error_free(err);
+return -1;
 }
+
 return 0;
 }
 
 static int numa_add_hostnode(const char *name, const char *value, void *opaque)
 {
 int *nodenr = opaque;
+Error *err = NULL;
+
 if (!strcmp(name, mem-hostnode)) {
-numa_node_parse_hostnode(*nodenr, value);
+numa_node_parse_hostnode(*nodenr, value, err);
 }
+if (error_is_set(err)) {
+fprintf(stderr, qemu: %s\n, error_get_pretty(err));
+error_free(err);
+return -1;
+}
+
 return 0;
 }
 
-- 
1.8.3.2.634.g7a3187e

[Qemu-devel] [PATCH V4 08/10] NUMA: add qmp command set-mpol to set memory policy for NUMA node

2013-07-04 Thread Wanlong Gao

The QMP command let it be able to set node's memory policy
through the QMP protocol. The qmp-shell command is like:
set-mpol nodeid=0 mem-policy=membind mem-hostnode=0-1

Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 cpus.c   | 54 ++
 qapi-schema.json | 15 +++
 qmp-commands.hx  | 35 +++
 3 files changed, 104 insertions(+)

diff --git a/cpus.c b/cpus.c
index 7240de7..ff42b9d 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1417,3 +1417,57 @@ void qmp_inject_nmi(Error **errp)
 error_set(errp, QERR_UNSUPPORTED);
 #endif
 }
+
+void qmp_set_mpol(int64_t nodeid, bool has_mpol, const char *mpol,
+  bool has_hostnode, const char *hostnode, Error **errp)
+{
+unsigned int flags;
+DECLARE_BITMAP(host_mem, MAX_CPUMASK_BITS);
+
+if (nodeid = nb_numa_nodes) {
+error_setg(errp, Only has '%d' NUMA nodes, nb_numa_nodes);
+return;
+}
+
+bitmap_copy(host_mem, numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
+flags = numa_info[nodeid].flags;
+
+numa_info[nodeid].flags = NODE_HOST_NONE;
+bitmap_zero(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
+
+if (!has_mpol) {
+if (set_node_mpol(nodeid) == -1) {
+error_setg(errp, Failed to set memory policy for node%lu, 
nodeid);
+goto error;
+}
+return;
+}
+
+numa_node_parse_mpol(nodeid, mpol, errp);
+if (error_is_set(errp)) {
+goto error;
+}
+
+if (!has_hostnode) {
+bitmap_fill(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
+}
+
+if (hostnode) {
+numa_node_parse_hostnode(nodeid, hostnode, errp);
+if (error_is_set(errp)) {
+goto error;
+}
+}
+
+if (set_node_mpol(nodeid) == -1) {
+error_setg(errp, Failed to set memory policy for node%lu, nodeid);
+goto error;
+}
+
+return;
+
+error:
+bitmap_copy(numa_info[nodeid].host_mem, host_mem, MAX_CPUMASK_BITS);
+numa_info[nodeid].flags = flags;
+return;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index 5c32528..0870da2 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3712,3 +3712,18 @@
 '*cpuid-input-ecx': 'int',
 'cpuid-register': 'X86CPURegister32',
 'features': 'int' } }
+
+# @set-mpol:
+#
+# Set the host memory binding policy for guest NUMA node.
+#
+# @nodeid: The node ID of guest NUMA node to set memory policy to.
+#
+# @mem-policy: The memory policy string to set.
+#
+# @mem-hostnode: The host node or node range for memory policy.
+#
+# Since: 1.6.0
+##
+{ 'command': 'set-mpol', 'data': {'nodeid': 'int', '*mem-policy': 'str',
+  '*mem-hostnode': 'str'} }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 362f0e1..ccab51b 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3043,3 +3043,38 @@ Example:
 - { return: {} }
 
 EQMP
+
+{
+.name  = set-mpol,
+.args_type = nodeid:i,mem-policy:s?,mem-hostnode:s?,
+.help  = Set the host memory binding policy for guest NUMA node,
+.mhandler.cmd_new = qmp_marshal_input_set_mpol,
+},
+
+SQMP
+set-mpol
+--
+
+Set the host memory binding policy for guest NUMA node
+
+Arguments:
+
+- nodeid: The nodeid of guest NUMA node to set memory policy to.
+(json-int)
+- mem-policy: The memory policy string to set.
+(json-string, optional)
+- mem-hostnode: The host nodes contained to mpol.
+  (json-string, optional)
+
+Example:
+
+- { execute: set-mpol, arguments: { nodeid: 0, mem-policy: 
membind,
+   mem-hostnode: 0-1 }}
+- { return: {} }
+
+Notes:
+1. If mem-policy is not set, the memory policy of this nodeid will be 
set
+   to default.
+2. If mem-hostnode is not set, the node mask of this mpol will be set
+   to all.
+EQMP
-- 
1.8.3.2.634.g7a3187e

Re: [Qemu-devel] [RFC V8 01/24] qcow2: Add journal specification.

2013-07-04 Thread Benoît Canet

 Simple is good.  Even for deduplication alone, I think data integrity is
 critical - otherwise we risk stale dedup metadata pointing to clusters
 that are unallocated or do not contain the right data.  So the journal
 will probably need to follow techniques for commits/checksums.

I agree that checksums are missing for the dedup.
Maybe we could even use some kind of error correcting code instead of a 
checksum.

Concerning data integrity the events that the deduplication code cannot loose
are hash deletions because they mark a previously inserted hash as obsolete.

The problem with a commit/flush mechanism on hash deletion is that it will slow
down the store insertion speed and also create some extra SSD wear out.

To solve this I considered the fact that the dedup metadata as a whole is
disposable.

So I implemented a dedup dirty bit.

When QEMU stop the journal is flushed and the dirty bit is cleared.
When QEMU start and the dirty bit is set a crash is detected and _all_ the
deduplication metadata is dropped.
The QCOW2 data integrity won't suffer only the dedup ratio will be lower.

As you said once on irc crashes don't happen often.

Benoît

Re: [Qemu-devel] 回复： Re: 回复： Re: Which part of qemu responds to ACPI control method?

2013-07-04 Thread Laszlo Ersek

On 07/04/13 08:05, bobooscar wrote:
 Thank you laszlo. however, after I got DSDT.dsl, I found that there is
 no “_PTS” method, even if “_TTS” “_GTS”.  
 Then I go through all the acpi tables, still found no PTS/TTS methods:
 acpidump  acpidump.out
 acpixtract -a acpidump.out
 for file in `ls |grep dat`; do iasl -a $file; done
 
 The guest os is redhat 6.1 hvm.
 
 What does that mean? Does that mean this OS does not support
 sleep/wakeup(suspend/resume) with acpi? What caused this problem? Does
 that have anything to do with qemu? (I tried to add logs in
 hwsleep.c:acpi_enter_sleep_mode in the guest kernel code, and found that
 the os does not get here)

Some tables can have several instances, like SSDT; see the --skip option.

But, it seems reasonable that you have found no _PTS method, as SeaBIOS
doesn't seem to define such. Since you started your email with _PTS, I
treated the method as something given in your case. Now I'm supposing
you use SeaBIOS and _PTS not being there is consistent with that.

So, back to square 1, what is your *actual* problem?

In any of the dumped / decompiled SSDT tables, do you see _S3, _S4, _S5
objects?

Maybe try passing the following options to qemu:

  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

Laszlo

Re: [Qemu-devel] [PATCH v3 01/14] tcg: Add myself to general TCG maintainership

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 Signed-off-by: Richard Henderson r...@twiddle.net

I think this is definitely a good idea; thanks!

Acked-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 02/14] tcg: Split rem requirement from div requirement

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 There are several hosts with only a div insn.  Remainder is computed
 manually from the quotient and inputs.  We can do this generically.

 Signed-off-by: Richard Henderson r...@twiddle.net
 --- a/tcg/tci/tcg-target.h
 +++ b/tcg/tci/tcg-target.h
 @@ -61,6 +61,7 @@
  #define TCG_TARGET_HAS_bswap32_i32  1
  /* Not more than one of the next two defines must be 1. */
  #define TCG_TARGET_HAS_div_i32  1
 +#define TCG_TARGET_HAS_rem_i32  1
  #define TCG_TARGET_HAS_div2_i32 0
  #define TCG_TARGET_HAS_ext8s_i321
  #define TCG_TARGET_HAS_ext16s_i32   1
 @@ -85,6 +86,7 @@
  #define TCG_TARGET_HAS_deposit_i64  1
  /* Not more than one of the next two defines must be 1. */
  #define TCG_TARGET_HAS_div_i64  0
 +#define TCG_TARGET_HAS_rem_i64  0
  #define TCG_TARGET_HAS_div2_i64 0
  #define TCG_TARGET_HAS_ext8s_i641
  #define TCG_TARGET_HAS_ext16s_i64   1

The added line in the these two hunks makes the comments wrong,
doesn't it?

Other than that,
Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 06/14] tcg: Allow non-constant control macros

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 This allows TCG_TARGET_HAS_* to be a variable rather than a constant,
 which allows easier support for differing ISA levels for the host.

The effect of this is that TCG_OPF_NOT_PRESENT means if set,
op is definitely not present; if not set, op might or might
not be present, right? Which is OK because it's just a debug
guard/sanity check. (That might be worth noting in a comment
I guess.)

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 07/14] tcg: Simplify logic using TCG_OPF_NOT_PRESENT

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 Expand the definition of not present to include should not be present.
 This means we can simplify the logic surrounding the generic tcg opcodes
 for which the host backend ought not be providing definitions.

 Signed-off-by: Richard Henderson r...@twiddle.net

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 03/14] tcg-arm: Don't implement rem

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 Signed-off-by: Richard Henderson r...@twiddle.net

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 08/14] tcg-arm: Make use of conditional availability of opcodes for divide

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 We can now detect and use divide instructions at runtime, rather than
 having to restrict their availability to compile-time.

 Signed-off-by: Richard Henderson r...@twiddle.net
 ---
  tcg/arm/tcg-target.c | 16 ++--
  tcg/arm/tcg-target.h | 14 --
  2 files changed, 22 insertions(+), 8 deletions(-)

 diff --git a/tcg/arm/tcg-target.c b/tcg/arm/tcg-target.c
 index 8321f80..2c46ceb 100644
 --- a/tcg/arm/tcg-target.c
 +++ b/tcg/arm/tcg-target.c
 @@ -67,6 +67,13 @@ static const int use_armv7_instructions = 0;
  #endif
  #undef USE_ARMV7_INSTRUCTIONS

 +#ifndef use_idiv_instructions
 +bool use_idiv_instructions;
 +#endif
 +#ifdef CONFIG_GETAUXVAL
 +# include sys/auxv.h
 +#endif

My ARM system doesn't have a sys/auxv.h, which renders most of this patch
a bit moot (and certainly untestable :-)). Do newer glibc have this?

 +
  #ifndef NDEBUG
  static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
  %r0,
 @@ -2029,16 +2036,21 @@ static const TCGTargetOpDef arm_op_defs[] = {

  { INDEX_op_deposit_i32, { r, 0, rZ } },

 -#if TCG_TARGET_HAS_div_i32
  { INDEX_op_div_i32, { r, r, r } },
  { INDEX_op_divu_i32, { r, r, r } },
 -#endif

  { -1 },
  };

  static void tcg_target_init(TCGContext *s)
  {
 +#if defined(CONFIG_GETAUXVAL)  !defined(use_idiv_instructions)
 +{
 +unsigned long hwcap = getauxval(AT_HWCAP);
 +use_idiv_instructions = hwcap  (HWCAP_ARM_IDIVA | HWCAP_ARM_IDIVT);

Doesn't this mean we'll try to use the ARM division
insns even if the CPU only supports the Thumb encodings?
I think you should only be testing for whether HWCAP_ARM_IDIVA
is set.

 +}
 +#endif

thanks
-- PMM

Re: [Qemu-devel] [PATCH v3 09/14] tcg-arm: Simplify logic in detecting the ARM ISA in use

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 -#if defined(__ARM_ARCH_7__) ||  \
 -defined(__ARM_ARCH_7A__) || \
 -defined(__ARM_ARCH_7EM__) || \
 -defined(__ARM_ARCH_7M__) || \
 -defined(__ARM_ARCH_7R__)
 -#define USE_ARMV7_INSTRUCTIONS
 +/* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
 +#ifndef __ARM_ARCH
 +# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
 + || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
 + || defined(__ARM_ARCH_7EM__)
 +#  define __ARM_ARCH 7
 +# elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
 +   || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
 +   || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
 +#  define __ARM_ARCH 6
 +# elif defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5E__) \
 +   || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
 +   || defined(__ARM_ARCH_5TEJ__)
 +#  define __ARM_ARCH 5
 +# else
 +#  define __ARM_ARCH 4
 +# endif
  #endif

 -#if defined(USE_ARMV7_INSTRUCTIONS) || \
 -defined(__ARM_ARCH_6J__) || \
 -defined(__ARM_ARCH_6K__) || \
 -defined(__ARM_ARCH_6T2__) || \
 -defined(__ARM_ARCH_6Z__) || \
 -defined(__ARM_ARCH_6ZK__)
 -#define USE_ARMV6_INSTRUCTIONS
 -#endif
 -
 -#if defined(USE_ARMV6_INSTRUCTIONS) || \
 -defined(__ARM_ARCH_5T__) || \
 -defined(__ARM_ARCH_5TE__) || \
 -defined(__ARM_ARCH_5TEJ__)
 -#define USE_ARMV5_INSTRUCTIONS
 -#endif

This change means we now set use_armv5_instructions
for __ARCH_ARCH_5__ and __ARM_ARCH_5E__, which we didn't
before. However one of the things that bool is gating is
whether we use the 'blx' insn, which is ARMv5T and above only.
So this will break v5-but-not-v5T CPUs.

(use_armv6_instructions is similarly now set for __ARCH_ARCH_6__
where it was not before, but none of the things we guard with
that test are insns that aren't in base v6.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH 6/9] vhost-scsi: new device supporting the tcm_vhost Linux kernel module

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 14:33, Libaiqing ha scritto:
 Hi asias,
 I got the rootcause:guest was installed on raw img with lvm 
 partition,which vhost does not support.

You mean LVM in the host or the guest?  I guess in the host, but I'd
rather make sure because otherwise you've found a bug.

Paolo

 Now vhost-scsi can be used as bootable device.

Re: [Qemu-devel] [PATCH v3 12/14] tcg: Move the CIE and FDE header definitions to common code

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 These will necessarily be the same layout for all hosts.  This limits
 the amount of boilerplate required to implement jit debug for a host.

 Signed-off-by: Richard Henderson r...@twiddle.net

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 14/14] tcg-arm: Implement tcg_register_jit

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 Allows unwinding past the code_gen_buffer.

 Signed-off-by: Richard Henderson r...@twiddle.net

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH v3 11/14] tcg: Fix high_pc fields in .debug_info

2013-07-04 Thread Peter Maydell

On 3 July 2013 22:29, Richard Henderson r...@twiddle.net wrote:
 I don't think the debugger actually looks at this for anything,
 using the correct .debug_frame contents, but might as well get
 it all correct.

 Signed-off-by: Richard Henderson r...@twiddle.net

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM

Re: [Qemu-devel] [PATCH] highbank: add initial Calxeda Midway A15 support

2013-07-04 Thread Peter Maydell

On 28 June 2013 12:59, Andre Przywara andre.przyw...@calxeda.com wrote:
 From: Rob Herring rob.herr...@calxeda.com

 While the Calxeda Midway part is actually a bit more than a Highbank
 with A15s, for QEMU's purposes this view is sufficient. So to allow
 both emulation with that chip as well as KVM guests using that model
 add an A15 CPU and it's peripherals as an option. The use of:
 -M highbank -cpu cortex-a15 simply gives the new chip without the
 need for a new model.

I don't think we have any other board models which do I'm going
to guess which board you actually wanted based on which CPU
you specified, do we? I think it would be nicer just to have
a '-M midway' which gave you the right CPU and peripherals.

thanks
-- PMM

[Qemu-devel] [Bug 1197663] Re: qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB irrespective of the volume size

2013-07-04 Thread chandrashekar shastri

I tried booting with the qemu and observed the same thing.

qemu-system-x86_64 /home/images/rhel64-64.qcow2 -drive
if=none,id=hd,file=/home/images/virtio-scsi11.img -device virtio-scsi-
pci,id=scsi --enable-kvm -device scsi-hd,drive=hd -m 2000

After creating the filesystem tried running the iozone and noticed disk
out of space issue.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1197663

Title:
  qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
  irrespective of the volume size

Status in QEMU:
  New
Status in Fedora:
  New

Bug description:
  qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
  irrespective of the volume size.

  Kernel Version: 3.10.0-rc5+

  Libvirt Version: 1.0.6

  Qemu Version: 1.5.50

  Steps to reproduce the issue:

  1. Create a qcow2 voulme using the command qemu-img create -f qcow2 
virtio-scsi11.img 10G 
  2. Add the virtio-scsi controller 
   
   controller type='scsi' index='0' model='virtio-scsi'
address type='pci' domain='0x' bus='0x00' slot='0x04' 
function='0x0'/
  /controller

  3. Attach the qcow2 device to the guest, virsh attach-disk rhel64-64
  /home/images/virtio-scsi11.img --persistent sdr --cache writethrough

  4. Run the scan commnad echo ' - - - ' 
  /sys/class/scsi_host/host#/scan, if the attached volume doesn't get
  recognize.

  5. Check the dmesg for the added volume.

  6. Run fdisk -l command

  Disk /dev/sdl: 0 MB, 197120 bytes
  1 heads, 1 sectors/track, 385 cylinders, total 385 sectors
  Units = cylinders of 1 * 512 = 512 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x

  
  And observe that the 10G qcow2 volume shows as 0MB.

  This is not seen with the raw image.

  Disk /dev/sdm: 10.7 GB, 10737418240 bytes
  64 heads, 32 sectors/track, 10240 cylinders
  Units = cylinders of 2048 * 512 = 1048576 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x

  Expected Result:

  The volume size for the qcow2 volumes should be shown correctly inside
  the guest to avoid confusion.

  
  Guest XML:
  virsh dumpxml rhel64-64
  domain type='kvm' id='4'
namerhel64-64/name
uuid48deb0e1-0c23-9be9-da12-2ead34864de2/uuid
memory unit='KiB'4096000/memory
currentMemory unit='KiB'4096000/currentMemory
vcpu placement='static'1/vcpu
resource
  partition/machine/partition
/resource
os
  type arch='x86_64' machine='pc-i440fx-1.5'hvm/type
  boot dev='hd'/
/os
features
  acpi/
  apic/
  pae/
/features
clock offset='utc'/
on_poweroffdestroy/on_poweroff
on_rebootrestart/on_reboot
on_crashrestart/on_crash
devices
  emulator/usr/local/bin/qemu-system-x86_64/emulator
  disk type='file' device='disk'
driver name='qemu' type='qcow2' cache='none'/
source file='/home/images/rhel64-64.qcow2'/
target dev='hda' bus='ide'/
alias name='ide0-0-0'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='cdrom'
driver name='qemu' type='raw'/
source 
file='/home/upstream/autotest/virt-test/shared/data/isos/RHEL-6.4-x86_64-DVD.iso'/
target dev='hdb' bus='ide'/
readonly/
alias name='ide0-0-1'/
address type='drive' controller='0' bus='0' target='0' unit='1'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi11.img'/
target dev='sda' bus='scsi'/
alias name='scsi0-0-0-0'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi1.img'/
target dev='sdf' bus='scsi'/
alias name='scsi0-0-0-5'/
address type='drive' controller='0' bus='0' target='0' unit='5'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi9.img'/
target dev='sdg' bus='scsi'/
alias name='scsi0-0-0-6'/
address type='drive' controller='0' bus='0' target='0' unit='6'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi8.img'/
target dev='sdh' bus='scsi'/
alias name='scsi1-0-0'/
address type='drive' controller='1' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi10.img'/
target

[Qemu-devel] [Bug 1197663] Re: qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB irrespective of the volume size

2013-07-04 Thread chandrashekar shastri

** Attachment added: Screen shot
   
https://bugs.launchpad.net/fedora/+bug/1197663/+attachment/3724283/+files/Screenshot2.png

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1197663

Title:
  qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
  irrespective of the volume size

Status in QEMU:
  New
Status in Fedora:
  New

Bug description:
  qcow2 [virtio-scsi] devices when mapped to the guest shows as 0MB
  irrespective of the volume size.

  Kernel Version: 3.10.0-rc5+

  Libvirt Version: 1.0.6

  Qemu Version: 1.5.50

  Steps to reproduce the issue:

  1. Create a qcow2 voulme using the command qemu-img create -f qcow2 
virtio-scsi11.img 10G 
  2. Add the virtio-scsi controller 
   
   controller type='scsi' index='0' model='virtio-scsi'
address type='pci' domain='0x' bus='0x00' slot='0x04' 
function='0x0'/
  /controller

  3. Attach the qcow2 device to the guest, virsh attach-disk rhel64-64
  /home/images/virtio-scsi11.img --persistent sdr --cache writethrough

  4. Run the scan commnad echo ' - - - ' 
  /sys/class/scsi_host/host#/scan, if the attached volume doesn't get
  recognize.

  5. Check the dmesg for the added volume.

  6. Run fdisk -l command

  Disk /dev/sdl: 0 MB, 197120 bytes
  1 heads, 1 sectors/track, 385 cylinders, total 385 sectors
  Units = cylinders of 1 * 512 = 512 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x

  
  And observe that the 10G qcow2 volume shows as 0MB.

  This is not seen with the raw image.

  Disk /dev/sdm: 10.7 GB, 10737418240 bytes
  64 heads, 32 sectors/track, 10240 cylinders
  Units = cylinders of 2048 * 512 = 1048576 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x

  Expected Result:

  The volume size for the qcow2 volumes should be shown correctly inside
  the guest to avoid confusion.

  
  Guest XML:
  virsh dumpxml rhel64-64
  domain type='kvm' id='4'
namerhel64-64/name
uuid48deb0e1-0c23-9be9-da12-2ead34864de2/uuid
memory unit='KiB'4096000/memory
currentMemory unit='KiB'4096000/currentMemory
vcpu placement='static'1/vcpu
resource
  partition/machine/partition
/resource
os
  type arch='x86_64' machine='pc-i440fx-1.5'hvm/type
  boot dev='hd'/
/os
features
  acpi/
  apic/
  pae/
/features
clock offset='utc'/
on_poweroffdestroy/on_poweroff
on_rebootrestart/on_reboot
on_crashrestart/on_crash
devices
  emulator/usr/local/bin/qemu-system-x86_64/emulator
  disk type='file' device='disk'
driver name='qemu' type='qcow2' cache='none'/
source file='/home/images/rhel64-64.qcow2'/
target dev='hda' bus='ide'/
alias name='ide0-0-0'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='cdrom'
driver name='qemu' type='raw'/
source 
file='/home/upstream/autotest/virt-test/shared/data/isos/RHEL-6.4-x86_64-DVD.iso'/
target dev='hdb' bus='ide'/
readonly/
alias name='ide0-0-1'/
address type='drive' controller='0' bus='0' target='0' unit='1'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi11.img'/
target dev='sda' bus='scsi'/
alias name='scsi0-0-0-0'/
address type='drive' controller='0' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi1.img'/
target dev='sdf' bus='scsi'/
alias name='scsi0-0-0-5'/
address type='drive' controller='0' bus='0' target='0' unit='5'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi9.img'/
target dev='sdg' bus='scsi'/
alias name='scsi0-0-0-6'/
address type='drive' controller='0' bus='0' target='0' unit='6'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi8.img'/
target dev='sdh' bus='scsi'/
alias name='scsi1-0-0'/
address type='drive' controller='1' bus='0' target='0' unit='0'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw' cache='writethrough'/
source file='/home/images/virtio-scsi10.img'/
target dev='sdi' bus='scsi'/
alias name='scsi1-0-1'/
address type='drive' controller='1' bus='0' target='0' unit='1'/
  /disk
  disk type='file' device='disk'
driver name='qemu' type='raw'

Re: [Qemu-devel] [PATCH V3 4/9] qmp: add internal snapshot support in qmp_transaction

2013-07-04 Thread Stefan Hajnoczi

On Thu, Jun 27, 2013 at 10:41:43AM +0800, Wenchao Xia wrote:
 +/* check whether a snapshot with name exist, no need to check id, since
 +   name will be checked later to make sure it does not mess up with id. 
 */
 +ret = bdrv_snapshot_find_by_id_and_name(bs, NULL, name, sn, errp);
 +if (error_is_set(errp)) {
 +return;
 +}
 +if (ret) {
 +error_setg(errp,
 +   Snapshot with name '%s' already exist on device '%s',

s/exist/exists/

 +   name, device);
 +return;
 +}
 +
 +/* Forbid having a name similar to id, empty name is also forbidden. */
 +if (!snapshot_name_wellformed(name)) {
 +error_setg(errp, Name '%s' on device '%s' is not a valid one,
 +   name, device);
 +return;
 +}
 +
 +/* 3. take the snapshot */
 +sn1 = state-sn;
 +pstrcpy(sn1-name, sizeof(sn1-name), name);
 +qemu_gettimeofday(tv);
 +sn1-date_sec = tv.tv_sec;
 +sn1-date_nsec = tv.tv_usec * 1000;
 +sn1-vm_clock_nsec = qemu_get_clock_ns(vm_clock);
 +
 +if (bdrv_snapshot_create(bs, sn1)  0) {
 +error_setg(errp, Failed to create snapshot '%s' on device '%s',
 +   name, device);

Please use error_setg_errno() to include the bdrv_snapshot_create()
error message.

 @@ -1009,6 +1010,18 @@ the new image file has the same contents as the 
 current one; QEMU cannot
  perform any meaningful check.  Typically this is achieved by using the
  current image file as the backing file for the new image.
  
 +On failure, the original disks pre-snapshot attempt will be used.
 +
 +For internal snapshots, the dictionary contains the device and the snapshot's
 +name.  If name is a numeric string which will mess up with ID, the request 
 will

This is about namespace collision.  Collide or conflict are usually
used to describe identical naming problems.  Instead of mess up I
would say something like:

The name must not be a numeric string since this collides with snapshot
IDs and an error will be returned.

 +be rejected.  For example, name 99 is not a valid name.  If an internal
 +snapshot matching name already exists, the request will be also rejected.  
 Only
 +some image formats support it, for example, qcow2, rbd, and sheepdog.
 +
 +On failure, qemu will try delete new created internal snapshot in the

s/new created/the newly created/

 +transaction.  When I/O error causes deletion failure, the user needs to fix 
 it

When an I/O error occurs during deletion, ...

Re: [Qemu-devel] [PATCHv2 02/11] iscsi: read unmap info from block limits vpd page

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 23:23, Peter Lieven ha scritto:
 BDC is not used. I had an implementation that sent multiple descriptors out, 
 but
 at least for my storage the maximum unmap counts not for each descriptors, 
 but for all
 together. So in this case we do not need the field at all. I forgot to remove 
 it.
 
 discard and write_zeroes will both only send one request up to max_unmap in 
 size.
 
 apropos write_zeroes: do you know if UNMAP is guaranteed to unmap data if 
 lbprz == 1?

Yes.  On the other hand note that WRITE_SAME should be guaranteed _not_
to unmap if lbprz == 0 and you do WRITE_SAME with UNMAP and a zero
payload, but I suspect there may be buggy targets here.

 I have read in the specs something that the target might unmap the blocks or 
 not touch them at all.
 Maybe you have more information.

That's even true of UNMAP itself, actually. :)

The storage can always upgrade a block from unmapped to anchored and
from anchored to allocated, so UNMAP can be a no-op and still comply
with the standard.

Paolo

Re: [Qemu-devel] [PATCH V3 5/9] qmp: add interface blockdev-snapshot-internal-sync

2013-07-04 Thread Stefan Hajnoczi

On Thu, Jun 27, 2013 at 10:41:44AM +0800, Wenchao Xia wrote:
 diff --git a/qapi-schema.json b/qapi-schema.json
 index 2547a7d..fba9b15 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -1738,6 +1738,28 @@
  '*mode': 'NewImageMode'} }
  
  ##
 +# @blockdev-snapshot-internal-sync
 +#
 +# Synchronously take an internal snapshot of a block device, when the format
 +# of the image used supports it.
 +#
 +# @device: the name of the device to generate the snapshot from
 +#
 +# @name: the name of new snapshot name
 +#
 +# Returns: nothing on success
 +#  If @device is not a valid block device, DeviceNotFound
 +#  If any snapshot matching @name exists, or the name string is 
 invalid
 +#  which may mess up with snapshot ID, or name is empty, GenericError

s/mess up with snapshot ID/collide with snapshot IDs/

 +image used supports it.  If the name is a numeric string which may mess up 
 with

Same here.

Re: [Qemu-devel] [PATCH V3 6/9] qmp: add interface blockdev-snapshot-delete-internal-sync

2013-07-04 Thread Stefan Hajnoczi

On Thu, Jun 27, 2013 at 10:41:45AM +0800, Wenchao Xia wrote:
 diff --git a/blockdev.c b/blockdev.c
 index ce89f83..35fffd6 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -790,6 +790,67 @@ void qmp_blockdev_snapshot_internal_sync(const char 
 *device,
 snapshot, errp);
  }
  
 +SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device,
 + bool has_id,
 + const char *id,
 + bool has_name,
 + const char *name,
 + Error **errp)
 +{
 +BlockDriverState *bs = bdrv_find(device);
 +QEMUSnapshotInfo sn;
 +Error *local_err = NULL;
 +SnapshotInfo *info = NULL;
 +int ret;
 +
 +if (!bs) {
 +error_set(errp, QERR_DEVICE_NOT_FOUND, device);
 +return NULL;
 +};

Spurious ';'

 +
 +if (!has_id) {
 +id = NULL;
 +}
 +
 +if (!has_name) {
 +name = NULL;
 +}
 +
 +if (!id  !name) {
 +error_setg(errp, Name or id must be provided);
 +return NULL;
 +}
 +
 +ret = bdrv_snapshot_find_by_id_and_name(bs, id, name, sn, local_err);
 +if (error_is_set(local_err)) {
 +error_propagate(errp, local_err);
 +return NULL;
 +}
 +if (!ret) {
 +error_setg(errp,
 +   Snapshot with id '%s' and name '%s' do not exist on 

s/do not exist/does not exist/

 diff --git a/qapi-schema.json b/qapi-schema.json
 index fba9b15..ffcdca7 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -1760,6 +1760,33 @@
'data': { 'device': 'str', 'name': 'str'} }
  
  ##
 +# @blockdev-snapshot-delete-internal-sync
 +#
 +# Synchronously delete an internal snapshot of a block device, when the 
 format
 +# of the image used support it. The snapshot is identified by name or id or
 +# both. One of the name or id is required. It will returns SnapshotInfo of
 +# successfully deleted snapshot.

Return SnapshotInfo for the successfully deleted snapshot.

 +SQMP
 +blockdev-snapshot-delete-internal-sync
 +--
 +
 +Synchronously delete an internal snapshot of a block device when the format 
 of
 +image used support it.  The snapshot is identified by name or id or both.  
 One

s/support/supports/

 +of the name or id is required.  If the snapshot is not found, operation will
 +fail.

s/One of the name or id/One of name or id/

s/operation will fail/the operation will fail/

Re: [Qemu-devel] [PATCH V3 0/9] add internal snapshot support at block device level

2013-07-04 Thread Stefan Hajnoczi

On Wed, Jul 03, 2013 at 09:52:10AM +0800, Wenchao Xia wrote:
   Any comments for this version?

I'm happy with the code and left comments on error messages and
documentation.

Re: [Qemu-devel] [Xen-devel] [PATCH] libxl: Spice usbredirection support for upstream qemu

2013-07-04 Thread Stefano Stabellini

Please don't use HTML in emails

On Thu, 4 Jul 2013, Fabio Fantoni wrote:
 Il 04/07/2013 12:32, Wei Liu ha scritto:
 
 On Thu, Jul 04, 2013 at 12:16:43PM +0200, Fabio Fantoni wrote:
 
 Il 04/07/2013 12:12, Wei Liu ha scritto:
 
 On Thu, Jul 04, 2013 at 12:05:59PM +0200, Fabio Fantoni wrote:
 
 Usage: spiceusbredirection=1|0 (default=0)
 Enables spice usbredirection. The Spice usbredirection creates usb2
 controller and 4 usbredirection channels for redirection of up to 4
 usb devices from spice client to domU's qemu.
 
 Signed-off-by: Fabio Fantoni fabio.fant...@m2r.biz
 ---
  docs/man/xl.cfg.pod.5   |8 
  tools/libxl/libxl_create.c  |1 +
  tools/libxl/libxl_dm.c  |   18 ++
  tools/libxl/libxl_types.idl |1 +
  tools/libxl/xl_cmdimpl.c|2 ++
  5 files changed, 30 insertions(+)
 
 diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
 index 766862d..a450800 100644
 --- a/docs/man/xl.cfg.pod.5
 +++ b/docs/man/xl.cfg.pod.5
 @@ -1134,6 +1134,14 @@ requires vdagent service installed on domU o.s. to 
 work. The default is 0.
  =back
 +=item Bspiceusbredirection=BOOLEAN
 +
 +Enables spice usbredirection. The Spice usbredirection creates usb2
 +controller and 4 usbredirection channels for redirection of up to 4 usb
 +devices from spice client to domU's qemu. The default is 0.
 +
 +=back
 +
  =head3 Miscellaneous Emulated Hardware
  =over 4
 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 8db5460..58df106 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -289,6 +289,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
   false);
  libxl_defbool_setdefault(b_info-u.hvm.spice.agent_mouse, true);
  libxl_defbool_setdefault(b_info-u.hvm.spice.vdagent, false);
 +libxl_defbool_setdefault(b_info-u.hvm.spice.usbredirection, 
 false);
  }
  libxl_defbool_setdefault(b_info-u.hvm.nographic, false);
 diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
 index bc605e4..4f625e0 100644
 --- a/tools/libxl/libxl_dm.c
 +++ b/tools/libxl/libxl_dm.c
 @@ -471,6 +471,24 @@ static char ** 
 libxl__build_device_model_args_new(libxl__gc *gc,
  virtserialport,chardev=vdagent,name=com.redhat.spice.0,
  NULL);
  }
 +
 +if (libxl_defbool_val(b_info-u.hvm.spice.usbredirection)) {
 +flexarray_vappend(dm_args, -device,ich9-usb-ehci1,id=usb,
 +bus=pci.0,addr=0x1d.0x7, -device,ich9-usb-uhci1,
 +masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,
 +addr=0x1d.0x0, 
 -device,ich9-usb-uhci2,masterbus=usb.0,
 +firstport=2,bus=pci.0,addr=0x1d.0x1, -device,
 +ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,
 +addr=0x1d.0x2, -chardev,spicevmc,name=usbredir,
 +
 id=usbrc1,-device,usb-redir,chardev=usbrc1,id=usbrc1,
 +bus=usb.0, 
 -chardev,spicevmc,name=usbredir,id=usbrc2,
 +-device,usb-redir,chardev=usbrc2,id=usbrc2,bus=usb.0,
 +-chardev,spicevmc,name=usbredir,id=usbrc3,-device,
 +usb-redir,chardev=usbrc3,id=usbrc3,bus=usb.0, 
 -chardev,
 +spicevmc,name=usbredir,id=usbrc4,-device,usb-redir,
 +chardev=usbrc4,id=usbrc4,bus=usb.0, NULL);
 
 Any reason for so many hardcoded options?
 
 I searched and requested for one year on spice-devel and qemu-devel
 about alternative methods but nothing found for now.
 Already tried usb=1 which creates usb1 controller that is not
 working with usb redirection.
 
 What if QEMU upstream changes and these options don't work any more? In
 that case this functionality is broken and users have no way to
 workaround it. IMHO unless they are clearly documented we should not
 consider adding in theses hardcoded options in libxl.
 
 
 Added to cc spice-devel and qemu-devel for ask again if there is a better 
 solution to do this.
 
 @spice-devel and qemu-devel:
 Can someone help to improve qemu options above for enable usb redirection 
 please?
 Thanks for any reply.

It's not about improving qemu options for usb redirection (even though
they could use a simplification), it's whether they are guaranteed to be
stable. Are you sure that a future QEMU release is going to work with
these options?
For example, is libvirt using something similar to this?

Usually QEMU cmdline options are considered a stable interface, so I
wouldn't worry too much about it.

Re: [Qemu-devel] [PATCH V16 0/7] replace QEMUOptionParameter with QemuOpts parser

2013-07-04 Thread Stefan Hajnoczi

On Tue, Jun 18, 2013 at 05:31:52PM +0800, Dong Xu Wang wrote:
 These patches will replace QEMUOptionParameter with QemuOpts. Change logs
 please go to each patch's commit message.
 
 Dong Xu Wang (7):
   add def_value_str in QemuOptDesc struct and rewrite qemu_opts_print
   avoid duplication of default value in QemuOpts
   Create four QemuOptsList related functions
   Create some QemuOpts functons
   Use QemuOpts support in block layer
   query-command-line-options outputs def_value_str
   remove QEMUOptionParameter related functions and struct
 
  block.c   | 100 -
  block/cow.c   |  52 ++---
  block/gluster.c   |  37 ++-
  block/iscsi.c |  31 ++-
  block/qcow.c  |  67 +++---
  block/qcow2.c | 199 
  block/qed.c   | 108 +
  block/qed.h   |   2 +-
  block/raw-posix.c |  59 +++--
  block/raw-win32.c |  31 +--
  block/raw.c   |  30 +--
  block/rbd.c   |  62 +++--
  block/sheepdog.c  |  81 ---
  block/ssh.c   |  29 ++-
  block/vdi.c   |  70 +++---
  block/vmdk.c  | 129 ++-
  block/vpc.c   |  65 +++---
  block/vvfat.c |  11 +-
  include/block/block.h |   5 +-
  include/block/block_int.h |   6 +-
  include/qemu/option.h |  56 ++---
  qapi-schema.json  |   5 +-
  qemu-img.c|  65 +++---
  qmp-commands.hx   |   2 +
  util/qemu-config.c|   4 +
  util/qemu-option.c| 562 
 +-
  26 files changed, 906 insertions(+), 962 deletions(-)
 
 -- 
 V15-V16:
 1) discard double-initialization.
 2) use pointer directly, not g_strdup.
 3) modify query-command-line-options related code.
 V14-V15:
 1) Only delete enum QEMUOptionParType.

eblake: You commented on the last revision.  Are you happy with v16?

Stefan

Re: [Qemu-devel] [PATCH v5 10/11] qemu-ga: Install Windows VSS provider on `qemu-ga -s install'

2013-07-04 Thread Paolo Bonzini

Il 03/07/2013 18:19, Tomoki Sekiyama ha scritto:
 On 7/3/13 11:58 , Paolo Bonzini pbonz...@redhat.com wrote:
 
 Il 03/07/2013 17:49, Tomoki Sekiyama ha scritto:
 -return ga_install_service(path, log_filepath,
 fixed_state_dir);
 +if (ga_install_vss_provider()) {
 +return EXIT_FAILURE;
 +}
 +if (ga_install_service(path, log_filepath,
 fixed_state_dir)) {
 +ga_uninstall_vss_provider();
 +return EXIT_FAILURE;
 +}
 +return 0;
  } else if (strcmp(service, uninstall) == 0) {
 +ga_uninstall_vss_provider();
  return ga_uninstall_service();

 I think this shouldn't be a hard failure.  Only the freeze/thaw commands
 should fail.

 Paolo
 
 Do you mean that qemu-ga should work without qga-provider.dll etc.
 even if it is configured --with-vss-sdk ?

Yes, and I'm even wondering if we should move all VSS code to a DLL
(provider and requestor---they are very tied to each other anyway
because of hEventFrozen/hEventThaw), and have qemu-ga simply look for
qga-provider.dll dropped into the executable directory.

Then qemu-ga can look for it even if it is not configured --with-vss-sdk.

This is because the license of the SDK may be problematic for
distributions that compile qemu-ga from source.  These distribution
cannot distribute the SDK, and thus they will not be able to compile and
distribute the provider DLL.  Still, we should make it as easy as
possible to combine a DLL and executable from separate sources
into---for example---a single MSI.

Paolo

Re: [Qemu-devel] [PATCH 1/2] Refine and export infinite loop checking in collect_image_info_list()

2013-07-04 Thread Stefan Hajnoczi

On Fri, Jun 28, 2013 at 02:37:52PM -0600, Eric Blake wrote:
 On 06/27/2013 01:38 AM, Xu Wang wrote:
  +filenames = g_hash_table_new_full(g_str_hash, str_equal_func, NULL, 
  NULL);
  +
  +/* If backing file exists, filename will insert into hash table and 
  seek
  + * the whole backing file chain from @backing_file.
  + */
  +if (backing_file) {
  +g_hash_table_insert(filenames, (gpointer)filename, NULL);
 
 Does this have any false positives (perhaps mishandling due to relative
 names) or false negatives (perhaps hard links allow different spellings
 of the same file to create a loop, although the difference in names
 won't indicate the problem)?  I'd really like to see you add a testcase
 before this patch gets committed, although I agree that a patch along
 these lines is worthwhile.  For example, make sure the following chain
 is not rejected:
 
 /dir1/base.img - /dir1/wrap.img(relative backing 'base.img') -
 /dir2/base.img (absolute backing '/dir1/base.img') -
 /dir2/wrap.img(relative backing 'base.img')
 
 whether opened in /dir2/ via relative name 'wrap.img' or absolute name
 '/dir2/wrap.img'.  Likewise, make sure you can detect this loop:
 
 create directory 'dir'
 create './dir/b.img'
 create './b.img' with relative backing 'dir/b.img'
 remove ./dir/b.img and dir
 ln -s . dir
 now 'b.img' refers to itself as backing file, even though the names
 ./b.img and ./dir/b.img are not equal by strcmp.

Yes, a test case should be added in tests/qemu-iotests/.  Please see
this wiki page for documentation:

http://qemu-project.org/Documentation/QemuIoTests

Stefan

[Qemu-devel] [RFC PATCH] elfload: load PIE executables to right address

2013-07-04 Thread Timo Teräs

PIE images are ET_DYN images. Check first for pinterp_name to make
sure the main executable always is loaded to correct place.

See below for current behaviour of PIE executables:

Reserved 0x7f00 bytes of guest address space
host mmap_min_addr=0x1000
guest_base  0x7f7cb41d5000
startend  size prot
0037f400-003fe400 0007f000 r-x
003fe400-003ff400 1000 ---
003ff400-003fe400 f000 rw-
003fe400-003ff400 1000 ---
003ff400-003ffc00 0800 rw-
003ffc00-003fec00 f000 r-x
003fec00-003ffc00 1000 ---
003ffc00-0007f000 ffc7f400 rw-
start_brk   0x
end_code0x7eff7ac0
start_code  0x7eff7000
start_data  0x7efffac0
end_data0x7efffc18
start_stack 0x7eff6dc8
brk 0x7efffc34
entry   0x7e799b30
-5000 ---p  00:00 0
5000-00015000 rw-p  00:00 0
00015000-7e77d000 ---p  00:00 0
7e77d000-7e7ec000 r-xp  68:03 14326298  /lib/libc.so
7e7ec000-7e7f3000 ---p  00:00 0
7e7f3000-7e7f4000 rw-p 0006e000 68:03 14326298  /lib/libc.so
7e7f4000-7e7f6000 rw-p  00:00 0
7e7f6000-7e7f7000 ---p  00:00 0
7e7f7000-7eff7000 rw-p  00:00 0
7eff7000-7eff8000 r-xp  68:03 9731305  /usr/bin/brk
7eff8000-7efff000 ---p  00:00 0
7e7f7000-7eff7000 rw-p  00:00 0  [stack]

Showing how the main binary got loaded to wrong place.

Signed-off-by: Timo Teräs timo.te...@iki.fi
---
I assume pinterp_name is only ever set for the main executable.
Quick grep would indicate that this is indeed the case.

 linux-user/elfload.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index ddef23e..d6e00cd 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1660,7 +1660,12 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 }
 
 load_addr = loaddr;
-if (ehdr-e_type == ET_DYN) {
+if (pinterp_name != NULL) {
+/* This is the main executable.  Make sure that the low
+   address does not conflict with MMAP_MIN_ADDR or the
+   QEMU application itself.  */
+probe_guest_base(image_name, loaddr, hiaddr);
+} else if (ehdr-e_type == ET_DYN) {
 /* The image indicates that it can be loaded anywhere.  Find a
location that can hold the memory space required.  If the
image is pre-linked, LOADDR will be non-zero.  Since we do
@@ -1672,11 +1677,6 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 if (load_addr == -1) {
 goto exit_perror;
 }
-} else if (pinterp_name != NULL) {
-/* This is the main executable.  Make sure that the low
-   address does not conflict with MMAP_MIN_ADDR or the
-   QEMU application itself.  */
-probe_guest_base(image_name, loaddr, hiaddr);
 }
 load_bias = load_addr - loaddr;
 
-- 
1.8.3.2

1 2 3 >

1 - 100 of 251 matches

Mail list logo