Re: [PATCH V4 3/7] kvm tools: Add SPAPR PPC64 hcall rtascall structure

2012-01-31 Thread Pekka Enberg
On Tue, Jan 31, 2012 at 8:34 AM, Matt Evans m...@ozlabs.org wrote:
 +#define DEBUG_SPAPR_HCALLS

I suppose this shouldn't be defined by default?

 +#ifdef DEBUG_SPAPR_HCALLS
 +#define hcall_dprintf(fmt, ...) \
 +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
 +#else
 +#define hcall_dprintf(fmt, ...) \
 +    do { } while (0)
 +#endif

I don't expect you to fix this but this sort of thing really cries out
for userspace tracepoints.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Repair running on non ioeventfd-capable platforms

2012-01-31 Thread Pekka Enberg
On Tue, Jan 31, 2012 at 8:30 AM, Matt Evans m...@ozlabs.org wrote:
 Commit d3923126a24212f1e746a84a575dadbd9f259418 added a bunch of
 nice error checking around ioevent__init() but the init may gracefully
 fail if ioevents simply aren't supported (PPC64 KVM).

 This commit adds a new return code for the init -- positive, but
 identifiable as 'not success 0'.

 Signed-off-by: Matt Evans m...@ozlabs.org

Applied, thanks!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 0/7] Add initial SPAPR PPC64 architecture support

2012-01-31 Thread Pekka Enberg
On Tue, Jan 31, 2012 at 8:34 AM, Matt Evans m...@ozlabs.org wrote:
 This series adds support for a PPC64 platform, SPAPR, on top of the previous
 more general PPC64 CPU support.  This platform is paravirtualised, with most
 of the MMU hypercalls being dealt with in the kernel.  Userland needs to
 describe the environment to the machine with a device tree, cope with some
 hypercalls (and RTAS calls) for hvconsole, PCI config and IRQs, and emulate
 the interrupt controller (XICS) and PCI Host Bridge of the SPAPR-esque
 machine.

I think the patches are good to go once we've sorted out the minor
nits I pointed out.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4 v3] KVM: Introduce kvm_memory_slot::arch and move lpage_info into it

2012-01-31 Thread Avi Kivity
On 01/31/2012 03:17 AM, Takuya Yoshikawa wrote:
 Added s390 and ppc developers to Cc,

 (2012/01/30 14:35), Takuya Yoshikawa wrote:
 Some members of kvm_memory_slot are not used by every architecture.

 This patch is the first step to make this difference clear by
 introducing kvm_memory_slot::arch;  lpage_info is moved into it.

 I am planning to move rmap stuff into arch next if this patch is
 accepted.

 Please let me know if you have some opinion about which members should be
 moved into this.


Is there anything else?  Everything else seems to be generic.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4 v3] KVM: Introduce kvm_memory_slot::arch and move lpage_info into it

2012-01-31 Thread Takuya Yoshikawa

(2012/01/31 18:18), Avi Kivity wrote:

On 01/31/2012 03:17 AM, Takuya Yoshikawa wrote:

Added s390 and ppc developers to Cc,

(2012/01/30 14:35), Takuya Yoshikawa wrote:

Some members of kvm_memory_slot are not used by every architecture.

This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch;  lpage_info is moved into it.


I am planning to move rmap stuff into arch next if this patch is
accepted.

Please let me know if you have some opinion about which members should be
moved into this.



Is there anything else?  Everything else seems to be generic.



About members, I agree.

But dirty_bitmap allocation/destruction should be implemented by
kvm_arch_create_*_dirty_bitmap().


Of course I may want to add another x86 specific member in the future.

Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch

2012-01-31 Thread Gleb Natapov
On Mon, Jan 30, 2012 at 04:26:06PM +0100, Kevin Wolf wrote:
  Would that be enough or would we have to avoid clearing it in all other
  places as well? Where would it be initialised if it's not enough?
  
  Maybe vmx_vcpu_reset().
 
 Do all CPL changes go through set_cr0/segment/rflags/cpl? I guess yes,
 so initialising on reset and keeping it valid all the time should be
 possible indeed.
 
CPL can be changed while guest is running. SVM saves it for us in cpl
field. VMX does not, so we either will have to update cpl on each exit
(cpl = cs  3) or somehow mark it not up-to-date and recalculate on
access. Can VMX exit while cpl != cs  3 or can this happen only during
emulation? If it can we cannot know real cpl after exit.
  
--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 42563] rhel5u5 guest panic when booting up

2012-01-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42563





--- Comment #8 from Gleb g...@redhat.com  2012-01-31 09:39:50 ---
(In reply to comment #7)
 (In reply to comment #6)
  (In reply to comment #5)
   (In reply to comment #4)
   
Yes. With 'ignore_msrs=y' added when loading kvm module, rhel5.x guest 
can boot
up now. As it will effect all the RHEL5.x (e.g.5.3, 5.3) guest, this 
issue
still need to be fixed.
   
   Agree.  Gleb, I think we need to restore the perf msrs behaviour to what 
   it was
   before, when the PMU is not enabled.
  
  Yes, they were special cased for a reason.
 
 I found the bug got fixed in kvm upstream. Gleb, you fixed it by the following
 commit? 

Yes.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 42563] rhel5u5 guest panic when booting up

2012-01-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42563


Jay Ren yongjie@intel.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||CODE_FIX




-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 42563] rhel5u5 guest panic when booting up

2012-01-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42563


Jay Ren yongjie@intel.com changed:

   What|Removed |Added

 Status|RESOLVED|VERIFIED




-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 42563] rhel5u5 guest panic when booting up

2012-01-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42563


Jay Ren yongjie@intel.com changed:

   What|Removed |Added

 Status|VERIFIED|CLOSED




--- Comment #9 from Jay Ren yongjie@intel.com  2012-01-31 09:44:11 ---
verified and closed it.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 2/7] kvm tools: Generate SPAPR PPC64 guest device tree

2012-01-31 Thread Matt Evans

On 31 Jan 2012, at 18:59, Pekka Enberg wrote:

 On Tue, Jan 31, 2012 at 8:34 AM, Matt Evans m...@ozlabs.org wrote:
 +static struct cpu_info cpu_power7_info = {
 +   POWER7,
 +   power7_page_sizes_prop, sizeof(power7_page_sizes_prop),
 +   power7_segment_sizes_prop, sizeof(power7_segment_sizes_prop),
 +   32, /* SLB size */
 +   51200,  /* TB frequency */
 +   128,/* d-cache block size */
 +   128,/* i-cache block size */
 +   CPUINFO_FLAG_DFP | CPUINFO_FLAG_VSX | CPUINFO_FLAG_VMX
 +};
 
 Why don't you use C99 initializers here?

...just out of bad bad habit :)  Initialisers'd be cleaner, I'll convert them.


Thanks!


Matt

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 3/7] kvm tools: Add SPAPR PPC64 hcall rtascall structure

2012-01-31 Thread Matt Evans

On 31 Jan 2012, at 19:11, Pekka Enberg wrote:

 On Tue, Jan 31, 2012 at 8:34 AM, Matt Evans m...@ozlabs.org wrote:
 +#define DEBUG_SPAPR_HCALLS
 
 I suppose this shouldn't be defined by default?

Well, I had a bit of a debate about it.  I left it on as it is actually 
interesting whilst experimenting with different SPAPR guests.  For instance 
(and this is really out there, there are other reasons why this won't work) if 
someone tries an AIX guest it'd be good to see lots of Unsupported! spew.  
(The 'unsupported' message could be a warning but that might also get annoying, 
OTOH.)  I would prefer to leave it on for the moment.

 
 +#ifdef DEBUG_SPAPR_HCALLS
 +#define hcall_dprintf(fmt, ...) \
 +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
 +#else
 +#define hcall_dprintf(fmt, ...) \
 +do { } while (0)
 +#endif
 
 I don't expect you to fix this but this sort of thing really cries out
 for userspace tracepoints.

Ha!  Yes...


Cheers,


Matt--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch

2012-01-31 Thread Avi Kivity
On 01/31/2012 11:37 AM, Gleb Natapov wrote:
 On Mon, Jan 30, 2012 at 04:26:06PM +0100, Kevin Wolf wrote:
   Would that be enough or would we have to avoid clearing it in all other
   places as well? Where would it be initialised if it's not enough?
   
   Maybe vmx_vcpu_reset().
  
  Do all CPL changes go through set_cr0/segment/rflags/cpl? I guess yes,
  so initialising on reset and keeping it valid all the time should be
  possible indeed.
  
 CPL can be changed while guest is running. SVM saves it for us in cpl
 field. VMX does not, so we either will have to update cpl on each exit
 (cpl = cs  3) or somehow mark it not up-to-date and recalculate on
 access. Can VMX exit while cpl != cs  3 or can this happen only during
 emulation? If it can we cannot know real cpl after exit.


Perhaps it can, with unrestricted guests, but I think we don't allow
those conditions (we trap cr0 writes).

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Increase KVMSlots size

2012-01-31 Thread Bhushan Bharat-R65777
Hi All,

In my development I came across the state where I need more than 32 kvmslots

QEMU/kvm-all.c:

cut
struct KVMState
{
KVMSlot slots[32];

cut


Followings are the solution I thought of:
Solution 1: 
Increase the slot to 64 in both KVM and QEMU. I think this will also 
require to add a new ioctl call so that qemu can come to know on which KVM it 
is running (whether it is running on 32 slots KVM or 64 slots KVM).

Other solutions:
a) Decouple the qemu and kvm data structure. QEMU and KVM use some 
variable size data structure or lists etc. QEMU can keep on adding as many as 
it wants without knowing how much slots KVM supports.
b) There are linear search in the KVM slots array in critical path of 
code of KVM. Does this make some sense using RB-tree, heapsort, balanced binary 
tree type of solutions?

Looking forward for suggestions.

Thanks
-Bharat

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[VT-d reboot problems] Re: [PATCH] x86 / reboot: Blacklist Dell OptiPlex 990 known to require PCI reboot

2012-01-31 Thread Ingo Molnar

(added KVM folks to the Cc:)

* Bastien ROUCARIES roucaries.bast...@gmail.com wrote:

 Ping^2
 
 Bastien
 On Mon, Jan 23, 2012 at 11:28 AM, Bastien ROUCARIES
 roucaries.bast...@gmail.com wrote:
  On Mon, Jan 16, 2012 at 8:21 PM, H. Peter Anvin h...@zytor.com wrote:
  On 01/16/2012 03:27 AM, Bastien ROUCARIES wrote:
 
  Does it work if you disable VT-d in the firmware? If so, then adding it
  to the reboot method blacklist is the wrong fix - we need to figure out
  why VT-d interferes with Dell's reboot code.
 
  Yes it work
 
 
  This is particularly so since we are very close to having a full Dell
  model catalogue in the kernel...
 
  Ping ? Do you need some dump ? testing ?

So disabling VT-d in the BIOS fixes the reboot problem and 
Matthew Garrett suggests we should figure out why and how VT-d 
on this Dell box interferes with the reboot method.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VMXON region vs VMCS region?

2012-01-31 Thread Orit Wasserman
On 01/31/2012 05:35 AM, Zhi Yong Wu wrote:
 HI,
 
 Can anyone let me know know the difference  between VMXON region and
 VMCS region? relationship?
 

There is no relationship between them:
VMXON region is created per logical processor and used by it for VMX ops.
VMCS region is created for each guest vcpu and used both by the hypervisor and 
the processor.

 It will be appreciated if you can make some comments.
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Avi Kivity
On 01/28/2012 04:21 PM, Alex Williamson wrote:
 To date we've only exposed BARs as 32bit even if the device
 physically supports 64bit BARs.  Enable 64bit BARs to be
 exposed as such in the guest, which may free up MMIO below
 4G should the guest choose to use it.

 This adds a new mem64= option to pci-assign, with the
 default being off for testing and enablement.  The goal
 is to eventually make this enabled by default.


Seems fine, but do we really need the option?  If it doesn't work we
should treat it as an ordinary but and fix it.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4 v3] KVM: Introduce kvm_memory_slot::arch and move lpage_info into it

2012-01-31 Thread Takuya Yoshikawa
Christian Borntraeger borntrae...@de.ibm.com wrote:

  Some members of kvm_memory_slot are not used by every architecture.
 
  This patch is the first step to make this difference clear by
  introducing kvm_memory_slot::arch;  lpage_info is moved into it.
 
 Patch series seems to work on s390.
 
 Christian
 

Thanks!

Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Jan Kiszka
On 2012-01-31 13:40, Avi Kivity wrote:
 On 01/28/2012 04:21 PM, Alex Williamson wrote:
 To date we've only exposed BARs as 32bit even if the device
 physically supports 64bit BARs.  Enable 64bit BARs to be
 exposed as such in the guest, which may free up MMIO below
 4G should the guest choose to use it.

 This adds a new mem64= option to pci-assign, with the
 default being off for testing and enablement.  The goal
 is to eventually make this enabled by default.

 
 Seems fine, but do we really need the option?  If it doesn't work we
 should treat it as an ordinary but and fix it.

So far it's against the architecture of the emulated system: our current
chipset predates 64 bit PCI.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/9] pci-assign: Update MSI-X MMIO to Memory API

2012-01-31 Thread Avi Kivity
On 01/28/2012 04:21 PM, Alex Williamson wrote:
 Stop using compatibility mode and at the same time fix available
 access sizes.  The PCI spec indicates that the MSI-X table may
 only be accessed as DWORD or QWORD.

  
  static const MemoryRegionOps msix_mmio_ops = {
 -.old_mmio = {
 -.read = { msix_mmio_readb, msix_mmio_readw, msix_mmio_readl, },
 -.write = { msix_mmio_writeb, msix_mmio_writew, msix_mmio_writel, },
 -},
 +.read = msix_mmio_read,
 +.write = msix_mmio_write,
  .endianness = DEVICE_NATIVE_ENDIAN,
 +.impl = {
 +.min_access_size = 4,
 +.max_access_size = 8,
 +},
  };
  

.impl.min_access_size = 4 means the core will convert 1-byte I/O to
4-byte I/O (using rmw if needed).  That's not what we want, I think you
can leave it at 1 and explicitly ignore small accesses in the callbacks.

Have you tested 8-byte I/O?  This is the first user.  Don't you need to
set .valid.max_access_size?


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] pci-assign: Update MSI-X config based on table writes

2012-01-31 Thread Avi Kivity
On 01/28/2012 04:22 PM, Alex Williamson wrote:
 We currently only update MSI-X configuration with the enable bit
 in PCI config space is toggled.  This is pretty sketchy and part
 of the reason for the odd checks for vector data is to guess
 whether the guest is going to use the vector so we can pre-enable
 it.

 Two key things missed by doing this is that we don't catch vector
 changes after enabling (ex. setting smp_affinity on an irq) and
 we don't support guests that don't touch the vector table prior
 to enabling the MSI-X capability (ex. freebsd).  This patch
 fixes both of these problems.

 I'm not able to get good behavior attempting to disable masked
 vectors, so we don't actually do anything on mask yet.

  
 +static inline bool msix_masked(MSIXTableEntry *entry)
 +{
 +return !!(entry-ctrl  0x1);
 +}

!! !needed (nor 'inline').

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Avi Kivity
On 01/31/2012 02:45 PM, Jan Kiszka wrote:
 On 2012-01-31 13:40, Avi Kivity wrote:
  On 01/28/2012 04:21 PM, Alex Williamson wrote:
  To date we've only exposed BARs as 32bit even if the device
  physically supports 64bit BARs.  Enable 64bit BARs to be
  exposed as such in the guest, which may free up MMIO below
  4G should the guest choose to use it.
 
  This adds a new mem64= option to pci-assign, with the
  default being off for testing and enablement.  The goal
  is to eventually make this enabled by default.
 
  
  Seems fine, but do we really need the option?  If it doesn't work we
  should treat it as an ordinary but and fix it.

 So far it's against the architecture of the emulated system: our current
 chipset predates 64 bit PCI.


Then it should be enabled/disabled at the chipset level.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9] pci-assign: 64bit MMIO + better MSI-X table support

2012-01-31 Thread Avi Kivity
On 01/30/2012 03:44 PM, Alex Williamson wrote:
 On Mon, 2012-01-30 at 11:11 +0100, Jan Kiszka wrote:
  On 2012-01-28 15:21, Alex Williamson wrote:
   Patch 1  2 here are independent of the rest, but I include them
   here to avoid conflicts.  The first patch enables exposing MMIO
   BARs as their native width to the guest.  I added a config option
   for this with the default to use the existing behavior as I
   suspect we may have some latent issues there.  Patch 2 is just
   some trivial debug build warning fixes.
   
   The rest of the patches work on improving MSI-X table support.
   Particularly, vectors can now be updated by the guest after
   MSI-X is enabled to support things like irqbalance for SMP
   affinity tuning.  We also now update MSI-X configuration as
   new vectors are unmasked, which enables assignment of MSI-X
   devices on FreeBSD.  I was able to assign and use an 82576
   (PF  VF) on a FreeBSD 9.0 guest with this series.  Hopefully
   Shashidhar can report whether this improves the behavior he
   as seeing with an 82599.
   
   I wasn't able to get masking to work reliably, so I left that
   as is for now.  Perhaps someone has suggestions on getting that
   to work.  Thanks,
  
  Unless it's urging, let's focus on getting this implemented via the
  MSI/MSI-X core, not widely duplicated in device-assignment.

 I disagree.  This isn't making the code duplication worse and it solves
 at least two use cases that are currently broken.  This won't make it
 any more difficult to eventually move to msix.c, if it does, the core
 needs more work.  Thanks,


I agree (with Alex), but maybe I missed something?  Patch 9 does call
kvm directly instead of going through msi services, but I don't think
this should hold the patches.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9] pci-assign: 64bit MMIO + better MSI-X table support

2012-01-31 Thread Jan Kiszka
On 2012-01-31 13:52, Avi Kivity wrote:
 On 01/30/2012 03:44 PM, Alex Williamson wrote:
 On Mon, 2012-01-30 at 11:11 +0100, Jan Kiszka wrote:
 On 2012-01-28 15:21, Alex Williamson wrote:
 Patch 1  2 here are independent of the rest, but I include them
 here to avoid conflicts.  The first patch enables exposing MMIO
 BARs as their native width to the guest.  I added a config option
 for this with the default to use the existing behavior as I
 suspect we may have some latent issues there.  Patch 2 is just
 some trivial debug build warning fixes.

 The rest of the patches work on improving MSI-X table support.
 Particularly, vectors can now be updated by the guest after
 MSI-X is enabled to support things like irqbalance for SMP
 affinity tuning.  We also now update MSI-X configuration as
 new vectors are unmasked, which enables assignment of MSI-X
 devices on FreeBSD.  I was able to assign and use an 82576
 (PF  VF) on a FreeBSD 9.0 guest with this series.  Hopefully
 Shashidhar can report whether this improves the behavior he
 as seeing with an 82599.

 I wasn't able to get masking to work reliably, so I left that
 as is for now.  Perhaps someone has suggestions on getting that
 to work.  Thanks,

 Unless it's urging, let's focus on getting this implemented via the
 MSI/MSI-X core, not widely duplicated in device-assignment.

 I disagree.  This isn't making the code duplication worse and it solves
 at least two use cases that are currently broken.  This won't make it
 any more difficult to eventually move to msix.c, if it does, the core
 needs more work.  Thanks,

 
 I agree (with Alex), but maybe I missed something?  Patch 9 does call
 kvm directly instead of going through msi services, but I don't think
 this should hold the patches.

If this solves the issues, ok. That all needs quite some refactoring
anyway (including the kvm core services).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Jan Kiszka
On 2012-01-31 13:51, Avi Kivity wrote:
 On 01/31/2012 02:45 PM, Jan Kiszka wrote:
 On 2012-01-31 13:40, Avi Kivity wrote:
 On 01/28/2012 04:21 PM, Alex Williamson wrote:
 To date we've only exposed BARs as 32bit even if the device
 physically supports 64bit BARs.  Enable 64bit BARs to be
 exposed as such in the guest, which may free up MMIO below
 4G should the guest choose to use it.

 This adds a new mem64= option to pci-assign, with the
 default being off for testing and enablement.  The goal
 is to eventually make this enabled by default.


 Seems fine, but do we really need the option?  If it doesn't work we
 should treat it as an ordinary but and fix it.

 So far it's against the architecture of the emulated system: our current
 chipset predates 64 bit PCI.

 
 Then it should be enabled/disabled at the chipset level.

Makes me wonder if we already do some filtering if the device supports
64 bit but the next bridge does not.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Avi Kivity
On 01/31/2012 02:57 PM, Jan Kiszka wrote:
  Seems fine, but do we really need the option?  If it doesn't work we
  should treat it as an ordinary but and fix it.
 
  So far it's against the architecture of the emulated system: our current
  chipset predates 64 bit PCI.
 
  
  Then it should be enabled/disabled at the chipset level.

 Makes me wonder if we already do some filtering if the device supports
 64 bit but the next bridge does not.


Our 440fx does support 64-bit bars, so the question doesn't arise for
x86.  Instead we violate the spec.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Andreas Färber
Am 31.01.2012 00:53, schrieb Anthony Liguori:
 On 01/30/2012 05:41 PM, Andreas Färber wrote:
 Am 30.01.2012 19:55, schrieb Juan Quintela:
 Please send in any agenda items you are interested in covering.

 VMState:
 Anthony specifically said that VMState were not affected by QOM and that
 patches should not be deferred until the merge. Yet there's no review
 and/or decision-making for a month now. Ping^2 for AHCI+SDHC.
 
 Do you have pointers (to pending VMState patches)?

http://patchwork.ozlabs.org/patch/137732/ (PATCH v4)

It's basically about how to deal with variable-sized arrays. (Alex
mentioned it on one call around November.) I found ways to deal with
subsets of arrays embedded within the struct and variable-sized list of
pointers to structs but no solution for a malloc()'ed array of structs.
Maybe I'm just too stupid to see. Anyway, no one commented since Xmas.

Igor posted (and refined for v2) a patch with a callback-based approach
that I find promising. From my view, unofficially Juan is the VMState
guy, he's been cc'ed. Are we lacking an official maintainer that cares?
Or is Juan the official, undocumented maintainer but simply busy?

SUSE's interest is making AHCI migratable, and my VMState workaround for
that is simply ugly:

http://patchwork.ozlabs.org/patch/133066/ (RFC)

Therefore I'm waiting for some resolution.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Jan Kiszka
On 2012-01-31 14:10, Avi Kivity wrote:
 On 01/31/2012 02:57 PM, Jan Kiszka wrote:
 Seems fine, but do we really need the option?  If it doesn't work we
 should treat it as an ordinary but and fix it.

 So far it's against the architecture of the emulated system: our current
 chipset predates 64 bit PCI.


 Then it should be enabled/disabled at the chipset level.

 Makes me wonder if we already do some filtering if the device supports
 64 bit but the next bridge does not.

 
 Our 440fx does support 64-bit bars, so the question doesn't arise for
 x86.  Instead we violate the spec.

If you mean by our the 440fx-qemu, not the real 440fx. That one does
not even support 1GB RAM.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Avi Kivity
On 01/31/2012 03:21 PM, Jan Kiszka wrote:
 On 2012-01-31 14:10, Avi Kivity wrote:
  On 01/31/2012 02:57 PM, Jan Kiszka wrote:
  Seems fine, but do we really need the option?  If it doesn't work we
  should treat it as an ordinary but and fix it.
 
  So far it's against the architecture of the emulated system: our current
  chipset predates 64 bit PCI.
 
 
  Then it should be enabled/disabled at the chipset level.
 
  Makes me wonder if we already do some filtering if the device supports
  64 bit but the next bridge does not.
 
  
  Our 440fx does support 64-bit bars, so the question doesn't arise for
  x86.  Instead we violate the spec.

 If you mean by our the 440fx-qemu, not the real 440fx. That one does
 not even support 1GB RAM.

Yes, that's what I meant.  It also supports pci hotplug, more slots, cpu
hotplug, etc.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Repair running on non ioeventfd-capable platforms

2012-01-31 Thread Sasha Levin
On Tue, 2012-01-31 at 17:30 +1100, Matt Evans wrote:
 
 Commit d3923126a24212f1e746a84a575dadbd9f259418 added a bunch of
 nice error checking around ioevent__init() but the init may gracefully
 fail if ioevents simply aren't supported (PPC64 KVM).
 
 This commit adds a new return code for the init -- positive, but
 identifiable as 'not success 0'.
 
 Signed-off-by: Matt Evans m...@ozlabs.org
 ---
 
 Sasha, you screwed PPC64!  And it took me a month to notice hahaha! 

I only broke it two days ago :)

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-01-31 Thread Avi Kivity
On 01/31/2012 03:32 PM, Peter Zijlstra wrote:
 Subject: srcu: Implement call_srcu()
 From: Peter Zijlstra a.p.zijls...@chello.nl
 Date: Mon Jan 30 23:20:49 CET 2012

 Implement call_srcu() by using a state machine driven by
 call_rcu_sched() and timer callbacks.

 The state machine is a direct derivation of the existing
 synchronize_srcu() code and replaces synchronize_sched() calls with a
 call_rcu_sched() callback and the schedule_timeout() calls with simple
 timer callbacks.

 It then re-implements synchronize_srcu() using a completion where we
 send the complete through call_srcu().

 It completely wrecks synchronize_srcu_expedited() which is only used
 by KVM. 3 of the 5 

2 of the 5

 use cases look like they really want to use
 call_srcu() instead, the remaining 2 I don't know but hope they can,
 which would let us remove it.

They really need to return quickly to userspace, and they really need to
perform some operation between rcu_assign_pointer() and returning, so no.


 Compile tested only!! :-)


How much did synchronize_srcu_expedited() regress?  Presumably your
compiler didn't tell you that.

Can we get it back to speed by scheduling a work function on all cpus? 
wouldn't that force a quiescent state and allow call_srcu() to fire?

In kvm's use case synchronize_srcu_expedited() is usually called when no
thread is in a critical section, so we don't have to wait for anything
except the srcu machinery.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-01-31 Thread Peter Zijlstra
On Tue, 2012-01-31 at 15:47 +0200, Avi Kivity wrote:

 They really need to return quickly to userspace, and they really need to
 perform some operation between rcu_assign_pointer() and returning, so no.

Bugger :/

 
  Compile tested only!! :-)
 
 
 How much did synchronize_srcu_expedited() regress?  Presumably your
 compiler didn't tell you that.

Nope, quite a lot I figure.

 Can we get it back to speed by scheduling a work function on all cpus? 
 wouldn't that force a quiescent state and allow call_srcu() to fire?
 
 In kvm's use case synchronize_srcu_expedited() is usually called when no
 thread is in a critical section, so we don't have to wait for anything
 except the srcu machinery.

OK, I'll try and come up with means of making it go fast again ;-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Anthony Liguori

On 01/30/2012 05:41 PM, Andreas Färber wrote:

Am 30.01.2012 19:55, schrieb Juan Quintela:

Please send in any agenda items you are interested in covering.


QOM roadmap update:
* Series 3/4 is on the list.
-  Please officially designate a merge date (Friday?).
-  To make review sensible, I ask for a hard device freeze until merged.
I.e., no new devices and no conflicting changes to DeviceInfo.


I put together a few slides to help this discussion:

http://www.codemonkey.ws/files/qom-overview.pdf

Regards,

Anthony Liguori



* Further roadmap: i440FX, CPU, MemoryRegion (when, who, where, how)

VMState:
Anthony specifically said that VMState were not affected by QOM and that
patches should not be deferred until the merge. Yet there's no review
and/or decision-making for a month now. Ping^2 for AHCI+SDHC.

Mascot contest:
Any update? Were there too few entries? Has it been forgotten?
If the contest is still open then that should be stated on the list.

Andreas



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Mitsyanko Igor

On 01/31/2012 05:15 PM, Andreas Färber wrote:

Am 31.01.2012 00:53, schrieb Anthony Liguori:

On 01/30/2012 05:41 PM, Andreas Färber wrote:

Am 30.01.2012 19:55, schrieb Juan Quintela:

Please send in any agenda items you are interested in covering.



VMState:
Anthony specifically said that VMState were not affected by QOM and that
patches should not be deferred until the merge. Yet there's no review
and/or decision-making for a month now. Ping^2 for AHCI+SDHC.


Do you have pointers (to pending VMState patches)?


http://patchwork.ozlabs.org/patch/137732/ (PATCH v4)

It's basically about how to deal with variable-sized arrays. (Alex
mentioned it on one call around November.) I found ways to deal with
subsets of arrays embedded within the struct and variable-sized list of
pointers to structs but no solution for a malloc()'ed array of structs.
Maybe I'm just too stupid to see. Anyway, no one commented since Xmas.

Igor posted (and refined for v2) a patch with a callback-based approach
that I find promising. From my view, unofficially Juan is the VMState
guy, he's been cc'ed. Are we lacking an official maintainer that cares?
Or is Juan the official, undocumented maintainer but simply busy?

SUSE's interest is making AHCI migratable, and my VMState workaround for
that is simply ugly:

http://patchwork.ozlabs.org/patch/133066/ (RFC)



If I'm not mistaken, if you change AHCIState's .ports type to uint32_t 
you can use existing VMSTATE_BUFFER_MULTIPLY macro like this:


VMSTATE_BUFFER_MULTIPLY(dev, AHCIState, 0, NULL, 0, ports, 
sizeof(AHCIDevice))


VMSTATE_BUFFER_MULTIPLY currently lacks VMS_POINTER flag and therefore 
doesn't make any use of _start field (you don't need it anyway)


Nevertheless, VMSTATE_BUFFER_MULTIPLY is just a partial solution to a 
bigger set of possible problems. SD card's vmstate implementation 
requires shift operation, SDHC gets size from switch {} statement, 
something else later may require division or addition e.t.c., 
get_bufsize callback will cover all possible cases.




--
Mitsyanko Igor
ASWG, Moscow RD center, Samsung Electronics
email: i.mitsya...@samsung.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kernel bug?

2012-01-31 Thread Bram De Wilde
Hi,

We are setting up a ubuntu / openstack / KVM based private cloud. Of the 3 
servers involved in this configuration 1 has been crashing frequently over the 
last month.
Since it is the only server responsible for the virtualization (so the only one 
running kvm virtual machines) my guess was to find some help with you guys. We 
are running the same install on 2 other machines (not running virtual 
machines!) who have been completely stable over the last three months, so a 
hardware issue might also be at play here?

I have collected some stack traces over the last month and was hoping one of 
you might point us in the right direction as to what is going wrong on our 
system.
We are running a default ubuntu 11.10 server install. The kernel has been 
upgraded from the 3.0.0-12 release to the 3.0.0-14 release over the last month, 
but we keep getting these crashes. We are now running the 3.2.2-030202-generic 
hoping that this is a kernel problem, and it is fixed in this latest version.

If any additional information is required or I should file an official bug with 
this information, please let me know...

kind regards
Bram

Jan 27 13:41:28 cmggcn01 kernel: [871350.761867] general protection fault:  
[#2] SMP 
Jan 27 13:41:28 cmggcn01 kernel: [871350.790117] CPU 14 
Jan 27 13:41:28 cmggcn01 kernel: [871350.790387] Modules linked in: btrfs 
zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs 
reiserfs ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat 
ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 
xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp 
iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser 
rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi psmouse dcdbas dm_multipath serio_raw joydev ghes 
hed acpi_power_meter bonding lp parport i7core_edac edac_core ses enclosure 
usbhid hid megaraid_sas bnx2
Jan 27 13:41:28 cmggcn01 kernel: [871351.005151] 
Jan 27 13:41:28 cmggcn01 kernel: [871351.036187] Pid: 90, comm: kswapd0 
Tainted: G  D 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R710/0MD99X
Jan 27 13:41:28 cmggcn01 kernel: [871351.072809] RIP: 0010:[a01a5890] 
 [a01a5890] kvm_unmap_rmapp+0x20/0x60 [kvm]
Jan 27 13:41:28 cmggcn01 kernel: [871351.105190] RSP: 0018:8817f3e27a60  
EFLAGS: 00010202
Jan 27 13:41:28 cmggcn01 kernel: [871351.141329] RAX: 8817f5d067f8 RBX: 
c9001fd41ff8 RCX: a01a58d0
Jan 27 13:41:28 cmggcn01 kernel: [871351.179076] RDX:  RSI: 
 RDI: 8817f5d067f8
Jan 27 13:41:28 cmggcn01 kernel: [871351.212086] RBP: 8817f3e27a80 R08: 
8817f315b3e0 R09: 0100
Jan 27 13:41:28 cmggcn01 kernel: [871351.245788] R10: 000e R11: 
0002 R12: 8817f2f0c000
Jan 27 13:41:28 cmggcn01 kernel: [871351.277514] R13:  R14: 
880be235e000 R15: 000d3cff
Jan 27 13:41:28 cmggcn01 kernel: [871351.308421] FS:  () 
GS:88183fce() knlGS:
Jan 27 13:41:28 cmggcn01 kernel: [871351.339685] CS:  0010 DS:  ES:  
CR0: 8005003b
Jan 27 13:41:28 cmggcn01 kernel: [871351.370089] CR2: 7f8836442000 CR3: 
01c03000 CR4: 26e0
Jan 27 13:41:28 cmggcn01 kernel: [871351.399771] DR0:  DR1: 
 DR2: 
Jan 27 13:41:28 cmggcn01 kernel: [871351.428208] DR3:  DR6: 
0ff0 DR7: 0400
Jan 27 13:41:28 cmggcn01 kernel: [871351.456153] Process kswapd0 (pid: 90, 
threadinfo 8817f3e26000, task 8817f63ac560)
Jan 27 13:41:28 cmggcn01 kernel: [871351.484943] Stack:
Jan 27 13:41:28 cmggcn01 kernel: [871351.512984]   
c9001fd41ff8 0001 7f834a87e000
Jan 27 13:41:28 cmggcn01 kernel: [871351.542025]  8817f3e27aa0 
a01a5945 880be235e060 0001
Jan 27 13:41:28 cmggcn01 kernel: [871351.571050]  8817f3e27b10 
a01a1dd9 8817f3e27ae0 a01a58d0
Jan 27 13:41:28 cmggcn01 kernel: [871351.600455] Call Trace:
Jan 27 13:41:28 cmggcn01 kernel: [871351.628903]  [a01a5945] 
kvm_age_rmapp+0x75/0x90 [kvm]
Jan 27 13:41:28 cmggcn01 kernel: [871351.659242]  [a01a1dd9] 
kvm_handle_hva+0x99/0x180 [kvm]
Jan 27 13:41:28 cmggcn01 kernel: [871351.687386]  [a01a58d0] ? 
kvm_unmap_rmapp+0x60/0x60 [kvm]
Jan 27 13:41:28 cmggcn01 kernel: [871351.716053]  [a01a8af7] 
kvm_age_hva+0x17/0x20 [kvm]
Jan 27 13:41:28 cmggcn01 kernel: [871351.746652]  [a018a4dd] 
kvm_mmu_notifier_clear_flush_young+0x4d/0x90 [kvm]
Jan 27 13:41:28 cmggcn01 kernel: [871351.774490]  [8114d7b8] 
__mmu_notifier_clear_flush_young+0x48/0x60
Jan 27 13:41:28 cmggcn01 kernel: [871351.801948]  [81138f1b] 
page_referenced_one+0x18b/0x1f0
Jan 27 13:41:28 cmggcn01 kernel: [871351.827654]  

Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Avi Kivity
On 01/31/2012 03:59 PM, Anthony Liguori wrote:
 On 01/30/2012 05:41 PM, Andreas Färber wrote:
 Am 30.01.2012 19:55, schrieb Juan Quintela:
 Please send in any agenda items you are interested in covering.

 QOM roadmap update:
 * Series 3/4 is on the list.
 -  Please officially designate a merge date (Friday?).
 -  To make review sensible, I ask for a hard device freeze until
 merged.
 I.e., no new devices and no conflicting changes to DeviceInfo.

 I put together a few slides to help this discussion:

 http://www.codemonkey.ws/files/qom-overview.pdf


That was helpful, thanks.

Can you clarify

- Types and their properties will be ABI  compatible
- Types and properties will not be backwards  compatible
– We can re-examine this as the device model  matures and stabilizes

the first two seem very similar, except for the not.


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Anthony Liguori

On 01/31/2012 07:15 AM, Andreas Färber wrote:

Am 31.01.2012 00:53, schrieb Anthony Liguori:

On 01/30/2012 05:41 PM, Andreas Färber wrote:

Am 30.01.2012 19:55, schrieb Juan Quintela:

Please send in any agenda items you are interested in covering.



VMState:
Anthony specifically said that VMState were not affected by QOM and that
patches should not be deferred until the merge. Yet there's no review
and/or decision-making for a month now. Ping^2 for AHCI+SDHC.


Do you have pointers (to pending VMState patches)?


http://patchwork.ozlabs.org/patch/137732/ (PATCH v4)

It's basically about how to deal with variable-sized arrays. (Alex
mentioned it on one call around November.) I found ways to deal with
subsets of arrays embedded within the struct and variable-sized list of
pointers to structs but no solution for a malloc()'ed array of structs.
Maybe I'm just too stupid to see. Anyway, no one commented since Xmas.


/me puts on his flame proof suit

Don't use VMState.  Just open code a save/restore function.  VMState is too 
limited in how it handles complex data structures.


I really believe the only long term solution we're going to get to here is 
something that uses a builder interface (like Visitors).


Regards,

Anthony Liguori



Igor posted (and refined for v2) a patch with a callback-based approach
that I find promising. From my view, unofficially Juan is the VMState
guy, he's been cc'ed. Are we lacking an official maintainer that cares?
Or is Juan the official, undocumented maintainer but simply busy?

SUSE's interest is making AHCI migratable, and my VMState workaround for
that is simply ugly:

http://patchwork.ozlabs.org/patch/133066/ (RFC)

Therefore I'm waiting for some resolution.

Regards,
Andreas



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Anthony Liguori

On 01/31/2012 08:09 AM, Avi Kivity wrote:

On 01/31/2012 03:59 PM, Anthony Liguori wrote:

On 01/30/2012 05:41 PM, Andreas Färber wrote:

Am 30.01.2012 19:55, schrieb Juan Quintela:

Please send in any agenda items you are interested in covering.


QOM roadmap update:
* Series 3/4 is on the list.
-   Please officially designate a merge date (Friday?).
-   To make review sensible, I ask for a hard device freeze until
merged.
 I.e., no new devices and no conflicting changes to DeviceInfo.


I put together a few slides to help this discussion:

http://www.codemonkey.ws/files/qom-overview.pdf



That was helpful, thanks.

Can you clarify

- Types and their properties will be ABI  compatible
- Types and properties will not be backwards  compatible
– We can re-examine this as the device model  matures and stabilizes

the first two seem very similar, except for the not.


I guess I really mean QMP compatible.  The expectation is that well written 
management code does:


if (check_for_command(qom-list)) {
   run_command(qom-list, /);
}

ABI compatible means that if qom-list is there, it's semantics never change. 
Backwards compatibility would mean that once qom-list is introduced, it never 
goes away.


In terms of QOM types, we won't guarantee that a Type will stick around forever, 
but we will guarantee if it's there, it'll behave in a certain way.


When the device model stabilizes over time, we can revisit this, but in the 1.x 
series, I'm sure we're going to go through a few significant refactorings that 
change types significantly.


Regards,

Anthony Liguori






--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel bug?

2012-01-31 Thread Avi Kivity
On 01/31/2012 04:06 PM, Bram De Wilde wrote:
 Hi,

 We are setting up a ubuntu / openstack / KVM based private cloud. Of the 3 
 servers involved in this configuration 1 has been crashing frequently over 
 the last month.
 Since it is the only server responsible for the virtualization (so the only 
 one running kvm virtual machines) my guess was to find some help with you 
 guys. We are running the same install on 2 other machines (not running 
 virtual machines!) who have been completely stable over the last three 
 months, so a hardware issue might also be at play here?

 I have collected some stack traces over the last month and was hoping one of 
 you might point us in the right direction as to what is going wrong on our 
 system.
 We are running a default ubuntu 11.10 server install. The kernel has been 
 upgraded from the 3.0.0-12 release to the 3.0.0-14 release over the last 
 month, but we keep getting these crashes. We are now running the 
 3.2.2-030202-generic hoping that this is a kernel problem, and it is fixed in 
 this latest version.

 If any additional information is required or I should file an official bug 
 with this information, please let me know...

File a bug please.

Meanwhile:

 Jan 27 13:41:28 cmggcn01 kernel: [871350.761867] general protection fault: 
  [#2] SMP 
 Jan 27 13:41:28 cmggcn01 kernel: [871350.790117] CPU 14 
 Jan 27 13:41:28 cmggcn01 kernel: [871350.790387] Modules linked in: btrfs 
 zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs 
 reiserfs ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat 
 ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 
 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp 
 iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser 
 rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
 libiscsi scsi_transport_iscsi psmouse dcdbas dm_multipath serio_raw joydev 
 ghes hed acpi_power_meter bonding lp parport i7core_edac edac_core ses 
 enclosure usbhid hid megaraid_sas bnx2
 Jan 27 13:41:28 cmggcn01 kernel: [871351.005151] 
 Jan 27 13:41:28 cmggcn01 kernel: [871351.036187] Pid: 90, comm: kswapd0 
 Tainted: G  D 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge 
 R710/0MD99X
 Jan 27 13:41:28 cmggcn01 kernel: [871351.072809] RIP: 
 0010:[a01a5890]  [a01a5890] kvm_unmap_rmapp+0x20/0x60 
 [kvm]
 Jan 27 13:41:28 cmggcn01 kernel: [871351.105190] RSP: 0018:8817f3e27a60  
 EFLAGS: 00010202
 Jan 27 13:41:28 cmggcn01 kernel: [871351.141329] RAX: 8817f5d067f8 RBX: 
 c9001fd41ff8 RCX: a01a58d0
 Jan 27 13:41:28 cmggcn01 kernel: [871351.179076] RDX:  RSI: 
  RDI: 8817f5d067f8
 Jan 27 13:41:28 cmggcn01 kernel: [871351.212086] RBP: 8817f3e27a80 R08: 
 8817f315b3e0 R09: 0100
 Jan 27 13:41:28 cmggcn01 kernel: [871351.245788] R10: 000e R11: 
 0002 R12: 8817f2f0c000
 Jan 27 13:41:28 cmggcn01 kernel: [871351.277514] R13:  R14: 
 880be235e000 R15: 000d3cff
 Jan 27 13:41:28 cmggcn01 kernel: [871351.308421] FS:  () 
 GS:88183fce() knlGS:
 Jan 27 13:41:28 cmggcn01 kernel: [871351.339685] CS:  0010 DS:  ES:  
 CR0: 8005003b
 Jan 27 13:41:28 cmggcn01 kernel: [871351.370089] CR2: 7f8836442000 CR3: 
 01c03000 CR4: 26e0
 Jan 27 13:41:28 cmggcn01 kernel: [871351.399771] DR0:  DR1: 
  DR2: 
 Jan 27 13:41:28 cmggcn01 kernel: [871351.428208] DR3:  DR6: 
 0ff0 DR7: 0400
 Jan 27 13:41:28 cmggcn01 kernel: [871351.456153] Process kswapd0 (pid: 90, 
 threadinfo 8817f3e26000, task 8817f63ac560)
 Jan 27 13:41:28 cmggcn01 kernel: [871351.484943] Stack:
 Jan 27 13:41:28 cmggcn01 kernel: [871351.512984]   
 c9001fd41ff8 0001 7f834a87e000
 Jan 27 13:41:28 cmggcn01 kernel: [871351.542025]  8817f3e27aa0 
 a01a5945 880be235e060 0001
 Jan 27 13:41:28 cmggcn01 kernel: [871351.571050]  8817f3e27b10 
 a01a1dd9 8817f3e27ae0 a01a58d0

snip

 Jan 27 13:41:28 cmggcn01 kernel: [871352.080582] Code: e7 d0 e8 e0 66 90 e9 
 a2 fe ff ff 55 48 89 e5 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 45 31 ed 49 
 89 fc 48 89 f3 eb 20 0f 1f 00 f6 00 01 74 35 48 8b 15 74 7a 02 00 48 89 c6 
 4c 89 e7 41 bd 01 


   0:e8 e0 66 90 e9   callq  0xe99066e5
   5:a2 fe ff ff 55 48 89 mov%al,0x41e5894855fe
   c:e5 41
   e:55   push   %rbp
   f:41 54push   %r12
  11:53   push   %rbx
  12:48 83 ec 08  sub$0x8,%rsp
  16:66 66 66 66 90   data32 data32 data32 xchg %ax,%ax
  1b:45 31 ed xor%r13d,%r13d
  1e:49 89 fc

Re: [Android-virt] [PATCH v5 00/13] KVM/ARM Implementation

2012-01-31 Thread Antonios Motakis

On 01/30/2012 11:46 PM, Peter Maydell wrote:

On 20 January 2012 02:59, Christoffer Dall
c.d...@virtualopensystems.com  wrote:

There's a new list of issues available at:

https://github.com/virtualopensystems/linux-kvm-arm/issues

Thanks for putting this up. Here's a couple more for you :-)

  * Support guest kernels configured for LPAE

At the moment (well, if you have Marc's 3.3rc1 tree with the A15 L2
cache control register bodge in it) you can boot an A15 kernel
configured without LPAE as a KVM guest, but an LPAE kernel with LPAE
enabled will not boot. This probably mostly requires supporting the
64 bit wide cp15 registers that LPAE implies.

  * handle QEMU being ^C'd

Currently if you ^C the qemu process then instead of a nice
clean exit things go a bit pear shaped, with apparently part
of the qemu/kvm combo having quit and the rest not and lots
of error messages being emitted. (I admit to not having looked
at this one enough to be certain tht it's a kernel side thing
rather than a QEMU one; I'm just guessing.)

-- PMM


I took the initiative to add them to that list, thanks.

-Antonios
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-Autotest][PATCH 1/5] virt: Check illegal instruction code

2012-01-31 Thread Jiří Župka
Add a check on the base VM class for illegal
instruction code executed by the VM. That
check is performed on the serial console
output.

Signed-off-by: Jiří Župka jzu...@redhat.com
---
 client/virt/virt_vm.py |   27 +++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/client/virt/virt_vm.py b/client/virt/virt_vm.py
index 32593c1..d4fb47d 100644
--- a/client/virt/virt_vm.py
+++ b/client/virt/virt_vm.py
@@ -133,6 +133,18 @@ class VMDeadKernelCrashError(VMError):
 return (VM is dead due to a kernel crash:\n%s % self.kernel_crash)
 
 
+class VMInvalidInstructionCode(VMError):
+def __init__(self, invalid_code):
+VMError.__init__(self, invalid_code)
+self.invalid_code = invalid_code
+
+def __str__(self):
+error = 
+for invalid_code in self.invalid_code:
+error += %s % (invalid_code)
+return (Invalid instruction was executed on VM:\n%s % error)
+
+
 class VMAddressError(VMError):
 pass
 
@@ -656,6 +668,21 @@ class BaseVM(object):
 raise VMDeadKernelCrashError(match.group(0))
 
 
+def verify_illegal_instructonn(self):
+
+Find illegal instruction code on VM serial console output.
+
+@raise: VMInvalidInstructionCode, in case a wrong instruction code.
+
+if self.serial_console is not None:
+data = self.serial_console.get_output()
+match = re.findall(r.*trap invalid opcode.*\n, data,
+   re.MULTILINE)
+
+if match:
+raise VMInvalidInstructionCode(match)
+
+
 def get_params(self):
 
 Return the VM's params dict. Most modified params take effect only
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-Autotest][PATCH 2/5] virt: Add aliases to class Flag.

2012-01-31 Thread Jiří Župka
The patch solve problem with doubled  name of cpuflags
sse4_1, sse4.1 etc. in cpuflag test.

Signed-off-by: Jiří Župka jzu...@redhat.com
---
 client/virt/virt_utils.py |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/client/virt/virt_utils.py b/client/virt/virt_utils.py
index a367ffe..20ed4ba 100644
--- a/client/virt/virt_utils.py
+++ b/client/virt/virt_utils.py
@@ -1294,8 +1294,12 @@ class Flag(str):
 
 Class for easy merge cpuflags.
 
-def __init__(self,  *args, **kwargs):
-super(Flag, self).__init__( *args, **kwargs)
+aliases = {}
+
+def __new__(cls, flag):
+if flag in Flag.aliases:
+flag = Flag.aliases[flag]
+return str.__new__(cls, flag)
 
 def __eq__(self, other):
 s = set(self.split(|))
@@ -1324,6 +1328,12 @@ kvm_map_flags_to_test = {
 }
 
 
+kvm_map_flags_aliases = {
+'sse4.1'  :'sse4_1',
+'sse4.2'  :'sse4_2',
+}
+
+
 def kvm_flags_to_stresstests(flags):
 
 Covert [cpu flags] to [tests]
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-Autotest][PATCH 3/5] kvm test: Introduce multi_host.srv

2012-01-31 Thread Jiří Župka
We had migration_control.srv which worked with old
version of kvm config files (tests.cfg, subtest.cfg,
etc..). Because config files were changed this control
code stopped working.

This patch repairs this problem and renames the file
to the more correct and generic name multi-host.srv.
Also, added new configuration file tests-shared.cfg
which contains shared data among single host and multi
host tests.

The changes are useful for multihost tests, since
we can use a lot of configuration from regular single
host tests.

Signed-off-by: Jiří Župka jzu...@redhat.com
---
 client/tests/kvm/migration_control.srv   |  123 --
 client/tests/kvm/multi-host-tests.cfg.sample |   43 +
 client/tests/kvm/multi_host.srv  |  106 ++
 client/tests/kvm/tests-shared.cfg.sample |   48 ++
 client/tests/kvm/tests.cfg.sample|   43 +-
 5 files changed, 198 insertions(+), 165 deletions(-)
 delete mode 100644 client/tests/kvm/migration_control.srv
 create mode 100644 client/tests/kvm/multi-host-tests.cfg.sample
 create mode 100644 client/tests/kvm/multi_host.srv
 create mode 100644 client/tests/kvm/tests-shared.cfg.sample

diff --git a/client/tests/kvm/migration_control.srv 
b/client/tests/kvm/migration_control.srv
deleted file mode 100644
index c669ccd..000
--- a/client/tests/kvm/migration_control.srv
+++ /dev/null
@@ -1,123 +0,0 @@
-AUTHOR = Yolkfull Chow yz...@redhat.com
-TIME = SHORT
-NAME = KVM Test (migration across multiple hosts)
-TEST_CATEGORY = Functional
-TEST_CLASS = Virtualization
-TEST_TYPE = Server
-DOC = 
-Migrate KVM guest between two hosts. It parses the base config file, restricts
-it with appropriate parameters, generates the test dicts, modify the test_dicts
-so there's a distinction between the migration roles ('dest' or 'source').
-
-
-import sys, os, commands, glob, shutil, logging, random
-from autotest_lib.server import utils
-from autotest_lib.client.common_lib import cartesian_config
-
-# Specify the directory of autotest before you start this test
-AUTOTEST_DIR = '/usr/local/autotest'
-
-# Specify the root directory that on client machines
-rootdir = '/tmp/kvm_autotest_root'
-
-KVM_DIR = os.path.join(AUTOTEST_DIR, 'client/tests/kvm')
-
-
-def generate_mac_address():
-r = random.SystemRandom()
-mac = 9a:%02x:%02x:%02x:%02x:%02x % (r.randint(0x00, 0xff),
-   r.randint(0x00, 0xff),
-   r.randint(0x00, 0xff),
-   r.randint(0x00, 0xff),
-   r.randint(0x00, 0xff))
-return mac
-
-
-def run(pair):
-logging.info(KVM migration running on source host [%s] and destination 
- host [%s]\n, pair[0], pair[1])
-
-source = hosts.create_host(pair[0])
-dest = hosts.create_host(pair[1])
-source_at = autotest.Autotest(source)
-dest_at = autotest.Autotest(dest)
-
-cfg_file = os.path.join(KVM_DIR, tests_base.cfg)
-
-if not os.path.exists(cfg_file):
-raise error.JobError(Config file %s was not found, cfg_file)
-
-# Get test set (dictionary list) from the configuration file
-parser = cartesian_config.Parser()
-test_variants = 
-image_name(_.*)? ?= /tmp/kvm_autotest_root/images/
-cdrom(_.*)? ?= /tmp/kvm_autotest_root/
-floppy ?= /tmp/kvm_autotest_root/
-Linux:
-unattended_install:
-kernel ?= /tmp/kvm_autotest_root/
-initrd ?= /tmp/kvm_autotest_root/
-qemu_binary = /usr/libexec/qemu-kvm
-qemu_img_binary = /usr/bin/qemu-img
-only qcow2
-only virtio_net
-only virtio_blk
-only smp2
-only no_pci_assignable
-only smallpages
-only Fedora.14.64
-only migrate_multi_host
-nic_mode = tap
-nic_mac_nic1 = %s
- % (generate_mac_address())
-parser.parse_file(cfg_file)
-parser.parse_string(test_variants)
-test_dicts = parser.get_dicts()
-
-source_control_file = dest_control_file = 
-kvm_test_dir = os.path.join(os.environ['AUTODIR'],'tests/kvm')
-sys.path.append(kvm_test_dir)\n
-
-for params in test_dicts:
-params['srchost'] = source.ip
-params['dsthost'] = dest.ip
-params['rootdir'] = rootdir
-
-source_params = params.copy()
-source_params['role'] = source
-
-dest_params = params.copy()
-dest_params['role'] = destination
-dest_params['migration_mode'] = tcp
-
-# Report the parameters we've received
-print Test parameters:
-keys = params.keys()
-keys.sort()
-for key in keys:
-logging.debug(%s = %s, key, params[key])
-
-source_control_file += (job.run_test('kvm', tag='%s', params=%s) %
-(source_params['shortname'], source_params))
-dest_control_file += (job.run_test('kvm', tag='%s', params=%s) %
-  (dest_params['shortname'], dest_params))
-
-logging.info('Source control 

[KVM-Autotest][PATCH 5/5] kvm test: Add multihost migration support to cpuflag test

2012-01-31 Thread Jiří Župka
Signed-off-by: Jiří Župka jzu...@redhat.com
---
 client/tests/kvm/tests/cpuflags.py |  284 +++-
 client/virt/subtests.cfg.sample|   17 ++-
 2 files changed, 263 insertions(+), 38 deletions(-)

diff --git a/client/tests/kvm/tests/cpuflags.py 
b/client/tests/kvm/tests/cpuflags.py
index 6f281d0..c2111d4 100644
--- a/client/tests/kvm/tests/cpuflags.py
+++ b/client/tests/kvm/tests/cpuflags.py
@@ -1,4 +1,4 @@
-import logging, re, random, os, time
+import logging, re, random, os, time, socket, pickle
 from autotest_lib.client.common_lib import error, utils
 from autotest_lib.client.virt import kvm_vm
 from autotest_lib.client.virt import virt_utils, aexpect
@@ -14,10 +14,9 @@ def run_cpuflags(test, params, env):
 @param params: Dictionary with the test parameters.
 @param env: Dictionary with test environment.
 
+virt_utils.Flag.aliases = virt_utils.kvm_map_flags_aliases
 qemu_binary = virt_utils.get_path('.', params.get(qemu_binary, qemu))
 
-cpuflags_path = os.path.join(test.virtdir, deps)
-cpuflags_tar = cpuflags-test.tar.bz2
 cpuflags_src = os.path.join(test.virtdir, deps, test_cpu_flags)
 smp = int(params.get(smp, 1))
 
@@ -25,8 +24,9 @@ def run_cpuflags(test, params, env):
 
 mig_timeout = float(params.get(mig_timeout, 3600))
 mig_protocol = params.get(migration_protocol, tcp)
-mig_speed = params.get(mig_speed, 1G)
+mig_speed = params.get(mig_speed, 100M)
 
+multi_host_migration = params.get(multi_host_migration, no)
 
 class HgFlags(object):
 def __init__(self, cpu_model, extra_flags=set([])):
@@ -62,7 +62,8 @@ def run_cpuflags(test, params, env):
 virtual_flags)
 
 
-def start_guest_with_cpuflags(cpuflags, smp=None):
+def start_guest_with_cpuflags(cpuflags, smp=None, migration=False,
+  wait=True):
 
 Try to boot guest with special cpu flags and try login in to them.
 
@@ -74,10 +75,15 @@ def run_cpuflags(test, params, env):
 vm_name = vm1-cpuflags
 vm = kvm_vm.VM(vm_name, params_b, test.bindir, env['address_cache'])
 env.register_vm(vm_name, vm)
-vm.create()
+if (migration is True):
+vm.create(migration_mode=mig_protocol)
+else:
+vm.create()
 vm.verify_alive()
 
-session = vm.wait_for_login()
+session = None
+if wait:
+session = vm.wait_for_login()
 
 return (vm, session)
 
@@ -228,8 +234,10 @@ def run_cpuflags(test, params, env):
 
 session = vm.wait_for_login()
 vm.copy_files_to(cpuflags_src, dst_dir)
+session.cmd(sync)
 session.cmd(cd %s; make EXTRA_FLAGS=''; %
 os.path.join(dst_dir, test_cpu_flags))
+session.cmd(sync)
 session.close()
 
 
@@ -293,7 +301,30 @@ def run_cpuflags(test, params, env):
 return cpu_model
 
 
-def test_qemu_interface():
+def parse_cpu_model():
+
+Parse cpu_models from config file.
+
+@return: [(cpumodel, extra_flags)]
+
+cpu_models = params.get(cpu_models,).split()
+if not cpu_models:
+cpu_models = get_cpu_models()
+
+logging.debug(CPU models found: %s, str(cpu_models))
+models = []
+for cpu_model in cpu_models:
+try:
+(cpu_model, extra_flags) = cpu_model.split(:)
+extra_flags = set(map(virt_utils.Flag, extra_flags.split(,)))
+except ValueError:
+cpu_model = cpu_model
+extra_flags = set([])
+models.append((cpu_model,extra_flags))
+return models
+
+
+def test_qemu_interface_group():
 
 1) qemu-kvm-cmd -cpu ?model
 2) qemu-kvm-cmd -cpu ?dump
@@ -349,23 +380,20 @@ def run_cpuflags(test, params, env):
 test_qemu_dump()
 test_qemu_cpuid()
 
+
 class test_temp(Subtest):
 def clean(self):
 logging.info(cleanup)
 if (hasattr(self, vm)):
 self.vm.destroy(gracefully=False)
 
-def test_boot_guest():
+
+def test_boot_guest_group():
 
 1) boot with cpu_model
 2) migrate with flags
 3) qemu-kvm-cmd -cpu model_name,+Flag
 
-cpu_models = params.get(cpu_models,).split()
-if not cpu_models:
-cpu_models = get_cpu_models()
-logging.debug(CPU models found: %s, str(cpu_models))
-
 # 1) boot with cpu_model
 class test_boot_cpu_model(test_temp):
 def test(self, cpu_model):
@@ -460,29 +488,18 @@ def run_cpuflags(test, params, env):
 if fwarn_flags:
 raise error.TestFail(Qemu did not warn the use of 
  flags %s % str(fwarn_flags))
-for cpu_model in cpu_models:
-try:
-

Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Avi Kivity
On 01/31/2012 04:17 PM, Anthony Liguori wrote:
 On 01/31/2012 08:09 AM, Avi Kivity wrote:
 On 01/31/2012 03:59 PM, Anthony Liguori wrote:
 On 01/30/2012 05:41 PM, Andreas Färber wrote:
 Am 30.01.2012 19:55, schrieb Juan Quintela:
 Please send in any agenda items you are interested in covering.

 QOM roadmap update:
 * Series 3/4 is on the list.
 -   Please officially designate a merge date (Friday?).
 -   To make review sensible, I ask for a hard device freeze until
 merged.
  I.e., no new devices and no conflicting changes to DeviceInfo.

 I put together a few slides to help this discussion:

 http://www.codemonkey.ws/files/qom-overview.pdf


 That was helpful, thanks.

 Can you clarify

 - Types and their properties will be ABI  compatible
 - Types and properties will not be backwards  compatible
 – We can re-examine this as the device model  matures and stabilizes

 the first two seem very similar, except for the not.

 I guess I really mean QMP compatible.  The expectation is that well
 written management code does:

 if (check_for_command(qom-list)) {
run_command(qom-list, /);
 }

 ABI compatible means that if qom-list is there, it's semantics never
 change. Backwards compatibility would mean that once qom-list is
 introduced, it never goes away.

 In terms of QOM types, we won't guarantee that a Type will stick
 around forever, but we will guarantee if it's there, it'll behave in a
 certain way.

 When the device model stabilizes over time, we can revisit this, but
 in the 1.x series, I'm sure we're going to go through a few
 significant refactorings that change types significantly.

Ok.  We have to communicate this very carefully.  If a management tool
uses something that is ABI compatible, but not backwards compatible, and
lacks a known alternative at the time of writing the management tool,
then the ABI-compatible-but-not-backwards-compatible property becomes
not backwards compatible to the management tool's users.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for tuesday 31

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 08:12:29AM -0600, Anthony Liguori wrote:
 On 01/31/2012 07:15 AM, Andreas Färber wrote:
 Am 31.01.2012 00:53, schrieb Anthony Liguori:
 On 01/30/2012 05:41 PM, Andreas Färber wrote:
 Am 30.01.2012 19:55, schrieb Juan Quintela:
 Please send in any agenda items you are interested in covering.
 
 VMState:
 Anthony specifically said that VMState were not affected by QOM and that
 patches should not be deferred until the merge. Yet there's no review
 and/or decision-making for a month now. Ping^2 for AHCI+SDHC.
 
 Do you have pointers (to pending VMState patches)?
 
 http://patchwork.ozlabs.org/patch/137732/ (PATCH v4)
 
 It's basically about how to deal with variable-sized arrays. (Alex
 mentioned it on one call around November.) I found ways to deal with
 subsets of arrays embedded within the struct and variable-sized list of
 pointers to structs but no solution for a malloc()'ed array of structs.
 Maybe I'm just too stupid to see. Anyway, no one commented since Xmas.
 
 /me puts on his flame proof suit
 
 Don't use VMState.  Just open code a save/restore function.  VMState
 is too limited in how it handles complex data structures.
 
 I really believe the only long term solution we're going to get to
 here is something that uses a builder interface (like Visitors).
 
 Regards,
 
 Anthony Liguori

So the TPM patches started implementing a visitor interface
for BER, they are still blocked, right?

 
 Igor posted (and refined for v2) a patch with a callback-based approach
 that I find promising. From my view, unofficially Juan is the VMState
 guy, he's been cc'ed. Are we lacking an official maintainer that cares?
 Or is Juan the official, undocumented maintainer but simply busy?
 
 SUSE's interest is making AHCI migratable, and my VMState workaround for
 that is simply ugly:
 
 http://patchwork.ozlabs.org/patch/133066/ (RFC)
 
 Therefore I'm waiting for some resolution.
 
 Regards,
 Andreas
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for tuesday 31

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 05:04:48PM +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 31, 2012 at 08:12:29AM -0600, Anthony Liguori wrote:
  On 01/31/2012 07:15 AM, Andreas Färber wrote:
  Am 31.01.2012 00:53, schrieb Anthony Liguori:
  On 01/30/2012 05:41 PM, Andreas Färber wrote:
  Am 30.01.2012 19:55, schrieb Juan Quintela:
  Please send in any agenda items you are interested in covering.
  
  VMState:
  Anthony specifically said that VMState were not affected by QOM and that
  patches should not be deferred until the merge. Yet there's no review
  and/or decision-making for a month now. Ping^2 for AHCI+SDHC.
  
  Do you have pointers (to pending VMState patches)?
  
  http://patchwork.ozlabs.org/patch/137732/ (PATCH v4)
  
  It's basically about how to deal with variable-sized arrays. (Alex
  mentioned it on one call around November.) I found ways to deal with
  subsets of arrays embedded within the struct and variable-sized list of
  pointers to structs but no solution for a malloc()'ed array of structs.
  Maybe I'm just too stupid to see. Anyway, no one commented since Xmas.
  
  /me puts on his flame proof suit
  
  Don't use VMState.  Just open code a save/restore function.  VMState
  is too limited in how it handles complex data structures.
  
  I really believe the only long term solution we're going to get to
  here is something that uses a builder interface (like Visitors).
  
  Regards,
  
  Anthony Liguori
 
 So the TPM patches started implementing a visitor interface
 for BER, they are still blocked, right?

And by the way, I believe we really should switch to
something like BER and just drop the legacy migration
format - being non self delimiting makes it just too
painful for words to abstract, and attempts to
hide that mess behind a visitor interface will just cause
more cruft to accumulate, and cause inefficiencies
such as extra data copies.

  
  Igor posted (and refined for v2) a patch with a callback-based approach
  that I find promising. From my view, unofficially Juan is the VMState
  guy, he's been cc'ed. Are we lacking an official maintainer that cares?
  Or is Juan the official, undocumented maintainer but simply busy?
  
  SUSE's interest is making AHCI migratable, and my VMState workaround for
  that is simply ugly:
  
  http://patchwork.ozlabs.org/patch/133066/ (RFC)
  
  Therefore I'm waiting for some resolution.
  
  Regards,
  Andreas
  
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Paolo Bonzini

On 01/31/2012 03:12 PM, Anthony Liguori wrote:


Don't use VMState.  Just open code a save/restore function.  VMState is
too limited in how it handles complex data structures.

I really believe the only long term solution we're going to get to here
is something that uses a builder interface (like Visitors).


Visitors for complex data structures still require you separate get/set 
functions.  VMState was created to avoid having to write things twice 
and avoid getting the two out of sync.


Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4 V11] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-31 Thread Eric B Munson
Now that we have a flag that will tell the guest it was suspended, create an
interface for that communication using a KVM ioctl.

Signed-off-by: Eric B Munson emun...@mgebm.net

Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
Changes from V10:
 Return the ioctl to per vcpu instead of per vm
Changes from V9:
 Use kvm_for_each_vcpu to iterate online vcpu's
Changes from V8:
 Make KVM_GUEST_PAUSED a per vm ioctl instead of per vcpu
Changes from V7:
 Define KVM_CAP_GUEST_PAUSED and support check
 Call mark_page_dirty () after setting PVCLOCK_GUEST_STOPPED
Changes from V4:
 Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED
 Add new ioctl description to api.txt

 Documentation/virtual/kvm/api.txt |   13 +
 arch/x86/kvm/x86.c|   21 +
 include/linux/kvm.h   |3 +++
 3 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index e1d94bf..1931e5c 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1491,6 +1491,19 @@ following algorithm:
 Some guests configure the LINT1 NMI input to cause a panic, aiding in
 debugging.
 
+4.65 KVMCLOCK_GUEST_PAUSED
+
+Capability: KVM_CAP_GUEST_PAUSED
+Architechtures: Any that implement pvclocks (currently x86 only)
+Type: vcpu ioctl
+Parameters: None
+Returns: 0 on success, -1 on error
+
+This signals to the host kernel that the specified guest is being paused by
+userspace.  The host will set a flag in the pvclock structure that is checked
+from the soft lockup watchdog.  This ioctl can be called during pause or
+unpause.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 14d6cad..c9cabba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2056,6 +2056,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_X86_ROBUST_SINGLESTEP:
case KVM_CAP_XSAVE:
case KVM_CAP_ASYNC_PF:
+   case KVM_CAP_GUEST_PAUSED:
case KVM_CAP_GET_TSC_KHZ:
r = 1;
break;
@@ -2503,6 +2504,22 @@ static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu 
*vcpu,
return r;
 }
 
+/*
+ * kvm_set_guest_paused() indicates to the guest kernel that it has been
+ * stopped by the hypervisor.  This function will be called from the host only.
+ * EINVAL is returned when the host attempts to set the flag for a guest that
+ * does not support pv clocks.
+ */
+static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
+{
+   struct pvclock_vcpu_time_info *src = vcpu-arch.hv_clock;
+   if (!vcpu-arch.time_page)
+   return -EINVAL;
+   src-flags |= PVCLOCK_GUEST_STOPPED;
+   mark_page_dirty(vcpu-kvm, vcpu-arch.time  PAGE_SHIFT);
+   return 0;
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg)
 {
@@ -2784,6 +2801,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 
goto out;
}
+   case KVMCLOCK_GUEST_PAUSED: {
+   r = kvm_set_guest_paused(vcpu);
+   break;
+   }
default:
r = -EINVAL;
}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 68e67e5..4ffe0df 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_PAPR 68
 #define KVM_CAP_S390_GMAP 71
 #define KVM_CAP_TSC_DEADLINE_TIMER 72
+#define KVM_CAP_GUEST_PAUSED 73
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -763,6 +764,8 @@ struct kvm_clock_data {
 #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO,  0xa8, struct 
kvm_create_spapr_tce)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
+/* VM is being stopped by host */
+#define KVMCLOCK_GUEST_PAUSED_IO(KVMIO,   0xaa)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1  0)
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4 V11] Add flag to indicate that a vm was stopped by the host

2012-01-31 Thread Eric B Munson
This flag will be used to check if the vm was stopped by the host when a soft
lockup was detected.  The host will set the flag when it stops the guest.  On
resume, the guest will check this flag if a soft lockup is detected and skip
issuing the warning.

Signed-off-by: Eric B Munson emun...@mgebm.net
Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
 arch/x86/include/asm/pvclock-abi.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pvclock-abi.h 
b/arch/x86/include/asm/pvclock-abi.h
index 35f2d19..6167fd7 100644
--- a/arch/x86/include/asm/pvclock-abi.h
+++ b/arch/x86/include/asm/pvclock-abi.h
@@ -40,5 +40,6 @@ struct pvclock_wall_clock {
 } __attribute__((__packed__));
 
 #define PVCLOCK_TSC_STABLE_BIT (1  0)
+#define PVCLOCK_GUEST_STOPPED  (1  1)
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_PVCLOCK_ABI_H */
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4 V11] Add check for suspended vm in softlockup detector

2012-01-31 Thread Eric B Munson
A suspended VM can cause spurious soft lockup warnings.  To avoid these, the
watchdog now checks if the kernel knows it was stopped by the host and skips
the warning if so.  When the watchdog is reset successfully, clear the guest
paused flag.

Signed-off-by: Eric B Munson emun...@mgebm.net
Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
Changes from V3:
 Clear the PAUSED flag when the watchdog is reset

 kernel/watchdog.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 1d7bca7..91485e5 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -25,6 +25,7 @@
 #include linux/sysctl.h
 
 #include asm/irq_regs.h
+#include linux/kvm_para.h
 #include linux/perf_event.h
 
 int watchdog_enabled = 1;
@@ -280,6 +281,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
hrtimer *hrtimer)
__this_cpu_write(softlockup_touch_sync, false);
sched_clock_tick();
}
+
+   /* Clear the guest paused flag on watchdog reset */
+   kvm_check_and_clear_guest_paused();
__touch_watchdog();
return HRTIMER_RESTART;
}
@@ -292,6 +296,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
hrtimer *hrtimer)
 */
duration = is_softlockup(touch_ts);
if (unlikely(duration)) {
+   /*
+* If a virtual machine is stopped by the host it can look to
+* the watchdog like a soft lockup, check to see if the host
+* stopped the vm before we issue the warning
+*/
+   if (kvm_check_and_clear_guest_paused())
+   return HRTIMER_RESTART;
+
/* only warn once */
if (__this_cpu_read(soft_watchdog_warn) == true)
return HRTIMER_RESTART;
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4 V11] Add functions to check if the host has stopped the vm

2012-01-31 Thread Eric B Munson
When a host stops or suspends a VM it will set a flag to show this.  The
watchdog will use these functions to determine if a softlockup is real, or the
result of a suspended VM.

Signed-off-by: Eric B Munson emun...@mgebm.net
asm-generic changes Acked-by: Arnd Bergmann a...@arndb.de
Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
Changes from V6:
 Use __this_cpu_and when clearing the PVCLOCK_GUEST_STOPPED flag
Changes from V5:
 Collapse generic stubs into this patch
 check_and_clear_guest_stopped() takes no args and uses __get_cpu_var()
 Include individual definitions in ia64, s390, and powerpc

 arch/ia64/include/asm/kvm_para.h|5 +
 arch/powerpc/include/asm/kvm_para.h |5 +
 arch/s390/include/asm/kvm_para.h|5 +
 arch/x86/include/asm/kvm_para.h |8 
 arch/x86/kernel/kvmclock.c  |   21 +
 5 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/include/asm/kvm_para.h b/arch/ia64/include/asm/kvm_para.h
index 1588aee..2019cb9 100644
--- a/arch/ia64/include/asm/kvm_para.h
+++ b/arch/ia64/include/asm/kvm_para.h
@@ -26,6 +26,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0;
 }
 
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 50533f9..1f80293 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -169,6 +169,11 @@ static inline unsigned int kvm_arch_para_features(void)
return r;
 }
 
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* __POWERPC_KVM_PARA_H__ */
diff --git a/arch/s390/include/asm/kvm_para.h b/arch/s390/include/asm/kvm_para.h
index 6964db2..a988329 100644
--- a/arch/s390/include/asm/kvm_para.h
+++ b/arch/s390/include/asm/kvm_para.h
@@ -149,6 +149,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0;
 }
 
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
 #endif
 
 #endif /* __S390_KVM_PARA_H */
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 734c376..99c4bbe 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -95,6 +95,14 @@ struct kvm_vcpu_pv_apf_data {
 extern void kvmclock_init(void);
 extern int kvm_register_clock(char *txt);
 
+#ifdef CONFIG_KVM_CLOCK
+bool kvm_check_and_clear_guest_paused(void);
+#else
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+#endif /* CONFIG_KVMCLOCK */
 
 /* This instruction is vmcall.  On non-VT architectures, it will generate a
  * trap that we will then rewrite to the appropriate instruction.
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 44842d7..bdf6423 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -22,6 +22,7 @@
 #include asm/msr.h
 #include asm/apic.h
 #include linux/percpu.h
+#include linux/hardirq.h
 
 #include asm/x86_init.h
 #include asm/reboot.h
@@ -114,6 +115,26 @@ static void kvm_get_preset_lpj(void)
preset_lpj = lpj;
 }
 
+bool kvm_check_and_clear_guest_paused(void)
+{
+   bool ret = false;
+   struct pvclock_vcpu_time_info *src;
+
+   /*
+* per_cpu() is safe here because this function is only called from
+* timer functions where preemption is already disabled.
+*/
+   WARN_ON(!in_atomic());
+   src = __get_cpu_var(hv_clock);
+   if ((src-flags  PVCLOCK_GUEST_STOPPED) != 0) {
+   __this_cpu_and(hv_clock.flags, ~PVCLOCK_GUEST_STOPPED);
+   ret = true;
+   }
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_check_and_clear_guest_paused);
+
 static struct clocksource kvm_clock = {
.name = kvm-clock,
.read = kvm_clock_get_cycles,
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4 V11] Avoid soft lockup message when KVM is stopped by host

2012-01-31 Thread Eric B Munson
Changes from V10:
Return ioctl to per vcpu instead of per vm

Changes from V9:
Use kvm_for_each_vcpu to iterate online vcpu's

Changes from V8:
Make KVM_GUEST_PAUSED a per vm ioctl instead of per vcpu

Changes from V7:
Define KVM_CAP_GUEST_PAUSED and support check
Call mark_page_dirty () after setting PVCLOCK_GUEST_STOPPED

Changes from V6:
Use __this_cpu_and when clearing the PVCLOCK_GUEST_STOPPED flag

Changes from V5:
Collapse generic check_and_clear_guest_stopped into patch 2
Include check_and_clear_guest_stopped defintion to ia64, s390, and powerpc
Change check_and_clear_guest_stopped to use __get_cpu_var instead of taking the
 cpuid arg.
Protect check_and_clear_guest_stopped declaration with CONFIG_KVM_CLOCK check

Changes from V4:
Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED
Add description of KVMCLOCK_GUEST_PAUSED ioctl to api.txt

Changes from V3:
Include CC's on patch 3
Drop clear flag ioctl and have the watchdog clear the flag when it is reset

Changes from V2:
A new kvm functions defined in kvm_para.h, the only change to pvclock is the
initial flag definition

Changes from V1:
(Thanks Marcelo)
Host code has all been moved to arch/x86/kvm/x86.c
KVM_PAUSE_GUEST was renamed to KVM_GUEST_PAUSED

Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org

Eric B Munson (4):
  Add flag to indicate that a vm was stopped by the host
  Add functions to check if the host has stopped the vm
  Add ioctl for KVMCLOCK_GUEST_STOPPED
  Add check for suspended vm in softlockup detector

 Documentation/virtual/kvm/api.txt   |   13 +
 arch/ia64/include/asm/kvm_para.h|5 +
 arch/powerpc/include/asm/kvm_para.h |5 +
 arch/s390/include/asm/kvm_para.h|5 +
 arch/x86/include/asm/kvm_para.h |8 
 arch/x86/include/asm/pvclock-abi.h  |1 +
 arch/x86/kernel/kvmclock.c  |   21 +
 arch/x86/kvm/x86.c  |   21 +
 include/linux/kvm.h |3 +++
 kernel/watchdog.c   |   12 
 10 files changed, 94 insertions(+), 0 deletions(-)

-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] reset edge sense circuit of i8259 on init

2012-01-31 Thread Gleb Natapov
On Tue, Jan 24, 2012 at 03:06:05PM +0200, Gleb Natapov wrote:
 The spec says that during initialization The edge sense circuit is
 reset which means that following initialization an interrupt request
 (IR) input must make a low-to-high transition to generate an interrupt,
 but currently if edge triggered interrupt is in IRR it is delivered
 after i8259 initialization.
 
ping

 Signed-off-by: Gleb Natapov g...@redhat.com
 diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
 index b6a7353..81cf4fa 100644
 --- a/arch/x86/kvm/i8259.c
 +++ b/arch/x86/kvm/i8259.c
 @@ -307,6 +307,7 @@ static void pic_ioport_write(void *opaque, u32 addr, u32 
 val)
   if (val  0x10) {
   s-init4 = val  1;
   s-last_irr = 0;
 + s-irr = s-elcr;
   s-imr = 0;
   s-priority_add = 0;
   s-special_mask = 0;
 --
   Gleb.
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for tuesday 31

2012-01-31 Thread Andreas Färber
Am 31.01.2012 14:59, schrieb Anthony Liguori:
 On 01/30/2012 05:41 PM, Andreas Färber wrote:
 Am 30.01.2012 19:55, schrieb Juan Quintela:
 Please send in any agenda items you are interested in covering.

 QOM roadmap update:
 * Series 3/4 is on the list.
 -  Please officially designate a merge date (Friday?).
 -  To make review sensible, I ask for a hard device freeze until merged.
 I.e., no new devices and no conflicting changes to DeviceInfo.
 
 I put together a few slides to help this discussion:
 
 http://www.codemonkey.ws/files/qom-overview.pdf

So basically we barely managed to walk through the slides without
getting into any of the requested agenda points. ;)

Re slide 5 I could use a more detailed explanation.
From what I understand, ...
...types are registered through device_init() and module_call_init().
...classes are lazily initialized, eventually calling class_init.
...objects are instantiated through object_new[_with_type](), eventually
calling instance_init, by convention named initfn.
Up to here would be step 1.

How does the realize step 2 fit in technically? What types does this
apply to? All? Who calls realize? Where should class authors draw the
line between the two? (DOs and DON'Ts)

Then some conceptual questions:

How is inheritance (virtual methods in C++) supposed to work? (in my
case a reset function overwritten by each subclass) I've seen code in
series 3/4 calling methods the parent class defines in its header,
replacing the parent class' function pointer? Or should derived classes
store their parent's function pointer in a new ObjectClass and invoke
that? (i.e., are such parent functions supposed to be public or static)

When you talk about setting properties, does that refer to using the
property accessor functions moved to Object in series 3/4? Or should
qdev devices that currently expose only arbitrary setup functions
instead move their state to a header file so that it can be written into
by the parent? If so, should such headers reside in hw/ or include/qemu/?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 42703] New: random hangs on virtualization host

2012-01-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42703

   Summary: random hangs on virtualization host
   Product: Virtualization
   Version: unspecified
Kernel Version: 3.0.0-16
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: kvm
AssignedTo: virtualization_...@kernel-bugs.osdl.org
ReportedBy: gbramdewi...@gmail.com
Regression: No


Hi,

We are experiencing random system crashes on our ubuntu 11.10 server running an
openstack / KVM cloud. Unfortunately we did not observe any specific behavior
that trigers the problem, but I have collected a couple of stack traces.
Running on a dell 710  dual Intel Xeon X5650 CPU's

Please let me know if I can provide more information.

Kind regards
bram

Jan 13 17:06:46 cmggcn01 kernel: [876590.142455] general protection fault: 
[#1] SMP 
Jan 13 17:06:46 cmggcn01 kernel: [876590.165966] CPU 4 
Jan 13 17:06:46 cmggcn01 kernel: [876590.166135] Modules linked in: ebt_arp
ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat ebtables
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter
ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser rdma_cm ib_cm
iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi dcdbas dm_multipath psmouse serio_raw ghes hed
acpi_power_meter bonding i7core_edac lp joydev parport edac_core ses enclosure
usbhid hid megaraid_sas bnx2
Jan 13 17:06:46 cmggcn01 kernel: [876590.313703] 
Jan 13 17:06:46 cmggcn01 kernel: [876590.337109] Pid: 93, comm: ksmd Not
tainted 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R710/0MD99X
Jan 13 17:06:46 cmggcn01 kernel: [876590.362347] RIP: 0010:[a01b3411]
 [a01b3411] kvm_set_pte_rmapp+0x51/0x130 [kvm]
Jan 13 17:06:46 cmggcn01 kernel: [876590.386424] RSP: 0018:8817f4129bc0 
EFLAGS: 00010202
Jan 13 17:06:46 cmggcn01 kernel: [876590.411497] RAX: 88050d943ff8 RBX:
88050d943ff8 RCX: a01b33c0
Jan 13 17:06:46 cmggcn01 kernel: [876590.435831] RDX: 8817f4129c88 RSI:
 RDI: 88050d943ff8
Jan 13 17:06:46 cmggcn01 kernel: [876590.460571] RBP: 8817f4129c00 R08:
880bf6012960 R09: 0100
Jan 13 17:06:46 cmggcn01 kernel: [876590.484258] R10: 00ab R11:
0002 R12: c900288c3ff8
Jan 13 17:06:46 cmggcn01 kernel: [876590.507470] R13: 8817f4129c88 R14:
880b917d8000 R15: 002f7279
Jan 13 17:06:46 cmggcn01 kernel: [876590.531508] FS:  ()
GS:88183fc4() knlGS:
Jan 13 17:06:46 cmggcn01 kernel: [876590.554747] CS:  0010 DS:  ES: 
CR0: 8005003b
Jan 13 17:06:46 cmggcn01 kernel: [876590.577706] CR2: 7f3ac168a000 CR3:
01c03000 CR4: 26e0
Jan 13 17:06:46 cmggcn01 kernel: [876590.602479] DR0:  DR1:
 DR2: 
Jan 13 17:06:46 cmggcn01 kernel: [876590.624973] DR3:  DR6:
0ff0 DR7: 0400
Jan 13 17:06:46 cmggcn01 kernel: [876590.647214] Process ksmd (pid: 93,
threadinfo 8817f4128000, task 8817f63fdc80)
Jan 13 17:06:46 cmggcn01 kernel: [876590.669444] Stack:
Jan 13 17:06:46 cmggcn01 kernel: [876590.690742]  8817f4129c10
a01b3409 008f 880b6bea70b0
Jan 13 17:06:46 cmggcn01 kernel: [876590.713167]  0002
7f54c405 880b6bea7000 0010d9ff
Jan 13 17:06:46 cmggcn01 kernel: [876590.735595]  8817f4129c70
a01b0dd9 8817f4129c80 a01b33c0
Jan 13 17:06:46 cmggcn01 kernel: [876590.758191] Call Trace:
Jan 13 17:06:46 cmggcn01 kernel: [876590.780877]  [a01b3409] ?
kvm_set_pte_rmapp+0x49/0x130 [kvm]
Jan 13 17:06:46 cmggcn01 kernel: [876590.804088]  [a01b0dd9]
kvm_handle_hva+0x99/0x180 [kvm]
Jan 13 17:06:46 cmggcn01 kernel: [876590.829628]  [a01b33c0] ?
rmap_write_protect+0x150/0x150 [kvm]
Jan 13 17:06:46 cmggcn01 kernel: [876590.853510]  [a01b7ad1]
kvm_set_spte_hva+0x21/0x30 [kvm]
Jan 13 17:06:46 cmggcn01 kernel: [876590.876966]  [a019591d]
kvm_mmu_notifier_change_pte+0x5d/0x90 [kvm]
Jan 13 17:06:46 cmggcn01 kernel: [876590.901395]  [8114d87e]
__mmu_notifier_change_pte+0x3e/0x80
Jan 13 17:06:46 cmggcn01 kernel: [876590.925191]  [8114e07f]
write_protect_page+0x10f/0x170
Jan 13 17:06:46 cmggcn01 kernel: [876590.950156]  [8114e2df] ?
replace_page+0x1ff/0x280
Jan 13 17:06:46 cmggcn01 kernel: [876590.974177]  [8114e3e0]
try_to_merge_one_page+0x80/0x220
Jan 13 17:06:46 cmggcn01 kernel: [876590.999032]  [8114e5f7]
try_to_merge_with_ksm_page+0x77/0xc0
Jan 13 17:06:46 cmggcn01 kernel: [876591.022944]  [8114f616]
cmp_and_merge_page+0xe6/0x260
Jan 13 17:06:46 cmggcn01 kernel: [876591.046993]  [8114f83f]

[Bug 42703] random hangs on virtualization host

2012-01-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42703


Avi Kivity a...@redhat.com changed:

   What|Removed |Added

 CC||a...@redhat.com
 AssignedTo|virtualization_kvm@kernel-b |a...@redhat.com
   |ugs.osdl.org|




--- Comment #1 from Avi Kivity a...@redhat.com  2012-01-31 17:15:48 ---
 Jan 27 13:41:28 cmggcn01 kernel: [871350.761867] general protection fault: 
  [#2] SMP 
 Jan 27 13:41:28 cmggcn01 kernel: [871350.790117] CPU 14 
 Jan 27 13:41:28 cmggcn01 kernel: [871350.790387] Modules linked in: btrfs 
 zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs 
 reiserfs ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat 
 ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 
 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp 
 iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser 
 rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
 libiscsi scsi_transport_iscsi psmouse dcdbas dm_multipath serio_raw joydev 
 ghes hed acpi_power_meter bonding lp parport i7core_edac edac_core ses 
 enclosure usbhid hid megaraid_sas bnx2
 Jan 27 13:41:28 cmggcn01 kernel: [871351.005151] 
 Jan 27 13:41:28 cmggcn01 kernel: [871351.036187] Pid: 90, comm: kswapd0 
 Tainted: G  D 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge 
 R710/0MD99X
 Jan 27 13:41:28 cmggcn01 kernel: [871351.072809] RIP: 
 0010:[a01a5890]  [a01a5890] kvm_unmap_rmapp+0x20/0x60 
 [kvm]
 Jan 27 13:41:28 cmggcn01 kernel: [871351.105190] RSP: 0018:8817f3e27a60  
 EFLAGS: 00010202
 Jan 27 13:41:28 cmggcn01 kernel: [871351.141329] RAX: 8817f5d067f8 RBX: 
 c9001fd41ff8 RCX: a01a58d0
 Jan 27 13:41:28 cmggcn01 kernel: [871351.179076] RDX:  RSI: 
  RDI: 8817f5d067f8
 Jan 27 13:41:28 cmggcn01 kernel: [871351.212086] RBP: 8817f3e27a80 R08: 
 8817f315b3e0 R09: 0100
 Jan 27 13:41:28 cmggcn01 kernel: [871351.245788] R10: 000e R11: 
 0002 R12: 8817f2f0c000
 Jan 27 13:41:28 cmggcn01 kernel: [871351.277514] R13:  R14: 
 880be235e000 R15: 000d3cff
 Jan 27 13:41:28 cmggcn01 kernel: [871351.308421] FS:  () 
 GS:88183fce() knlGS:
 Jan 27 13:41:28 cmggcn01 kernel: [871351.339685] CS:  0010 DS:  ES:  
 CR0: 8005003b
 Jan 27 13:41:28 cmggcn01 kernel: [871351.370089] CR2: 7f8836442000 CR3: 
 01c03000 CR4: 26e0
 Jan 27 13:41:28 cmggcn01 kernel: [871351.399771] DR0:  DR1: 
  DR2: 
 Jan 27 13:41:28 cmggcn01 kernel: [871351.428208] DR3:  DR6: 
 0ff0 DR7: 0400
 Jan 27 13:41:28 cmggcn01 kernel: [871351.456153] Process kswapd0 (pid: 90, 
 threadinfo 8817f3e26000, task 8817f63ac560)
 Jan 27 13:41:28 cmggcn01 kernel: [871351.484943] Stack:
 Jan 27 13:41:28 cmggcn01 kernel: [871351.512984]   
 c9001fd41ff8 0001 7f834a87e000
 Jan 27 13:41:28 cmggcn01 kernel: [871351.542025]  8817f3e27aa0 
 a01a5945 880be235e060 0001
 Jan 27 13:41:28 cmggcn01 kernel: [871351.571050]  8817f3e27b10 
 a01a1dd9 8817f3e27ae0 a01a58d0
snip

 Jan 27 13:41:28 cmggcn01 kernel: [871352.080582] Code: e7 d0 e8 e0 66 90 e9 
 a2 fe ff ff 55 48 89 e5 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 45 31 ed 49 
 89 fc 48 89 f3 eb 20 0f 1f 00 f6 00 01 74 35 48 8b 15 74 7a 02 00 48 89 c6 
 4c 89 e7 41 bd 01 

   0:e8 e0 66 90 e9   callq  0xe99066e5
   5:a2 fe ff ff 55 48 89 mov%al,0x41e5894855fe
   c:e5 41
   e:55   push   %rbp
   f:41 54push   %r12
  11:53   push   %rbx
  12:48 83 ec 08  sub$0x8,%rsp
  16:66 66 66 66 90   data32 data32 data32 xchg %ax,%ax
  1b:45 31 ed xor%r13d,%r13d
  1e:49 89 fc mov%rdi,%r12
  21:48 89 f3 mov%rsi,%rbx
  24:eb 20jmp0x46
  26:0f 1f 00 nopl   (%rax)
  29:f6 00 01 testb  $0x1,(%rax)

^ dies here, %rax is non-canonical.

  2c:74 35je 0x63
  2e:48 8b 15 74 7a 02 00 mov0x27a74(%rip),%rdx# 0x27aa9
  35:48 89 c6 mov%rax,%rsi
  38:4c 89 e7 mov%r12,%rdi


static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
   unsigned long data)
{
u64 *spte;
int need_tlb_flush = 0;

while ((spte = rmap_next(kvm, rmapp, NULL))) {
BUG_ON(!(*spte  PT_PRESENT_MASK));

^ here, 

Re: performance trouble

2012-01-31 Thread David Cure
Le Mon, Jan 30, 2012 at 07:21:20PM +0200, Avi Kivity ecrivait :
 
 Start with top and vmstat to see if the cpu or I/O are the bottleneck

with top at the kvm layer, the VM take 50% of CPU (the physical
box have 16 real cores-not HT- : AMD Opteron 6134).
one result of vmstat :

procs ---memory-- ---swap-- -io -system--
cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy
id wa
 5  0  0 9904544  19612 19544000 01110  6  5
88  0

and I run it 1H and the result is always the same (just free
result change time to time).

 (also helpful to run the Windows performance monitor in the guest).  If

between the start of the function and the response, the are no
performance pic or something like that in Windows.

What we can see : 
o we start the function, 
o we see network communication with the database server during 
~3/4s 
o after nothing during 10s before display the result

David.  
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: performance trouble

2012-01-31 Thread Avi Kivity
What's up with the cc list?


On 01/31/2012 07:15 PM, David Cure wrote:
 Le Mon, Jan 30, 2012 at 07:21:20PM +0200, Avi Kivity ecrivait :
  
  Start with top and vmstat to see if the cpu or I/O are the bottleneck

   with top at the kvm layer, the VM take 50% of CPU (the physical
 box have 16 real cores-not HT- : AMD Opteron 6134).
   one result of vmstat :

 procs ---memory-- ---swap-- -io -system--
 cpu
  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy
 id wa
  5  0  0 9904544  19612 19544000 01110  6  5
 88  0

   and I run it 1H and the result is always the same (just free
 result change time to time).

Running vmstat with no arguments gives the average for all uptime.  It's
meaningless.  Run 'vmstat 1'.

How many vcpus are there in total (over all guests)?

What does top show?


  (also helpful to run the Windows performance monitor in the guest).  If

   between the start of the function and the response, the are no
 performance pic or something like that in Windows.

   What we can see : 
   o we start the function, 
   o we see network communication with the database server during 
 ~3/4s 
   o after nothing during 10s before display the result



That's really strange.  No network traffic/cpu/block I/O?

What do 'vmstat 1' and 'top -d 1' say during this?

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/7] i8254: Do not raise IRQ level on reset

2012-01-31 Thread Jan Kiszka
Avoid changing the IRQ level to high on reset as it may trigger spurious
events. Instead, open-code the effects of pit_load_count(0) in the reset
handler.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index add1fab..d4a08d7 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -481,7 +481,13 @@ static void pit_reset(DeviceState *dev)
 s = pit-channels[i];
 s-mode = 3;
 s-gate = (i != 2);
-pit_load_count(s, 0);
+s-count_load_time = qemu_get_clock_ns(vm_clock);
+s-count = 0x1;
+if (i == 0) {
+s-next_transition_time =
+pit_get_next_transition_time(s, s-count_load_time);
+qemu_mod_timer(s-irq_timer, s-next_transition_time);
+}
 }
 }
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/7] hpet: Save/restore cached RTC IRQ level

2012-01-31 Thread Jan Kiszka
In legacy mode, the HPET suppresses the RTC interrupt delivery via IRQ
8 but keeps track of the RTC output level and applies it when legacy
mode is turned off again. This value has to be preserved across save/
restore as it cannot be reconstructed otherwise.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index aba9ea9..39b291f 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -240,6 +240,24 @@ static int hpet_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static bool hpet_rtc_irq_level_needed(void *opaque)
+{
+HPETState *s = opaque;
+
+return s-rtc_irq_level != 0;
+}
+
+static const VMStateDescription vmstate_hpet_rtc_irq_level = {
+.name = hpet/rtc_irq_level,
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8(rtc_irq_level, HPETState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_hpet_timer = {
 .name = hpet_timer,
 .version_id = 1,
@@ -273,6 +291,14 @@ static const VMStateDescription vmstate_hpet = {
 VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
 vmstate_hpet_timer, HPETTimer),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = vmstate_hpet_rtc_irq_level,
+.needed = hpet_rtc_irq_level_needed,
+}, {
+/* empty */
+}
 }
 };
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 6/7] pcspk: Convert to qdev

2012-01-31 Thread Jan Kiszka
Convert the PC speaker device to a qdev ISA model. Move the public
interface to a dedicated header file at this chance.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 arch_init.c|1 +
 hw/i82378.c|3 +-
 hw/mips_jazz.c |3 +-
 hw/pc.c|3 +-
 hw/pc.h|4 ---
 hw/pcspk.c |   62 ++-
 hw/pcspk.h |   45 
 7 files changed, 104 insertions(+), 17 deletions(-)
 create mode 100644 hw/pcspk.h

diff --git a/arch_init.c b/arch_init.c
index 2366511..a45485b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -42,6 +42,7 @@
 #include gdbstub.h
 #include hw/smbios.h
 #include exec-memory.h
+#include hw/pcspk.h
 
 #ifdef TARGET_SPARC
 int graphic_width = 1024;
diff --git a/hw/i82378.c b/hw/i82378.c
index 127cadf..79fa8b7 100644
--- a/hw/i82378.c
+++ b/hw/i82378.c
@@ -20,6 +20,7 @@
 #include pci.h
 #include pc.h
 #include i8254.h
+#include pcspk.h
 
 //#define DEBUG_I82378
 
@@ -195,7 +196,7 @@ static void i82378_init(DeviceState *dev, I82378State *s)
 pit = pit_init(isabus, 0x40, 0, NULL);
 
 /* speaker */
-pcspk_init(pit);
+pcspk_init(isabus, pit);
 
 /* 2 82C37 (dma) */
 DMA_init(1, s-out[1]);
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index b61b218..65608dc 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -37,6 +37,7 @@
 #include loader.h
 #include mc146818rtc.h
 #include i8254.h
+#include pcspk.h
 #include blockdev.h
 #include sysbus.h
 #include exec-memory.h
@@ -193,7 +194,7 @@ static void mips_jazz_init(MemoryRegion *address_space,
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 pit = pit_init(isa_bus, 0x40, 0, NULL);
-pcspk_init(pit);
+pcspk_init(isa_bus, pit);
 
 /* ISA IO space at 0x9000 */
 isa_mmio_init(0x9000, 0x0100);
diff --git a/hw/pc.c b/hw/pc.c
index 6fb1de7..f6a91a9 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -37,6 +37,7 @@
 #include multiboot.h
 #include mc146818rtc.h
 #include i8254.h
+#include pcspk.h
 #include msi.h
 #include sysbus.h
 #include sysemu.h
@@ -1169,7 +1170,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
 /* connect PIT to output control line of the HPET */
 qdev_connect_gpio_out(hpet, 0, qdev_get_gpio_in(pit-qdev, 0));
 }
-pcspk_init(pit);
+pcspk_init(isa_bus, pit);
 
 for(i = 0; i  MAX_SERIAL_PORTS; i++) {
 if (serial_hds[i]) {
diff --git a/hw/pc.h b/hw/pc.h
index b08708d..1b47bbd 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -149,10 +149,6 @@ void piix4_smbus_register_device(SMBusDevice *dev, uint8_t 
addr);
 /* hpet.c */
 extern int no_hpet;
 
-/* pcspk.c */
-void pcspk_init(ISADevice *pit);
-int pcspk_audio_init(ISABus *bus);
-
 /* piix_pci.c */
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
diff --git a/hw/pcspk.c b/hw/pcspk.c
index 43df818..5bd9e32 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -28,6 +28,7 @@
 #include audio/audio.h
 #include qemu-timer.h
 #include i8254.h
+#include pcspk.h
 
 #define PCSPK_BUF_LEN 1792
 #define PCSPK_SAMPLE_RATE 32000
@@ -35,10 +36,13 @@
 #define PCSPK_MIN_COUNT ((PIT_FREQ + PCSPK_MAX_FREQ - 1) / PCSPK_MAX_FREQ)
 
 typedef struct {
+ISADevice dev;
+MemoryRegion ioport;
+uint32_t iobase;
 uint8_t sample_buf[PCSPK_BUF_LEN];
 QEMUSoundCard card;
 SWVoiceOut *voice;
-ISADevice *pit;
+void *pit;
 unsigned int pit_count;
 unsigned int samples;
 unsigned int play_pos;
@@ -47,7 +51,7 @@ typedef struct {
 } PCSpkState;
 
 static const char *s_spk = pcspk;
-static PCSpkState pcspk_state;
+static PCSpkState *pcspk_state;
 
 static inline void generate_samples(PCSpkState *s)
 {
@@ -99,7 +103,7 @@ static void pcspk_callback(void *opaque, int free)
 
 int pcspk_audio_init(ISABus *bus)
 {
-PCSpkState *s = pcspk_state;
+PCSpkState *s = pcspk_state;
 struct audsettings as = {PCSPK_SAMPLE_RATE, 1, AUD_FMT_U8, 0};
 
 AUD_register_card(s_spk, s-card);
@@ -113,7 +117,8 @@ int pcspk_audio_init(ISABus *bus)
 return 0;
 }
 
-static uint32_t pcspk_ioport_read(void *opaque, uint32_t addr)
+static uint64_t pcspk_io_read(void *opaque, target_phys_addr_t addr,
+  unsigned size)
 {
 PCSpkState *s = opaque;
 int out;
@@ -124,7 +129,8 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t 
addr)
 return pit_get_gate(s-pit, 2) | (s-data_on  1) | 
s-dummy_refresh_clock | out;
 }
 
-static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val)
+static void pcspk_io_write(void *opaque, target_phys_addr_t addr, uint64_t val,
+   unsigned size)
 {
 PCSpkState *s = opaque;
 const int gate = val  1;
@@ -138,11 +144,47 @@ static void pcspk_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 }
 }
 
-void pcspk_init(ISADevice *pit)
+static const MemoryRegionOps pcspk_io_ops = {
+.read = pcspk_io_read,
+.write = 

[PATCH v3 4/7] i8254: Pass alternative IRQ output object on initialization

2012-01-31 Thread Jan Kiszka
HPET legacy emulation will require control over the PIT IRQ output. To
enable this, add support for an alternative IRQ output object to the PIT
factory function. If the isa_irq number is  0, this object will be
used.

This also removes the IRQ number property from the PIT class as we now
use a generic GPIO output pin that is connected by the factory function.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/alpha_dp264.c   |2 +-
 hw/i82378.c|2 +-
 hw/i8254.c |4 +---
 hw/i8254.h |6 --
 hw/mips_fulong2e.c |2 +-
 hw/mips_jazz.c |2 +-
 hw/mips_malta.c|2 +-
 hw/mips_r4k.c  |2 +-
 hw/pc.c|2 +-
 9 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index 4c0efd3..ea0fd95 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -73,7 +73,7 @@ static void clipper_init(ram_addr_t ram_size,
clipper_pci_map_irq);
 
 rtc_init(isa_bus, 1980, rtc_irq);
-pit_init(isa_bus, 0x40, 0);
+pit_init(isa_bus, 0x40, 0, NULL);
 isa_create_simple(isa_bus, i8042);
 
 /* VGA setup.  Don't bother loading the bios.  */
diff --git a/hw/i82378.c b/hw/i82378.c
index 652970e..127cadf 100644
--- a/hw/i82378.c
+++ b/hw/i82378.c
@@ -192,7 +192,7 @@ static void i82378_init(DeviceState *dev, I82378State *s)
 isa_bus_irqs(isabus, s-i8259);
 
 /* 1 82C54 (pit) */
-pit = pit_init(isabus, 0x40, 0);
+pit = pit_init(isabus, 0x40, 0, NULL);
 
 /* speaker */
 pcspk_init(pit);
diff --git a/hw/i8254.c b/hw/i8254.c
index df83880..326211a 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -57,7 +57,6 @@ typedef struct PITChannelState {
 typedef struct PITState {
 ISADevice dev;
 MemoryRegion ioports;
-uint32_t irq;
 uint32_t iobase;
 PITChannelState channels[3];
 } PITState;
@@ -532,7 +531,7 @@ static int pit_initfn(ISADevice *dev)
 s = pit-channels[0];
 /* the timer 0 is connected to an IRQ */
 s-irq_timer = qemu_new_timer_ns(vm_clock, pit_irq_timer, s);
-s-irq = isa_get_irq(dev, pit-irq);
+qdev_init_gpio_out(dev-qdev, s-irq, 1);
 
 memory_region_init_io(pit-ioports, pit_ioport_ops, pit, pit, 4);
 isa_register_ioport(dev, pit-ioports, pit-iobase);
@@ -556,7 +555,6 @@ static DeviceInfo pit_info = {
 .no_user  = 1,
 .class_init  = pit_class_initfn,
 .props = (Property[]) {
-DEFINE_PROP_UINT32(irq, PITState, irq,  -1),
 DEFINE_PROP_HEX32(iobase, PITState, iobase,  -1),
 DEFINE_PROP_END_OF_LIST(),
 },
diff --git a/hw/i8254.h b/hw/i8254.h
index cd3111c..fc64a63 100644
--- a/hw/i8254.h
+++ b/hw/i8254.h
@@ -30,14 +30,16 @@
 
 #define PIT_FREQ 1193182
 
-static inline ISADevice *pit_init(ISABus *bus, int base, int irq)
+static inline ISADevice *pit_init(ISABus *bus, int base, int isa_irq,
+  qemu_irq alt_irq)
 {
 ISADevice *dev;
 
 dev = isa_create(bus, isa-pit);
 qdev_prop_set_uint32(dev-qdev, iobase, base);
-qdev_prop_set_uint32(dev-qdev, irq, irq);
 qdev_init_nofail(dev-qdev);
+qdev_connect_gpio_out(dev-qdev, 0,
+  isa_irq = 0 ? isa_get_irq(dev, isa_irq) : alt_irq);
 
 return dev;
 }
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index ead72ae..e3ba9dd 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -364,7 +364,7 @@ static void mips_fulong2e_init(ram_addr_t ram_size, const 
char *boot_device,
 smbus_eeprom_init(smbus, 1, eeprom_spd, sizeof(eeprom_spd));
 
 /* init other devices */
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, 0, NULL);
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index 61dee4d..b61b218 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -192,7 +192,7 @@ static void mips_jazz_init(MemoryRegion *address_space,
 isa_bus_irqs(isa_bus, i8259);
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, 0, NULL);
 pcspk_init(pit);
 
 /* ISA IO space at 0x9000 */
diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index 32f9f65..f611701 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -970,7 +970,7 @@ void mips_malta_init (ram_addr_t ram_size,
   isa_get_irq(NULL, 9), NULL, NULL, 0);
 /* TODO: Populate SPD eeprom data.  */
 smbus_eeprom_init(smbus, 8, NULL, 0);
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, 0, NULL);
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 
diff --git a/hw/mips_r4k.c b/hw/mips_r4k.c
index 1b3ec2d..83401f0 100644
--- a/hw/mips_r4k.c
+++ b/hw/mips_r4k.c
@@ -270,7 +270,7 @@ void mips_r4k_init (ram_addr_t ram_size,
 isa_mmio_init(0x1400, 0x0001);
 

[PATCH v3 7/7] i8254: Factor out pit_get_channel_info

2012-01-31 Thread Jan Kiszka
Instead of providing 4 individual query functions for mode, gate, output
and initial counter state, introduce a service that queries all
information at once. This comes with tiny additional costs for
pcspk_callback but with a much cleaner interface. Also, it will simplify
the implementation of the KVM in-kernel PIT model.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c |   35 ++-
 hw/i8254.h |   12 
 hw/pcspk.c |   16 +++-
 3 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index 6412277..ff253bb 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -90,7 +90,7 @@ static int pit_get_count(PITChannelState *s)
 }
 
 /* get pit output bit */
-static int pit_get_out1(PITChannelState *s, int64_t current_time)
+static int pit_get_out(PITChannelState *s, int64_t current_time)
 {
 uint64_t d;
 int out;
@@ -122,13 +122,6 @@ static int pit_get_out1(PITChannelState *s, int64_t 
current_time)
 return out;
 }
 
-int pit_get_out(ISADevice *dev, int channel, int64_t current_time)
-{
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-return pit_get_out1(s, current_time);
-}
-
 /* return -1 if no transition will occur.  */
 static int64_t pit_get_next_transition_time(PITChannelState *s,
 int64_t current_time)
@@ -215,25 +208,15 @@ void pit_set_gate(ISADevice *dev, int channel, int val)
 s-gate = val;
 }
 
-int pit_get_gate(ISADevice *dev, int channel)
-{
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-return s-gate;
-}
-
-int pit_get_initial_count(ISADevice *dev, int channel)
+void pit_get_channel_info(ISADevice *dev, int channel, PITChannelInfo *info)
 {
 PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = pit-channels[channel];
-return s-count;
-}
 
-int pit_get_mode(ISADevice *dev, int channel)
-{
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-return s-mode;
+info-gate = s-gate;
+info-mode = s-mode;
+info-initial_count = s-count;
+info-out = pit_get_out(s, qemu_get_clock_ns(vm_clock));
 }
 
 static inline void pit_load_count(PITChannelState *s, int val)
@@ -274,7 +257,9 @@ static void pit_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 if (!(val  0x10)  !s-status_latched) {
 /* status latch */
 /* XXX: add BCD and null count */
-s-status =  (pit_get_out1(s, 
qemu_get_clock_ns(vm_clock))  7) |
+s-status =
+(pit_get_out(s,
+ qemu_get_clock_ns(vm_clock))  7) |
 (s-rw_mode  4) |
 (s-mode  1) |
 s-bcd;
@@ -381,7 +366,7 @@ static void pit_irq_timer_update(PITChannelState *s, 
int64_t current_time)
 return;
 }
 expire_time = pit_get_next_transition_time(s, current_time);
-irq_level = pit_get_out1(s, current_time);
+irq_level = pit_get_out(s, current_time);
 qemu_set_irq(s-irq, irq_level);
 #ifdef DEBUG_PIT
 printf(irq_level=%d next_delay=%f\n,
diff --git a/hw/i8254.h b/hw/i8254.h
index 8ad8e07..a1d2e98 100644
--- a/hw/i8254.h
+++ b/hw/i8254.h
@@ -30,6 +30,13 @@
 
 #define PIT_FREQ 1193182
 
+typedef struct PITChannelInfo {
+int gate;
+int mode;
+int initial_count;
+int out;
+} PITChannelInfo;
+
 static inline ISADevice *pit_init(ISABus *bus, int base, int isa_irq,
   qemu_irq alt_irq)
 {
@@ -45,9 +52,6 @@ static inline ISADevice *pit_init(ISABus *bus, int base, int 
isa_irq,
 }
 
 void pit_set_gate(ISADevice *dev, int channel, int val);
-int pit_get_gate(ISADevice *dev, int channel);
-int pit_get_initial_count(ISADevice *dev, int channel);
-int pit_get_mode(ISADevice *dev, int channel);
-int pit_get_out(ISADevice *dev, int channel, int64_t current_time);
+void pit_get_channel_info(ISADevice *dev, int channel, PITChannelInfo *info);
 
 #endif /* !HW_I8254_H */
diff --git a/hw/pcspk.c b/hw/pcspk.c
index 5bd9e32..834fb0c 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -75,12 +75,16 @@ static inline void generate_samples(PCSpkState *s)
 static void pcspk_callback(void *opaque, int free)
 {
 PCSpkState *s = opaque;
+PITChannelInfo ch;
 unsigned int n;
 
-if (pit_get_mode(s-pit, 2) != 3)
+pit_get_channel_info(s-pit, 2, ch);
+
+if (ch.mode != 3) {
 return;
+}
 
-n = pit_get_initial_count(s-pit, 2);
+n = ch.initial_count;
 /* avoid frequencies that are not reproducible with sample rate */
 if (n  PCSPK_MIN_COUNT)
 n = 0;
@@ -121,12 +125,14 @@ static uint64_t pcspk_io_read(void *opaque, 
target_phys_addr_t addr,
   unsigned size)

[PATCH v3 0/7] pit, hpet, pcspk: fixes preparation for KVM

2012-01-31 Thread Jan Kiszka
This is a preparatory series to allow the introduction of the KVM
in-kernel PIT. It also fixes various bugs in the PIT and HPET code,
see patches for details.

Changes in V3:
 - rebased over master
 - tuned pic_init interface to avoid isa_get_irq(NULL, ...)

Jan Kiszka (7):
  i8254: Do not raise IRQ level on reset
  hpet: Save/restore cached RTC IRQ level
  i8254: Factor out interface header
  i8254: Pass alternative IRQ output object on initialization
  i8254: Rework  fix interaction with HPET in legacy mode
  pcspk: Convert to qdev
  i8254: Factor out pit_get_channel_info

 arch_init.c|1 +
 hw/alpha_dp264.c   |3 +-
 hw/hpet.c  |   65 ++--
 hw/hpet_emul.h |3 ++
 hw/i82378.c|6 ++-
 hw/i8254.c |   92 ++-
 hw/i8254.h |   57 
 hw/mips_fulong2e.c |3 +-
 hw/mips_jazz.c |6 ++-
 hw/mips_malta.c|3 +-
 hw/mips_r4k.c  |3 +-
 hw/pc.c|   19 --
 hw/pc.h|   29 
 hw/pcspk.c |   79 
 hw/pcspk.h |   45 +
 15 files changed, 288 insertions(+), 126 deletions(-)
 create mode 100644 hw/i8254.h
 create mode 100644 hw/pcspk.h

-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 5/7] i8254: Rework fix interaction with HPET in legacy mode

2012-01-31 Thread Jan Kiszka
When the HPET enters legacy mode, the IRQ output of the PIT is
suppressed and replaced by the HPET timer 0. But the current code to
emulate this was broken in many ways. It reset the PIT state after
re-enabling, it worked against a stale static PIT structure, and it did
not properly saved/restored the IRQ output mask in the PIT vmstate.

This patch solves the PIT IRQ control in a different way. On x86, it
both redirects the PIT IRQ to the HPET, just like the RTC. But it also
keeps the control line from the HPET to the PIT. This allows to disable
the PIT QEMU timer when it is not needed. The PIT's view on the control
line state is now saved in the same format that qemu-kvm is already
using.

Note that, in contrast to the suppressed RTC IRQ line, we do not need to
save/restore the PIT line state in the HPET. As we trigger a PIT IRQ
update via the control line, the line state is reconstructed on mode
switch.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c  |   38 --
 hw/hpet_emul.h |3 +++
 hw/i8254.c |   46 ++
 hw/i8254.h |3 ---
 hw/pc.c|   15 ---
 5 files changed, 57 insertions(+), 48 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index c921ec9..9fbbd47 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -65,6 +65,7 @@ typedef struct HPETState {
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 uint32_t flags;
 uint8_t rtc_irq_level;
+qemu_irq pit_enabled;
 uint8_t num_timers;
 HPETTimer timer[HPET_MAX_TIMERS];
 
@@ -573,12 +574,15 @@ static void hpet_ram_write(void *opaque, 
target_phys_addr_t addr,
 hpet_del_timer(s-timer[i]);
 }
 }
-/* i8254 and RTC are disabled when HPET is in legacy mode */
+/* i8254 and RTC output pins are disabled
+ * when HPET is in legacy mode */
 if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
-hpet_pit_disable();
+qemu_set_irq(s-pit_enabled, 0);
+qemu_irq_lower(s-irqs[0]);
 qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
-hpet_pit_enable();
+qemu_irq_lower(s-irqs[0]);
+qemu_set_irq(s-pit_enabled, 1);
 qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
 }
 break;
@@ -632,7 +636,6 @@ static void hpet_reset(DeviceState *d)
 {
 HPETState *s = FROM_SYSBUS(HPETState, sysbus_from_qdev(d));
 int i;
-static int count = 0;
 
 for (i = 0; i  s-num_timers; i++) {
 HPETTimer *timer = s-timer[i];
@@ -649,29 +652,27 @@ static void hpet_reset(DeviceState *d)
 timer-wrap_flag = 0;
 }
 
+qemu_set_irq(s-pit_enabled, 1);
 s-hpet_counter = 0ULL;
 s-hpet_offset = 0ULL;
 s-config = 0ULL;
-if (count  0) {
-/* we don't enable pit when hpet_reset is first called (by hpet_init)
- * because hpet is taking over for pit here. On subsequent invocations,
- * hpet_reset is called due to system reset. At this point control must
- * be returned to pit until SW reenables hpet.
- */
-hpet_pit_enable();
-}
 hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
 hpet_cfg.hpet[s-hpet_id].address = sysbus_from_qdev(d)-mmio[0].addr;
-count = 1;
 }
 
-static void hpet_handle_rtc_irq(void *opaque, int n, int level)
+static void hpet_handle_legacy_irq(void *opaque, int n, int level)
 {
 HPETState *s = FROM_SYSBUS(HPETState, opaque);
 
-s-rtc_irq_level = level;
-if (!hpet_in_legacy_mode(s)) {
-qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
+if (n == HPET_LEGACY_PIT_INT) {
+if (!hpet_in_legacy_mode(s)) {
+qemu_set_irq(s-irqs[0], level);
+}
+} else {
+s-rtc_irq_level = level;
+if (!hpet_in_legacy_mode(s)) {
+qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
+}
 }
 }
 
@@ -714,7 +715,8 @@ static int hpet_init(SysBusDevice *dev)
 s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
 s-capability |= ((HPET_CLK_PERIOD)  32);
 
-qdev_init_gpio_in(dev-qdev, hpet_handle_rtc_irq, 1);
+qdev_init_gpio_in(dev-qdev, hpet_handle_legacy_irq, 2);
+qdev_init_gpio_out(dev-qdev, s-pit_enabled, 1);
 
 /* HPET Area */
 memory_region_init_io(s-iomem, hpet_ram_ops, s, hpet, 0x400);
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index 6128702..757f79f 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -22,6 +22,9 @@
 
 #define HPET_NUM_IRQ_ROUTES 32
 
+#define HPET_LEGACY_PIT_INT 0
+#define HPET_LEGACY_RTC_INT 1
+
 #define HPET_CFG_ENABLE 0x001
 #define HPET_CFG_LEGACY 0x002
 
diff --git a/hw/i8254.c b/hw/i8254.c
index 326211a..6412277 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -52,6 +52,7 @@ typedef struct PITChannelState {
 

[PATCH v3 3/7] i8254: Factor out interface header

2012-01-31 Thread Jan Kiszka
Move the public interface of the PIT into its own header file and update
all users.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/alpha_dp264.c   |1 +
 hw/hpet.c  |1 +
 hw/i82378.c|1 +
 hw/i8254.c |1 +
 hw/i8254.h |   54 
 hw/mips_fulong2e.c |1 +
 hw/mips_jazz.c |1 +
 hw/mips_malta.c|1 +
 hw/mips_r4k.c  |1 +
 hw/pc.c|1 +
 hw/pc.h|   25 
 hw/pcspk.c |1 +
 12 files changed, 64 insertions(+), 25 deletions(-)
 create mode 100644 hw/i8254.h

diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index 876335a..4c0efd3 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -14,6 +14,7 @@
 #include sysemu.h
 #include mc146818rtc.h
 #include ide.h
+#include i8254.h
 
 #define MAX_IDE_BUS 2
 
diff --git a/hw/hpet.c b/hw/hpet.c
index 39b291f..c921ec9 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -31,6 +31,7 @@
 #include hpet_emul.h
 #include sysbus.h
 #include mc146818rtc.h
+#include i8254.h
 
 //#define HPET_DEBUG
 #ifdef HPET_DEBUG
diff --git a/hw/i82378.c b/hw/i82378.c
index 99b453a..652970e 100644
--- a/hw/i82378.c
+++ b/hw/i82378.c
@@ -19,6 +19,7 @@
 
 #include pci.h
 #include pc.h
+#include i8254.h
 
 //#define DEBUG_I82378
 
diff --git a/hw/i8254.c b/hw/i8254.c
index d4a08d7..df83880 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -25,6 +25,7 @@
 #include pc.h
 #include isa.h
 #include qemu-timer.h
+#include i8254.h
 
 //#define DEBUG_PIT
 
diff --git a/hw/i8254.h b/hw/i8254.h
new file mode 100644
index 000..cd3111c
--- /dev/null
+++ b/hw/i8254.h
@@ -0,0 +1,54 @@
+/*
+ * QEMU 8253/8254 interval timer emulation
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef HW_I8254_H
+#define HW_I8254_H
+
+#include hw.h
+#include isa.h
+
+#define PIT_FREQ 1193182
+
+static inline ISADevice *pit_init(ISABus *bus, int base, int irq)
+{
+ISADevice *dev;
+
+dev = isa_create(bus, isa-pit);
+qdev_prop_set_uint32(dev-qdev, iobase, base);
+qdev_prop_set_uint32(dev-qdev, irq, irq);
+qdev_init_nofail(dev-qdev);
+
+return dev;
+}
+
+void pit_set_gate(ISADevice *dev, int channel, int val);
+int pit_get_gate(ISADevice *dev, int channel);
+int pit_get_initial_count(ISADevice *dev, int channel);
+int pit_get_mode(ISADevice *dev, int channel);
+int pit_get_out(ISADevice *dev, int channel, int64_t current_time);
+
+void hpet_pit_disable(void);
+void hpet_pit_enable(void);
+
+#endif /* !HW_I8254_H */
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index 163a668..ead72ae 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -40,6 +40,7 @@
 #include elf.h
 #include vt82c686.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include exec-memory.h
 
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index 63165b9..61dee4d 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -36,6 +36,7 @@
 #include mips-bios.h
 #include loader.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include sysbus.h
 #include exec-memory.h
diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index 64603ce..32f9f65 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -45,6 +45,7 @@
 #include loader.h
 #include elf.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include exec-memory.h
 #include sysbus.h /* SysBusDevice */
diff --git a/hw/mips_r4k.c b/hw/mips_r4k.c
index 1c0615c..1b3ec2d 100644
--- a/hw/mips_r4k.c
+++ b/hw/mips_r4k.c
@@ -22,6 +22,7 @@
 #include loader.h
 #include elf.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include exec-memory.h
 
diff --git a/hw/pc.c b/hw/pc.c
index 31608d3..080a731 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -36,6 +36,7 @@
 #include elf.h
 #include multiboot.h
 #include mc146818rtc.h
+#include i8254.h
 #include msi.h
 #include sysbus.h
 #include sysemu.h

[PATCH uq/master] kvm: Implement kvm_irqchip_in_kernel like kvm_enabled

2012-01-31 Thread Jan Kiszka
To both avoid that kvm_irqchip_in_kernel always has to be paired with
kvm_enabled and that the former ends up in a function call, implement it
like the latter. This means keeping the state in a global variable and
defining kvm_irqchip_in_kernel as a preprocessor macro.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/pc.c   |4 ++--
 hw/pc_piix.c  |6 +++---
 kvm-all.c |   13 -
 kvm-stub.c|5 -
 kvm.h |8 +---
 target-i386/kvm.c |4 ++--
 6 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index f6a91a9..d298dc2 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -889,7 +889,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 DeviceState *dev;
 static int apic_mapped;
 
-if (kvm_enabled()  kvm_irqchip_in_kernel()) {
+if (kvm_irqchip_in_kernel()) {
 dev = qdev_create(NULL, kvm-apic);
 } else {
 dev = qdev_create(NULL, apic);
@@ -908,7 +908,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 }
 
 /* KVM does not support MSI yet. */
-if (!kvm_enabled() || !kvm_irqchip_in_kernel()) {
+if (!kvm_irqchip_in_kernel()) {
 msi_supported = true;
 }
 
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index a285ad2..70b40bc 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -104,7 +104,7 @@ static void ioapic_init(GSIState *gsi_state)
 SysBusDevice *d;
 unsigned int i;
 
-if (kvm_enabled()  kvm_irqchip_in_kernel()) {
+if (kvm_irqchip_in_kernel()) {
 dev = qdev_create(NULL, kvm-ioapic);
 } else {
 dev = qdev_create(NULL, ioapic);
@@ -183,7 +183,7 @@ static void pc_init1(MemoryRegion *system_memory,
 }
 
 gsi_state = g_malloc0(sizeof(*gsi_state));
-if (kvm_enabled()  kvm_irqchip_in_kernel()) {
+if (kvm_irqchip_in_kernel()) {
 kvm_piix3_setup_irq_routing(pci_enabled);
 gsi = qemu_allocate_irqs(kvm_piix3_gsi_handler, gsi_state,
  GSI_NUM_PINS);
@@ -209,7 +209,7 @@ static void pc_init1(MemoryRegion *system_memory,
 }
 isa_bus_irqs(isa_bus, gsi);
 
-if (kvm_enabled()  kvm_irqchip_in_kernel()) {
+if (kvm_irqchip_in_kernel()) {
 i8259 = kvm_i8259_init(isa_bus);
 } else if (xen_enabled()) {
 i8259 = xen_interrupt_controller_init();
diff --git a/kvm-all.c b/kvm-all.c
index 7d4e544..3f2460f 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -74,7 +74,6 @@ struct KVMState
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 struct kvm_sw_breakpoint_head kvm_sw_breakpoints;
 #endif
-int irqchip_in_kernel;
 int pit_in_kernel;
 int xsave, xcrs;
 int many_ioeventfds;
@@ -88,6 +87,7 @@ struct KVMState
 };
 
 KVMState *kvm_state;
+bool kvm_kernel_irqchip;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
 KVM_CAP_INFO(USER_MEMORY),
@@ -193,11 +193,6 @@ static void kvm_reset_vcpu(void *opaque)
 kvm_arch_reset_vcpu(env);
 }
 
-int kvm_irqchip_in_kernel(void)
-{
-return kvm_state-irqchip_in_kernel;
-}
-
 int kvm_pit_in_kernel(void)
 {
 return kvm_state-pit_in_kernel;
@@ -742,7 +737,7 @@ int kvm_irqchip_set_irq(KVMState *s, int irq, int level)
 struct kvm_irq_level event;
 int ret;
 
-assert(s-irqchip_in_kernel);
+assert(kvm_irqchip_in_kernel());
 
 event.level = level;
 event.irq = irq;
@@ -862,7 +857,7 @@ static int kvm_irqchip_create(KVMState *s)
 if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) {
 s-irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS;
 }
-s-irqchip_in_kernel = 1;
+kvm_kernel_irqchip = true;
 
 kvm_init_irq_routing(s);
 
@@ -1311,7 +1306,7 @@ int kvm_has_gsi_routing(void)
 
 int kvm_allows_irq0_override(void)
 {
-return !kvm_enabled() || !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
+return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
 }
 
 void kvm_setup_guest_memory(void *start, size_t size)
diff --git a/kvm-stub.c b/kvm-stub.c
index 6c2b06b..f63a0d2 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -16,11 +16,6 @@
 #include gdbstub.h
 #include kvm.h
 
-int kvm_irqchip_in_kernel(void)
-{
-return 0;
-}
-
 int kvm_pit_in_kernel(void)
 {
 return 0;
diff --git a/kvm.h b/kvm.h
index 40b5ffc..f9f1dc8 100644
--- a/kvm.h
+++ b/kvm.h
@@ -23,11 +23,14 @@
 #endif
 
 extern int kvm_allowed;
+extern bool kvm_kernel_irqchip;
 
 #if defined CONFIG_KVM || !defined NEED_CPU_H
-#define kvm_enabled() (kvm_allowed)
+#define kvm_enabled()   (kvm_allowed)
+#define kvm_irqchip_in_kernel() (kvm_kernel_irqchip)
 #else
-#define kvm_enabled() (0)
+#define kvm_enabled()   (0)
+#define kvm_irqchip_in_kernel() (false)
 #endif
 
 struct kvm_run;
@@ -80,7 +83,6 @@ int kvm_set_signal_mask(CPUState *env, const sigset_t 
*sigset);
 #endif
 
 int kvm_pit_in_kernel(void);
-int kvm_irqchip_in_kernel(void);
 
 int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr);
 int kvm_on_sigbus(int code, void *addr);
diff --git a/target-i386/kvm.c 

[PATCH 4/4] kvm: x86: Add user space part for in-kernel i8254

2012-01-31 Thread Jan Kiszka
This provides the required user space stubs to enable the in-kernel
i8254 emulation of KVM.

The in-kernel model supports lost tick compensation according to the
delay policy. This is enabled by default and can be switched off via a
device property.

Depending on the feature set of the host kernel (before 2.6.32), we may
have to disable the HPET or lack sound output from the PC speaker.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile.target |2 +-
 hw/i8254.h  |   11 +++
 hw/kvm/i8254.c  |  237 +++
 hw/pc.c |   14 +++-
 4 files changed, 261 insertions(+), 3 deletions(-)
 create mode 100644 hw/kvm/i8254.c

diff --git a/Makefile.target b/Makefile.target
index 68481a3..2af5511 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -235,7 +235,7 @@ obj-i386-y += vmport.o
 obj-i386-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o kvm/ioapic.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o kvm/ioapic.o 
kvm/i8254.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/i8254.h b/hw/i8254.h
index a1d2e98..ba6b598 100644
--- a/hw/i8254.h
+++ b/hw/i8254.h
@@ -51,6 +51,17 @@ static inline ISADevice *pit_init(ISABus *bus, int base, int 
isa_irq,
 return dev;
 }
 
+static inline ISADevice *kvm_pit_init(ISABus *bus, int base)
+{
+ISADevice *dev;
+
+dev = isa_create(bus, kvm-pit);
+qdev_prop_set_uint32(dev-qdev, iobase, base);
+qdev_init_nofail(dev-qdev);
+
+return dev;
+}
+
 void pit_set_gate(ISADevice *dev, int channel, int val);
 void pit_get_channel_info(ISADevice *dev, int channel, PITChannelInfo *info);
 
diff --git a/hw/kvm/i8254.c b/hw/kvm/i8254.c
new file mode 100644
index 000..0d7196a
--- /dev/null
+++ b/hw/kvm/i8254.c
@@ -0,0 +1,237 @@
+/*
+ * KVM in-kernel PIT (i8254) support
+ *
+ * Copyright (c) 2012 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  jan.kis...@siemens.com
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include qemu-timer.h
+#include hw/i8254.h
+#include hw/i8254_internal.h
+#include kvm.h
+
+#define KVM_PIT_REINJECT_BIT 0
+
+typedef struct KVMPITState {
+PITCommonState pit;
+LostTickPolicy lost_tick_policy;
+} KVMPITState;
+
+static void kvm_pit_get(PITCommonState *s)
+{
+struct kvm_pit_state2 kpit;
+struct kvm_pit_channel_state *kchan;
+struct PITChannelState *sc;
+int i, ret;
+
+if (kvm_has_pit_state2()) {
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_PIT2, kpit);
+if (ret  0) {
+fprintf(stderr, KVM_GET_PIT2 failed: %s\n, strerror(ret));
+abort();
+}
+s-channels[0].irq_disabled = kpit.flags  KVM_PIT_FLAGS_HPET_LEGACY;
+} else {
+/*
+ * kvm_pit_state2 is superset of kvm_pit_state struct,
+ * so we can use it for KVM_GET_PIT as well.
+ */
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_PIT, kpit);
+if (ret  0) {
+fprintf(stderr, KVM_GET_PIT failed: %s\n, strerror(ret));
+abort();
+}
+}
+for (i = 0; i  3; i++) {
+kchan = kpit.channels[i];
+sc = s-channels[i];
+sc-count = kchan-count;
+sc-latched_count = kchan-latched_count;
+sc-count_latched = kchan-count_latched;
+sc-status_latched = kchan-status_latched;
+sc-status = kchan-status;
+sc-read_state = kchan-read_state;
+sc-write_state = kchan-write_state;
+sc-write_latch = kchan-write_latch;
+sc-rw_mode = kchan-rw_mode;
+sc-mode = kchan-mode;
+sc-bcd = kchan-bcd;
+sc-gate = kchan-gate;
+sc-count_load_time = kchan-count_load_time;
+}
+
+sc = s-channels[0];
+sc-next_transition_time =
+pit_get_next_transition_time(sc, sc-count_load_time);
+}
+
+static void kvm_pit_put(PITCommonState *s)
+{
+struct kvm_pit_state2 kpit;
+struct kvm_pit_channel_state *kchan;
+struct PITChannelState *sc;
+int i, ret;
+
+kpit.flags = s-channels[0].irq_disabled ? KVM_PIT_FLAGS_HPET_LEGACY : 0;
+for (i = 0; i  3; i++) {
+kchan = kpit.channels[i];
+sc = s-channels[i];
+kchan-count = sc-count;
+kchan-latched_count = sc-latched_count;
+kchan-count_latched = sc-count_latched;
+kchan-status_latched = sc-status_latched;
+kchan-status = sc-status;
+kchan-read_state = sc-read_state;
+kchan-write_state = sc-write_state;
+kchan-write_latch = sc-write_latch;
+kchan-rw_mode = sc-rw_mode;
+kchan-mode = sc-mode;
+kchan-bcd = sc-bcd;
+kchan-gate = sc-gate;
+kchan-count_load_time = sc-count_load_time;
+}
+
+ret = kvm_vm_ioctl(kvm_state,
+   

[PATCH 1/4] i8254: Factor out base class for KVM reuse

2012-01-31 Thread Jan Kiszka
Applying the concept used for the *PICs once again: establish a base
class for the i8254 that can be used both by the current user space
emulation and the upcoming KVM in-kernel version. We share most of the
public interface of the i8254, specifically to the pcspk, vmstate, reset
and certain init parts.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile.objs   |2 +-
 hw/i8254.c  |  277 ++---
 hw/i8254_common.c   |  311 +++
 hw/i8254_internal.h |   87 ++
 4 files changed, 434 insertions(+), 243 deletions(-)
 create mode 100644 hw/i8254_common.c
 create mode 100644 hw/i8254_internal.h

diff --git a/Makefile.objs b/Makefile.objs
index b942625..6a733fa 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -210,7 +210,7 @@ hw-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o
 
 hw-obj-$(CONFIG_SERIAL) += serial.o
 hw-obj-$(CONFIG_PARALLEL) += parallel.o
-hw-obj-$(CONFIG_I8254) += i8254.o
+hw-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
 hw-obj-$(CONFIG_PCSPK) += pcspk.o
 hw-obj-$(CONFIG_PCKBD) += pckbd.o
 hw-obj-$(CONFIG_USB_UHCI) += usb-uhci.o
diff --git a/hw/i8254.c b/hw/i8254.c
index ff253bb..90e9a87 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -26,6 +26,7 @@
 #include isa.h
 #include qemu-timer.h
 #include i8254.h
+#include i8254_internal.h
 
 //#define DEBUG_PIT
 
@@ -34,34 +35,6 @@
 #define RW_STATE_WORD0 3
 #define RW_STATE_WORD1 4
 
-typedef struct PITChannelState {
-int count; /* can be 65536 */
-uint16_t latched_count;
-uint8_t count_latched;
-uint8_t status_latched;
-uint8_t status;
-uint8_t read_state;
-uint8_t write_state;
-uint8_t write_latch;
-uint8_t rw_mode;
-uint8_t mode;
-uint8_t bcd; /* not supported */
-uint8_t gate; /* timer start */
-int64_t count_load_time;
-/* irq handling */
-int64_t next_transition_time;
-QEMUTimer *irq_timer;
-qemu_irq irq;
-uint32_t irq_disabled;
-} PITChannelState;
-
-typedef struct PITState {
-ISADevice dev;
-MemoryRegion ioports;
-uint32_t iobase;
-PITChannelState channels[3];
-} PITState;
-
 static void pit_irq_timer_update(PITChannelState *s, int64_t current_time);
 
 static int pit_get_count(PITChannelState *s)
@@ -89,99 +62,11 @@ static int pit_get_count(PITChannelState *s)
 return counter;
 }
 
-/* get pit output bit */
-static int pit_get_out(PITChannelState *s, int64_t current_time)
-{
-uint64_t d;
-int out;
-
-d = muldiv64(current_time - s-count_load_time, PIT_FREQ,
- get_ticks_per_sec());
-switch(s-mode) {
-default:
-case 0:
-out = (d = s-count);
-break;
-case 1:
-out = (d  s-count);
-break;
-case 2:
-if ((d % s-count) == 0  d != 0)
-out = 1;
-else
-out = 0;
-break;
-case 3:
-out = (d % s-count)  ((s-count + 1)  1);
-break;
-case 4:
-case 5:
-out = (d == s-count);
-break;
-}
-return out;
-}
-
-/* return -1 if no transition will occur.  */
-static int64_t pit_get_next_transition_time(PITChannelState *s,
-int64_t current_time)
-{
-uint64_t d, next_time, base;
-int period2;
-
-d = muldiv64(current_time - s-count_load_time, PIT_FREQ,
- get_ticks_per_sec());
-switch(s-mode) {
-default:
-case 0:
-case 1:
-if (d  s-count)
-next_time = s-count;
-else
-return -1;
-break;
-case 2:
-base = (d / s-count) * s-count;
-if ((d - base) == 0  d != 0)
-next_time = base + s-count;
-else
-next_time = base + s-count + 1;
-break;
-case 3:
-base = (d / s-count) * s-count;
-period2 = ((s-count + 1)  1);
-if ((d - base)  period2)
-next_time = base + period2;
-else
-next_time = base + s-count;
-break;
-case 4:
-case 5:
-if (d  s-count)
-next_time = s-count;
-else if (d == s-count)
-next_time = s-count + 1;
-else
-return -1;
-break;
-}
-/* convert to timer units */
-next_time = s-count_load_time + muldiv64(next_time, get_ticks_per_sec(),
-  PIT_FREQ);
-/* fix potential rounding problems */
-/* XXX: better solution: use a clock at PIT_FREQ Hz */
-if (next_time = current_time)
-next_time = current_time + 1;
-return next_time;
-}
-
 /* val must be 0 or 1 */
-void pit_set_gate(ISADevice *dev, int channel, int val)
+static void pit_set_channel_gate(PITCommonState *s, PITChannelState *sc,
+ int val)
 {
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-
-switch(s-mode) {
+switch (sc-mode) {
 default:
 

[PATCH 3/4] kvm: Add kvm_has_pit_state2 helper

2012-01-31 Thread Jan Kiszka
To be used for in-kernel PIT emulation.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c  |   10 ++
 kvm-stub.c |5 +
 kvm.h  |1 +
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 3f2460f..8df5235 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -75,6 +75,7 @@ struct KVMState
 struct kvm_sw_breakpoint_head kvm_sw_breakpoints;
 #endif
 int pit_in_kernel;
+int pit_state2;
 int xsave, xcrs;
 int many_ioeventfds;
 int irqchip_inject_ioctl;
@@ -954,6 +955,10 @@ int kvm_init(void)
 s-xcrs = kvm_check_extension(s, KVM_CAP_XCRS);
 #endif
 
+#ifdef KVM_CAP_PIT_STATE2
+s-pit_state2 = kvm_check_extension(s, KVM_CAP_PIT_STATE2);
+#endif
+
 ret = kvm_arch_init(s);
 if (ret  0) {
 goto err;
@@ -1291,6 +1296,11 @@ int kvm_has_xcrs(void)
 return kvm_state-xcrs;
 }
 
+int kvm_has_pit_state2(void)
+{
+return kvm_state-pit_state2;
+}
+
 int kvm_has_many_ioeventfds(void)
 {
 if (!kvm_enabled()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index f63a0d2..1f1c686 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -78,6 +78,11 @@ int kvm_allows_irq0_override(void)
 return 1;
 }
 
+int kvm_has_pit_state2(void)
+{
+return 0;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index f9f1dc8..8ef4476 100644
--- a/kvm.h
+++ b/kvm.h
@@ -54,6 +54,7 @@ int kvm_has_robust_singlestep(void);
 int kvm_has_debugregs(void);
 int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
+int kvm_has_pit_state2(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_gsi_routing(void);
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] uq/master: Introduce KVM PIT support

2012-01-31 Thread Jan Kiszka
This adds another piece of qemu-kvm to upstream: The accelerated
in-kernel model of the i8254. It does this in the same fashion as the
interrupt controllers were already introduced. And it even has one bug
less than qemu-kvm: PC speaker output still works with KVM acceleration
enabled.

The patches depend on the follow upstream or uq/master patches:
- pit, hpet, pcspk: fixes  preparation for KVM (v7, 7 patches)
  http://thread.gmane.org/gmane.comp.emulators.qemu/134393
- Generic tick reinjection control (2 patches)
  http://thread.gmane.org/gmane.comp.emulators.qemu/133306
- kvm: Implement kvm_irqchip_in_kernel
  http://thread.gmane.org/gmane.comp.emulators.qemu/134399

The series plus dependencies is also available at

git://git.kiszka.org/qemu-kvm.git queues/kvm-pit

Please review / merge.

Jan Kiszka (4):
  i8254: Factor out base class for KVM reuse
  i8254: Open-code timer restore
  kvm: Add kvm_has_pit_state2 helper
  kvm: x86: Add user space part for in-kernel i8254

 Makefile.objs   |2 +-
 Makefile.target |2 +-
 hw/i8254.c  |  289 ---
 hw/i8254.h  |   11 ++
 hw/i8254_common.c   |  311 +++
 hw/i8254_internal.h |   87 ++
 hw/kvm/i8254.c  |  237 +++
 hw/pc.c |   14 ++-
 kvm-all.c   |   10 ++
 kvm-stub.c  |5 +
 kvm.h   |1 +
 11 files changed, 723 insertions(+), 246 deletions(-)
 create mode 100644 hw/i8254_common.c
 create mode 100644 hw/i8254_internal.h
 create mode 100644 hw/kvm/i8254.c

-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] i8254: Open-code timer restore

2012-01-31 Thread Jan Kiszka
Same as for the APIC: To enable migration between accelerated and
non-accelerated models, we need to arm the channel 0 timer only inside
the emulated PIT model. The common code just saves/restores that timer
to the the next_transition_time field.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c|   12 
 hw/i8254_common.c |6 +++---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index 90e9a87..9020845 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -300,6 +300,17 @@ static const MemoryRegionOps pit_ioport_ops = {
 .old_portio = pit_portio
 };
 
+static void pit_post_load(PITCommonState *s)
+{
+PITChannelState *sc = s-channels[0];
+
+if (sc-next_transition_time != -1) {
+qemu_mod_timer(sc-irq_timer, sc-next_transition_time);
+} else {
+qemu_del_timer(sc-irq_timer);
+}
+}
+
 static int pit_initfn(PITCommonState *pit)
 {
 PITChannelState *s;
@@ -323,6 +334,7 @@ static void pit_class_init(ObjectClass *klass, void *data)
 k-init = pit_initfn;
 k-set_channel_gate = pit_set_channel_gate;
 k-get_channel_info = pit_get_channel_info_common;
+k-post_load= pit_post_load;
 }
 
 static DeviceInfo pit_info = {
diff --git a/hw/i8254_common.c b/hw/i8254_common.c
index 6ff91fb..371c3ed 100644
--- a/hw/i8254_common.c
+++ b/hw/i8254_common.c
@@ -234,9 +234,8 @@ static int pit_load_old(QEMUFile *f, void *opaque, int 
version_id)
 qemu_get_8s(f, s-gate);
 s-count_load_time = qemu_get_be64(f);
 s-irq_disabled = 0;
-if (s-irq_timer) {
+if (i == 0) {
 s-next_transition_time = qemu_get_be64(f);
-qemu_get_timer(f, s-irq_timer);
 }
 }
 return 0;
@@ -275,7 +274,8 @@ static const VMStateDescription vmstate_pit_common = {
 VMSTATE_UINT32_V(channels[0].irq_disabled, PITCommonState, 3),
 VMSTATE_STRUCT_ARRAY(channels, PITCommonState, 3, 2,
  vmstate_pit_channel, PITChannelState),
-VMSTATE_TIMER(channels[0].irq_timer, PITCommonState),
+VMSTATE_INT64(channels[0].next_transition_time,
+  PITCommonState), /* formerly irq_timer */
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] pci-assign: Proper initialization for MSI-X table

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
 On Sat, Jan 28, 2012 at 07:22:09AM -0700, Alex Williamson wrote:
  Per the PCI spec, all vectors should be masked at handoff.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
  ---
  
   hw/device-assignment.c |   20 +++-
   1 files changed, 19 insertions(+), 1 deletions(-)
  
  diff --git a/hw/device-assignment.c b/hw/device-assignment.c
  index af614d3..6efa219 100644
  --- a/hw/device-assignment.c
  +++ b/hw/device-assignment.c
  @@ -1462,6 +1462,22 @@ static const MemoryRegionOps msix_mmio_ops = {
   },
   };
   
  +static void msix_reset(AssignedDevice *dev)
  +{
  +MSIXTableEntry *entry;
  +int i;
  +
  +if (!dev-msix_table) {
  +return;
  +}
  +
  +memset(dev-msix_table, 0, 0x1000);
  +
  +for (i = 0, entry = dev-msix_table; i  dev-msix_max; i++, entry++) {
  +entry-ctrl = 0x1; /* Masked */
 
 This is broken for BE hosts.

Show me a BE host that even remotely works with this device assignment
implementation.  Thanks,

Alex

  +}
  +}
  +
   static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
   {
   dev-msix_table = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE,
  @@ -1470,7 +1486,9 @@ static int 
  assigned_dev_register_msix_mmio(AssignedDevice *dev)
   fprintf(stderr, fail allocate msix_table! %s\n, strerror(errno));
   return -EFAULT;
   }
  -memset(dev-msix_table, 0, 0x1000);
  +
  +msix_reset(dev);
  +
   memory_region_init_io(dev-mmio, msix_mmio_ops, dev,
 assigned-dev-msix, MSIX_PAGE_SIZE);
   return 0;
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] pci-assign: Proper initialization for MSI-X table

2012-01-31 Thread Michael S. Tsirkin
On Sat, Jan 28, 2012 at 07:22:09AM -0700, Alex Williamson wrote:
 Per the PCI spec, all vectors should be masked at handoff.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---
 
  hw/device-assignment.c |   20 +++-
  1 files changed, 19 insertions(+), 1 deletions(-)
 
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index af614d3..6efa219 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -1462,6 +1462,22 @@ static const MemoryRegionOps msix_mmio_ops = {
  },
  };
  
 +static void msix_reset(AssignedDevice *dev)
 +{
 +MSIXTableEntry *entry;
 +int i;
 +
 +if (!dev-msix_table) {
 +return;
 +}
 +
 +memset(dev-msix_table, 0, 0x1000);
 +
 +for (i = 0, entry = dev-msix_table; i  dev-msix_max; i++, entry++) {
 +entry-ctrl = 0x1; /* Masked */

This is broken for BE hosts.

 +}
 +}
 +
  static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
  {
  dev-msix_table = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE,
 @@ -1470,7 +1486,9 @@ static int 
 assigned_dev_register_msix_mmio(AssignedDevice *dev)
  fprintf(stderr, fail allocate msix_table! %s\n, strerror(errno));
  return -EFAULT;
  }
 -memset(dev-msix_table, 0, 0x1000);
 +
 +msix_reset(dev);
 +
  memory_region_init_io(dev-mmio, msix_mmio_ops, dev,
assigned-dev-msix, MSIX_PAGE_SIZE);
  return 0;
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/9] pci-assign: Use struct for MSI-X table

2012-01-31 Thread Michael S. Tsirkin
On Sat, Jan 28, 2012 at 07:21:58AM -0700, Alex Williamson wrote:
 This makes access much easier.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com

Yes but this also makes it easier to forget
to handle endian-ness.
How about using pci_get/pci_set instead?

 ---
 
  hw/device-assignment.c |   55 
 ++--
  hw/device-assignment.h |9 +++-
  2 files changed, 33 insertions(+), 31 deletions(-)
 
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index 1fc7ffa..422ee00 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -973,10 +973,9 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
 *pci_dev)
  AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
  uint16_t entries_nr = 0, entries_max_nr;
  int pos = 0, i, r = 0;
 -uint32_t msg_addr, msg_upper_addr, msg_data, msg_ctrl;
  struct kvm_assigned_msix_nr msix_nr;
  struct kvm_assigned_msix_entry msix_entry;
 -void *va = adev-msix_table_page;
 +MSIXTableEntry *entry = adev-msix_table;
  
  pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
  
 @@ -985,12 +984,11 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
 *pci_dev)
  entries_max_nr += 1;
  
  /* Get the usable entry number for allocating */
 -for (i = 0; i  entries_max_nr; i++) {
 -memcpy(msg_ctrl, va + i * 16 + 12, 4);
 -memcpy(msg_data, va + i * 16 + 8, 4);
 +for (i = 0; i  entries_max_nr; i++, entry++) {
  /* Ignore unused entry even it's unmasked */
 -if (msg_data == 0)
 +if (entry-data == 0) {
  continue;
 +}
  entries_nr ++;
  }
  
 @@ -1013,16 +1011,13 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
 *pci_dev)
  
  msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
  entries_nr = 0;
 -for (i = 0; i  entries_max_nr; i++) {
 +entry = adev-msix_table;
 +for (i = 0; i  entries_max_nr; i++, entry++) {
  if (entries_nr = msix_nr.entry_nr)
  break;
 -memcpy(msg_ctrl, va + i * 16 + 12, 4);
 -memcpy(msg_data, va + i * 16 + 8, 4);
 -if (msg_data == 0)
 +if (entry-data == 0) {
  continue;
 -
 -memcpy(msg_addr, va + i * 16, 4);
 -memcpy(msg_upper_addr, va + i * 16 + 4, 4);
 +}
  
  r = kvm_get_irq_route_gsi();
  if (r  0)
 @@ -1031,10 +1026,11 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
 *pci_dev)
  adev-entry[entries_nr].gsi = r;
  adev-entry[entries_nr].type = KVM_IRQ_ROUTING_MSI;
  adev-entry[entries_nr].flags = 0;
 -adev-entry[entries_nr].u.msi.address_lo = msg_addr;
 -adev-entry[entries_nr].u.msi.address_hi = msg_upper_addr;
 -adev-entry[entries_nr].u.msi.data = msg_data;
 -DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!, msg_data, msg_addr);
 +adev-entry[entries_nr].u.msi.address_lo = entry-addr_lo;
 +adev-entry[entries_nr].u.msi.address_hi = entry-addr_hi;
 +adev-entry[entries_nr].u.msi.data = entry-data;
 +DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!,
 +  entry-data, entry-addr_lo);
   kvm_add_routing_entry(adev-entry[entries_nr]);
  
  msix_entry.gsi = adev-entry[entries_nr].gsi;
 @@ -1443,7 +1439,7 @@ static uint64_t msix_mmio_read(void *opaque, 
 target_phys_addr_t addr,
  AssignedDevice *adev = opaque;
  uint64_t val;
  
 -memcpy(val, (void *)((uint8_t *)adev-msix_table_page + addr), size);
 +memcpy(val, (void *)((uint8_t *)adev-msix_table + addr), size);
  
  return val;
  }
 @@ -1456,7 +1452,7 @@ static void msix_mmio_write(void *opaque, 
 target_phys_addr_t addr,
  DEBUG(write to MSI-X entry table mmio offset 0x%lx, val 0x%lx\n,
addr, val);
  
 -memcpy((void *)((uint8_t *)adev-msix_table_page + addr), val, size);
 +memcpy((void *)((uint8_t *)adev-msix_table + addr), val, size);
  }
  
  static const MemoryRegionOps msix_mmio_ops = {
 @@ -1471,15 +1467,13 @@ static const MemoryRegionOps msix_mmio_ops = {
  
  static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
  {
 -dev-msix_table_page = mmap(NULL, 0x1000,
 -PROT_READ|PROT_WRITE,
 -MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
 -if (dev-msix_table_page == MAP_FAILED) {
 -fprintf(stderr, fail allocate msix_table_page! %s\n,
 -strerror(errno));
 +dev-msix_table = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE,
 +   MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
 +if (dev-msix_table == MAP_FAILED) {
 +fprintf(stderr, fail allocate msix_table! %s\n, strerror(errno));
  return -EFAULT;
  }
 -memset(dev-msix_table_page, 0, 0x1000);
 +memset(dev-msix_table, 0, 0x1000);
  memory_region_init_io(dev-mmio, msix_mmio_ops, dev,
assigned-dev-msix, MSIX_PAGE_SIZE);
  

Re: [PATCH v2] KVM: Fix assigned device MSI-X entry setting leak

2012-01-31 Thread Michael S. Tsirkin
On Mon, Jan 30, 2012 at 02:05:54PM -0700, Alex Williamson wrote:
 We need to prioritize our matching when setting MSI-X vector
 entries.  Unused entries should only be used if we don't find
 an exact match or else we risk duplicating entries.  This was
 causing an ENOSPC return when trying to mask and unmask MSI-X
 vectors based on guest MSI-X table updates.
 
 The new KVM_CAP_DEVICE_MSIX_MASK extension indicates the
 presence of this fix.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---
 
 v2: Add capability indicating MSIX_MASKing now works.
 
 The faulting sequence went something like:
 
 Start:
 [0] entry 0, vector A
 [1] entry 1, vector B
 [2] entry 2, vector C
 
 Set entry 1 to 0:
 [0] entry 0, vector A
 [1] entry 1-1, vector B-0
 [2] entry 2, vector C
 
 Set entry 2 to 0:
 [0] entry 0, vector A
 [1] entry 1-2, vector 0-0 - incorrectly matched
 [2] entry 2, vector C
 
 Set entry 2 to C:
 [0] entry 0, vector A
 [1] entry 2-2, vector 0-C - incorrectly matched again
 [2] entry 2, vector C
 
 Set entry 1 to B:
 [0] entry 0, vector A
 [1] entry 2, vector C
 [2] entry 2, vector C
 -ENOSPC
 
  Documentation/virtual/kvm/api.txt |5 +++--
  arch/x86/kvm/x86.c|1 +
  include/linux/kvm.h   |1 +
  virt/kvm/assigned-dev.c   |   21 ++---
  4 files changed, 19 insertions(+), 9 deletions(-)

I don't object to fixing this bug. But I don't think
we need a capability for this. Here's why:

setting an entry to zero is not really matching
what devices need, since that would lose interrupts.
What devices need is mask entries to disable them,
then unmask and get an interrupt if it was delayed.

So, if we are going to add a new capability, how about
sticking a mask bit in the padding instead?

As was shown in the past this can even improve performance since we can
propagate the mask bit back to the host device.


 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index e1d94bf..dd0b554 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1304,8 +1304,9 @@ Type: vm ioctl
  Parameters: struct kvm_assigned_msix_entry (in)
  Returns: 0 on success, -1 on error
  
 -Specifies the routing of an MSI-X assigned device interrupt to a GSI. Setting
 -the GSI vector to zero means disabling the interrupt.
 +Specifies the routing of an MSI-X assigned device interrupt to a GSI. If
 +KVM_CAP_DEVICE_MSIX_MASK is available, setting the GSI vector to zero means
 +disabling the interrupt.
  
  struct kvm_assigned_msix_entry {
   __u32 assigned_dev_id;
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 14d6cad..a35255e 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -2057,6 +2057,7 @@ int kvm_dev_ioctl_check_extension(long ext)
   case KVM_CAP_XSAVE:
   case KVM_CAP_ASYNC_PF:
   case KVM_CAP_GET_TSC_KHZ:
 + case KVM_CAP_DEVICE_MSIX_MASK:
   r = 1;
   break;
   case KVM_CAP_COALESCED_MMIO:
 diff --git a/include/linux/kvm.h b/include/linux/kvm.h
 index 68e67e5..64438e6 100644
 --- a/include/linux/kvm.h
 +++ b/include/linux/kvm.h
 @@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo {
  #define KVM_CAP_PPC_PAPR 68
  #define KVM_CAP_S390_GMAP 71
  #define KVM_CAP_TSC_DEADLINE_TIMER 72
 +#define KVM_CAP_DEVICE_MSIX_MASK 73
  
  #ifdef KVM_CAP_IRQ_ROUTING
  
 diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
 index 758e3b3..58f3f6b 100644
 --- a/virt/kvm/assigned-dev.c
 +++ b/virt/kvm/assigned-dev.c
 @@ -728,7 +728,7 @@ msix_nr_out:
  static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm,
  struct kvm_assigned_msix_entry *entry)
  {
 - int r = 0, i;
 + int r = 0, i, unused = -1;
   struct kvm_assigned_dev_kernel *adev;
  
   mutex_lock(kvm-lock);
 @@ -741,17 +741,24 @@ static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm,
   goto msix_entry_out;
   }
  
 - for (i = 0; i  adev-entries_nr; i++)
 - if (adev-guest_msix_entries[i].vector == 0 ||
 - adev-guest_msix_entries[i].entry == entry-entry) {
 + for (i = 0; i  adev-entries_nr; i++) {
 + if (adev-guest_msix_entries[i].entry == entry-entry) {
   adev-guest_msix_entries[i].entry = entry-entry;
   adev-guest_msix_entries[i].vector = entry-gsi;
   adev-host_msix_entries[i].entry = entry-entry;
   break;
 - }
 + } else if (unused  0  !adev-guest_msix_entries[i].vector)
 + unused = i;
 + }
 +
   if (i == adev-entries_nr) {
 - r = -ENOSPC;
 - goto msix_entry_out;
 + if (unused  0) {
 + r = -ENOSPC;
 + goto msix_entry_out;
 + }
 + adev-guest_msix_entries[unused].entry = entry-entry;
 + adev-guest_msix_entries[unused].vector = 

Re: [PATCH 6/9] pci-assign: Proper initialization for MSI-X table

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 12:07:39PM -0700, Alex Williamson wrote:
 On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
  On Sat, Jan 28, 2012 at 07:22:09AM -0700, Alex Williamson wrote:
   Per the PCI spec, all vectors should be masked at handoff.
   
   Signed-off-by: Alex Williamson alex.william...@redhat.com
   ---
   
hw/device-assignment.c |   20 +++-
1 files changed, 19 insertions(+), 1 deletions(-)
   
   diff --git a/hw/device-assignment.c b/hw/device-assignment.c
   index af614d3..6efa219 100644
   --- a/hw/device-assignment.c
   +++ b/hw/device-assignment.c
   @@ -1462,6 +1462,22 @@ static const MemoryRegionOps msix_mmio_ops = {
},
};

   +static void msix_reset(AssignedDevice *dev)
   +{
   +MSIXTableEntry *entry;
   +int i;
   +
   +if (!dev-msix_table) {
   +return;
   +}
   +
   +memset(dev-msix_table, 0, 0x1000);
   +
   +for (i = 0, entry = dev-msix_table; i  dev-msix_max; i++, 
   entry++) {
   +entry-ctrl = 0x1; /* Masked */
  
  This is broken for BE hosts.
 
 Show me a BE host that even remotely works with this device assignment
 implementation.  Thanks,
 
 Alex

I don't get it. Yes lots of cleanup is needed but why add more broken
code?

   +}
   +}
   +
static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
{
dev-msix_table = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE,
   @@ -1470,7 +1486,9 @@ static int 
   assigned_dev_register_msix_mmio(AssignedDevice *dev)
fprintf(stderr, fail allocate msix_table! %s\n, 
   strerror(errno));
return -EFAULT;
}
   -memset(dev-msix_table, 0, 0x1000);
   +
   +msix_reset(dev);
   +
memory_region_init_io(dev-mmio, msix_mmio_ops, dev,
  assigned-dev-msix, MSIX_PAGE_SIZE);
return 0;
   
   --
   To unsubscribe from this list: send the line unsubscribe kvm in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] pci-assign: Proper initialization for MSI-X table

2012-01-31 Thread Jan Kiszka
On 2012-01-31 20:12, Michael S. Tsirkin wrote:
 On Tue, Jan 31, 2012 at 12:07:39PM -0700, Alex Williamson wrote:
 On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
 On Sat, Jan 28, 2012 at 07:22:09AM -0700, Alex Williamson wrote:
 Per the PCI spec, all vectors should be masked at handoff.

 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---

  hw/device-assignment.c |   20 +++-
  1 files changed, 19 insertions(+), 1 deletions(-)

 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index af614d3..6efa219 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -1462,6 +1462,22 @@ static const MemoryRegionOps msix_mmio_ops = {
  },
  };
  
 +static void msix_reset(AssignedDevice *dev)
 +{
 +MSIXTableEntry *entry;
 +int i;
 +
 +if (!dev-msix_table) {
 +return;
 +}
 +
 +memset(dev-msix_table, 0, 0x1000);
 +
 +for (i = 0, entry = dev-msix_table; i  dev-msix_max; i++, entry++) 
 {
 +entry-ctrl = 0x1; /* Masked */

 This is broken for BE hosts.

 Show me a BE host that even remotely works with this device assignment
 implementation.  Thanks,

 Alex
 
 I don't get it. Yes lots of cleanup is needed but why add more broken
 code?

At some point, we will just do this via the PCI core, and that will have
to do it correctly anyway. This code here is supposed to be removed again.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/9] pci-assign: Use struct for MSI-X table

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
 On Sat, Jan 28, 2012 at 07:21:58AM -0700, Alex Williamson wrote:
  This makes access much easier.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 
 Yes but this also makes it easier to forget
 to handle endian-ness.
 How about using pci_get/pci_set instead?

Do we really think existing device assignment is ever going to work on
anything but x86/x86?

Alex

  ---
  
   hw/device-assignment.c |   55 
  ++--
   hw/device-assignment.h |9 +++-
   2 files changed, 33 insertions(+), 31 deletions(-)
  
  diff --git a/hw/device-assignment.c b/hw/device-assignment.c
  index 1fc7ffa..422ee00 100644
  --- a/hw/device-assignment.c
  +++ b/hw/device-assignment.c
  @@ -973,10 +973,9 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
  *pci_dev)
   AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
   uint16_t entries_nr = 0, entries_max_nr;
   int pos = 0, i, r = 0;
  -uint32_t msg_addr, msg_upper_addr, msg_data, msg_ctrl;
   struct kvm_assigned_msix_nr msix_nr;
   struct kvm_assigned_msix_entry msix_entry;
  -void *va = adev-msix_table_page;
  +MSIXTableEntry *entry = adev-msix_table;
   
   pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
   
  @@ -985,12 +984,11 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
  *pci_dev)
   entries_max_nr += 1;
   
   /* Get the usable entry number for allocating */
  -for (i = 0; i  entries_max_nr; i++) {
  -memcpy(msg_ctrl, va + i * 16 + 12, 4);
  -memcpy(msg_data, va + i * 16 + 8, 4);
  +for (i = 0; i  entries_max_nr; i++, entry++) {
   /* Ignore unused entry even it's unmasked */
  -if (msg_data == 0)
  +if (entry-data == 0) {
   continue;
  +}
   entries_nr ++;
   }
   
  @@ -1013,16 +1011,13 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
  *pci_dev)
   
   msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
   entries_nr = 0;
  -for (i = 0; i  entries_max_nr; i++) {
  +entry = adev-msix_table;
  +for (i = 0; i  entries_max_nr; i++, entry++) {
   if (entries_nr = msix_nr.entry_nr)
   break;
  -memcpy(msg_ctrl, va + i * 16 + 12, 4);
  -memcpy(msg_data, va + i * 16 + 8, 4);
  -if (msg_data == 0)
  +if (entry-data == 0) {
   continue;
  -
  -memcpy(msg_addr, va + i * 16, 4);
  -memcpy(msg_upper_addr, va + i * 16 + 4, 4);
  +}
   
   r = kvm_get_irq_route_gsi();
   if (r  0)
  @@ -1031,10 +1026,11 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
  *pci_dev)
   adev-entry[entries_nr].gsi = r;
   adev-entry[entries_nr].type = KVM_IRQ_ROUTING_MSI;
   adev-entry[entries_nr].flags = 0;
  -adev-entry[entries_nr].u.msi.address_lo = msg_addr;
  -adev-entry[entries_nr].u.msi.address_hi = msg_upper_addr;
  -adev-entry[entries_nr].u.msi.data = msg_data;
  -DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!, msg_data, 
  msg_addr);
  +adev-entry[entries_nr].u.msi.address_lo = entry-addr_lo;
  +adev-entry[entries_nr].u.msi.address_hi = entry-addr_hi;
  +adev-entry[entries_nr].u.msi.data = entry-data;
  +DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!,
  +  entry-data, entry-addr_lo);
  kvm_add_routing_entry(adev-entry[entries_nr]);
   
   msix_entry.gsi = adev-entry[entries_nr].gsi;
  @@ -1443,7 +1439,7 @@ static uint64_t msix_mmio_read(void *opaque, 
  target_phys_addr_t addr,
   AssignedDevice *adev = opaque;
   uint64_t val;
   
  -memcpy(val, (void *)((uint8_t *)adev-msix_table_page + addr), size);
  +memcpy(val, (void *)((uint8_t *)adev-msix_table + addr), size);
   
   return val;
   }
  @@ -1456,7 +1452,7 @@ static void msix_mmio_write(void *opaque, 
  target_phys_addr_t addr,
   DEBUG(write to MSI-X entry table mmio offset 0x%lx, val 0x%lx\n,
 addr, val);
   
  -memcpy((void *)((uint8_t *)adev-msix_table_page + addr), val, size);
  +memcpy((void *)((uint8_t *)adev-msix_table + addr), val, size);
   }
   
   static const MemoryRegionOps msix_mmio_ops = {
  @@ -1471,15 +1467,13 @@ static const MemoryRegionOps msix_mmio_ops = {
   
   static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
   {
  -dev-msix_table_page = mmap(NULL, 0x1000,
  -PROT_READ|PROT_WRITE,
  -MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
  -if (dev-msix_table_page == MAP_FAILED) {
  -fprintf(stderr, fail allocate msix_table_page! %s\n,
  -strerror(errno));
  +dev-msix_table = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE,
  +   MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
  +if (dev-msix_table == MAP_FAILED) {
  +fprintf(stderr, fail allocate 

Re: [PATCH 4/9] pci-assign: Use struct for MSI-X table

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 12:05:49PM -0700, Alex Williamson wrote:
 On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
  On Sat, Jan 28, 2012 at 07:21:58AM -0700, Alex Williamson wrote:
   This makes access much easier.
   
   Signed-off-by: Alex Williamson alex.william...@redhat.com
  
  Yes but this also makes it easier to forget
  to handle endian-ness.
  How about using pci_get/pci_set instead?
 
 Do we really think existing device assignment is ever going to work on
 anything but x86/x86?
 
 Alex

It's a question of whether anyone bothers to
fix all portability bugs. Besides endian-ness,
and msix vector decoding, there's not much that is
x86 specific there.

   ---
   
hw/device-assignment.c |   55 
   ++--
hw/device-assignment.h |9 +++-
2 files changed, 33 insertions(+), 31 deletions(-)
   
   diff --git a/hw/device-assignment.c b/hw/device-assignment.c
   index 1fc7ffa..422ee00 100644
   --- a/hw/device-assignment.c
   +++ b/hw/device-assignment.c
   @@ -973,10 +973,9 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
   *pci_dev)
AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
uint16_t entries_nr = 0, entries_max_nr;
int pos = 0, i, r = 0;
   -uint32_t msg_addr, msg_upper_addr, msg_data, msg_ctrl;
struct kvm_assigned_msix_nr msix_nr;
struct kvm_assigned_msix_entry msix_entry;
   -void *va = adev-msix_table_page;
   +MSIXTableEntry *entry = adev-msix_table;

pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);

   @@ -985,12 +984,11 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
   *pci_dev)
entries_max_nr += 1;

/* Get the usable entry number for allocating */
   -for (i = 0; i  entries_max_nr; i++) {
   -memcpy(msg_ctrl, va + i * 16 + 12, 4);
   -memcpy(msg_data, va + i * 16 + 8, 4);
   +for (i = 0; i  entries_max_nr; i++, entry++) {
/* Ignore unused entry even it's unmasked */
   -if (msg_data == 0)
   +if (entry-data == 0) {
continue;
   +}
entries_nr ++;
}

   @@ -1013,16 +1011,13 @@ static int 
   assigned_dev_update_msix_mmio(PCIDevice *pci_dev)

msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
entries_nr = 0;
   -for (i = 0; i  entries_max_nr; i++) {
   +entry = adev-msix_table;
   +for (i = 0; i  entries_max_nr; i++, entry++) {
if (entries_nr = msix_nr.entry_nr)
break;
   -memcpy(msg_ctrl, va + i * 16 + 12, 4);
   -memcpy(msg_data, va + i * 16 + 8, 4);
   -if (msg_data == 0)
   +if (entry-data == 0) {
continue;
   -
   -memcpy(msg_addr, va + i * 16, 4);
   -memcpy(msg_upper_addr, va + i * 16 + 4, 4);
   +}

r = kvm_get_irq_route_gsi();
if (r  0)
   @@ -1031,10 +1026,11 @@ static int 
   assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
adev-entry[entries_nr].gsi = r;
adev-entry[entries_nr].type = KVM_IRQ_ROUTING_MSI;
adev-entry[entries_nr].flags = 0;
   -adev-entry[entries_nr].u.msi.address_lo = msg_addr;
   -adev-entry[entries_nr].u.msi.address_hi = msg_upper_addr;
   -adev-entry[entries_nr].u.msi.data = msg_data;
   -DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!, msg_data, 
   msg_addr);
   +adev-entry[entries_nr].u.msi.address_lo = entry-addr_lo;
   +adev-entry[entries_nr].u.msi.address_hi = entry-addr_hi;
   +adev-entry[entries_nr].u.msi.data = entry-data;
   +DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!,
   +  entry-data, entry-addr_lo);
 kvm_add_routing_entry(adev-entry[entries_nr]);

msix_entry.gsi = adev-entry[entries_nr].gsi;
   @@ -1443,7 +1439,7 @@ static uint64_t msix_mmio_read(void *opaque, 
   target_phys_addr_t addr,
AssignedDevice *adev = opaque;
uint64_t val;

   -memcpy(val, (void *)((uint8_t *)adev-msix_table_page + addr), 
   size);
   +memcpy(val, (void *)((uint8_t *)adev-msix_table + addr), size);

return val;
}
   @@ -1456,7 +1452,7 @@ static void msix_mmio_write(void *opaque, 
   target_phys_addr_t addr,
DEBUG(write to MSI-X entry table mmio offset 0x%lx, val 0x%lx\n,
  addr, val);

   -memcpy((void *)((uint8_t *)adev-msix_table_page + addr), val, 
   size);
   +memcpy((void *)((uint8_t *)adev-msix_table + addr), val, size);
}

static const MemoryRegionOps msix_mmio_ops = {
   @@ -1471,15 +1467,13 @@ static const MemoryRegionOps msix_mmio_ops = {

static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
{
   -dev-msix_table_page = mmap(NULL, 0x1000,
   -PROT_READ|PROT_WRITE,
   -MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
   -

Re: Increase KVMSlots size

2012-01-31 Thread Marcelo Tosatti
On Tue, Jan 31, 2012 at 11:47:32AM +, Bhushan Bharat-R65777 wrote:
 Hi All,
 
 In my development I came across the state where I need more than 32 kvmslots
 
 QEMU/kvm-all.c:
 
 cut
 struct KVMState
 {
 KVMSlot slots[32];
   
 cut
 
 
 Followings are the solution I thought of:
 Solution 1: 
   Increase the slot to 64 in both KVM and QEMU. I think this will also 
 require to add a new ioctl call so that qemu can come to know on which KVM it 
 is running (whether it is running on 32 slots KVM or 64 slots KVM).

It already exists, its called KVM_CAP_NR_MEMSLOTS (implemented by x86).

 Other solutions:
   a) Decouple the qemu and kvm data structure. QEMU and KVM use some 
 variable size data structure or lists etc. QEMU can keep on adding as many as 
 it wants without knowing how much slots KVM supports.
   b) There are linear search in the KVM slots array in critical path of 
 code of KVM. Does this make some sense using RB-tree, heapsort, balanced 
 binary tree type of solutions?

See discussion at http://lwn.net/Articles/429308/.

x86 optimizes the case of a miss (GFN not mapped by any memslot)
by caching the result into a specially invalid pagetable entry (so 
an invalid fault optimizes away the need to walk all memslots).

It would be good to know how much of a problem an increased memslot
array is for PPC.

 
 Looking forward for suggestions.
 
 Thanks
 -Bharat
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/9] pci-assign: Only calculate maximum MSI-X vector entries once

2012-01-31 Thread Michael S. Tsirkin
On Sat, Jan 28, 2012 at 07:22:04AM -0700, Alex Williamson wrote:
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---

Why? Optimization?

  hw/device-assignment.c |   17 +++--
  hw/device-assignment.h |1 +
  2 files changed, 8 insertions(+), 10 deletions(-)
 
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index 422ee00..af614d3 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -971,20 +971,14 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
  static int assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
  {
  AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
 -uint16_t entries_nr = 0, entries_max_nr;
 -int pos = 0, i, r = 0;
 +uint16_t entries_nr = 0;
 +int i, r = 0;
  struct kvm_assigned_msix_nr msix_nr;
  struct kvm_assigned_msix_entry msix_entry;
  MSIXTableEntry *entry = adev-msix_table;
  
 -pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
 -
 -entries_max_nr = *(uint16_t *)(pci_dev-config + pos + 2);
 -entries_max_nr = PCI_MSIX_FLAGS_QSIZE;
 -entries_max_nr += 1;
 -
  /* Get the usable entry number for allocating */
 -for (i = 0; i  entries_max_nr; i++, entry++) {
 +for (i = 0; i  adev-msix_max; i++, entry++) {
  /* Ignore unused entry even it's unmasked */
  if (entry-data == 0) {
  continue;
 @@ -1012,7 +1006,7 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
 *pci_dev)
  msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
  entries_nr = 0;
  entry = adev-msix_table;
 -for (i = 0; i  entries_max_nr; i++, entry++) {
 +for (i = 0; i  adev-msix_max; i++, entry++) {
  if (entries_nr = msix_nr.entry_nr)
  break;
  if (entry-data == 0) {
 @@ -1225,6 +1219,9 @@ static int assigned_device_pci_cap_init(PCIDevice 
 *pci_dev)
  bar_nr = msix_table_entry  PCI_MSIX_FLAGS_BIRMASK;
  msix_table_entry = ~PCI_MSIX_FLAGS_BIRMASK;
  dev-msix_table_addr = pci_region[bar_nr].base_addr + 
 msix_table_entry;
 +dev-msix_max = pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS);
 +dev-msix_max = PCI_MSIX_FLAGS_QSIZE;
 +dev-msix_max += 1;
  }
  
  /* Minimal PM support, nothing writable, device appears to NAK changes */
 diff --git a/hw/device-assignment.h b/hw/device-assignment.h
 index e229303..a909fb2 100644
 --- a/hw/device-assignment.h
 +++ b/hw/device-assignment.h
 @@ -119,6 +119,7 @@ typedef struct AssignedDevice {
  struct kvm_irq_routing_entry *entry;
  MSIXTableEntry *msix_table;
  target_phys_addr_t msix_table_addr;
 +uint16_t msix_max;
  MemoryRegion mmio;
  char *configfd_name;
  int32_t bootindex;
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] pci-assign: Proper initialization for MSI-X table

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 08:16:31PM +0100, Jan Kiszka wrote:
 On 2012-01-31 20:12, Michael S. Tsirkin wrote:
  On Tue, Jan 31, 2012 at 12:07:39PM -0700, Alex Williamson wrote:
  On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
  On Sat, Jan 28, 2012 at 07:22:09AM -0700, Alex Williamson wrote:
  Per the PCI spec, all vectors should be masked at handoff.
 
  Signed-off-by: Alex Williamson alex.william...@redhat.com
  ---
 
   hw/device-assignment.c |   20 +++-
   1 files changed, 19 insertions(+), 1 deletions(-)
 
  diff --git a/hw/device-assignment.c b/hw/device-assignment.c
  index af614d3..6efa219 100644
  --- a/hw/device-assignment.c
  +++ b/hw/device-assignment.c
  @@ -1462,6 +1462,22 @@ static const MemoryRegionOps msix_mmio_ops = {
   },
   };
   
  +static void msix_reset(AssignedDevice *dev)
  +{
  +MSIXTableEntry *entry;
  +int i;
  +
  +if (!dev-msix_table) {
  +return;
  +}
  +
  +memset(dev-msix_table, 0, 0x1000);
  +
  +for (i = 0, entry = dev-msix_table; i  dev-msix_max; i++, 
  entry++) {
  +entry-ctrl = 0x1; /* Masked */
 
  This is broken for BE hosts.
 
  Show me a BE host that even remotely works with this device assignment
  implementation.  Thanks,
 
  Alex
  
  I don't get it. Yes lots of cleanup is needed but why add more broken
  code?
 
 At some point, we will just do this via the PCI core, and that will have
 to do it correctly anyway. This code here is supposed to be removed again.
 
 Jan

Either we say this is all dead code, then
let's just fix bugs until it is rewritten.
To fix the bug, all we need is patch 9, maybe 6 and 7.

Or there is intent to maintain it going forward
then, methinks, it needs to be brought up to
some minimal standards.


 -- 
 Siemens AG, Corporate Technology, CT T DE IT 1
 Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/9] pci-assign: Only calculate maximum MSI-X vector entries once

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 22:18 +0200, Michael S. Tsirkin wrote:
 On Sat, Jan 28, 2012 at 07:22:04AM -0700, Alex Williamson wrote:
  Signed-off-by: Alex Williamson alex.william...@redhat.com
  ---
 
 Why? Optimization?

Because in 6/9 we'd have to calculate it again for resetting the msix
table and in 9/9 we use it to drop writes outside of the valid vector
range.  I suspect we might actually want to call msix_reset on device
reset, so that means we'd be recalculating it every time we reset the
device, write to the msix table, and again if we're actually updating
msix entries.  Redundancy exceeded my cost of two bytes metric.  Thanks,

Alex

   hw/device-assignment.c |   17 +++--
   hw/device-assignment.h |1 +
   2 files changed, 8 insertions(+), 10 deletions(-)
  
  diff --git a/hw/device-assignment.c b/hw/device-assignment.c
  index 422ee00..af614d3 100644
  --- a/hw/device-assignment.c
  +++ b/hw/device-assignment.c
  @@ -971,20 +971,14 @@ static void assigned_dev_update_msi(PCIDevice 
  *pci_dev)
   static int assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
   {
   AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
  -uint16_t entries_nr = 0, entries_max_nr;
  -int pos = 0, i, r = 0;
  +uint16_t entries_nr = 0;
  +int i, r = 0;
   struct kvm_assigned_msix_nr msix_nr;
   struct kvm_assigned_msix_entry msix_entry;
   MSIXTableEntry *entry = adev-msix_table;
   
  -pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
  -
  -entries_max_nr = *(uint16_t *)(pci_dev-config + pos + 2);
  -entries_max_nr = PCI_MSIX_FLAGS_QSIZE;
  -entries_max_nr += 1;
  -
   /* Get the usable entry number for allocating */
  -for (i = 0; i  entries_max_nr; i++, entry++) {
  +for (i = 0; i  adev-msix_max; i++, entry++) {
   /* Ignore unused entry even it's unmasked */
   if (entry-data == 0) {
   continue;
  @@ -1012,7 +1006,7 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
  *pci_dev)
   msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
   entries_nr = 0;
   entry = adev-msix_table;
  -for (i = 0; i  entries_max_nr; i++, entry++) {
  +for (i = 0; i  adev-msix_max; i++, entry++) {
   if (entries_nr = msix_nr.entry_nr)
   break;
   if (entry-data == 0) {
  @@ -1225,6 +1219,9 @@ static int assigned_device_pci_cap_init(PCIDevice 
  *pci_dev)
   bar_nr = msix_table_entry  PCI_MSIX_FLAGS_BIRMASK;
   msix_table_entry = ~PCI_MSIX_FLAGS_BIRMASK;
   dev-msix_table_addr = pci_region[bar_nr].base_addr + 
  msix_table_entry;
  +dev-msix_max = pci_get_word(pci_dev-config + pos + 
  PCI_MSIX_FLAGS);
  +dev-msix_max = PCI_MSIX_FLAGS_QSIZE;
  +dev-msix_max += 1;
   }
   
   /* Minimal PM support, nothing writable, device appears to NAK changes 
  */
  diff --git a/hw/device-assignment.h b/hw/device-assignment.h
  index e229303..a909fb2 100644
  --- a/hw/device-assignment.h
  +++ b/hw/device-assignment.h
  @@ -119,6 +119,7 @@ typedef struct AssignedDevice {
   struct kvm_irq_routing_entry *entry;
   MSIXTableEntry *msix_table;
   target_phys_addr_t msix_table_addr;
  +uint16_t msix_max;
   MemoryRegion mmio;
   char *configfd_name;
   int32_t bootindex;
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v3 6/7] pcspk: Convert to qdev

2012-01-31 Thread Anthony Liguori

On 01/31/2012 11:41 AM, Jan Kiszka wrote:

Convert the PC speaker device to a qdev ISA model. Move the public
interface to a dedicated header file at this chance.

Signed-off-by: Jan Kiszkajan.kis...@siemens.com


Heh, I did this too more or less the same way.  Some comments below:


---
  arch_init.c|1 +
  hw/i82378.c|3 +-
  hw/mips_jazz.c |3 +-
  hw/pc.c|3 +-
  hw/pc.h|4 ---
  hw/pcspk.c |   62 ++-
  hw/pcspk.h |   45 
  7 files changed, 104 insertions(+), 17 deletions(-)
  create mode 100644 hw/pcspk.h

diff --git a/arch_init.c b/arch_init.c
index 2366511..a45485b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -42,6 +42,7 @@
  #include gdbstub.h
  #include hw/smbios.h
  #include exec-memory.h
+#include hw/pcspk.h

  #ifdef TARGET_SPARC
  int graphic_width = 1024;
diff --git a/hw/i82378.c b/hw/i82378.c
index 127cadf..79fa8b7 100644
--- a/hw/i82378.c
+++ b/hw/i82378.c
@@ -20,6 +20,7 @@
  #include pci.h
  #include pc.h
  #include i8254.h
+#include pcspk.h

  //#define DEBUG_I82378

@@ -195,7 +196,7 @@ static void i82378_init(DeviceState *dev, I82378State *s)
  pit = pit_init(isabus, 0x40, 0, NULL);

  /* speaker */
-pcspk_init(pit);
+pcspk_init(isabus, pit);

  /* 2 82C37 (dma) */
  DMA_init(1,s-out[1]);
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index b61b218..65608dc 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -37,6 +37,7 @@
  #include loader.h
  #include mc146818rtc.h
  #include i8254.h
+#include pcspk.h
  #include blockdev.h
  #include sysbus.h
  #include exec-memory.h
@@ -193,7 +194,7 @@ static void mips_jazz_init(MemoryRegion *address_space,
  cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
  DMA_init(0, cpu_exit_irq);
  pit = pit_init(isa_bus, 0x40, 0, NULL);
-pcspk_init(pit);
+pcspk_init(isa_bus, pit);

  /* ISA IO space at 0x9000 */
  isa_mmio_init(0x9000, 0x0100);
diff --git a/hw/pc.c b/hw/pc.c
index 6fb1de7..f6a91a9 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -37,6 +37,7 @@
  #include multiboot.h
  #include mc146818rtc.h
  #include i8254.h
+#include pcspk.h
  #include msi.h
  #include sysbus.h
  #include sysemu.h
@@ -1169,7 +1170,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
  /* connect PIT to output control line of the HPET */
  qdev_connect_gpio_out(hpet, 0, qdev_get_gpio_in(pit-qdev, 0));
  }
-pcspk_init(pit);
+pcspk_init(isa_bus, pit);

  for(i = 0; i  MAX_SERIAL_PORTS; i++) {
  if (serial_hds[i]) {
diff --git a/hw/pc.h b/hw/pc.h
index b08708d..1b47bbd 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -149,10 +149,6 @@ void piix4_smbus_register_device(SMBusDevice *dev, uint8_t 
addr);
  /* hpet.c */
  extern int no_hpet;

-/* pcspk.c */
-void pcspk_init(ISADevice *pit);
-int pcspk_audio_init(ISABus *bus);
-
  /* piix_pci.c */
  struct PCII440FXState;
  typedef struct PCII440FXState PCII440FXState;
diff --git a/hw/pcspk.c b/hw/pcspk.c
index 43df818..5bd9e32 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -28,6 +28,7 @@
  #include audio/audio.h
  #include qemu-timer.h
  #include i8254.h
+#include pcspk.h

  #define PCSPK_BUF_LEN 1792
  #define PCSPK_SAMPLE_RATE 32000
@@ -35,10 +36,13 @@
  #define PCSPK_MIN_COUNT ((PIT_FREQ + PCSPK_MAX_FREQ - 1) / PCSPK_MAX_FREQ)

  typedef struct {
+ISADevice dev;
+MemoryRegion ioport;
+uint32_t iobase;
  uint8_t sample_buf[PCSPK_BUF_LEN];
  QEMUSoundCard card;
  SWVoiceOut *voice;
-ISADevice *pit;
+void *pit;
  unsigned int pit_count;
  unsigned int samples;
  unsigned int play_pos;
@@ -47,7 +51,7 @@ typedef struct {
  } PCSpkState;

  static const char *s_spk = pcspk;
-static PCSpkState pcspk_state;
+static PCSpkState *pcspk_state;

  static inline void generate_samples(PCSpkState *s)
  {
@@ -99,7 +103,7 @@ static void pcspk_callback(void *opaque, int free)

  int pcspk_audio_init(ISABus *bus)
  {
-PCSpkState *s =pcspk_state;
+PCSpkState *s = pcspk_state;
  struct audsettings as = {PCSPK_SAMPLE_RATE, 1, AUD_FMT_U8, 0};



This can be a follow up, but what this is really doing is enabling audio for 
this device.  ISABus is unused as a parameter.


I think it would be better to make this a qdev bool property (audio_enabled) and 
then modify the code that calls this function to either set the property as a 
global, or use the QOM path to specifically set it on this device.


Either way, I think setting an audio_enabled flag via qdev makes a whole lot 
more sense conceptionally than using the -soundhw option.




  AUD_register_card(s_spk,s-card);
@@ -113,7 +117,8 @@ int pcspk_audio_init(ISABus *bus)
  return 0;
  }

-static uint32_t pcspk_ioport_read(void *opaque, uint32_t addr)
+static uint64_t pcspk_io_read(void *opaque, target_phys_addr_t addr,
+  unsigned size)
  {
  PCSpkState *s = opaque;
  

Re: [PATCH 5/9] pci-assign: Only calculate maximum MSI-X vector entries once

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 01:31:59PM -0700, Alex Williamson wrote:
 On Tue, 2012-01-31 at 22:18 +0200, Michael S. Tsirkin wrote:
  On Sat, Jan 28, 2012 at 07:22:04AM -0700, Alex Williamson wrote:
   Signed-off-by: Alex Williamson alex.william...@redhat.com
   ---
  
  Why? Optimization?
 
 Because in 6/9 we'd have to calculate it again for resetting the msix
 table and in 9/9 we use it to drop writes outside of the valid vector
 range.  I suspect we might actually want to call msix_reset on device
 reset, so that means we'd be recalculating it every time we reset the
 device, write to the msix table, and again if we're actually updating
 msix entries.  Redundancy exceeded my cost of two bytes metric.  Thanks,
 
 Alex

No problem but motivation should probably go into commit log.

hw/device-assignment.c |   17 +++--
hw/device-assignment.h |1 +
2 files changed, 8 insertions(+), 10 deletions(-)
   
   diff --git a/hw/device-assignment.c b/hw/device-assignment.c
   index 422ee00..af614d3 100644
   --- a/hw/device-assignment.c
   +++ b/hw/device-assignment.c
   @@ -971,20 +971,14 @@ static void assigned_dev_update_msi(PCIDevice 
   *pci_dev)
static int assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
{
AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
   -uint16_t entries_nr = 0, entries_max_nr;
   -int pos = 0, i, r = 0;
   +uint16_t entries_nr = 0;
   +int i, r = 0;
struct kvm_assigned_msix_nr msix_nr;
struct kvm_assigned_msix_entry msix_entry;
MSIXTableEntry *entry = adev-msix_table;

   -pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
   -
   -entries_max_nr = *(uint16_t *)(pci_dev-config + pos + 2);
   -entries_max_nr = PCI_MSIX_FLAGS_QSIZE;
   -entries_max_nr += 1;
   -
/* Get the usable entry number for allocating */
   -for (i = 0; i  entries_max_nr; i++, entry++) {
   +for (i = 0; i  adev-msix_max; i++, entry++) {
/* Ignore unused entry even it's unmasked */
if (entry-data == 0) {
continue;
   @@ -1012,7 +1006,7 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
   *pci_dev)
msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
entries_nr = 0;
entry = adev-msix_table;
   -for (i = 0; i  entries_max_nr; i++, entry++) {
   +for (i = 0; i  adev-msix_max; i++, entry++) {
if (entries_nr = msix_nr.entry_nr)
break;
if (entry-data == 0) {
   @@ -1225,6 +1219,9 @@ static int assigned_device_pci_cap_init(PCIDevice 
   *pci_dev)
bar_nr = msix_table_entry  PCI_MSIX_FLAGS_BIRMASK;
msix_table_entry = ~PCI_MSIX_FLAGS_BIRMASK;
dev-msix_table_addr = pci_region[bar_nr].base_addr + 
   msix_table_entry;
   +dev-msix_max = pci_get_word(pci_dev-config + pos + 
   PCI_MSIX_FLAGS);
   +dev-msix_max = PCI_MSIX_FLAGS_QSIZE;
   +dev-msix_max += 1;
}

/* Minimal PM support, nothing writable, device appears to NAK 
   changes */
   diff --git a/hw/device-assignment.h b/hw/device-assignment.h
   index e229303..a909fb2 100644
   --- a/hw/device-assignment.h
   +++ b/hw/device-assignment.h
   @@ -119,6 +119,7 @@ typedef struct AssignedDevice {
struct kvm_irq_routing_entry *entry;
MSIXTableEntry *msix_table;
target_phys_addr_t msix_table_addr;
   +uint16_t msix_max;
MemoryRegion mmio;
char *configfd_name;
int32_t bootindex;
   
   --
   To unsubscribe from this list: send the line unsubscribe kvm in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] pci-assign: Proper initialization for MSI-X table

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 22:19 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 31, 2012 at 08:16:31PM +0100, Jan Kiszka wrote:
  On 2012-01-31 20:12, Michael S. Tsirkin wrote:
   On Tue, Jan 31, 2012 at 12:07:39PM -0700, Alex Williamson wrote:
   On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
   On Sat, Jan 28, 2012 at 07:22:09AM -0700, Alex Williamson wrote:
   Per the PCI spec, all vectors should be masked at handoff.
  
   Signed-off-by: Alex Williamson alex.william...@redhat.com
   ---
  
hw/device-assignment.c |   20 +++-
1 files changed, 19 insertions(+), 1 deletions(-)
  
   diff --git a/hw/device-assignment.c b/hw/device-assignment.c
   index af614d3..6efa219 100644
   --- a/hw/device-assignment.c
   +++ b/hw/device-assignment.c
   @@ -1462,6 +1462,22 @@ static const MemoryRegionOps msix_mmio_ops = {
},
};

   +static void msix_reset(AssignedDevice *dev)
   +{
   +MSIXTableEntry *entry;
   +int i;
   +
   +if (!dev-msix_table) {
   +return;
   +}
   +
   +memset(dev-msix_table, 0, 0x1000);
   +
   +for (i = 0, entry = dev-msix_table; i  dev-msix_max; i++, 
   entry++) {
   +entry-ctrl = 0x1; /* Masked */
  
   This is broken for BE hosts.
  
   Show me a BE host that even remotely works with this device assignment
   implementation.  Thanks,
  
   Alex
   
   I don't get it. Yes lots of cleanup is needed but why add more broken
   code?
  
  At some point, we will just do this via the PCI core, and that will have
  to do it correctly anyway. This code here is supposed to be removed again.
  
  Jan
 
 Either we say this is all dead code, then
 let's just fix bugs until it is rewritten.
 To fix the bug, all we need is patch 9, maybe 6 and 7.
 
 Or there is intent to maintain it going forward
 then, methinks, it needs to be brought up to
 some minimal standards.

There's minimal standards and then there's endian neutrality and non-x86
support for a piece of code that, I think, is probably never going to
see qemu.git.  There's a big gap between those.  I'd love for us to move
to vfio, but for now we're stuck with this and I'll try to update it and
make it a little better each time.  Looking out for big endian hosts
isn't high on my priority list for this code though.  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 15:33 +0200, Avi Kivity wrote:
 On 01/31/2012 03:21 PM, Jan Kiszka wrote:
  On 2012-01-31 14:10, Avi Kivity wrote:
   On 01/31/2012 02:57 PM, Jan Kiszka wrote:
   Seems fine, but do we really need the option?  If it doesn't work we
   should treat it as an ordinary but and fix it.
  
   So far it's against the architecture of the emulated system: our 
   current
   chipset predates 64 bit PCI.
  
  
   Then it should be enabled/disabled at the chipset level.
  
   Makes me wonder if we already do some filtering if the device supports
   64 bit but the next bridge does not.
  
   
   Our 440fx does support 64-bit bars, so the question doesn't arise for
   x86.  Instead we violate the spec.
 
  If you mean by our the 440fx-qemu, not the real 440fx. That one does
  not even support 1GB RAM.
 
 Yes, that's what I meant.  It also supports pci hotplug, more slots, cpu
 hotplug, etc.

I'll drop this patch for now, it was just something I enabled based on a
query from MST and didn't want to lose it.  Maybe we need the option in
PCI core, but I'm just turn it on and hope for the best w/o giving users
a way to disable it.  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/9] pci-assign: Update MSI-X MMIO to Memory API

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 14:45 +0200, Avi Kivity wrote:
 On 01/28/2012 04:21 PM, Alex Williamson wrote:
  Stop using compatibility mode and at the same time fix available
  access sizes.  The PCI spec indicates that the MSI-X table may
  only be accessed as DWORD or QWORD.
 
   
   static const MemoryRegionOps msix_mmio_ops = {
  -.old_mmio = {
  -.read = { msix_mmio_readb, msix_mmio_readw, msix_mmio_readl, },
  -.write = { msix_mmio_writeb, msix_mmio_writew, msix_mmio_writel, },
  -},
  +.read = msix_mmio_read,
  +.write = msix_mmio_write,
   .endianness = DEVICE_NATIVE_ENDIAN,
  +.impl = {
  +.min_access_size = 4,
  +.max_access_size = 8,
  +},
   };
   
 
 .impl.min_access_size = 4 means the core will convert 1-byte I/O to
 4-byte I/O (using rmw if needed).  That's not what we want, I think you
 can leave it at 1 and explicitly ignore small accesses in the callbacks.
 
 Have you tested 8-byte I/O?  This is the first user.  Don't you need to
 set .valid.max_access_size?

I have not explicitly tested 8-byte I/O, figured it might just work.
Hmm, I wonder if we really need to be strict enough to reject byte and
word access.  It doesn't follow the spec, but I don't know that it buys
us anything to be strict about it.  Anyway, I'll look at .valid and
respin.  Thanks,

Alex



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] pci-assign: Optionally enable 64bit BARs in guest

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 02:08:38PM -0700, Alex Williamson wrote:
 On Tue, 2012-01-31 at 15:33 +0200, Avi Kivity wrote:
  On 01/31/2012 03:21 PM, Jan Kiszka wrote:
   On 2012-01-31 14:10, Avi Kivity wrote:
On 01/31/2012 02:57 PM, Jan Kiszka wrote:
Seems fine, but do we really need the option?  If it doesn't work we
should treat it as an ordinary but and fix it.
   
So far it's against the architecture of the emulated system: our 
current
chipset predates 64 bit PCI.
   
   
Then it should be enabled/disabled at the chipset level.
   
Makes me wonder if we already do some filtering if the device supports
64 bit but the next bridge does not.
   

Our 440fx does support 64-bit bars, so the question doesn't arise for
x86.  Instead we violate the spec.
  
   If you mean by our the 440fx-qemu, not the real 440fx. That one does
   not even support 1GB RAM.
  
  Yes, that's what I meant.  It also supports pci hotplug, more slots, cpu
  hotplug, etc.
 
 I'll drop this patch for now, it was just something I enabled based on a
 query from MST and didn't want to lose it.  Maybe we need the option in
 PCI core, but I'm just turn it on and hope for the best w/o giving users
 a way to disable it.  Thanks,
 
 Alex

The patch itself makes sense to me.
Avi found some bugs in 64 bit BAR handling, let's wait till that is
fixed then merge this patch.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/7] hpet: Save/restore cached RTC IRQ level

2012-01-31 Thread Anthony Liguori

On 01/31/2012 11:41 AM, Jan Kiszka wrote:

In legacy mode, the HPET suppresses the RTC interrupt delivery via IRQ
8 but keeps track of the RTC output level and applies it when legacy
mode is turned off again. This value has to be preserved across save/
restore as it cannot be reconstructed otherwise.

Signed-off-by: Jan Kiszkajan.kis...@siemens.com
---
  hw/hpet.c |   26 ++
  1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index aba9ea9..39b291f 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -240,6 +240,24 @@ static int hpet_post_load(void *opaque, int version_id)
  return 0;
  }

+static bool hpet_rtc_irq_level_needed(void *opaque)
+{
+HPETState *s = opaque;
+
+return s-rtc_irq_level != 0;
+}
+
+static const VMStateDescription vmstate_hpet_rtc_irq_level = {
+.name = hpet/rtc_irq_level,
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8(rtc_irq_level, HPETState),
+VMSTATE_END_OF_LIST()
+}
+};
+


This won't work.  We don't clear rtc_irq_level on reset so rtc_irq_level may be 
high or low after reset.  As such, we can't use a subsection here.  We need to 
bump the savevm state.


Regards,

Anthony Liguori


  static const VMStateDescription vmstate_hpet_timer = {
  .name = hpet_timer,
  .version_id = 1,
@@ -273,6 +291,14 @@ static const VMStateDescription vmstate_hpet = {
  VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
  vmstate_hpet_timer, HPETTimer),
  VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd =vmstate_hpet_rtc_irq_level,
+.needed = hpet_rtc_irq_level_needed,
+}, {
+/* empty */
+}
  }
  };



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/9] pci-assign: Use struct for MSI-X table

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 22:00 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 31, 2012 at 12:05:49PM -0700, Alex Williamson wrote:
  On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
   On Sat, Jan 28, 2012 at 07:21:58AM -0700, Alex Williamson wrote:
This makes access much easier.

Signed-off-by: Alex Williamson alex.william...@redhat.com
   
   Yes but this also makes it easier to forget
   to handle endian-ness.
   How about using pci_get/pci_set instead?
  
  Do we really think existing device assignment is ever going to work on
  anything but x86/x86?
  
  Alex
 
 It's a question of whether anyone bothers to
 fix all portability bugs. Besides endian-ness,
 and msix vector decoding, there's not much that is
 x86 specific there.

The way interrupt routing is updated from the chipset is disgustingly
x86 specific.  The entire kernel side implementation is only enabled for
x86 (well, not counting ia64).  I expect the kernel side interrupt
injection is very x86, or at least ioapic, specific.  Beyond that, yeah,
it may be trivial ;)  Thanks,

Alex

---

 hw/device-assignment.c |   55 
++--
 hw/device-assignment.h |9 +++-
 2 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 1fc7ffa..422ee00 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -973,10 +973,9 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
*pci_dev)
 AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
 uint16_t entries_nr = 0, entries_max_nr;
 int pos = 0, i, r = 0;
-uint32_t msg_addr, msg_upper_addr, msg_data, msg_ctrl;
 struct kvm_assigned_msix_nr msix_nr;
 struct kvm_assigned_msix_entry msix_entry;
-void *va = adev-msix_table_page;
+MSIXTableEntry *entry = adev-msix_table;
 
 pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
 
@@ -985,12 +984,11 @@ static int 
assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
 entries_max_nr += 1;
 
 /* Get the usable entry number for allocating */
-for (i = 0; i  entries_max_nr; i++) {
-memcpy(msg_ctrl, va + i * 16 + 12, 4);
-memcpy(msg_data, va + i * 16 + 8, 4);
+for (i = 0; i  entries_max_nr; i++, entry++) {
 /* Ignore unused entry even it's unmasked */
-if (msg_data == 0)
+if (entry-data == 0) {
 continue;
+}
 entries_nr ++;
 }
 
@@ -1013,16 +1011,13 @@ static int 
assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
 
 msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
 entries_nr = 0;
-for (i = 0; i  entries_max_nr; i++) {
+entry = adev-msix_table;
+for (i = 0; i  entries_max_nr; i++, entry++) {
 if (entries_nr = msix_nr.entry_nr)
 break;
-memcpy(msg_ctrl, va + i * 16 + 12, 4);
-memcpy(msg_data, va + i * 16 + 8, 4);
-if (msg_data == 0)
+if (entry-data == 0) {
 continue;
-
-memcpy(msg_addr, va + i * 16, 4);
-memcpy(msg_upper_addr, va + i * 16 + 4, 4);
+}
 
 r = kvm_get_irq_route_gsi();
 if (r  0)
@@ -1031,10 +1026,11 @@ static int 
assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
 adev-entry[entries_nr].gsi = r;
 adev-entry[entries_nr].type = KVM_IRQ_ROUTING_MSI;
 adev-entry[entries_nr].flags = 0;
-adev-entry[entries_nr].u.msi.address_lo = msg_addr;
-adev-entry[entries_nr].u.msi.address_hi = msg_upper_addr;
-adev-entry[entries_nr].u.msi.data = msg_data;
-DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!, msg_data, 
msg_addr);
+adev-entry[entries_nr].u.msi.address_lo = entry-addr_lo;
+adev-entry[entries_nr].u.msi.address_hi = entry-addr_hi;
+adev-entry[entries_nr].u.msi.data = entry-data;
+DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!,
+  entry-data, entry-addr_lo);
kvm_add_routing_entry(adev-entry[entries_nr]);
 
 msix_entry.gsi = adev-entry[entries_nr].gsi;
@@ -1443,7 +1439,7 @@ static uint64_t msix_mmio_read(void *opaque, 
target_phys_addr_t addr,
 AssignedDevice *adev = opaque;
 uint64_t val;
 
-memcpy(val, (void *)((uint8_t *)adev-msix_table_page + addr), 
size);
+memcpy(val, (void *)((uint8_t *)adev-msix_table + addr), size);
 
 return val;
 }
@@ -1456,7 +1452,7 @@ static void msix_mmio_write(void *opaque, 
target_phys_addr_t addr,
 DEBUG(write to MSI-X entry table mmio offset 0x%lx, val 0x%lx\n,
   addr, val);
 
-memcpy((void *)((uint8_t 

Re: [PATCH 4/9] pci-assign: Use struct for MSI-X table

2012-01-31 Thread Michael S. Tsirkin
On Tue, Jan 31, 2012 at 02:17:30PM -0700, Alex Williamson wrote:
 On Tue, 2012-01-31 at 22:00 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 31, 2012 at 12:05:49PM -0700, Alex Williamson wrote:
   On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
On Sat, Jan 28, 2012 at 07:21:58AM -0700, Alex Williamson wrote:
 This makes access much easier.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com

Yes but this also makes it easier to forget
to handle endian-ness.
How about using pci_get/pci_set instead?
   
   Do we really think existing device assignment is ever going to work on
   anything but x86/x86?
   
   Alex
  
  It's a question of whether anyone bothers to
  fix all portability bugs. Besides endian-ness,
  and msix vector decoding, there's not much that is
  x86 specific there.
 
 The way interrupt routing is updated from the chipset is disgustingly
 x86 specific.

Right, that's what I mean - the decoding of msix vector to destination.

 The entire kernel side implementation is only enabled for
 x86 (well, not counting ia64).  I expect the kernel side interrupt
 injection is very x86, or at least ioapic, specific.  Beyond that, yeah,
 it may be trivial ;)  Thanks,
 
 Alex
 
 ---
 
  hw/device-assignment.c |   55 
 ++--
  hw/device-assignment.h |9 +++-
  2 files changed, 33 insertions(+), 31 deletions(-)
 
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index 1fc7ffa..422ee00 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -973,10 +973,9 @@ static int 
 assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
  AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
  uint16_t entries_nr = 0, entries_max_nr;
  int pos = 0, i, r = 0;
 -uint32_t msg_addr, msg_upper_addr, msg_data, msg_ctrl;
  struct kvm_assigned_msix_nr msix_nr;
  struct kvm_assigned_msix_entry msix_entry;
 -void *va = adev-msix_table_page;
 +MSIXTableEntry *entry = adev-msix_table;
  
  pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
  
 @@ -985,12 +984,11 @@ static int 
 assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
  entries_max_nr += 1;
  
  /* Get the usable entry number for allocating */
 -for (i = 0; i  entries_max_nr; i++) {
 -memcpy(msg_ctrl, va + i * 16 + 12, 4);
 -memcpy(msg_data, va + i * 16 + 8, 4);
 +for (i = 0; i  entries_max_nr; i++, entry++) {
  /* Ignore unused entry even it's unmasked */
 -if (msg_data == 0)
 +if (entry-data == 0) {
  continue;
 +}
  entries_nr ++;
  }
  
 @@ -1013,16 +1011,13 @@ static int 
 assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
  
  msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
  entries_nr = 0;
 -for (i = 0; i  entries_max_nr; i++) {
 +entry = adev-msix_table;
 +for (i = 0; i  entries_max_nr; i++, entry++) {
  if (entries_nr = msix_nr.entry_nr)
  break;
 -memcpy(msg_ctrl, va + i * 16 + 12, 4);
 -memcpy(msg_data, va + i * 16 + 8, 4);
 -if (msg_data == 0)
 +if (entry-data == 0) {
  continue;
 -
 -memcpy(msg_addr, va + i * 16, 4);
 -memcpy(msg_upper_addr, va + i * 16 + 4, 4);
 +}
  
  r = kvm_get_irq_route_gsi();
  if (r  0)
 @@ -1031,10 +1026,11 @@ static int 
 assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
  adev-entry[entries_nr].gsi = r;
  adev-entry[entries_nr].type = KVM_IRQ_ROUTING_MSI;
  adev-entry[entries_nr].flags = 0;
 -adev-entry[entries_nr].u.msi.address_lo = msg_addr;
 -adev-entry[entries_nr].u.msi.address_hi = msg_upper_addr;
 -adev-entry[entries_nr].u.msi.data = msg_data;
 -DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!, msg_data, 
 msg_addr);
 +adev-entry[entries_nr].u.msi.address_lo = entry-addr_lo;
 +adev-entry[entries_nr].u.msi.address_hi = entry-addr_hi;
 +adev-entry[entries_nr].u.msi.data = entry-data;
 +DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!,
 +  entry-data, entry-addr_lo);
   kvm_add_routing_entry(adev-entry[entries_nr]);
  
  msix_entry.gsi = adev-entry[entries_nr].gsi;
 @@ -1443,7 +1439,7 @@ static uint64_t msix_mmio_read(void *opaque, 
 target_phys_addr_t addr,
  AssignedDevice *adev = opaque;
  uint64_t val;
  
 -memcpy(val, (void *)((uint8_t *)adev-msix_table_page + addr), 
 size);
 +memcpy(val, (void *)((uint8_t *)adev-msix_table + addr), size);
  
  return val;
 

Re: [PATCH 4/9] pci-assign: Use struct for MSI-X table

2012-01-31 Thread Alex Williamson
On Tue, 2012-01-31 at 23:24 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 31, 2012 at 02:17:30PM -0700, Alex Williamson wrote:
  On Tue, 2012-01-31 at 22:00 +0200, Michael S. Tsirkin wrote:
   On Tue, Jan 31, 2012 at 12:05:49PM -0700, Alex Williamson wrote:
On Tue, 2012-01-31 at 19:40 +0200, Michael S. Tsirkin wrote:
 On Sat, Jan 28, 2012 at 07:21:58AM -0700, Alex Williamson wrote:
  This makes access much easier.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 
 Yes but this also makes it easier to forget
 to handle endian-ness.
 How about using pci_get/pci_set instead?

Do we really think existing device assignment is ever going to work on
anything but x86/x86?

Alex
   
   It's a question of whether anyone bothers to
   fix all portability bugs. Besides endian-ness,
   and msix vector decoding, there's not much that is
   x86 specific there.
  
  The way interrupt routing is updated from the chipset is disgustingly
  x86 specific.
 
 Right, that's what I mean - the decoding of msix vector to destination.

Yes, that, but I was referring more to INTx and the ugly hard coded hook
to get notified when the 440fx changes PCI interrupt link setup.
Thanks,

Alex

 
  The entire kernel side implementation is only enabled for
  x86 (well, not counting ia64).  I expect the kernel side interrupt
  injection is very x86, or at least ioapic, specific.  Beyond that, yeah,
  it may be trivial ;)  Thanks,
  
  Alex
  
  ---
  
   hw/device-assignment.c |   55 
  ++--
   hw/device-assignment.h |9 +++-
   2 files changed, 33 insertions(+), 31 deletions(-)
  
  diff --git a/hw/device-assignment.c b/hw/device-assignment.c
  index 1fc7ffa..422ee00 100644
  --- a/hw/device-assignment.c
  +++ b/hw/device-assignment.c
  @@ -973,10 +973,9 @@ static int 
  assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
   AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
   uint16_t entries_nr = 0, entries_max_nr;
   int pos = 0, i, r = 0;
  -uint32_t msg_addr, msg_upper_addr, msg_data, msg_ctrl;
   struct kvm_assigned_msix_nr msix_nr;
   struct kvm_assigned_msix_entry msix_entry;
  -void *va = adev-msix_table_page;
  +MSIXTableEntry *entry = adev-msix_table;
   
   pos = pci_find_capability(pci_dev, PCI_CAP_ID_MSIX);
   
  @@ -985,12 +984,11 @@ static int 
  assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
   entries_max_nr += 1;
   
   /* Get the usable entry number for allocating */
  -for (i = 0; i  entries_max_nr; i++) {
  -memcpy(msg_ctrl, va + i * 16 + 12, 4);
  -memcpy(msg_data, va + i * 16 + 8, 4);
  +for (i = 0; i  entries_max_nr; i++, entry++) {
   /* Ignore unused entry even it's unmasked */
  -if (msg_data == 0)
  +if (entry-data == 0) {
   continue;
  +}
   entries_nr ++;
   }
   
  @@ -1013,16 +1011,13 @@ static int 
  assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
   
   msix_entry.assigned_dev_id = msix_nr.assigned_dev_id;
   entries_nr = 0;
  -for (i = 0; i  entries_max_nr; i++) {
  +entry = adev-msix_table;
  +for (i = 0; i  entries_max_nr; i++, entry++) {
   if (entries_nr = msix_nr.entry_nr)
   break;
  -memcpy(msg_ctrl, va + i * 16 + 12, 4);
  -memcpy(msg_data, va + i * 16 + 8, 4);
  -if (msg_data == 0)
  +if (entry-data == 0) {
   continue;
  -
  -memcpy(msg_addr, va + i * 16, 4);
  -memcpy(msg_upper_addr, va + i * 16 + 4, 4);
  +}
   
   r = kvm_get_irq_route_gsi();
   if (r  0)
  @@ -1031,10 +1026,11 @@ static int 
  assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
   adev-entry[entries_nr].gsi = r;
   adev-entry[entries_nr].type = KVM_IRQ_ROUTING_MSI;
   adev-entry[entries_nr].flags = 0;
  -adev-entry[entries_nr].u.msi.address_lo = msg_addr;
  -adev-entry[entries_nr].u.msi.address_hi = msg_upper_addr;
  -adev-entry[entries_nr].u.msi.data = msg_data;
  -DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!, msg_data, 
  msg_addr);
  +adev-entry[entries_nr].u.msi.address_lo = entry-addr_lo;
  +adev-entry[entries_nr].u.msi.address_hi = entry-addr_hi;
  +adev-entry[entries_nr].u.msi.data = entry-data;
  +DEBUG(MSI-X data 0x%x, MSI-X addr_lo 0x%x\n!,
  +  entry-data, entry-addr_lo);
  kvm_add_routing_entry(adev-entry[entries_nr]);
   
   msix_entry.gsi = adev-entry[entries_nr].gsi;
  @@ -1443,7 +1439,7 @@ static 

Re: [Qemu-devel] [PATCH 1/4] i8254: Factor out base class for KVM reuse

2012-01-31 Thread Anthony Liguori

On 01/31/2012 12:46 PM, Jan Kiszka wrote:

Applying the concept used for the *PICs once again: establish a base
class for the i8254 that can be used both by the current user space
emulation and the upcoming KVM in-kernel version. We share most of the
public interface of the i8254, specifically to the pcspk, vmstate, reset
and certain init parts.

Signed-off-by: Jan Kiszkajan.kis...@siemens.com


Now that we have QOM bits, there's no need to factor out a common base class.

Just make the methods that you want to override virtual with the default 
implementation and then make a KVMPIT that inherits from the PIT and then 
overrides whatever virtual functions it needs to.


Regards,

Anthony Liguori


---
  Makefile.objs   |2 +-
  hw/i8254.c  |  277 ++---
  hw/i8254_common.c   |  311 +++
  hw/i8254_internal.h |   87 ++
  4 files changed, 434 insertions(+), 243 deletions(-)
  create mode 100644 hw/i8254_common.c
  create mode 100644 hw/i8254_internal.h

diff --git a/Makefile.objs b/Makefile.objs
index b942625..6a733fa 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -210,7 +210,7 @@ hw-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o

  hw-obj-$(CONFIG_SERIAL) += serial.o
  hw-obj-$(CONFIG_PARALLEL) += parallel.o
-hw-obj-$(CONFIG_I8254) += i8254.o
+hw-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
  hw-obj-$(CONFIG_PCSPK) += pcspk.o
  hw-obj-$(CONFIG_PCKBD) += pckbd.o
  hw-obj-$(CONFIG_USB_UHCI) += usb-uhci.o
diff --git a/hw/i8254.c b/hw/i8254.c
index ff253bb..90e9a87 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -26,6 +26,7 @@
  #include isa.h
  #include qemu-timer.h
  #include i8254.h
+#include i8254_internal.h

  //#define DEBUG_PIT

@@ -34,34 +35,6 @@
  #define RW_STATE_WORD0 3
  #define RW_STATE_WORD1 4

-typedef struct PITChannelState {
-int count; /* can be 65536 */
-uint16_t latched_count;
-uint8_t count_latched;
-uint8_t status_latched;
-uint8_t status;
-uint8_t read_state;
-uint8_t write_state;
-uint8_t write_latch;
-uint8_t rw_mode;
-uint8_t mode;
-uint8_t bcd; /* not supported */
-uint8_t gate; /* timer start */
-int64_t count_load_time;
-/* irq handling */
-int64_t next_transition_time;
-QEMUTimer *irq_timer;
-qemu_irq irq;
-uint32_t irq_disabled;
-} PITChannelState;
-
-typedef struct PITState {
-ISADevice dev;
-MemoryRegion ioports;
-uint32_t iobase;
-PITChannelState channels[3];
-} PITState;
-
  static void pit_irq_timer_update(PITChannelState *s, int64_t current_time);

  static int pit_get_count(PITChannelState *s)
@@ -89,99 +62,11 @@ static int pit_get_count(PITChannelState *s)
  return counter;
  }

-/* get pit output bit */
-static int pit_get_out(PITChannelState *s, int64_t current_time)
-{
-uint64_t d;
-int out;
-
-d = muldiv64(current_time - s-count_load_time, PIT_FREQ,
- get_ticks_per_sec());
-switch(s-mode) {
-default:
-case 0:
-out = (d= s-count);
-break;
-case 1:
-out = (d  s-count);
-break;
-case 2:
-if ((d % s-count) == 0  d != 0)
-out = 1;
-else
-out = 0;
-break;
-case 3:
-out = (d % s-count)  ((s-count + 1)  1);
-break;
-case 4:
-case 5:
-out = (d == s-count);
-break;
-}
-return out;
-}
-
-/* return -1 if no transition will occur.  */
-static int64_t pit_get_next_transition_time(PITChannelState *s,
-int64_t current_time)
-{
-uint64_t d, next_time, base;
-int period2;
-
-d = muldiv64(current_time - s-count_load_time, PIT_FREQ,
- get_ticks_per_sec());
-switch(s-mode) {
-default:
-case 0:
-case 1:
-if (d  s-count)
-next_time = s-count;
-else
-return -1;
-break;
-case 2:
-base = (d / s-count) * s-count;
-if ((d - base) == 0  d != 0)
-next_time = base + s-count;
-else
-next_time = base + s-count + 1;
-break;
-case 3:
-base = (d / s-count) * s-count;
-period2 = ((s-count + 1)  1);
-if ((d - base)  period2)
-next_time = base + period2;
-else
-next_time = base + s-count;
-break;
-case 4:
-case 5:
-if (d  s-count)
-next_time = s-count;
-else if (d == s-count)
-next_time = s-count + 1;
-else
-return -1;
-break;
-}
-/* convert to timer units */
-next_time = s-count_load_time + muldiv64(next_time, get_ticks_per_sec(),
-  PIT_FREQ);
-/* fix potential rounding problems */
-/* XXX: better solution: use a clock at PIT_FREQ Hz */
-if (next_time= current_time)
-next_time = current_time + 1;
-return next_time;
-}
-

Re: [Qemu-devel] [PATCH 1/4] i8254: Factor out base class for KVM reuse

2012-01-31 Thread Jan Kiszka
On 2012-01-31 22:40, Anthony Liguori wrote:
 On 01/31/2012 12:46 PM, Jan Kiszka wrote:
 Applying the concept used for the *PICs once again: establish a base
 class for the i8254 that can be used both by the current user space
 emulation and the upcoming KVM in-kernel version. We share most of the
 public interface of the i8254, specifically to the pcspk, vmstate, reset
 and certain init parts.

 Signed-off-by: Jan Kiszkajan.kis...@siemens.com
 
 Now that we have QOM bits, there's no need to factor out a common base
 class.
 
 Just make the methods that you want to override virtual with the default
 implementation and then make a KVMPIT that inherits from the PIT and
 then overrides whatever virtual functions it needs to.

That doesn't sound like the proper design for this purpose. Rather, we
have an abstract base class that both implementations are derived from.
If I'm not using QOM properly to achieve this, please tell me.

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/4] i8254: Factor out base class for KVM reuse

2012-01-31 Thread Anthony Liguori

On 01/31/2012 03:49 PM, Jan Kiszka wrote:

On 2012-01-31 22:40, Anthony Liguori wrote:

On 01/31/2012 12:46 PM, Jan Kiszka wrote:

Applying the concept used for the *PICs once again: establish a base
class for the i8254 that can be used both by the current user space
emulation and the upcoming KVM in-kernel version. We share most of the
public interface of the i8254, specifically to the pcspk, vmstate, reset
and certain init parts.

Signed-off-by: Jan Kiszkajan.kis...@siemens.com


Now that we have QOM bits, there's no need to factor out a common base
class.

Just make the methods that you want to override virtual with the default
implementation and then make a KVMPIT that inherits from the PIT and
then overrides whatever virtual functions it needs to.


That doesn't sound like the proper design for this purpose.


It's hard to say really.  There's a lot more in the common class that I expected 
(like initialization of the ISA regions.  I would expect the base class to look 
a lot more like an interface such that the KVM PIT implementation was 
trivialized more than it is.


But I can argue it both ways so if you feel strongly here, I won't object.

Regards,

Anthony Liguori

 Rather, we

have an abstract base class that both implementations are derived from.
If I'm not using QOM properly to achieve this, please tell me.

Thanks,
Jan


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v3 6/7] pcspk: Convert to qdev

2012-01-31 Thread Jan Kiszka
On 2012-01-31 21:49, Anthony Liguori wrote:
 On 01/31/2012 11:41 AM, Jan Kiszka wrote:
 Convert the PC speaker device to a qdev ISA model. Move the public
 interface to a dedicated header file at this chance.

 Signed-off-by: Jan Kiszkajan.kis...@siemens.com
 
 Heh, I did this too more or less the same way.  Some comments below:
 
 ---
   arch_init.c|1 +
   hw/i82378.c|3 +-
   hw/mips_jazz.c |3 +-
   hw/pc.c|3 +-
   hw/pc.h|4 ---
   hw/pcspk.c |   62
 ++-
   hw/pcspk.h |   45 
   7 files changed, 104 insertions(+), 17 deletions(-)
   create mode 100644 hw/pcspk.h

 diff --git a/arch_init.c b/arch_init.c
 index 2366511..a45485b 100644
 --- a/arch_init.c
 +++ b/arch_init.c
 @@ -42,6 +42,7 @@
   #include gdbstub.h
   #include hw/smbios.h
   #include exec-memory.h
 +#include hw/pcspk.h

   #ifdef TARGET_SPARC
   int graphic_width = 1024;
 diff --git a/hw/i82378.c b/hw/i82378.c
 index 127cadf..79fa8b7 100644
 --- a/hw/i82378.c
 +++ b/hw/i82378.c
 @@ -20,6 +20,7 @@
   #include pci.h
   #include pc.h
   #include i8254.h
 +#include pcspk.h

   //#define DEBUG_I82378

 @@ -195,7 +196,7 @@ static void i82378_init(DeviceState *dev,
 I82378State *s)
   pit = pit_init(isabus, 0x40, 0, NULL);

   /* speaker */
 -pcspk_init(pit);
 +pcspk_init(isabus, pit);

   /* 2 82C37 (dma) */
   DMA_init(1,s-out[1]);
 diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
 index b61b218..65608dc 100644
 --- a/hw/mips_jazz.c
 +++ b/hw/mips_jazz.c
 @@ -37,6 +37,7 @@
   #include loader.h
   #include mc146818rtc.h
   #include i8254.h
 +#include pcspk.h
   #include blockdev.h
   #include sysbus.h
   #include exec-memory.h
 @@ -193,7 +194,7 @@ static void mips_jazz_init(MemoryRegion
 *address_space,
   cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
   DMA_init(0, cpu_exit_irq);
   pit = pit_init(isa_bus, 0x40, 0, NULL);
 -pcspk_init(pit);
 +pcspk_init(isa_bus, pit);

   /* ISA IO space at 0x9000 */
   isa_mmio_init(0x9000, 0x0100);
 diff --git a/hw/pc.c b/hw/pc.c
 index 6fb1de7..f6a91a9 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -37,6 +37,7 @@
   #include multiboot.h
   #include mc146818rtc.h
   #include i8254.h
 +#include pcspk.h
   #include msi.h
   #include sysbus.h
   #include sysemu.h
 @@ -1169,7 +1170,7 @@ void pc_basic_device_init(ISABus *isa_bus,
 qemu_irq *gsi,
   /* connect PIT to output control line of the HPET */
   qdev_connect_gpio_out(hpet, 0, qdev_get_gpio_in(pit-qdev,
 0));
   }
 -pcspk_init(pit);
 +pcspk_init(isa_bus, pit);

   for(i = 0; i  MAX_SERIAL_PORTS; i++) {
   if (serial_hds[i]) {
 diff --git a/hw/pc.h b/hw/pc.h
 index b08708d..1b47bbd 100644
 --- a/hw/pc.h
 +++ b/hw/pc.h
 @@ -149,10 +149,6 @@ void piix4_smbus_register_device(SMBusDevice
 *dev, uint8_t addr);
   /* hpet.c */
   extern int no_hpet;

 -/* pcspk.c */
 -void pcspk_init(ISADevice *pit);
 -int pcspk_audio_init(ISABus *bus);
 -
   /* piix_pci.c */
   struct PCII440FXState;
   typedef struct PCII440FXState PCII440FXState;
 diff --git a/hw/pcspk.c b/hw/pcspk.c
 index 43df818..5bd9e32 100644
 --- a/hw/pcspk.c
 +++ b/hw/pcspk.c
 @@ -28,6 +28,7 @@
   #include audio/audio.h
   #include qemu-timer.h
   #include i8254.h
 +#include pcspk.h

   #define PCSPK_BUF_LEN 1792
   #define PCSPK_SAMPLE_RATE 32000
 @@ -35,10 +36,13 @@
   #define PCSPK_MIN_COUNT ((PIT_FREQ + PCSPK_MAX_FREQ - 1) /
 PCSPK_MAX_FREQ)

   typedef struct {
 +ISADevice dev;
 +MemoryRegion ioport;
 +uint32_t iobase;
   uint8_t sample_buf[PCSPK_BUF_LEN];
   QEMUSoundCard card;
   SWVoiceOut *voice;
 -ISADevice *pit;
 +void *pit;
   unsigned int pit_count;
   unsigned int samples;
   unsigned int play_pos;
 @@ -47,7 +51,7 @@ typedef struct {
   } PCSpkState;

   static const char *s_spk = pcspk;
 -static PCSpkState pcspk_state;
 +static PCSpkState *pcspk_state;

   static inline void generate_samples(PCSpkState *s)
   {
 @@ -99,7 +103,7 @@ static void pcspk_callback(void *opaque, int free)

   int pcspk_audio_init(ISABus *bus)
   {
 -PCSpkState *s =pcspk_state;
 +PCSpkState *s = pcspk_state;
   struct audsettings as = {PCSPK_SAMPLE_RATE, 1, AUD_FMT_U8, 0};
 
 
 This can be a follow up, but what this is really doing is enabling audio
 for this device.  ISABus is unused as a parameter.

Yes, but this is a callback for the audio subsystem, look at
arch_init.c. Nothing we can do here, only if we change that callback
prototype.

 
 I think it would be better to make this a qdev bool property
 (audio_enabled) and then modify the code that calls this function to
 either set the property as a global, or use the QOM path to specifically
 set it on this device.
 
 Either way, I think setting an audio_enabled flag via qdev makes a whole
 lot more sense conceptionally than using the -soundhw option.

Yep, 

Re: [PATCH v3 2/7] hpet: Save/restore cached RTC IRQ level

2012-01-31 Thread Jan Kiszka
On 2012-01-31 22:02, Anthony Liguori wrote:
 On 01/31/2012 11:41 AM, Jan Kiszka wrote:
 In legacy mode, the HPET suppresses the RTC interrupt delivery via IRQ
 8 but keeps track of the RTC output level and applies it when legacy
 mode is turned off again. This value has to be preserved across save/
 restore as it cannot be reconstructed otherwise.

 Signed-off-by: Jan Kiszkajan.kis...@siemens.com
 ---
   hw/hpet.c |   26 ++
   1 files changed, 26 insertions(+), 0 deletions(-)

 diff --git a/hw/hpet.c b/hw/hpet.c
 index aba9ea9..39b291f 100644
 --- a/hw/hpet.c
 +++ b/hw/hpet.c
 @@ -240,6 +240,24 @@ static int hpet_post_load(void *opaque, int
 version_id)
   return 0;
   }

 +static bool hpet_rtc_irq_level_needed(void *opaque)
 +{
 +HPETState *s = opaque;
 +
 +return s-rtc_irq_level != 0;
 +}
 +
 +static const VMStateDescription vmstate_hpet_rtc_irq_level = {
 +.name = hpet/rtc_irq_level,
 +.version_id = 1,
 +.minimum_version_id = 1,
 +.minimum_version_id_old = 1,
 +.fields  = (VMStateField[]) {
 +VMSTATE_UINT8(rtc_irq_level, HPETState),
 +VMSTATE_END_OF_LIST()
 +}
 +};
 +
 
 This won't work.  We don't clear rtc_irq_level on reset so rtc_irq_level
 may be high or low after reset.  As such, we can't use a subsection
 here.  We need to bump the savevm state.

rtc_irq_level is updated during reset by the connected RTC device. It
clears its IRQ line.

Jan



signature.asc
Description: OpenPGP digital signature


  1   2   >