Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-17 Thread Gabriel L. Somlo
On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
 Well there is a bigger issue: any interrupt with
 multiple sources is broken.
 
 __kvm_irq_line_state does a logical OR of all sources,
 before XOR with polarity.
 
 This makes no sense if polarity is active low.

So, do you think something like this would make sense, to address
active-low polarity in __kvm_irq_line_state ?
(this would be independent of the subsequent xor in
kvm_ioapic_set_irq()):

-static inline int __kvm_irq_line_state(unsigned long *irq_state,
+static inline int __kvm_irq_line_state(unsigned long *irq_state, int polarity,
int irq_source_id, int level)
{
-/* Logical OR for level trig interrupt */
if (level)
__set_bit(irq_source_id, irq_state);
else
__clear_bit(irq_source_id, irq_state);

-   return !!(*irq_state);
+   if (polarity) {
+   /* Logical OR for level trig interrupt, active-high */
+   return !!(*irq_state);
+   } else { // active-low
+   /* Logical AND for level trig interrupt, active-low */
+   return !~(*irq_state);
+   }
}

Thanks,
--Gabriel



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-17 Thread Gabriel L. Somlo
On Mon, Feb 17, 2014 at 12:57:00PM -0500, Gabriel L. Somlo wrote:
 On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
  Well there is a bigger issue: any interrupt with
  multiple sources is broken.
  
  __kvm_irq_line_state does a logical OR of all sources,
  before XOR with polarity.
  
  This makes no sense if polarity is active low.
 
 So, do you think something like this would make sense, to address
 active-low polarity in __kvm_irq_line_state ?
 (this would be independent of the subsequent xor in
 kvm_ioapic_set_irq()):
 
Make that rather:

-static inline int __kvm_irq_line_state(unsigned long *irq_state,
+static inline int __kvm_irq_line_state(unsigned long *irq_state, int polarity,
int irq_source_id, int level)
{
-/* Logical OR for level trig interrupt */
if (level)
__set_bit(irq_source_id, irq_state);
else
__clear_bit(irq_source_id, irq_state);

-   return !!(*irq_state);
+   if (polarity) {
+   /* Logical AND for level trig interrupt, active-low */
+   return !~(*irq_state);
+   } else {
+   /* Logical OR for level trig interrupt, active-high */
+   return !!(*irq_state);
+   }
}

Thanks, and sorry for the noise :)
--Gabriel



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-17 Thread Paolo Bonzini

Il 17/02/2014 19:01, Gabriel L. Somlo ha scritto:

On Mon, Feb 17, 2014 at 12:57:00PM -0500, Gabriel L. Somlo wrote:

On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:

Well there is a bigger issue: any interrupt with
multiple sources is broken.

__kvm_irq_line_state does a logical OR of all sources,
before XOR with polarity.

This makes no sense if polarity is active low.


So, do you think something like this would make sense, to address
active-low polarity in __kvm_irq_line_state ?
(this would be independent of the subsequent xor in
kvm_ioapic_set_irq()):


Make that rather:

-static inline int __kvm_irq_line_state(unsigned long *irq_state,
+static inline int __kvm_irq_line_state(unsigned long *irq_state, int polarity,
int irq_source_id, int level)
{
-/* Logical OR for level trig interrupt */
if (level)
__set_bit(irq_source_id, irq_state);
else
__clear_bit(irq_source_id, irq_state);

-   return !!(*irq_state);
+   if (polarity) {
+   /* Logical AND for level trig interrupt, active-low */
+   return !~(*irq_state);


This is ~*irq_state == 0, i.e. *irq_state == ~0.

What if high-order bits of *irq_state are never used?  That is, do you 
need to consider the maximum valid irq_source_id too?




+   } else {
+   /* Logical OR for level trig interrupt, active-high */
+   return !!(*irq_state);


Better rewrite this as *irq_state != 0.

Paolo


+   }
}

Thanks, and sorry for the noise :)
--Gabriel






Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-17 Thread Gabriel L. Somlo
On Mon, Feb 17, 2014 at 07:06:11PM +0100, Paolo Bonzini wrote:
 Il 17/02/2014 19:01, Gabriel L. Somlo ha scritto:
 On Mon, Feb 17, 2014 at 12:57:00PM -0500, Gabriel L. Somlo wrote:
 On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
 Well there is a bigger issue: any interrupt with
 multiple sources is broken.
 
 __kvm_irq_line_state does a logical OR of all sources,
 before XOR with polarity.
 
 This makes no sense if polarity is active low.
 
 So, do you think something like this would make sense, to address
 active-low polarity in __kvm_irq_line_state ?
 (this would be independent of the subsequent xor in
 kvm_ioapic_set_irq()):
 
 -return !!(*irq_state);
 +if (polarity) {
 +/* Logical AND for level trig interrupt, active-low */
 +return !~(*irq_state);
 
 This is ~*irq_state == 0, i.e. *irq_state == ~0.
 
 What if high-order bits of *irq_state are never used?  That is, do
 you need to consider the maximum valid irq_source_id too?

Oh, I think I'm starting to comprehend the problem here. The bits of
*irq_state are indexed by irq_source_id, which is dynamically
assigned by kvm_request_irq_source_id().

So, doing the OR thing when assuming always-active-high makes
sense. Doing AND based on an active-low assumption doesn't make
sense, because there could ALWAYS be 0 bits that just weren't
allocated (yet), and I'm having trouble imagining how I'd keep
track of where the current allocation boundary is in a sane way :)

Which I *think* was Michael's original point...

--Gabriel



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-17 Thread Gabriel L. Somlo
On Mon, Feb 17, 2014 at 02:38:09PM -0500, Gabriel L. Somlo wrote:
 Oh, I think I'm starting to comprehend the problem here. The bits of
 *irq_state are indexed by irq_source_id, which is dynamically
 assigned by kvm_request_irq_source_id().
 
 So, doing the OR thing when assuming always-active-high makes
 sense. Doing AND based on an active-low assumption doesn't make
 sense, because there could ALWAYS be 0 bits that just weren't
 allocated (yet), and I'm having trouble imagining how I'd keep
 track of where the current allocation boundary is in a sane way :)

Hmm, I thought maybe I could use kvm-arch.irq_sources_bitmap, but
that's global across the whole VM, whereas irq_state belongs to
one given GSI. So, the per-GSI bitmap is sparse, so it's at least
as bad as I thought earlier, if not worse :)

Am I missing anything that would put this in a better light ?

Thanks,
--Gabriel



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Michael S. Tsirkin
On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
 
 On 14.02.2014, at 23:06, Gabriel L. Somlo gso...@gmail.com wrote:
 
  On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
  
  Can't you just turn the polarity around in the pci host adapter?
  
  I tried this:
  
  diff --git a/hw/pci/pci.c b/hw/pci/pci.c
  index 1221f32..0e86d21 100644
  --- a/hw/pci/pci.c
  +++ b/hw/pci/pci.c
  @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
  
  static inline int pci_irq_state(PCIDevice *d, int irq_num)
  {
  -   return (d-irq_state  irq_num)  0x1;
  +   return !(d-irq_state  irq_num)  0x1;
  }
  
  static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
  {
  d-irq_state = ~(0x1  irq_num);
  -   d-irq_state |= level  irq_num;
  +   d-irq_state = ~(level  irq_num);
  }
  
  static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int 
  change)
  @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
  }
  
  for (i = 0; i  bus-nirq; i++) {
  -assert(bus-irq_count[i] == 0);
  +assert(bus-irq_count[i] != 0);
  }
  }
  
  ---
  
  but now OS X freezes during boot right after
  
  [ PCI configuration begin ]
  [ PCI configuration end, bridges 1, devices 10 ]
  RTC: Only single RAM bank (128 bytes)
  
  which all looks normal, except the process is supposed to continue on
  from there and doesn't :)
  
  On Linux, I get Fedora 20 live all the way up with no obvious/loud
  complaints, but mouse and keyboard don't work at all...
  
  I have to admit I'm a bit out of my depth here, though :)
 
 Yeah, another thing we have to take into account is vhost-net which generates 
 IRQs directly through irqfd. I guess for those we'll have to configure the 
 polarity in the irq routing table?
 
 
 Alex

This is using MSI-X interrupts which are edge though,
not going through IOAPIC at all.




Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Michael S. Tsirkin
On Fri, Feb 14, 2014 at 04:13:11PM -0500, Gabriel L. Somlo wrote:
 On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
  On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
   1. Regarding KVM and the polarity xor line in the patch above: Does
   anyone have experience with any *other* guests which insist on setting
   level-triggered interrupt polarity to 1/active-low ? Is that xor line
   actually doing anything useful in practice, for any other guest, on
   either QEMU or any other platform ?
   
   
   2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
   has a hardcoded assumption re. polarity == 0, or active-high, for
   level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
   and a bunch of other files, but couldn't isolate anything that I could
   flip to fix things in userspace.
   
   
   Any ideas or suggestions about the appropriate way to move forward would
   be much appreciated !!!
   
   
   Thanks much,
   --Gabriel
  
  I think changing ACPI is the right thing to
  do really. But we'll need to fix some things
  first of course.
 
 So I followed your advice, and was able to boot OS X just fine (but
 booting Linux after this patch still resulted in multiple no one
 cared complaints on IRQs 17, 18, 19, etc.:
 
 diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
 index d618e9e..9c52f64 100644
 --- a/hw/i386/q35-acpi-dsdt.dsl
 +++ b/hw/i386/q35-acpi-dsdt.dsl
 @@ -353,7 +353,7 @@ DefinitionBlock (
  Method(IQCR, 1, Serialized) {
  // _CRS method - get current settings
  Name(PRR0, ResourceTemplate() {
 -Interrupt(, Level, ActiveHigh, Shared) { 0 }
 +Interrupt(, Level, ActiveLow, Shared) { 0 }
  })
  CreateDWordField(PRR0, 0x05, PRRI)
  Store(And(Arg0, 0x0F), PRRI)
 @@ -365,7 +365,7 @@ DefinitionBlock (
  Name(_HID, EISAID(PNP0C0F))   \
  Name(_UID, uid) \
  Name(_PRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
  5, 10, 11   \
  }   \
  })  \
 @@ -398,12 +398,12 @@ DefinitionBlock (
  Name(_HID, EISAID(PNP0C0F))   \
  Name(_UID, uid) \
  Name(_PRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
  gsi \
  }   \
  })  \
  Name(_CRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
  gsi \
  }   \
  })  \
 diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
 index 51ce12d..fe1527a 100644
 --- a/hw/isa/lpc_ich9.c
 +++ b/hw/isa/lpc_ich9.c
 @@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int 
 pic_irq)
  int i, pic_level;
  
  /* The pic level is the logical OR of all the PCI irqs mapped to it */
 -pic_level = 0;
 +pic_level = 1;
  for (i = 0; i  ICH9_LPC_NB_PIRQS; i++) {
  int tmp_irq;
  int tmp_dis;
  ich9_lpc_pic_irq(lpc, i, tmp_irq, tmp_dis);
  if (!tmp_dis  pic_irq == tmp_irq) {
 -pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
 +pic_level = !pci_bus_get_irq_level(lpc-d.bus, i);
  }
  }
  if (pic_irq == ich9_lpc_sci_irq(lpc)) {
 -pic_level |= lpc-sci_level;
 +pic_level = !lpc-sci_level;
  }
  
  qemu_set_irq(lpc-pic[pic_irq], pic_level);
 --
 
 However, even on OS X, the Ethernet (e1000) card won't link up at all.
 Fixing that requires another patch:
 
 diff --git a/hw/net/e1000.c b/hw/net/e1000.c
 index 58ba93b..c7a2c07 100644
 --- a/hw/net/e1000.c
 +++ b/hw/net/e1000.c
 @@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t 
 val)
  s-mac_reg[ICS] = val;
  
  pending_ints = (s-mac_reg[IMS]  s-mac_reg[ICR]);
 -if (!s-mit_irq_level  pending_ints) {
 +if (s-mit_irq_level  pending_ints) {
  /*
   * Here we detect a potential raising edge. We postpone raising the
   * interrupt line if we are inside the mitigation delay 

Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Michael S. Tsirkin
On Fri, Feb 14, 2014 at 04:13:11PM -0500, Gabriel L. Somlo wrote:
 On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
  On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
   1. Regarding KVM and the polarity xor line in the patch above: Does
   anyone have experience with any *other* guests which insist on setting
   level-triggered interrupt polarity to 1/active-low ? Is that xor line
   actually doing anything useful in practice, for any other guest, on
   either QEMU or any other platform ?
   
   
   2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
   has a hardcoded assumption re. polarity == 0, or active-high, for
   level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
   and a bunch of other files, but couldn't isolate anything that I could
   flip to fix things in userspace.
   
   
   Any ideas or suggestions about the appropriate way to move forward would
   be much appreciated !!!
   
   
   Thanks much,
   --Gabriel
  
  I think changing ACPI is the right thing to
  do really. But we'll need to fix some things
  first of course.
 
 So I followed your advice, and was able to boot OS X just fine (but
 booting Linux after this patch still resulted in multiple no one
 cared complaints on IRQs 17, 18, 19, etc.:
 
 diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
 index d618e9e..9c52f64 100644
 --- a/hw/i386/q35-acpi-dsdt.dsl
 +++ b/hw/i386/q35-acpi-dsdt.dsl
 @@ -353,7 +353,7 @@ DefinitionBlock (
  Method(IQCR, 1, Serialized) {
  // _CRS method - get current settings
  Name(PRR0, ResourceTemplate() {
 -Interrupt(, Level, ActiveHigh, Shared) { 0 }
 +Interrupt(, Level, ActiveLow, Shared) { 0 }
  })
  CreateDWordField(PRR0, 0x05, PRRI)
  Store(And(Arg0, 0x0F), PRRI)
 @@ -365,7 +365,7 @@ DefinitionBlock (
  Name(_HID, EISAID(PNP0C0F))   \
  Name(_UID, uid) \
  Name(_PRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
  5, 10, 11   \
  }   \
  })  \
 @@ -398,12 +398,12 @@ DefinitionBlock (
  Name(_HID, EISAID(PNP0C0F))   \
  Name(_UID, uid) \
  Name(_PRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
  gsi \
  }   \
  })  \
  Name(_CRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
  gsi \
  }   \
  })  \
 diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
 index 51ce12d..fe1527a 100644
 --- a/hw/isa/lpc_ich9.c
 +++ b/hw/isa/lpc_ich9.c
 @@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int 
 pic_irq)
  int i, pic_level;
  
  /* The pic level is the logical OR of all the PCI irqs mapped to it */
 -pic_level = 0;
 +pic_level = 1;
  for (i = 0; i  ICH9_LPC_NB_PIRQS; i++) {
  int tmp_irq;
  int tmp_dis;
  ich9_lpc_pic_irq(lpc, i, tmp_irq, tmp_dis);
  if (!tmp_dis  pic_irq == tmp_irq) {
 -pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
 +pic_level = !pci_bus_get_irq_level(lpc-d.bus, i);
  }
  }
  if (pic_irq == ich9_lpc_sci_irq(lpc)) {
 -pic_level |= lpc-sci_level;
 +pic_level = !lpc-sci_level;
  }
  
  qemu_set_irq(lpc-pic[pic_irq], pic_level);
 --
 
 However, even on OS X, the Ethernet (e1000) card won't link up at all.
 Fixing that requires another patch:
 
 diff --git a/hw/net/e1000.c b/hw/net/e1000.c
 index 58ba93b..c7a2c07 100644
 --- a/hw/net/e1000.c
 +++ b/hw/net/e1000.c
 @@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t 
 val)
  s-mac_reg[ICS] = val;
  
  pending_ints = (s-mac_reg[IMS]  s-mac_reg[ICR]);
 -if (!s-mit_irq_level  pending_ints) {
 +if (s-mit_irq_level  pending_ints) {
  /*
   * Here we detect a potential raising edge. We postpone raising the
   * interrupt line if we are inside the mitigation delay 

Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Michael S. Tsirkin
On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
 
 On 14.02.2014, at 23:06, Gabriel L. Somlo gso...@gmail.com wrote:
 
  On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
  
  Can't you just turn the polarity around in the pci host adapter?
  
  I tried this:
  
  diff --git a/hw/pci/pci.c b/hw/pci/pci.c
  index 1221f32..0e86d21 100644
  --- a/hw/pci/pci.c
  +++ b/hw/pci/pci.c
  @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
  
  static inline int pci_irq_state(PCIDevice *d, int irq_num)
  {
  -   return (d-irq_state  irq_num)  0x1;
  +   return !(d-irq_state  irq_num)  0x1;
  }
  
  static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
  {
  d-irq_state = ~(0x1  irq_num);
  -   d-irq_state |= level  irq_num;
  +   d-irq_state = ~(level  irq_num);
  }
  
  static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int 
  change)
  @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
  }
  
  for (i = 0; i  bus-nirq; i++) {
  -assert(bus-irq_count[i] == 0);
  +assert(bus-irq_count[i] != 0);
  }
  }
  
  ---
  
  but now OS X freezes during boot right after
  
  [ PCI configuration begin ]
  [ PCI configuration end, bridges 1, devices 10 ]
  RTC: Only single RAM bank (128 bytes)
  
  which all looks normal, except the process is supposed to continue on
  from there and doesn't :)
  
  On Linux, I get Fedora 20 live all the way up with no obvious/loud
  complaints, but mouse and keyboard don't work at all...
  
  I have to admit I'm a bit out of my depth here, though :)
 
 Yeah, another thing we have to take into account is vhost-net which generates 
 IRQs directly through irqfd. I guess for those we'll have to configure the 
 polarity in the irq routing table?
 
 
 Alex

What will be affected is VFIO which uses IRQFD
for level interrupts with KVM_IRQFD_FLAG_RESAMPLE.
I suspect this will need a kernel change, maybe
a new flag for IRQFD: KVM_IRQFD_FLAG_ACTIVE_LOW,
since at the moment that does:

static void
irqfd_inject(struct work_struct *work)
{
struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
struct kvm *kvm = irqfd-kvm;

if (!irqfd-resampler) {
kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1,
false);
kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0,
false);
} else
kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
irqfd-gsi, 1, false);
}



-- 
MST



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Alex Williamson
On Sun, 2014-02-16 at 13:41 +0200, Michael S. Tsirkin wrote:
 On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
  
  On 14.02.2014, at 23:06, Gabriel L. Somlo gso...@gmail.com wrote:
  
   On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
   
   Can't you just turn the polarity around in the pci host adapter?
   
   I tried this:
   
   diff --git a/hw/pci/pci.c b/hw/pci/pci.c
   index 1221f32..0e86d21 100644
   --- a/hw/pci/pci.c
   +++ b/hw/pci/pci.c
   @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
   
   static inline int pci_irq_state(PCIDevice *d, int irq_num)
   {
   - return (d-irq_state  irq_num)  0x1;
   + return !(d-irq_state  irq_num)  0x1;
   }
   
   static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
   {
 d-irq_state = ~(0x1  irq_num);
   - d-irq_state |= level  irq_num;
   + d-irq_state = ~(level  irq_num);
   }
   
   static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int 
   change)
   @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
   }
   
   for (i = 0; i  bus-nirq; i++) {
   -assert(bus-irq_count[i] == 0);
   +assert(bus-irq_count[i] != 0);
   }
   }
   
   ---
   
   but now OS X freezes during boot right after
   
 [ PCI configuration begin ]
 [ PCI configuration end, bridges 1, devices 10 ]
 RTC: Only single RAM bank (128 bytes)
   
   which all looks normal, except the process is supposed to continue on
   from there and doesn't :)
   
   On Linux, I get Fedora 20 live all the way up with no obvious/loud
   complaints, but mouse and keyboard don't work at all...
   
   I have to admit I'm a bit out of my depth here, though :)
  
  Yeah, another thing we have to take into account is vhost-net which 
  generates IRQs directly through irqfd. I guess for those we'll have to 
  configure the polarity in the irq routing table?
  
  
  Alex
 
 What will be affected is VFIO which uses IRQFD
 for level interrupts with KVM_IRQFD_FLAG_RESAMPLE.
 I suspect this will need a kernel change, maybe
 a new flag for IRQFD: KVM_IRQFD_FLAG_ACTIVE_LOW,
 since at the moment that does:
 
 static void
 irqfd_inject(struct work_struct *work)
 {
 struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
 struct kvm *kvm = irqfd-kvm;
 
 if (!irqfd-resampler) {
 kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1,
 false);
 kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0,
 false);
 } else
 kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
 irqfd-gsi, 1, false);
 }



As you said in a previous message, devices just want assert  de-assert,
1  0, which is what we have here.  I would think that what asserted
means only needs to be interpreted at the IOAPIC, so I'd hope we could
get it right w/o an API change.  Thanks,

Alex




Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Peter Maydell
On 16 February 2014 11:34, Michael S. Tsirkin m...@redhat.com wrote:
 Hmm no this is all wrong, from API point of view,
 devices shoud not care about value of interrupt.
 They just assert/deassert interrupts.
 It so happens that 1 means assert 0 means deassert.

Yeah, we generally model things as active-high even if the
hardware really treats the signal as active-low. (Among other
things there are some issues around how exactly device reset
should interact with a signal that is supposed to be high coming
out of reset, given you don't know whether the device at the
other end of the line has reset yet or not.)
This is great up until the point where you have a generic
GPIO device one of whose GPIO output lines happens to
be wired to an interrupt controller, of course.

thanks
-- PMM



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-16 Thread Michael S. Tsirkin
On Sun, Feb 16, 2014 at 07:47:00AM -0700, Alex Williamson wrote:
 On Sun, 2014-02-16 at 13:41 +0200, Michael S. Tsirkin wrote:
  On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
   
   On 14.02.2014, at 23:06, Gabriel L. Somlo gso...@gmail.com wrote:
   
On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:

Can't you just turn the polarity around in the pci host adapter?

I tried this:

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 1221f32..0e86d21 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)

static inline int pci_irq_state(PCIDevice *d, int irq_num)
{
-   return (d-irq_state  irq_num)  0x1;
+   return !(d-irq_state  irq_num)  0x1;
}

static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int 
level)
{
d-irq_state = ~(0x1  irq_num);
-   d-irq_state |= level  irq_num;
+   d-irq_state = ~(level  irq_num);
}

static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int 
change)
@@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
}

for (i = 0; i  bus-nirq; i++) {
-assert(bus-irq_count[i] == 0);
+assert(bus-irq_count[i] != 0);
}
}

---

but now OS X freezes during boot right after

[ PCI configuration begin ]
[ PCI configuration end, bridges 1, devices 10 ]
RTC: Only single RAM bank (128 bytes)

which all looks normal, except the process is supposed to continue on
from there and doesn't :)

On Linux, I get Fedora 20 live all the way up with no obvious/loud
complaints, but mouse and keyboard don't work at all...

I have to admit I'm a bit out of my depth here, though :)
   
   Yeah, another thing we have to take into account is vhost-net which 
   generates IRQs directly through irqfd. I guess for those we'll have to 
   configure the polarity in the irq routing table?
   
   
   Alex
  
  What will be affected is VFIO which uses IRQFD
  for level interrupts with KVM_IRQFD_FLAG_RESAMPLE.
  I suspect this will need a kernel change, maybe
  a new flag for IRQFD: KVM_IRQFD_FLAG_ACTIVE_LOW,
  since at the moment that does:
  
  static void
  irqfd_inject(struct work_struct *work)
  {
  struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
  struct kvm *kvm = irqfd-kvm;
  
  if (!irqfd-resampler) {
  kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1,
  false);
  kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0,
  false);
  } else
  kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
  irqfd-gsi, 1, false);
  }
 
 
 
 As you said in a previous message, devices just want assert  de-assert,
 1  0, which is what we have here.  I would think that what asserted
 means only needs to be interpreted at the IOAPIC, so I'd hope we could
 get it right w/o an API change.


Well there is a bigger issue: any interrupt with
multiple sources is broken.

__kvm_irq_line_state does a logical OR of all sources,
before XOR with polarity.

This makes no sense if polarity is active low.


One is beginning to think the simplest fix
would be Gabriel's patch after all:
-  irq_level ^= entry.fields.polarity;


although it's ugly in that it perpetuates the
bug in more places instead of fixing it.


  Thanks,
 
 Alex



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-14 Thread Gabriel L. Somlo
On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
 On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
  1. Regarding KVM and the polarity xor line in the patch above: Does
  anyone have experience with any *other* guests which insist on setting
  level-triggered interrupt polarity to 1/active-low ? Is that xor line
  actually doing anything useful in practice, for any other guest, on
  either QEMU or any other platform ?
  
  
  2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
  has a hardcoded assumption re. polarity == 0, or active-high, for
  level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
  and a bunch of other files, but couldn't isolate anything that I could
  flip to fix things in userspace.
  
  
  Any ideas or suggestions about the appropriate way to move forward would
  be much appreciated !!!
  
  
  Thanks much,
  --Gabriel
 
 I think changing ACPI is the right thing to
 do really. But we'll need to fix some things
 first of course.

So I followed your advice, and was able to boot OS X just fine (but
booting Linux after this patch still resulted in multiple no one
cared complaints on IRQs 17, 18, 19, etc.:

diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
index d618e9e..9c52f64 100644
--- a/hw/i386/q35-acpi-dsdt.dsl
+++ b/hw/i386/q35-acpi-dsdt.dsl
@@ -353,7 +353,7 @@ DefinitionBlock (
 Method(IQCR, 1, Serialized) {
 // _CRS method - get current settings
 Name(PRR0, ResourceTemplate() {
-Interrupt(, Level, ActiveHigh, Shared) { 0 }
+Interrupt(, Level, ActiveLow, Shared) { 0 }
 })
 CreateDWordField(PRR0, 0x05, PRRI)
 Store(And(Arg0, 0x0F), PRRI)
@@ -365,7 +365,7 @@ DefinitionBlock (
 Name(_HID, EISAID(PNP0C0F))   \
 Name(_UID, uid) \
 Name(_PRS, ResourceTemplate() { \
-Interrupt(, Level, ActiveHigh, Shared) {\
+Interrupt(, Level, ActiveLow, Shared) {\
 5, 10, 11   \
 }   \
 })  \
@@ -398,12 +398,12 @@ DefinitionBlock (
 Name(_HID, EISAID(PNP0C0F))   \
 Name(_UID, uid) \
 Name(_PRS, ResourceTemplate() { \
-Interrupt(, Level, ActiveHigh, Shared) {\
+Interrupt(, Level, ActiveLow, Shared) {\
 gsi \
 }   \
 })  \
 Name(_CRS, ResourceTemplate() { \
-Interrupt(, Level, ActiveHigh, Shared) {\
+Interrupt(, Level, ActiveLow, Shared) {\
 gsi \
 }   \
 })  \
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index 51ce12d..fe1527a 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int 
pic_irq)
 int i, pic_level;
 
 /* The pic level is the logical OR of all the PCI irqs mapped to it */
-pic_level = 0;
+pic_level = 1;
 for (i = 0; i  ICH9_LPC_NB_PIRQS; i++) {
 int tmp_irq;
 int tmp_dis;
 ich9_lpc_pic_irq(lpc, i, tmp_irq, tmp_dis);
 if (!tmp_dis  pic_irq == tmp_irq) {
-pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
+pic_level = !pci_bus_get_irq_level(lpc-d.bus, i);
 }
 }
 if (pic_irq == ich9_lpc_sci_irq(lpc)) {
-pic_level |= lpc-sci_level;
+pic_level = !lpc-sci_level;
 }
 
 qemu_set_irq(lpc-pic[pic_irq], pic_level);
--

However, even on OS X, the Ethernet (e1000) card won't link up at all.
Fixing that requires another patch:

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 58ba93b..c7a2c07 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
 s-mac_reg[ICS] = val;
 
 pending_ints = (s-mac_reg[IMS]  s-mac_reg[ICR]);
-if (!s-mit_irq_level  pending_ints) {
+if (s-mit_irq_level  pending_ints) {
 /*
  * Here we detect a potential raising edge. We postpone raising the
  * interrupt line if we are inside the mitigation delay window
@@ -339,7 +339,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
 }
 }
 
-s-mit_irq_level = (pending_ints != 0);
+s-mit_irq_level = 

Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-14 Thread Alexander Graf


 Am 14.02.2014 um 22:13 schrieb Gabriel L. Somlo gso...@gmail.com:
 
 On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
 On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
 1. Regarding KVM and the polarity xor line in the patch above: Does
 anyone have experience with any *other* guests which insist on setting
 level-triggered interrupt polarity to 1/active-low ? Is that xor line
 actually doing anything useful in practice, for any other guest, on
 either QEMU or any other platform ?
 
 
 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
 has a hardcoded assumption re. polarity == 0, or active-high, for
 level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
 and a bunch of other files, but couldn't isolate anything that I could
 flip to fix things in userspace.
 
 
 Any ideas or suggestions about the appropriate way to move forward would
 be much appreciated !!!
 
 
 Thanks much,
 --Gabriel
 
 I think changing ACPI is the right thing to
 do really. But we'll need to fix some things
 first of course.
 
 So I followed your advice, and was able to boot OS X just fine (but
 booting Linux after this patch still resulted in multiple no one
 cared complaints on IRQs 17, 18, 19, etc.:
 
 diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
 index d618e9e..9c52f64 100644
 --- a/hw/i386/q35-acpi-dsdt.dsl
 +++ b/hw/i386/q35-acpi-dsdt.dsl
 @@ -353,7 +353,7 @@ DefinitionBlock (
 Method(IQCR, 1, Serialized) {
 // _CRS method - get current settings
 Name(PRR0, ResourceTemplate() {
 -Interrupt(, Level, ActiveHigh, Shared) { 0 }
 +Interrupt(, Level, ActiveLow, Shared) { 0 }
 })
 CreateDWordField(PRR0, 0x05, PRRI)
 Store(And(Arg0, 0x0F), PRRI)
 @@ -365,7 +365,7 @@ DefinitionBlock (
 Name(_HID, EISAID(PNP0C0F))   \
 Name(_UID, uid) \
 Name(_PRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
 5, 10, 11   \
 }   \
 })  \
 @@ -398,12 +398,12 @@ DefinitionBlock (
 Name(_HID, EISAID(PNP0C0F))   \
 Name(_UID, uid) \
 Name(_PRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
 gsi \
 }   \
 })  \
 Name(_CRS, ResourceTemplate() { \
 -Interrupt(, Level, ActiveHigh, Shared) {\
 +Interrupt(, Level, ActiveLow, Shared) {\
 gsi \
 }   \
 })  \
 diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
 index 51ce12d..fe1527a 100644
 --- a/hw/isa/lpc_ich9.c
 +++ b/hw/isa/lpc_ich9.c
 @@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int 
 pic_irq)
 int i, pic_level;
 
 /* The pic level is the logical OR of all the PCI irqs mapped to it */
 -pic_level = 0;
 +pic_level = 1;
 for (i = 0; i  ICH9_LPC_NB_PIRQS; i++) {
 int tmp_irq;
 int tmp_dis;
 ich9_lpc_pic_irq(lpc, i, tmp_irq, tmp_dis);
 if (!tmp_dis  pic_irq == tmp_irq) {
 -pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
 +pic_level = !pci_bus_get_irq_level(lpc-d.bus, i);
 }
 }
 if (pic_irq == ich9_lpc_sci_irq(lpc)) {
 -pic_level |= lpc-sci_level;
 +pic_level = !lpc-sci_level;
 }
 
 qemu_set_irq(lpc-pic[pic_irq], pic_level);
 --
 
 However, even on OS X, the Ethernet (e1000) card won't link up at all.
 Fixing that requires another patch:
 
 diff --git a/hw/net/e1000.c b/hw/net/e1000.c
 index 58ba93b..c7a2c07 100644
 --- a/hw/net/e1000.c
 +++ b/hw/net/e1000.c
 @@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t 
 val)
 s-mac_reg[ICS] = val;
 
 pending_ints = (s-mac_reg[IMS]  s-mac_reg[ICR]);
 -if (!s-mit_irq_level  pending_ints) {
 +if (s-mit_irq_level  pending_ints) {
 /*
  * Here we detect a potential raising edge. We postpone raising the
  * interrupt line if we are inside the mitigation delay window
 @@ -339,7 +339,7 @@ set_interrupt_cause(E1000State *s, int index, 

Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-14 Thread Gabriel L. Somlo
On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
 
 Can't you just turn the polarity around in the pci host adapter?

I tried this:

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 1221f32..0e86d21 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
 
 static inline int pci_irq_state(PCIDevice *d, int irq_num)
 {
-   return (d-irq_state  irq_num)  0x1;
+   return !(d-irq_state  irq_num)  0x1;
 }
 
 static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
 {
d-irq_state = ~(0x1  irq_num);
-   d-irq_state |= level  irq_num;
+   d-irq_state = ~(level  irq_num);
 }
 
 static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
@@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
 }
 
 for (i = 0; i  bus-nirq; i++) {
-assert(bus-irq_count[i] == 0);
+assert(bus-irq_count[i] != 0);
 }
 }
 
---

but now OS X freezes during boot right after

[ PCI configuration begin ]
[ PCI configuration end, bridges 1, devices 10 ]
RTC: Only single RAM bank (128 bytes)

which all looks normal, except the process is supposed to continue on
from there and doesn't :)

On Linux, I get Fedora 20 live all the way up with no obvious/loud
complaints, but mouse and keyboard don't work at all...

I have to admit I'm a bit out of my depth here, though :)

--Gabriel



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-14 Thread Alexander Graf

On 14.02.2014, at 23:06, Gabriel L. Somlo gso...@gmail.com wrote:

 On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
 
 Can't you just turn the polarity around in the pci host adapter?
 
 I tried this:
 
 diff --git a/hw/pci/pci.c b/hw/pci/pci.c
 index 1221f32..0e86d21 100644
 --- a/hw/pci/pci.c
 +++ b/hw/pci/pci.c
 @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
 
 static inline int pci_irq_state(PCIDevice *d, int irq_num)
 {
 - return (d-irq_state  irq_num)  0x1;
 + return !(d-irq_state  irq_num)  0x1;
 }
 
 static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
 {
   d-irq_state = ~(0x1  irq_num);
 - d-irq_state |= level  irq_num;
 + d-irq_state = ~(level  irq_num);
 }
 
 static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
 @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
 }
 
 for (i = 0; i  bus-nirq; i++) {
 -assert(bus-irq_count[i] == 0);
 +assert(bus-irq_count[i] != 0);
 }
 }
 
 ---
 
 but now OS X freezes during boot right after
 
   [ PCI configuration begin ]
   [ PCI configuration end, bridges 1, devices 10 ]
   RTC: Only single RAM bank (128 bytes)
 
 which all looks normal, except the process is supposed to continue on
 from there and doesn't :)
 
 On Linux, I get Fedora 20 live all the way up with no obvious/loud
 complaints, but mouse and keyboard don't work at all...
 
 I have to admit I'm a bit out of my depth here, though :)

Yeah, another thing we have to take into account is vhost-net which generates 
IRQs directly through irqfd. I guess for those we'll have to configure the 
polarity in the irq routing table?


Alex




[Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-11 Thread Gabriel L. Somlo
Hi,

I'm trying to get OS X to work as a QEMU guest, and one of the few
remaining mysteries I need to solve is that the OS X guest hangs
during boot, waiting for its boot disk to be available, unless the
following KVM patch is applied:


diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index ce9ed99..1539d37 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, 
int irq_source_id,
irq_level = __kvm_irq_line_state(ioapic-irq_states[irq],
 irq_source_id, level);
entry = ioapic-redirtbl[irq];
-   irq_level ^= entry.fields.polarity;
if (!irq_level) {
ioapic-irr = ~mask;
ret = 1;
--


After digging around the KVM source for a bit, and printk-ing things
from Windows 7, Fedora 20, and OS X (10.9), I figured out the following:


1. Edge-triggered interrupts are invariably unaffected by the xor line
being removed by the patch. On all three guest types, edge-triggered
interrupts have polarity set to 0, so the xor is essentially a no-op,
and we can forget about it altogether.


2. Windows and Linux always configure all level-triggered interrupts
with polarity 0 (active-high, consistent with QEMU's ACPI/DSDT, in
particular q35-acpi-dsdt.dsl, which is what I'm using with -M q35).
As such, on Windows and Linux, the xor line in question is still a
no-op.


3. OS X (all versions I tried, at least since 10.5/Leopard) always
configures all level-triggered interrupts with polarity 1 (active-low),
regardless of what the QEMU DSDT says. As such, the xor line acts as
a negation of irq_level, which at first glance sounds reasonable.

However: when KVM negates irq_level due to polarity == 1, the OS X
guest hangs during boot.

OS X works fine when polarity == 1 is ignored (with the xor line
commented out).

This may be another instance (similar to how OS X didn't use to check
with CPUID regarding monitor/mwait instruction availability) where
apple devs know that any of their supported hardware advertises
active-low in the DSDT, so no need to check, just hardcode that
assumption... :)


4. With s/ActiveHigh/ActiveLow/ in QEMU's q35-acpi-dsdt.dsl, Linux
actually switches to polarity == 1 (active-low), and works fine
*with the xor line removed* !!!. With the xor line left intact (i.e.
without the above patch), the active-low fedora guest worked extremely
poorly, and printed out multiple error messages during boot:

irq XX: nobody cared (try booting with the irqpoll option)
...
Disabling IRQ #XX

for XX in [16, 18, 19, ...].


So, right now, I'm wondering about the following:


1. Regarding KVM and the polarity xor line in the patch above: Does
anyone have experience with any *other* guests which insist on setting
level-triggered interrupt polarity to 1/active-low ? Is that xor line
actually doing anything useful in practice, for any other guest, on
either QEMU or any other platform ?


2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
has a hardcoded assumption re. polarity == 0, or active-high, for
level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
and a bunch of other files, but couldn't isolate anything that I could
flip to fix things in userspace.


Any ideas or suggestions about the appropriate way to move forward would
be much appreciated !!!


Thanks much,
--Gabriel



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-11 Thread Michael S. Tsirkin
On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
 Hi,
 
 I'm trying to get OS X to work as a QEMU guest, and one of the few
 remaining mysteries I need to solve is that the OS X guest hangs
 during boot, waiting for its boot disk to be available, unless the
 following KVM patch is applied:
 
 
 diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
 index ce9ed99..1539d37 100644
 --- a/virt/kvm/ioapic.c
 +++ b/virt/kvm/ioapic.c
 @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int 
 irq, int irq_source_id,
   irq_level = __kvm_irq_line_state(ioapic-irq_states[irq],
irq_source_id, level);
   entry = ioapic-redirtbl[irq];
 - irq_level ^= entry.fields.polarity;
   if (!irq_level) {
   ioapic-irr = ~mask;
   ret = 1;
 --
 
 
 After digging around the KVM source for a bit, and printk-ing things
 from Windows 7, Fedora 20, and OS X (10.9), I figured out the following:
 
 
 1. Edge-triggered interrupts are invariably unaffected by the xor line
 being removed by the patch. On all three guest types, edge-triggered
 interrupts have polarity set to 0, so the xor is essentially a no-op,
 and we can forget about it altogether.
 
 
 2. Windows and Linux always configure all level-triggered interrupts
 with polarity 0 (active-high, consistent with QEMU's ACPI/DSDT, in
 particular q35-acpi-dsdt.dsl, which is what I'm using with -M q35).
 As such, on Windows and Linux, the xor line in question is still a
 no-op.
 
 
 3. OS X (all versions I tried, at least since 10.5/Leopard) always
 configures all level-triggered interrupts with polarity 1 (active-low),
 regardless of what the QEMU DSDT says. As such, the xor line acts as
 a negation of irq_level, which at first glance sounds reasonable.
 
 However: when KVM negates irq_level due to polarity == 1, the OS X
 guest hangs during boot.
 
 OS X works fine when polarity == 1 is ignored (with the xor line
 commented out).
 
 This may be another instance (similar to how OS X didn't use to check
 with CPUID regarding monitor/mwait instruction availability) where
 apple devs know that any of their supported hardware advertises
 active-low in the DSDT, so no need to check, just hardcode that
 assumption... :)
 
 
 4. With s/ActiveHigh/ActiveLow/ in QEMU's q35-acpi-dsdt.dsl, Linux
 actually switches to polarity == 1 (active-low), and works fine
 *with the xor line removed* !!!. With the xor line left intact (i.e.
 without the above patch), the active-low fedora guest worked extremely
 poorly, and printed out multiple error messages during boot:
 
   irq XX: nobody cared (try booting with the irqpoll option)
   ...
   Disabling IRQ #XX
 
 for XX in [16, 18, 19, ...].
 
 
 So, right now, I'm wondering about the following:
 
 
 1. Regarding KVM and the polarity xor line in the patch above: Does
 anyone have experience with any *other* guests which insist on setting
 level-triggered interrupt polarity to 1/active-low ? Is that xor line
 actually doing anything useful in practice, for any other guest, on
 either QEMU or any other platform ?
 
 
 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
 has a hardcoded assumption re. polarity == 0, or active-high, for
 level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
 and a bunch of other files, but couldn't isolate anything that I could
 flip to fix things in userspace.
 
 
 Any ideas or suggestions about the appropriate way to move forward would
 be much appreciated !!!
 
 
 Thanks much,
 --Gabriel

I think changing ACPI is the right thing to
do really. But we'll need to fix some things
first of course.

I think it's PC Q35 that has this assumption.
hw/i386/pc_q35.c

gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
 GSI_NUM_PINS);

kvm_pc_gsi_handler simply forwards interrupts to kvm.

and

hw/isa/lpc_ich9.c
static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
{
int i, pic_level;

/* The pic level is the logical OR of all the PCI irqs mapped to it */
/* The pic level is the logical OR of all the PCI irqs mapped to it
 * */
pic_level = 0;
for (i = 0; i  ICH9_LPC_NB_PIRQS; i++) {
int tmp_irq;
int tmp_dis;
ich9_lpc_pic_irq(lpc, i, tmp_irq, tmp_dis);
if (!tmp_dis  pic_irq == tmp_irq) {
pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
}
}

so somewhere we need to flip it, I am guessing in ich9
along the lines of:

-pic_level = 0;
-pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
+pic_level = 1;
+pic_level = !pci_bus_get_irq_level(lpc-d.bus, i);




-- 
MST



Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest

2014-02-11 Thread Gabriel L. Somlo
On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
 On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
  I'm trying to get OS X to work as a QEMU guest, and one of the few
  remaining mysteries I need to solve is that the OS X guest hangs
  during boot, waiting for its boot disk to be available, unless the
  following KVM patch is applied:
  [...]
  2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
  has a hardcoded assumption re. polarity == 0, or active-high, for
  level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
  and a bunch of other files, but couldn't isolate anything that I could
  flip to fix things in userspace.
  
  
  Any ideas or suggestions about the appropriate way to move forward would
  be much appreciated !!!
  
  
  Thanks much,
  --Gabriel
 
 I think changing ACPI is the right thing to
 do really. But we'll need to fix some things
 first of course.
 
 I think it's PC Q35 that has this assumption.
 hw/i386/pc_q35.c
 
 gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
  GSI_NUM_PINS);
 
 kvm_pc_gsi_handler simply forwards interrupts to kvm.
 
 and
 
 hw/isa/lpc_ich9.c
 static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
 {
 int i, pic_level;
 
 /* The pic level is the logical OR of all the PCI irqs mapped to it */
 /* The pic level is the logical OR of all the PCI irqs mapped to it
  * */
 pic_level = 0;
 for (i = 0; i  ICH9_LPC_NB_PIRQS; i++) {
 int tmp_irq;
 int tmp_dis;
 ich9_lpc_pic_irq(lpc, i, tmp_irq, tmp_dis);
 if (!tmp_dis  pic_irq == tmp_irq) {
 pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
 }
 }
 
 so somewhere we need to flip it, I am guessing in ich9
 along the lines of:
 
 -pic_level = 0;
 -pic_level |= pci_bus_get_irq_level(lpc-d.bus, i);
 +pic_level = 1;
 +pic_level = !pci_bus_get_irq_level(lpc-d.bus, i);

I think now we're on to something!

I managed to boot OS X on q35 with absolutely no kernel patches, but
Linux still hated it (irqXX: nobody cared). At least now I know what
I'm looking for, so I'll try to come up with a way to flip
level-triggered polarity to ActiveLow across all of i386, in a way
that works for Linux and Windows guests as well.

Thanks again for getting me unstuck!
--Gabriel