Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu

2015-12-14 Thread Sander Eikelenboom

On 2015-12-14 20:48, Eric Shelton wrote:

Please note that the same issue appears to have been introduced in the
recent 4.2.7 kernel.  It perhaps has to do
with b4ff8389ed14b849354b59ce9b360bdefcdbf99c having a matching
commit e8d097151d309eb71f750bbf34e6a7ef6256da7e in linux-stable.git.  
The

below patch to arch/x86/kernel/rtc.c was also effective for 4.2.7.

Eric


Hi Eric,

Yeah it's unfortunate the patch patching the other patches destined for 
stable didn't make it in time for stable :(.
Any how the chosen solution wasn't ideal so there now is a V2 patch by 
Boris. It hasn't been picked up yet,
but hopefully will be anytime soon (for the patch see 
http://lkml.iu.edu/hypermail/linux/kernel/1512.1/03504.html)


--
Sander


On 2015-12-02 18:30, Sander Eikelenboom wrote:

On 2015-12-02 15:55, David Vrabel wrote:
> On 28/11/15 15:47, Sander Eikelenboom wrote:
>> genirq: Flags mismatch irq 8.  (hvc_console) vs. 
>> (rtc0)
>
> We shouldn't register an rtc_cmos device because its legacy irq
> conflicts with the irq needed for hvc0.  For a multi VCPU guest irq 8
> is
> in use for the pv spinlocks and this gets requested first, preventing
> the rtc device from probing.
>
> Does this patch fix it for you?
>
> David

It does, thanks.

Reported-and-tested-by: Sander Eikelenboom 

--
Sander

> 8<
> x86: rtc_cmos platform device requires legacy irqs
>
> Adding the rtc platform device when there are no legacy irqs (no
> legacy PIC) causes a conflict with other devices that end up using the
> same irq number.
>
> In a single VCPU PV guest we should have:
>
> /proc/interrupts:
>CPU0
>   0:   4934  xen-percpu-virq  timer0
>   1:  0  xen-percpu-ipi   spinlock0
>   2:  0  xen-percpu-ipi   resched0
>   3:  0  xen-percpu-ipi   callfunc0
>   4:  0  xen-percpu-virq  debug0
>   5:  0  xen-percpu-ipi   callfuncsingle0
>   6:  0  xen-percpu-ipi   irqwork0
>   7:321   xen-dyn-event xenbus
>   8: 90   xen-dyn-event hvc_console
>   ...
>
> But hvc_console cannot get its interrupt because it is already in use
> by rtc0 and the console does not work.
>
>   genirq: Flags mismatch irq 8.  (hvc_console) vs. 
> (rtc0)
>
> The rtc_cmos device requires a particular legacy irq so don't add it
> if there are no legacy irqs.
>
> Signed-off-by: David Vrabel 
> ---
>  arch/x86/kernel/rtc.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
> index cd96852..07c70f1 100644
> --- a/arch/x86/kernel/rtc.c
> +++ b/arch/x86/kernel/rtc.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #ifdef CONFIG_X86_32
>  /*
> @@ -200,6 +201,10 @@ static __init int add_rtc_cmos(void)
>   }
>  #endif
>
> + /* RTC uses legacy IRQs. */
> + if (!nr_legacy_irqs())
> + return -ENODEV;
> +
>   platform_device_register(_device);
>   dev_info(_device.dev,
>"registered platform RTC device (no PNP device

found)\n");

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu

2015-12-14 Thread Sander Eikelenboom

On 2015-12-14 20:48, Eric Shelton wrote:

Please note that the same issue appears to have been introduced in the
recent 4.2.7 kernel.  It perhaps has to do
with b4ff8389ed14b849354b59ce9b360bdefcdbf99c having a matching
commit e8d097151d309eb71f750bbf34e6a7ef6256da7e in linux-stable.git.  
The

below patch to arch/x86/kernel/rtc.c was also effective for 4.2.7.

Eric


Hi Eric,

Yeah it's unfortunate the patch patching the other patches destined for 
stable didn't make it in time for stable :(.
Any how the chosen solution wasn't ideal so there now is a V2 patch by 
Boris. It hasn't been picked up yet,
but hopefully will be anytime soon (for the patch see 
http://lkml.iu.edu/hypermail/linux/kernel/1512.1/03504.html)


--
Sander


On 2015-12-02 18:30, Sander Eikelenboom wrote:

On 2015-12-02 15:55, David Vrabel wrote:
> On 28/11/15 15:47, Sander Eikelenboom wrote:
>> genirq: Flags mismatch irq 8.  (hvc_console) vs. 
>> (rtc0)
>
> We shouldn't register an rtc_cmos device because its legacy irq
> conflicts with the irq needed for hvc0.  For a multi VCPU guest irq 8
> is
> in use for the pv spinlocks and this gets requested first, preventing
> the rtc device from probing.
>
> Does this patch fix it for you?
>
> David

It does, thanks.

Reported-and-tested-by: Sander Eikelenboom 

--
Sander

> 8<
> x86: rtc_cmos platform device requires legacy irqs
>
> Adding the rtc platform device when there are no legacy irqs (no
> legacy PIC) causes a conflict with other devices that end up using the
> same irq number.
>
> In a single VCPU PV guest we should have:
>
> /proc/interrupts:
>CPU0
>   0:   4934  xen-percpu-virq  timer0
>   1:  0  xen-percpu-ipi   spinlock0
>   2:  0  xen-percpu-ipi   resched0
>   3:  0  xen-percpu-ipi   callfunc0
>   4:  0  xen-percpu-virq  debug0
>   5:  0  xen-percpu-ipi   callfuncsingle0
>   6:  0  xen-percpu-ipi   irqwork0
>   7:321   xen-dyn-event xenbus
>   8: 90   xen-dyn-event hvc_console
>   ...
>
> But hvc_console cannot get its interrupt because it is already in use
> by rtc0 and the console does not work.
>
>   genirq: Flags mismatch irq 8.  (hvc_console) vs. 
> (rtc0)
>
> The rtc_cmos device requires a particular legacy irq so don't add it
> if there are no legacy irqs.
>
> Signed-off-by: David Vrabel 
> ---
>  arch/x86/kernel/rtc.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
> index cd96852..07c70f1 100644
> --- a/arch/x86/kernel/rtc.c
> +++ b/arch/x86/kernel/rtc.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #ifdef CONFIG_X86_32
>  /*
> @@ -200,6 +201,10 @@ static __init int add_rtc_cmos(void)
>   }
>  #endif
>
> + /* RTC uses legacy IRQs. */
> + if (!nr_legacy_irqs())
> + return -ENODEV;
> +
>   platform_device_register(_device);
>   dev_info(_device.dev,
>"registered platform RTC device (no PNP device

found)\n");

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread Sander Eikelenboom

On 2015-12-02 15:55, David Vrabel wrote:

On 28/11/15 15:47, Sander Eikelenboom wrote:
genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)


We shouldn't register an rtc_cmos device because its legacy irq
conflicts with the irq needed for hvc0.  For a multi VCPU guest irq 8 
is

in use for the pv spinlocks and this gets requested first, preventing
the rtc device from probing.

Does this patch fix it for you?

David


It does, thanks.

Reported-and-tested-by: Sander Eikelenboom 

--
Sander


8<
x86: rtc_cmos platform device requires legacy irqs

Adding the rtc platform device when there are no legacy irqs (no
legacy PIC) causes a conflict with other devices that end up using the
same irq number.

In a single VCPU PV guest we should have:

/proc/interrupts:
   CPU0
  0:   4934  xen-percpu-virq  timer0
  1:  0  xen-percpu-ipi   spinlock0
  2:  0  xen-percpu-ipi   resched0
  3:  0  xen-percpu-ipi   callfunc0
  4:  0  xen-percpu-virq  debug0
  5:  0  xen-percpu-ipi   callfuncsingle0
  6:  0  xen-percpu-ipi   irqwork0
  7:321   xen-dyn-event xenbus
  8: 90   xen-dyn-event hvc_console
  ...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

  genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)


The rtc_cmos device requires a particular legacy irq so don't add it
if there are no legacy irqs.

Signed-off-by: David Vrabel 
---
 arch/x86/kernel/rtc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
index cd96852..07c70f1 100644
--- a/arch/x86/kernel/rtc.c
+++ b/arch/x86/kernel/rtc.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef CONFIG_X86_32
 /*
@@ -200,6 +201,10 @@ static __init int add_rtc_cmos(void)
}
 #endif

+   /* RTC uses legacy IRQs. */
+   if (!nr_legacy_irqs())
+   return -ENODEV;
+
platform_device_register(_device);
dev_info(_device.dev,
 "registered platform RTC device (no PNP device found)\n");

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread David Vrabel
On 28/11/15 15:47, Sander Eikelenboom wrote:
> genirq: Flags mismatch irq 8.  (hvc_console) vs.  (rtc0)

We shouldn't register an rtc_cmos device because its legacy irq
conflicts with the irq needed for hvc0.  For a multi VCPU guest irq 8 is
in use for the pv spinlocks and this gets requested first, preventing
the rtc device from probing.

Does this patch fix it for you?

David
8<
x86: rtc_cmos platform device requires legacy irqs

Adding the rtc platform device when there are no legacy irqs (no
legacy PIC) causes a conflict with other devices that end up using the
same irq number.

In a single VCPU PV guest we should have:

/proc/interrupts:
   CPU0
  0:   4934  xen-percpu-virq  timer0
  1:  0  xen-percpu-ipi   spinlock0
  2:  0  xen-percpu-ipi   resched0
  3:  0  xen-percpu-ipi   callfunc0
  4:  0  xen-percpu-virq  debug0
  5:  0  xen-percpu-ipi   callfuncsingle0
  6:  0  xen-percpu-ipi   irqwork0
  7:321   xen-dyn-event xenbus
  8: 90   xen-dyn-event hvc_console
  ...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

  genirq: Flags mismatch irq 8.  (hvc_console) vs.  (rtc0)

The rtc_cmos device requires a particular legacy irq so don't add it
if there are no legacy irqs.

Signed-off-by: David Vrabel 
---
 arch/x86/kernel/rtc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
index cd96852..07c70f1 100644
--- a/arch/x86/kernel/rtc.c
+++ b/arch/x86/kernel/rtc.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef CONFIG_X86_32
 /*
@@ -200,6 +201,10 @@ static __init int add_rtc_cmos(void)
}
 #endif

+   /* RTC uses legacy IRQs. */
+   if (!nr_legacy_irqs())
+   return -ENODEV;
+
platform_device_register(_device);
dev_info(_device.dev,
 "registered platform RTC device (no PNP device found)\n");
-- 
2.1.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread David Vrabel
On 28/11/15 15:47, Sander Eikelenboom wrote:
> 
> -rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
> +hctosys: unable to open rtc device (rtc0)
> 
> -genirq: Flags mismatch irq 8.  (hvc_console) vs.  (rtc0)
> -hvc_open: request_irq failed with rc -16.

I have reproduced this issue.  We really shouldn't have an RTC device in
a PV guest and I think this irq conflict breaks hvc0.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread Sander Eikelenboom

On 2015-12-02 00:41, Boris Ostrovsky wrote:

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 
0

after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen
version and host HW (Intel or AMD)? 'xl info' maybe?

-boris


Hi Boris,

A fresh new day .. a fresh new thought.
If i look at the /proc/interrupts from a broken and a kernel with both 
commits the
thing that catches the eye is irq8, just as the dmesg message was 
telling.


In my PV guest rtc0 now seems to try and take irq8 that was already 
assigned to HVC ?
Sounds like some assumptions around the legacy range are broken 
somewhere.


What is the benefit of not just reserving the legacy range ?

Attached the /proc/interrupts from both boots.

--
Sander






What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because 
we are on x86 and not on arm.


-- Sander




-- Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel   CPU0   
 16: 315536  xen-percpu-virq  timer0
 17:  0  xen-percpu-ipi   spinlock0
 18:  0  xen-percpu-ipi   resched0
 19:  0  xen-percpu-ipi   callfunc0
 20:  0  xen-percpu-virq  debug0
 21:  0  xen-percpu-ipi   callfuncsingle0
 22:  0  xen-percpu-ipi   irqwork0
 23:346   xen-dyn-event xenbus
 24:134   xen-dyn-event hvc_console
 25:  11464   xen-dyn-event blkif
 26:  28710   xen-dyn-event eth0-q0-tx
 27:  40136   xen-dyn-event eth0-q0-rx
NMI:  0   Non-maskable interrupts
LOC:  0   Local timer interrupts
SPU:  0   Spurious interrupts
PMI:  0   Performance monitoring interrupts
IWI:  0   IRQ work interrupts
RTR:  0   APIC ICR read retries
RES:  0   Rescheduling 

Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread David Vrabel
On 28/11/15 15:47, Sander Eikelenboom wrote:
> 
> -rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
> +hctosys: unable to open rtc device (rtc0)
> 
> -genirq: Flags mismatch irq 8.  (hvc_console) vs.  (rtc0)
> -hvc_open: request_irq failed with rc -16.

I have reproduced this issue.  We really shouldn't have an RTC device in
a PV guest and I think this irq conflict breaks hvc0.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread David Vrabel
On 28/11/15 15:47, Sander Eikelenboom wrote:
> genirq: Flags mismatch irq 8.  (hvc_console) vs.  (rtc0)

We shouldn't register an rtc_cmos device because its legacy irq
conflicts with the irq needed for hvc0.  For a multi VCPU guest irq 8 is
in use for the pv spinlocks and this gets requested first, preventing
the rtc device from probing.

Does this patch fix it for you?

David
8<
x86: rtc_cmos platform device requires legacy irqs

Adding the rtc platform device when there are no legacy irqs (no
legacy PIC) causes a conflict with other devices that end up using the
same irq number.

In a single VCPU PV guest we should have:

/proc/interrupts:
   CPU0
  0:   4934  xen-percpu-virq  timer0
  1:  0  xen-percpu-ipi   spinlock0
  2:  0  xen-percpu-ipi   resched0
  3:  0  xen-percpu-ipi   callfunc0
  4:  0  xen-percpu-virq  debug0
  5:  0  xen-percpu-ipi   callfuncsingle0
  6:  0  xen-percpu-ipi   irqwork0
  7:321   xen-dyn-event xenbus
  8: 90   xen-dyn-event hvc_console
  ...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

  genirq: Flags mismatch irq 8.  (hvc_console) vs.  (rtc0)

The rtc_cmos device requires a particular legacy irq so don't add it
if there are no legacy irqs.

Signed-off-by: David Vrabel 
---
 arch/x86/kernel/rtc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
index cd96852..07c70f1 100644
--- a/arch/x86/kernel/rtc.c
+++ b/arch/x86/kernel/rtc.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef CONFIG_X86_32
 /*
@@ -200,6 +201,10 @@ static __init int add_rtc_cmos(void)
}
 #endif

+   /* RTC uses legacy IRQs. */
+   if (!nr_legacy_irqs())
+   return -ENODEV;
+
platform_device_register(_device);
dev_info(_device.dev,
 "registered platform RTC device (no PNP device found)\n");
-- 
2.1.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread Sander Eikelenboom

On 2015-12-02 15:55, David Vrabel wrote:

On 28/11/15 15:47, Sander Eikelenboom wrote:
genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)


We shouldn't register an rtc_cmos device because its legacy irq
conflicts with the irq needed for hvc0.  For a multi VCPU guest irq 8 
is

in use for the pv spinlocks and this gets requested first, preventing
the rtc device from probing.

Does this patch fix it for you?

David


It does, thanks.

Reported-and-tested-by: Sander Eikelenboom 

--
Sander


8<
x86: rtc_cmos platform device requires legacy irqs

Adding the rtc platform device when there are no legacy irqs (no
legacy PIC) causes a conflict with other devices that end up using the
same irq number.

In a single VCPU PV guest we should have:

/proc/interrupts:
   CPU0
  0:   4934  xen-percpu-virq  timer0
  1:  0  xen-percpu-ipi   spinlock0
  2:  0  xen-percpu-ipi   resched0
  3:  0  xen-percpu-ipi   callfunc0
  4:  0  xen-percpu-virq  debug0
  5:  0  xen-percpu-ipi   callfuncsingle0
  6:  0  xen-percpu-ipi   irqwork0
  7:321   xen-dyn-event xenbus
  8: 90   xen-dyn-event hvc_console
  ...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

  genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)


The rtc_cmos device requires a particular legacy irq so don't add it
if there are no legacy irqs.

Signed-off-by: David Vrabel 
---
 arch/x86/kernel/rtc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
index cd96852..07c70f1 100644
--- a/arch/x86/kernel/rtc.c
+++ b/arch/x86/kernel/rtc.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef CONFIG_X86_32
 /*
@@ -200,6 +201,10 @@ static __init int add_rtc_cmos(void)
}
 #endif

+   /* RTC uses legacy IRQs. */
+   if (!nr_legacy_irqs())
+   return -ENODEV;
+
platform_device_register(_device);
dev_info(_device.dev,
 "registered platform RTC device (no PNP device found)\n");

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-02 Thread Sander Eikelenboom

On 2015-12-02 00:41, Boris Ostrovsky wrote:

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 
0

after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen
version and host HW (Intel or AMD)? 'xl info' maybe?

-boris


Hi Boris,

A fresh new day .. a fresh new thought.
If i look at the /proc/interrupts from a broken and a kernel with both 
commits the
thing that catches the eye is irq8, just as the dmesg message was 
telling.


In my PV guest rtc0 now seems to try and take irq8 that was already 
assigned to HVC ?
Sounds like some assumptions around the legacy range are broken 
somewhere.


What is the benefit of not just reserving the legacy range ?

Attached the /proc/interrupts from both boots.

--
Sander






What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because 
we are on x86 and not on arm.


-- Sander




-- Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel   CPU0   
 16: 315536  xen-percpu-virq  timer0
 17:  0  xen-percpu-ipi   spinlock0
 18:  0  xen-percpu-ipi   resched0
 19:  0  xen-percpu-ipi   callfunc0
 20:  0  xen-percpu-virq  debug0
 21:  0  xen-percpu-ipi   callfuncsingle0
 22:  0  xen-percpu-ipi   irqwork0
 23:346   xen-dyn-event xenbus
 24:134   xen-dyn-event hvc_console
 25:  11464   xen-dyn-event blkif
 26:  28710   xen-dyn-event eth0-q0-tx
 27:  40136   xen-dyn-event eth0-q0-rx
NMI:  0   Non-maskable interrupts
LOC:  0   Local timer interrupts
SPU:  0   Spurious interrupts
PMI:  0   Performance monitoring interrupts
IWI:  0   IRQ work interrupts
RTR:  0   APIC ICR read retries
RES:  0   Rescheduling 

Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-02 00:41, Boris Ostrovsky wrote:

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 
0

after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen
version and host HW (Intel or AMD)? 'xl info' maybe?

-boris


Guest config file == dom0 config file == the one i send you earlier.
Host is an AMD Phenom X6.

# xl info
host   : serveerstertje
release: 4.4.0-rc3-20151201-linus-doflr-boris+
version: #1 SMP Tue Dec 1 19:02:58 CET 2015
machine: x86_64
nr_cpus: 6
max_cpu_id : 5
nr_nodes   : 1
cores_per_socket   : 6
threads_per_core   : 1
cpu_mhz: 3200
hw_caps: 
178bf3ff:efd3fbff::00011300:00802001::37ff:

virt_caps  : hvm hvm_directio
total_memory   : 20479
free_memory: 7745
sharing_freed_memory   : 0
sharing_used_memory: 0
outstanding_claims : 0
free_cpus  : 0
xen_major  : 4
xen_minor  : 7
xen_extra  : -unstable
xen_version: 4.7-unstable
xen_caps   : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64

xen_scheduler  : credit
xen_pagesize   : 4096
platform_params: virt_start=0x8000
xen_changeset  : Thu Nov 26 20:58:13 2015 +0100 
git:5252636-dirty
xen_commandline: dom0_mem=1536M,max:1536M loglvl=all 
loglvl_guest=all console_timestamps=datems vga=gfx-1280x1024x32 cpuidle 
cpufreq=xen com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 
iommu=on,verbose,debug,amd-iommu-debug conring_size=128k ucode=-1

cc_compiler: gcc-4.9.real (Debian 4.9.2-10) 4.9.2
cc_compile_by  : root
cc_compile_domain  : dyndns.org
cc_compile_date: Thu Nov 26 21:18:41 CET 2015
xend_config_format : 4

If you need and can get more info by 

Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen 
version and host HW (Intel or AMD)? 'xl info' maybe?


-boris




What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because 
we are on x86 and not on arm.


--
Sander




-- Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck kworker 
thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.

What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because we 
are on x86 and not on arm.


--
Sander




-- Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0 
after that commit.


This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and 
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any 
difference.



-boris



--
Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 05:51 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.




Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



Could it be .. that with your fixup:
xen/events: Always allocate legacy interrupts on PV guests
(b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
for commit:
x86/irq: Probe for PIC presence before allocating descs for legacy 
IRQs

(8c058b0b9c34d8c8d7912880956543769323e2d8)

that we now have the situation described in the commit message of 
8c058b0b9c, but now for Xen PV instead of

Hyper-V ?
(seems both Xen and Hyper-V want to achieve the same but have 
different competing implementations ?)


(BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
more trouble).



You mean my statement that irq 1 looks bad? That was a red herring, it 
should be fine.


-boris




--
Sander




and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is also 
a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my attention, 
with a 2 vcpu that doesn't seem to happen, but you still get the dmesg 
output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?

--
Sander



-boris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.




Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



Could it be .. that with your fixup:
xen/events: Always allocate legacy interrupts on PV guests
(b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
for commit:
x86/irq: Probe for PIC presence before allocating descs for legacy 
IRQs

(8c058b0b9c34d8c8d7912880956543769323e2d8)

that we now have the situation described in the commit message of 
8c058b0b9c, but now for Xen PV instead of

Hyper-V ?
(seems both Xen and Hyper-V want to achieve the same but have different 
competing implementations ?)


(BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
more trouble).


--
Sander




and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.

-boris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is also 
a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my attention, 
with a 2 vcpu that doesn't seem to happen, but you still get the dmesg 
output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?

--
Sander



-boris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.




Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



Could it be .. that with your fixup:
xen/events: Always allocate legacy interrupts on PV guests
(b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
for commit:
x86/irq: Probe for PIC presence before allocating descs for legacy 
IRQs

(8c058b0b9c34d8c8d7912880956543769323e2d8)

that we now have the situation described in the commit message of 
8c058b0b9c, but now for Xen PV instead of

Hyper-V ?
(seems both Xen and Hyper-V want to achieve the same but have different 
competing implementations ?)


(BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
more trouble).


--
Sander




and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck kworker 
thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.

What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because we 
are on x86 and not on arm.


--
Sander




-- Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.

-boris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 05:51 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.




Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



Could it be .. that with your fixup:
xen/events: Always allocate legacy interrupts on PV guests
(b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
for commit:
x86/irq: Probe for PIC presence before allocating descs for legacy 
IRQs

(8c058b0b9c34d8c8d7912880956543769323e2d8)

that we now have the situation described in the commit message of 
8c058b0b9c, but now for Xen PV instead of

Hyper-V ?
(seems both Xen and Hyper-V want to achieve the same but have 
different competing implementations ?)


(BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
more trouble).



You mean my statement that irq 1 looks bad? That was a red herring, it 
should be fine.


-boris




--
Sander




and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-02 00:41, Boris Ostrovsky wrote:

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 
0

after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen
version and host HW (Intel or AMD)? 'xl info' maybe?

-boris


Guest config file == dom0 config file == the one i send you earlier.
Host is an AMD Phenom X6.

# xl info
host   : serveerstertje
release: 4.4.0-rc3-20151201-linus-doflr-boris+
version: #1 SMP Tue Dec 1 19:02:58 CET 2015
machine: x86_64
nr_cpus: 6
max_cpu_id : 5
nr_nodes   : 1
cores_per_socket   : 6
threads_per_core   : 1
cpu_mhz: 3200
hw_caps: 
178bf3ff:efd3fbff::00011300:00802001::37ff:

virt_caps  : hvm hvm_directio
total_memory   : 20479
free_memory: 7745
sharing_freed_memory   : 0
sharing_used_memory: 0
outstanding_claims : 0
free_cpus  : 0
xen_major  : 4
xen_minor  : 7
xen_extra  : -unstable
xen_version: 4.7-unstable
xen_caps   : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64

xen_scheduler  : credit
xen_pagesize   : 4096
platform_params: virt_start=0x8000
xen_changeset  : Thu Nov 26 20:58:13 2015 +0100 
git:5252636-dirty
xen_commandline: dom0_mem=1536M,max:1536M loglvl=all 
loglvl_guest=all console_timestamps=datems vga=gfx-1280x1024x32 cpuidle 
cpufreq=xen com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 
iommu=on,verbose,debug,amd-iommu-debug conring_size=128k ucode=-1

cc_compiler: gcc-4.9.real (Debian 4.9.2-10) 4.9.2
cc_compile_by  : root
cc_compile_domain  : dyndns.org
cc_compile_date: Thu Nov 26 21:18:41 CET 2015
xend_config_format : 4

If you need and can get more info by 

Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0 
after that commit.


This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and 
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any 
difference.



-boris



--
Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen 
version and host HW (Intel or AMD)? 'xl info' maybe?


-boris




What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because 
we are on x86 and not on arm.


--
Sander




-- Sander



-boris


___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-11-30 Thread Sander Eikelenboom

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the last 
i'm troubled by for now ;)


--
Sander





Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-11-30 Thread Boris Ostrovsky

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip tree
pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes a 
fatal bug for 32-bit PV guests. The other two are code improvements/cleanup.





Thanks :)

--
Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have b4ff8389ed14b849354b59ce9b360bdefcdbf99c.



-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something from 
setup_irq() generating a stack dump (warninig) for rtc_cmos but it 
appeared harmless at that time and now I don't see it anymore.


-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

--
Sander




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-11-30 Thread Sander Eikelenboom

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the last 
i'm troubled by for now ;)


--
Sander





Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-11-30 Thread Boris Ostrovsky

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip tree
pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes a 
fatal bug for 32-bit PV guests. The other two are code improvements/cleanup.





Thanks :)

--
Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have b4ff8389ed14b849354b59ce9b360bdefcdbf99c.



-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something from 
setup_irq() generating a stack dump (warninig) for rtc_cmos but it 
appeared harmless at that time and now I don't see it anymore.


-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

--
Sander




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/