Re: system hung up when offlining CPUs

2017-11-01 Thread Hannes Reinecke
On 11/01/2017 01:47 AM, Thomas Gleixner wrote: > On Mon, 30 Oct 2017, Shivasharan Srikanteshwara wrote: > >> In managed-interrupts case, interrupts which were affine to the offlined >> CPU is not getting migrated to another available CPU. But the >> documentation at below link says that "all

Re: system hung up when offlining CPUs

2017-11-01 Thread Hannes Reinecke
On 11/01/2017 01:47 AM, Thomas Gleixner wrote: > On Mon, 30 Oct 2017, Shivasharan Srikanteshwara wrote: > >> In managed-interrupts case, interrupts which were affine to the offlined >> CPU is not getting migrated to another available CPU. But the >> documentation at below link says that "all

RE: system hung up when offlining CPUs

2017-10-31 Thread Thomas Gleixner
On Mon, 30 Oct 2017, Shivasharan Srikanteshwara wrote: > In managed-interrupts case, interrupts which were affine to the offlined > CPU is not getting migrated to another available CPU. But the > documentation at below link says that "all interrupts" are migrated to a > new CPU. So not all

RE: system hung up when offlining CPUs

2017-10-31 Thread Thomas Gleixner
On Mon, 30 Oct 2017, Shivasharan Srikanteshwara wrote: > In managed-interrupts case, interrupts which were affine to the offlined > CPU is not getting migrated to another available CPU. But the > documentation at below link says that "all interrupts" are migrated to a > new CPU. So not all

RE: system hung up when offlining CPUs

2017-10-30 Thread Shivasharan Srikanteshwara
.@intel.com; > pet...@infradead.org; LKML; linux-s...@vger.kernel.org; Sumit Saxena; > Shivasharan Srikanteshwara > Subject: Re: system hung up when offlining CPUs > > Yasuaki, > > On Mon, 16 Oct 2017, YASUAKI ISHIMATSU wrote: > > > Hi Thomas, > > > > > Can y

RE: system hung up when offlining CPUs

2017-10-30 Thread Shivasharan Srikanteshwara
.@intel.com; > pet...@infradead.org; LKML; linux-s...@vger.kernel.org; Sumit Saxena; > Shivasharan Srikanteshwara > Subject: Re: system hung up when offlining CPUs > > Yasuaki, > > On Mon, 16 Oct 2017, YASUAKI ISHIMATSU wrote: > > > Hi Thomas, > > > > > Can y

Re: system hung up when offlining CPUs

2017-10-16 Thread Thomas Gleixner
Yasuaki, On Mon, 16 Oct 2017, YASUAKI ISHIMATSU wrote: > Hi Thomas, > > > Can you please apply the patch below on top of Linus tree and retest? > > > > Please send me the outputs I asked you to provide last time in any case > > (success or fail). > > The issue still occurs even if I applied

Re: system hung up when offlining CPUs

2017-10-16 Thread Thomas Gleixner
Yasuaki, On Mon, 16 Oct 2017, YASUAKI ISHIMATSU wrote: > Hi Thomas, > > > Can you please apply the patch below on top of Linus tree and retest? > > > > Please send me the outputs I asked you to provide last time in any case > > (success or fail). > > The issue still occurs even if I applied

Re: system hung up when offlining CPUs

2017-10-16 Thread YASUAKI ISHIMATSU
Hi Thomas, > Can you please apply the patch below on top of Linus tree and retest? > > Please send me the outputs I asked you to provide last time in any case > (success or fail). The issue still occurs even if I applied your patch to linux 4.14.0-rc4. --- [ ...] INFO: task setroubleshootd:4972

Re: system hung up when offlining CPUs

2017-10-16 Thread YASUAKI ISHIMATSU
Hi Thomas, > Can you please apply the patch below on top of Linus tree and retest? > > Please send me the outputs I asked you to provide last time in any case > (success or fail). The issue still occurs even if I applied your patch to linux 4.14.0-rc4. --- [ ...] INFO: task setroubleshootd:4972

Re: system hung up when offlining CPUs

2017-10-10 Thread YASUAKI ISHIMATSU
Hi Thomas, Sorry for the late reply. I'll apply the patches and retest in this week. Please wait a while. Thanks, Yasuaki Ishimatsu On 10/04/2017 05:04 PM, Thomas Gleixner wrote: > On Tue, 3 Oct 2017, Thomas Gleixner wrote: >> Can you please apply the debug patch below. > > I found an issue

Re: system hung up when offlining CPUs

2017-10-10 Thread YASUAKI ISHIMATSU
Hi Thomas, Sorry for the late reply. I'll apply the patches and retest in this week. Please wait a while. Thanks, Yasuaki Ishimatsu On 10/04/2017 05:04 PM, Thomas Gleixner wrote: > On Tue, 3 Oct 2017, Thomas Gleixner wrote: >> Can you please apply the debug patch below. > > I found an issue

Re: system hung up when offlining CPUs

2017-10-04 Thread Thomas Gleixner
On Mon, 2 Oct 2017, YASUAKI ISHIMATSU wrote: > > We are talking about megasas driver. > So I added linux-scsi and maintainers of megasas into the thread. Another question: Is this the in tree megasas driver and you are observing this on Linus latest tree, i.e. 4.14-rc3+ ? Thanks, tglx

Re: system hung up when offlining CPUs

2017-10-04 Thread Thomas Gleixner
On Mon, 2 Oct 2017, YASUAKI ISHIMATSU wrote: > > We are talking about megasas driver. > So I added linux-scsi and maintainers of megasas into the thread. Another question: Is this the in tree megasas driver and you are observing this on Linus latest tree, i.e. 4.14-rc3+ ? Thanks, tglx

Re: system hung up when offlining CPUs

2017-10-04 Thread Thomas Gleixner
On Tue, 3 Oct 2017, Thomas Gleixner wrote: > Can you please apply the debug patch below. I found an issue with managed interrupts when the affinity mask of an managed interrupt spawns multiple CPUs. Explanation in the changelog below. I'm not sure that this cures the problems you have, but at

Re: system hung up when offlining CPUs

2017-10-04 Thread Thomas Gleixner
On Tue, 3 Oct 2017, Thomas Gleixner wrote: > Can you please apply the debug patch below. I found an issue with managed interrupts when the affinity mask of an managed interrupt spawns multiple CPUs. Explanation in the changelog below. I'm not sure that this cures the problems you have, but at

Re: system hung up when offlining CPUs

2017-10-03 Thread Thomas Gleixner
On Mon, 2 Oct 2017, YASUAKI ISHIMATSU wrote: > On 09/16/2017 11:02 AM, Thomas Gleixner wrote: > > Which driver are we talking about? > > We are talking about megasas driver. Can you please apply the debug patch below. After booting enable stack traces for the tracer: # echo 1

Re: system hung up when offlining CPUs

2017-10-03 Thread Thomas Gleixner
On Mon, 2 Oct 2017, YASUAKI ISHIMATSU wrote: > On 09/16/2017 11:02 AM, Thomas Gleixner wrote: > > Which driver are we talking about? > > We are talking about megasas driver. Can you please apply the debug patch below. After booting enable stack traces for the tracer: # echo 1

Re: system hung up when offlining CPUs

2017-10-02 Thread YASUAKI ISHIMATSU
On 09/16/2017 11:02 AM, Thomas Gleixner wrote: > On Sat, 16 Sep 2017, Thomas Gleixner wrote: >> On Thu, 14 Sep 2017, YASUAKI ISHIMATSU wrote: >>> Here are one irq's info of megasas: >>> >>> - Before offline CPU >>> /proc/irq/70/smp_affinity_list >>> 24-29 >>> >>> /proc/irq/70/effective_affinity

Re: system hung up when offlining CPUs

2017-10-02 Thread YASUAKI ISHIMATSU
On 09/16/2017 11:02 AM, Thomas Gleixner wrote: > On Sat, 16 Sep 2017, Thomas Gleixner wrote: >> On Thu, 14 Sep 2017, YASUAKI ISHIMATSU wrote: >>> Here are one irq's info of megasas: >>> >>> - Before offline CPU >>> /proc/irq/70/smp_affinity_list >>> 24-29 >>> >>> /proc/irq/70/effective_affinity

Re: system hung up when offlining CPUs

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Thomas Gleixner wrote: > On Thu, 14 Sep 2017, YASUAKI ISHIMATSU wrote: > > Here are one irq's info of megasas: > > > > - Before offline CPU > > /proc/irq/70/smp_affinity_list > > 24-29 > > > > /proc/irq/70/effective_affinity > >

Re: system hung up when offlining CPUs

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Thomas Gleixner wrote: > On Thu, 14 Sep 2017, YASUAKI ISHIMATSU wrote: > > Here are one irq's info of megasas: > > > > - Before offline CPU > > /proc/irq/70/smp_affinity_list > > 24-29 > > > > /proc/irq/70/effective_affinity > >

Re: system hung up when offlining CPUs

2017-09-16 Thread Thomas Gleixner
On Thu, 14 Sep 2017, YASUAKI ISHIMATSU wrote: > On 09/13/2017 09:33 AM, Thomas Gleixner wrote: > >> Question - "what happens once __cpu_disable is called and some of the > >> queued > >> interrupt has affinity to that particular CPU ?" > >> I assume ideally those pending/queued Interrupt should

Re: system hung up when offlining CPUs

2017-09-16 Thread Thomas Gleixner
On Thu, 14 Sep 2017, YASUAKI ISHIMATSU wrote: > On 09/13/2017 09:33 AM, Thomas Gleixner wrote: > >> Question - "what happens once __cpu_disable is called and some of the > >> queued > >> interrupt has affinity to that particular CPU ?" > >> I assume ideally those pending/queued Interrupt should

Re: system hung up when offlining CPUs

2017-09-14 Thread YASUAKI ISHIMATSU
On 09/13/2017 09:33 AM, Thomas Gleixner wrote: > On Wed, 13 Sep 2017, Kashyap Desai wrote: >>> On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: + linux-scsi and maintainers of megasas > > In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline > CPU#24-29, I/O does not

Re: system hung up when offlining CPUs

2017-09-14 Thread YASUAKI ISHIMATSU
On 09/13/2017 09:33 AM, Thomas Gleixner wrote: > On Wed, 13 Sep 2017, Kashyap Desai wrote: >>> On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: + linux-scsi and maintainers of megasas > > In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline > CPU#24-29, I/O does not

RE: system hung up when offlining CPUs

2017-09-13 Thread Thomas Gleixner
On Wed, 13 Sep 2017, Kashyap Desai wrote: > > On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: > > > + linux-scsi and maintainers of megasas > > >> In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline > > >> CPU#24-29, I/O does not work, showing the following messages. > > This

RE: system hung up when offlining CPUs

2017-09-13 Thread Thomas Gleixner
On Wed, 13 Sep 2017, Kashyap Desai wrote: > > On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: > > > + linux-scsi and maintainers of megasas > > >> In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline > > >> CPU#24-29, I/O does not work, showing the following messages. > > This

RE: system hung up when offlining CPUs

2017-09-13 Thread Kashyap Desai
> > On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: > > + linux-scsi and maintainers of megasas > > > > When offlining CPU, I/O stops. Do you have any ideas? > > > > On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote: > >> Hi Mark and Christoph, > >> > >> Sorry for the late reply. I appreciated that

RE: system hung up when offlining CPUs

2017-09-13 Thread Kashyap Desai
> > On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: > > + linux-scsi and maintainers of megasas > > > > When offlining CPU, I/O stops. Do you have any ideas? > > > > On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote: > >> Hi Mark and Christoph, > >> > >> Sorry for the late reply. I appreciated that

Re: system hung up when offlining CPUs

2017-09-13 Thread Hannes Reinecke
On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: > + linux-scsi and maintainers of megasas > > When offlining CPU, I/O stops. Do you have any ideas? > > On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote: >> Hi Mark and Christoph, >> >> Sorry for the late reply. I appreciated that you fixed the

Re: system hung up when offlining CPUs

2017-09-13 Thread Hannes Reinecke
On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: > + linux-scsi and maintainers of megasas > > When offlining CPU, I/O stops. Do you have any ideas? > > On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote: >> Hi Mark and Christoph, >> >> Sorry for the late reply. I appreciated that you fixed the

Re: system hung up when offlining CPUs

2017-09-12 Thread YASUAKI ISHIMATSU
+ linux-scsi and maintainers of megasas When offlining CPU, I/O stops. Do you have any ideas? On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote: > Hi Mark and Christoph, > > Sorry for the late reply. I appreciated that you fixed the issue on kvm > environment. > But the issue still occurs on

Re: system hung up when offlining CPUs

2017-09-12 Thread YASUAKI ISHIMATSU
+ linux-scsi and maintainers of megasas When offlining CPU, I/O stops. Do you have any ideas? On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote: > Hi Mark and Christoph, > > Sorry for the late reply. I appreciated that you fixed the issue on kvm > environment. > But the issue still occurs on

Re: system hung up when offlining CPUs

2017-09-07 Thread YASUAKI ISHIMATSU
Hi Mark and Christoph, Sorry for the late reply. I appreciated that you fixed the issue on kvm environment. But the issue still occurs on physical server. Here ares irq information that I summarized megasas irqs from /proc/interrupts and /proc/irq/*/smp_affinity_list on my server: --- IRQ

Re: system hung up when offlining CPUs

2017-09-07 Thread YASUAKI ISHIMATSU
Hi Mark and Christoph, Sorry for the late reply. I appreciated that you fixed the issue on kvm environment. But the issue still occurs on physical server. Here ares irq information that I summarized megasas irqs from /proc/interrupts and /proc/irq/*/smp_affinity_list on my server: --- IRQ

Re: system hung up when offlining CPUs

2017-08-21 Thread Marc Zyngier
On 21/08/17 14:18, Christoph Hellwig wrote: > Can you try the patch below please? > > --- > From d5f59cb7a629de8439b318e1384660e6b56e7dd8 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig > Date: Mon, 21 Aug 2017 14:24:11 +0200 > Subject: virtio_pci: fix cpu affinity support > >

Re: system hung up when offlining CPUs

2017-08-21 Thread Marc Zyngier
On 21/08/17 14:18, Christoph Hellwig wrote: > Can you try the patch below please? > > --- > From d5f59cb7a629de8439b318e1384660e6b56e7dd8 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig > Date: Mon, 21 Aug 2017 14:24:11 +0200 > Subject: virtio_pci: fix cpu affinity support > > Commit

Re: system hung up when offlining CPUs

2017-08-21 Thread Christoph Hellwig
Can you try the patch below please? --- >From d5f59cb7a629de8439b318e1384660e6b56e7dd8 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Mon, 21 Aug 2017 14:24:11 +0200 Subject: virtio_pci: fix cpu affinity support Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts

Re: system hung up when offlining CPUs

2017-08-21 Thread Christoph Hellwig
Can you try the patch below please? --- >From d5f59cb7a629de8439b318e1384660e6b56e7dd8 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Mon, 21 Aug 2017 14:24:11 +0200 Subject: virtio_pci: fix cpu affinity support Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for

Re: system hung up when offlining CPUs

2017-08-21 Thread Christoph Hellwig
Hi Marc, in general the driver should know not to use the queue / irq, as blk-mq will never schedule I/O to queues that have no online cpus. The real bugs seems to be that we're using affinity for a device that only has one real queue (as the config queue should not have affinity). Let me dig

Re: system hung up when offlining CPUs

2017-08-21 Thread Christoph Hellwig
Hi Marc, in general the driver should know not to use the queue / irq, as blk-mq will never schedule I/O to queues that have no online cpus. The real bugs seems to be that we're using affinity for a device that only has one real queue (as the config queue should not have affinity). Let me dig

Re: system hung up when offlining CPUs

2017-08-10 Thread Marc Zyngier
+ Christoph, since he's the one who came up with the idea On 09/08/17 20:09, YASUAKI ISHIMATSU wrote: > Hi Marc, > > On 08/09/2017 07:42 AM, Marc Zyngier wrote: >> On Tue, 8 Aug 2017 15:25:35 -0400 >> YASUAKI ISHIMATSU wrote: >> >>> Hi Thomas, >>> >>> When offlining all

Re: system hung up when offlining CPUs

2017-08-10 Thread Marc Zyngier
+ Christoph, since he's the one who came up with the idea On 09/08/17 20:09, YASUAKI ISHIMATSU wrote: > Hi Marc, > > On 08/09/2017 07:42 AM, Marc Zyngier wrote: >> On Tue, 8 Aug 2017 15:25:35 -0400 >> YASUAKI ISHIMATSU wrote: >> >>> Hi Thomas, >>> >>> When offlining all CPUs except cpu0, system

Re: system hung up when offlining CPUs

2017-08-09 Thread YASUAKI ISHIMATSU
Hi Marc, On 08/09/2017 07:42 AM, Marc Zyngier wrote: > On Tue, 8 Aug 2017 15:25:35 -0400 > YASUAKI ISHIMATSU wrote: > >> Hi Thomas, >> >> When offlining all CPUs except cpu0, system hung up with the following >> message. >> >> [...] INFO: task kworker/u384:1:1234

Re: system hung up when offlining CPUs

2017-08-09 Thread YASUAKI ISHIMATSU
Hi Marc, On 08/09/2017 07:42 AM, Marc Zyngier wrote: > On Tue, 8 Aug 2017 15:25:35 -0400 > YASUAKI ISHIMATSU wrote: > >> Hi Thomas, >> >> When offlining all CPUs except cpu0, system hung up with the following >> message. >> >> [...] INFO: task kworker/u384:1:1234 blocked for more than 120

Re: system hung up when offlining CPUs

2017-08-09 Thread Marc Zyngier
On Tue, 8 Aug 2017 15:25:35 -0400 YASUAKI ISHIMATSU wrote: > Hi Thomas, > > When offlining all CPUs except cpu0, system hung up with the following > message. > > [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds. > [...] Not tainted

Re: system hung up when offlining CPUs

2017-08-09 Thread Marc Zyngier
On Tue, 8 Aug 2017 15:25:35 -0400 YASUAKI ISHIMATSU wrote: > Hi Thomas, > > When offlining all CPUs except cpu0, system hung up with the following > message. > > [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds. > [...] Not tainted 4.12.0-rc6+ #19 > [...] "echo 0 >

system hung up when offlining CPUs

2017-08-08 Thread YASUAKI ISHIMATSU
Hi Thomas, When offlining all CPUs except cpu0, system hung up with the following message. [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds. [...] Not tainted 4.12.0-rc6+ #19 [...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [...]

system hung up when offlining CPUs

2017-08-08 Thread YASUAKI ISHIMATSU
Hi Thomas, When offlining all CPUs except cpu0, system hung up with the following message. [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds. [...] Not tainted 4.12.0-rc6+ #19 [...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [...]

system hung up when offlining CPUs

2017-08-08 Thread YASUAKI ISHIMATSU
Hi Thomas, When offlining all CPUs except cpu0, system hung up with the following message. [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds. [...] Not tainted 4.12.0-rc6+ #19 [...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [...]

system hung up when offlining CPUs

2017-08-08 Thread YASUAKI ISHIMATSU
Hi Thomas, When offlining all CPUs except cpu0, system hung up with the following message. [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds. [...] Not tainted 4.12.0-rc6+ #19 [...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [...]