Re: [PATCH][RT] xfs: Disable preemption when grabbing all icsb counter locks

2015-04-30 Thread Austin Schuh
On Thu, Apr 30, 2015 at 11:32 AM, Steven Rostedt wrote: > On Thu, 30 Apr 2015 20:07:21 +0200 > Peter Zijlstra wrote: >> The irony, this is distinctly non deterministic code you're putting >> under a RT specific preempt_disable ;-) > > I know :-( > > Unfortunately, a RT behaving fix would be much

Re: [PATCH][RT] xfs: Disable preemption when grabbing all icsb counter locks

2015-04-30 Thread Austin Schuh
On Thu, Apr 30, 2015 at 11:32 AM, Steven Rostedt rost...@goodmis.org wrote: On Thu, 30 Apr 2015 20:07:21 +0200 Peter Zijlstra pet...@infradead.org wrote: The irony, this is distinctly non deterministic code you're putting under a RT specific preempt_disable ;-) I know :-( Unfortunately, a

Re: [PATCH] sched: fix RLIMIT_RTTIME when PI-boosting to RT

2015-03-05 Thread Austin Schuh
ping? On Wed, Feb 18, 2015 at 4:23 PM, wrote: > From: Brian Silverman > > When non-realtime tasks get priority-inheritance boosted to a realtime > scheduling class, RLIMIT_RTTIME starts to apply to them. However, the > counter used for checking this (the same one used for SCHED_RR >

Re: [PATCH] sched: fix RLIMIT_RTTIME when PI-boosting to RT

2015-03-05 Thread Austin Schuh
ping? On Wed, Feb 18, 2015 at 4:23 PM, br...@peloton-tech.com wrote: From: Brian Silverman br...@peloton-tech.com When non-realtime tasks get priority-inheritance boosted to a realtime scheduling class, RLIMIT_RTTIME starts to apply to them. However, the counter used for checking this (the

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-05 Thread Austin Schuh
On Sat, Jul 5, 2014 at 1:26 PM, Thomas Gleixner wrote: > On Mon, 30 Jun 2014, Austin Schuh wrote: >> I think I might have an answer for my own question, but I would >> appreciate someone else to double check. If list_empty erroneously >> returns that there is work to do

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-05 Thread Austin Schuh
On Sat, Jul 5, 2014 at 1:26 PM, Thomas Gleixner t...@linutronix.de wrote: On Mon, 30 Jun 2014, Austin Schuh wrote: I think I might have an answer for my own question, but I would appreciate someone else to double check. If list_empty erroneously returns that there is work to do when

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-03 Thread Austin Schuh
On Tue, Jul 1, 2014 at 12:32 PM, Austin Schuh wrote: > On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh wrote: >> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: >>> Completely untested patch below. I've tested it and looked it over now, and feel pretty confident in

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-03 Thread Austin Schuh
On Tue, Jul 1, 2014 at 12:32 PM, Austin Schuh aus...@peloton-tech.com wrote: On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh aus...@peloton-tech.com wrote: On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner t...@linutronix.de wrote: Completely untested patch below. I've tested it and looked

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-01 Thread Austin Schuh
On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh wrote: > On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: >> Completely untested patch below. > > By chance, I found this in my boot logs. I'll do some more startup > testing tomorrow. > > Jun 30 19:54:40 vp

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-01 Thread Austin Schuh
On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh aus...@peloton-tech.com wrote: On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner t...@linutronix.de wrote: Completely untested patch below. By chance, I found this in my boot logs. I'll do some more startup testing tomorrow. Jun 30 19:54:40 vpc5

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: > Completely untested patch below. By chance, I found this in my boot logs. I'll do some more startup testing tomorrow. Jun 30 19:54:40 vpc5 kernel: [0.670955] [ cut here ] Jun 30 19:54:40 vpc5 kernel: [

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Mon, Jun 30, 2014 at 5:12 PM, Austin Schuh wrote: > On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: >> On Thu, 26 Jun 2014, Austin Schuh wrote: >>> If I'm reading the rt patch correctly, wq_worker_sleeping was moved >>> out of __schedule to sched_

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: > On Thu, 26 Jun 2014, Austin Schuh wrote: >> If I'm reading the rt patch correctly, wq_worker_sleeping was moved >> out of __schedule to sched_submit_work. It looks like that changes >> the conditions under whi

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner t...@linutronix.de wrote: On Thu, 26 Jun 2014, Austin Schuh wrote: If I'm reading the rt patch correctly, wq_worker_sleeping was moved out of __schedule to sched_submit_work. It looks like that changes the conditions under which

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Mon, Jun 30, 2014 at 5:12 PM, Austin Schuh aus...@peloton-tech.com wrote: On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner t...@linutronix.de wrote: On Thu, 26 Jun 2014, Austin Schuh wrote: If I'm reading the rt patch correctly, wq_worker_sleeping was moved out of __schedule

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner t...@linutronix.de wrote: Completely untested patch below. By chance, I found this in my boot logs. I'll do some more startup testing tomorrow. Jun 30 19:54:40 vpc5 kernel: [0.670955] [ cut here ] Jun 30 19:54:40 vpc5

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-28 Thread Austin Schuh
On Fri, Jun 27, 2014 at 8:32 PM, Mike Galbraith wrote: > On Fri, 2014-06-27 at 18:18 -0700, Austin Schuh wrote: > >> It would be more context switches, but I wonder if we could kick the >> workqueue logic completely out of the scheduler into a thread. Have >> the sched

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-28 Thread Austin Schuh
On Fri, Jun 27, 2014 at 8:32 PM, Mike Galbraith umgwanakikb...@gmail.com wrote: On Fri, 2014-06-27 at 18:18 -0700, Austin Schuh wrote: It would be more context switches, but I wonder if we could kick the workqueue logic completely out of the scheduler into a thread. Have the scheduler

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Austin Schuh
On Fri, Jun 27, 2014 at 11:19 AM, Steven Rostedt wrote: > On Fri, 27 Jun 2014 20:07:54 +0200 > Mike Galbraith wrote: > >> > Why do we need the wakeup? the owner of the lock should wake it up >> > shouldn't it? >> >> True, but that can take ages. > > Can it? If the workqueue is of some higher

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Austin Schuh
On Fri, Jun 27, 2014 at 11:19 AM, Steven Rostedt rost...@goodmis.org wrote: On Fri, 27 Jun 2014 20:07:54 +0200 Mike Galbraith umgwanakikb...@gmail.com wrote: Why do we need the wakeup? the owner of the lock should wake it up shouldn't it? True, but that can take ages. Can it? If the

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Austin Schuh
On Thu, Jun 26, 2014 at 3:35 PM, Thomas Gleixner wrote: > On Thu, 26 Jun 2014, Austin Schuh wrote: >> On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger >> wrote: >> > CC'ing RT folks >> > >> > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh >>

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Austin Schuh
On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger wrote: > CC'ing RT folks > > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh wrote: >> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh >> wrote: >>> Hi, >>> >>> I am observing a filesystem lo

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Austin Schuh
On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger richard.weinber...@gmail.com wrote: CC'ing RT folks On Wed, May 21, 2014 at 8:23 AM, Austin Schuh aus...@peloton-tech.com wrote: On Tue, May 13, 2014 at 7:29 PM, Austin Schuh aus...@peloton-tech.com wrote: Hi, I am observing

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Austin Schuh
On Thu, Jun 26, 2014 at 3:35 PM, Thomas Gleixner t...@linutronix.de wrote: On Thu, 26 Jun 2014, Austin Schuh wrote: On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger richard.weinber...@gmail.com wrote: CC'ing RT folks On Wed, May 21, 2014 at 8:23 AM, Austin Schuh aus...@peloton

Re: On-stack work item completion race? (was Re: XFS crash?)

2014-06-25 Thread Austin Schuh
On Wed, Jun 25, 2014 at 7:00 AM, Tejun Heo wrote: > > Hello, > > On Tue, Jun 24, 2014 at 08:05:07PM -0700, Austin Schuh wrote: > > > I can see no reason why manual completion would behave differently > > > from flush_work() in this case. > > > > I went lo

Re: On-stack work item completion race? (was Re: XFS crash?)

2014-06-25 Thread Austin Schuh
On Wed, Jun 25, 2014 at 7:00 AM, Tejun Heo t...@kernel.org wrote: Hello, On Tue, Jun 24, 2014 at 08:05:07PM -0700, Austin Schuh wrote: I can see no reason why manual completion would behave differently from flush_work() in this case. I went looking for a short trace in my original

Re: On-stack work item completion race? (was Re: XFS crash?)

2014-06-24 Thread Austin Schuh
[Adding tglx to the cc. Sorry for any double sends] On Mon, Jun 23, 2014 at 8:25 PM, Tejun Heo wrote: > Hello, > > On Tue, Jun 24, 2014 at 01:02:40PM +1000, Dave Chinner wrote: >> start_flush_work() is effectively a special queue_work() >> implementation, so if if it's not safe to call

Re: On-stack work item completion race? (was Re: XFS crash?)

2014-06-24 Thread Austin Schuh
[Adding tglx to the cc. Sorry for any double sends] On Mon, Jun 23, 2014 at 8:25 PM, Tejun Heo t...@kernel.org wrote: Hello, On Tue, Jun 24, 2014 at 01:02:40PM +1000, Dave Chinner wrote: start_flush_work() is effectively a special queue_work() implementation, so if if it's not safe to call

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread Austin Schuh
On Wed, May 21, 2014 at 12:30 PM, John Blackwood wrote: >> Date: Wed, 21 May 2014 03:33:49 -0400 >> From: Richard Weinberger >> To: Austin Schuh >> CC: LKML , xfs , rt-users >> >> Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT > >>

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread Austin Schuh
On Tue, May 13, 2014 at 7:29 PM, Austin Schuh wrote: > Hi, > > I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT > patched kernel. I have currently only triggered it using dpkg. Dave > Chinner on the XFS mailing list suggested that it was a rt-kernel &g

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread Austin Schuh
On Tue, May 13, 2014 at 7:29 PM, Austin Schuh aus...@peloton-tech.com wrote: Hi, I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT patched kernel. I have currently only triggered it using dpkg. Dave Chinner on the XFS mailing list suggested that it was a rt-kernel

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread Austin Schuh
On Wed, May 21, 2014 at 12:30 PM, John Blackwood john.blackw...@ccur.com wrote: Date: Wed, 21 May 2014 03:33:49 -0400 From: Richard Weinberger richard.weinber...@gmail.com To: Austin Schuh aus...@peloton-tech.com CC: LKML linux-kernel@vger.kernel.org, xfs x...@oss.sgi.com, rt-users

Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-13 Thread Austin Schuh
Hi, I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT patched kernel. I have currently only triggered it using dpkg. Dave Chinner on the XFS mailing list suggested that it was a rt-kernel workqueue issue as opposed to a XFS problem after looking at the kernel messages. $ uname

Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-13 Thread Austin Schuh
Hi, I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT patched kernel. I have currently only triggered it using dpkg. Dave Chinner on the XFS mailing list suggested that it was a rt-kernel workqueue issue as opposed to a XFS problem after looking at the kernel messages. $ uname

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-28 Thread Austin Schuh
On Mon, Apr 7, 2014 at 1:08 PM, Austin Schuh wrote: > On Mon, Apr 7, 2014 at 1:07 PM, Thomas Gleixner wrote: >> On Mon, 7 Apr 2014, Austin Schuh wrote: >>> You originally sent the patch out. I could send your patch out back >>> to you, but that feels a bit weird ;) &g

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-28 Thread Austin Schuh
On Mon, Apr 7, 2014 at 1:08 PM, Austin Schuh aus...@peloton-tech.com wrote: On Mon, Apr 7, 2014 at 1:07 PM, Thomas Gleixner t...@linutronix.de wrote: On Mon, 7 Apr 2014, Austin Schuh wrote: You originally sent the patch out. I could send your patch out back to you, but that feels a bit weird

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-07 Thread Austin Schuh
On Mon, Apr 7, 2014 at 1:07 PM, Thomas Gleixner wrote: > On Mon, 7 Apr 2014, Austin Schuh wrote: >> You originally sent the patch out. I could send your patch out back >> to you, but that feels a bit weird ;) > > Wheee. Let me dig in my archives https://lkml.org/lkml

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-07 Thread Austin Schuh
On Mon, Apr 7, 2014 at 11:41 AM, Thomas Gleixner wrote: > On Mon, 7 Apr 2014, Austin Schuh wrote: > >> Hi Thomas, >> >> Did anything come of this patch? Both Oliver and I have found that it >> fixes real problems. I have multiple machines which have been running &g

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-07 Thread Austin Schuh
for two > weeks now without problems. > > If you want me to test an improved version (as Austin suggested below) please > send a patch. > > Best regards, > Oliver > > On 23.12.2013 20:25, Austin Schuh wrote: >> Hi Thomas, >> >> Did anything happen

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-07 Thread Austin Schuh
(as Austin suggested below) please send a patch. Best regards, Oliver On 23.12.2013 20:25, Austin Schuh wrote: Hi Thomas, Did anything happen with your patch to note_interrupt, originally posted on May 8th of 2013? (https://lkml.org/lkml/2013/3/7/222) I am seeing an issue on a machine

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-07 Thread Austin Schuh
On Mon, Apr 7, 2014 at 11:41 AM, Thomas Gleixner t...@linutronix.de wrote: On Mon, 7 Apr 2014, Austin Schuh wrote: Hi Thomas, Did anything come of this patch? Both Oliver and I have found that it fixes real problems. I have multiple machines which have been running with the patch since

Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs

2014-04-07 Thread Austin Schuh
On Mon, Apr 7, 2014 at 1:07 PM, Thomas Gleixner t...@linutronix.de wrote: On Mon, 7 Apr 2014, Austin Schuh wrote: You originally sent the patch out. I could send your patch out back to you, but that feels a bit weird ;) Wheee. Let me dig in my archives https://lkml.org/lkml/2013/3/7