Re: [ANNOUNCE] 3.6.1-rt1

2012-10-10 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:

> On Tue, 2012-10-09 at 20:21 -0400, Steven Rostedt wrote:
> 
> > > 0007-stomp-machine-deal-clever-with-stopper-lock.patch
> > 
> > With this one, things have changed quite a bit. I'll take a deeper look
> > at what you did and figure out how this applies to v3.0-rt.
> 
> It doesn't look like this patch is needed for v3.0-rt as the patch
> addresses the new stop_machine_from_inactive_cpu() API used by the mtrr
> code added by this commit:
> 
> commit 192d8857427dd23707d5f0b86ca990c3af6f2d74
> Author: Suresh Siddha 
> Date:   Thu Jun 23 11:19:29 2011 -0700
> 
> x86, mtrr: use stop_machine APIs for doing MTRR rendezvous
> 
> 
> This was added in 3.1 so the patch is still required for v3.2-rt.

Correct. 3.0 is not affected.

 tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-10 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:

 On Tue, 2012-10-09 at 20:21 -0400, Steven Rostedt wrote:
 
   0007-stomp-machine-deal-clever-with-stopper-lock.patch
  
  With this one, things have changed quite a bit. I'll take a deeper look
  at what you did and figure out how this applies to v3.0-rt.
 
 It doesn't look like this patch is needed for v3.0-rt as the patch
 addresses the new stop_machine_from_inactive_cpu() API used by the mtrr
 code added by this commit:
 
 commit 192d8857427dd23707d5f0b86ca990c3af6f2d74
 Author: Suresh Siddha suresh.b.sid...@intel.com
 Date:   Thu Jun 23 11:19:29 2011 -0700
 
 x86, mtrr: use stop_machine APIs for doing MTRR rendezvous
 
 
 This was added in 3.1 so the patch is still required for v3.2-rt.

Correct. 3.0 is not affected.

 tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 20:21 -0400, Steven Rostedt wrote:

> > 0007-stomp-machine-deal-clever-with-stopper-lock.patch
> 
> With this one, things have changed quite a bit. I'll take a deeper look
> at what you did and figure out how this applies to v3.0-rt.

It doesn't look like this patch is needed for v3.0-rt as the patch
addresses the new stop_machine_from_inactive_cpu() API used by the mtrr
code added by this commit:

commit 192d8857427dd23707d5f0b86ca990c3af6f2d74
Author: Suresh Siddha 
Date:   Thu Jun 23 11:19:29 2011 -0700

x86, mtrr: use stop_machine APIs for doing MTRR rendezvous


This was added in 3.1 so the patch is still required for v3.2-rt.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt


On Tue, 2012-10-09 at 14:19 -0400, Steven Rostedt wrote:

I applied and tested the backported patches to 3.4-rt. Things look good
and will be posting the -rc1 soon.

Status for 3.0-rt:


> -scsi-qla2xxx-fix-bug-sleeping-function-called-from-invalid-context.patch
> 0001-upstream-net-rt-remove-preemption-disabling-in-netisched-better-debug-output-for-might-sleep.patch
>  f_rx.patch

The above two have been applied to both 3.0-rt and 3.4-rt previously.

> 0002-random-make-it-work-on-rt.patch

The above has been applied to 3.4-rt but is not applicable to 3.0-rt, as
the add_interrupt_randomness() does not exist.

> 0003-softirq-init-softirq-local-lock-after-per-cpu-section-is-set-up.patch

For 3.0-rt the softirq_early_init() comes after the printk banner, which
is after per cpu data has been set up. But for consistency, I made this
patch moved to before the banner in the same location as 3.4-rt+ is.


> 0004-mm-slab-fix-potential-deadlock.patch
> 0005-mm-page-alloc-use-local-lock-on-target-cpu.patch
> 0006-rt-rw-lockdep-annotations.patch

The above applied to 3.0-rt with no issues.

* sched-better-debug-output-for-might-sleep.patch 

Had slight conflicts, but trivial fix.

> 0007-stomp-machine-deal-clever-with-stopper-lock.patch

With this one, things have changed quite a bit. I'll take a deeper look
at what you did and figure out how this applies to v3.0-rt.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Tim Sander
Hi Thomas
> I'm pleased to announce the 3.6.1-rt1 release.
I also have to second the big thanks of Steven!

>* Fix for a potential deadlock in mm/slab.c. This had been reported
>  as lockdep splats several times and stupidly ignored as a false
>  positive, but in fact it's a real (though almost impossible to
>  trigger) deadlock lurking.
Mh, just an unverified (i.e. i haven't had the time to dig into it) 
information from my side: I think i have been seeing a total lockup on an OOM 
condition. Is it possible that this possible deadlock
might have been triggered in such a condition more "reliably"?

Best regards
Tim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:

> On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
> 
> > The RT patch against 3.6.1 can be found here:
> > 
> >   
> > http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.1-rt1.patch.xz
> > 
> > The split quilt queue is available at:
> > 
> >   
> > http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patches-3.6.1-rt1.tar.xz
> > 
> 
> I think you meant:
> 
> 
> http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.1-rt1.tar.xz

Bah. Copy and paste should be something which can be disabled :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 14:19 -0400, Steven Rostedt wrote:
> On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
> > Dear RT Folks,
> > 
> > I'm pleased to announce the 3.6.1-rt1 release.
> > 
> > This is a pretty straight forward move from the 3.4-rt series which
> > includes a few significant updates which need to be backported to the
> > 3.x-rt stable series:
> 
> My scripts detected these patches to be pulled into stable. It detects
> any patch that has a Cc: to stable...@vger.kernel.org that does not
> already exist in stable. It also adds the '000x-' prefix to keep the
> order.
> 
> -scsi-qla2xxx-fix-bug-sleeping-function-called-from-invalid-context.patch
> 0001-upstream-net-rt-remove-preemption-disabling-in-netif_rx.patch

The above seem to be already applied (at least to 3.4-rt). Not sure why
my scripts missed it. Perhaps these were the ones added directly, and
the names of the patch that I used did not match your names. My script
saves off what it already applied, to know what it can skip for later
pulls.

-- Steve

> 0002-random-make-it-work-on-rt.patch
> 0003-softirq-init-softirq-local-lock-after-per-cpu-section-is-set-up.patch
> 0004-mm-slab-fix-potential-deadlock.patch
> 0005-mm-page-alloc-use-local-lock-on-target-cpu.patch
> 0006-rt-rw-lockdep-annotations.patch
> 0007-stomp-machine-deal-clever-with-stopper-lock.patch
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
> Dear RT Folks,
> 
> I'm pleased to announce the 3.6.1-rt1 release.
> 
> This is a pretty straight forward move from the 3.4-rt series which
> includes a few significant updates which need to be backported to the
> 3.x-rt stable series:

My scripts detected these patches to be pulled into stable. It detects
any patch that has a Cc: to stable...@vger.kernel.org that does not
already exist in stable. It also adds the '000x-' prefix to keep the
order.

-scsi-qla2xxx-fix-bug-sleeping-function-called-from-invalid-context.patch
0001-upstream-net-rt-remove-preemption-disabling-in-netif_rx.patch
0002-random-make-it-work-on-rt.patch
0003-softirq-init-softirq-local-lock-after-per-cpu-section-is-set-up.patch
0004-mm-slab-fix-potential-deadlock.patch
0005-mm-page-alloc-use-local-lock-on-target-cpu.patch
0006-rt-rw-lockdep-annotations.patch
0007-stomp-machine-deal-clever-with-stopper-lock.patch

> 
>* Make interrupt randomness work again on RT. Based on the 3.x.y
>  stable updates in that area. Should be applicable to all 3.x-rt
>  series with almost no modifications.

Looks to be: random-make-it-work-on-rt.patch

> 
>* RT softirq initialization sequence fix (Steven Rostedt)

As I remembered that I forgot to Cc stable-rt, I manually added it to my
patch before running the script.

> 
>* Fix for a potential deadlock in mm/slab.c. This had been reported
>  as lockdep splats several times and stupidly ignored as a false
>  positive, but in fact it's a real (though almost impossible to
>  trigger) deadlock lurking.

Looks to be: mm-slab-fix-potential-deadlock.patch

> 
>* Use the proper local_lock primitives in mm/page_alloc.c. That's
>  not a real bug, but this fixes an inconsistency which helps
>  debugability and therefore is worthwhile to be backported.

Looks to be: mm-page-alloc-use-local-lock-on-target-cpu.patch

> 
>* RT-rwlock/rwsem annotations:

Looks to be: rt-rw-lockdep-annotations.patch

> 
>  RT does not allow multiple readers on rwlocks and rwsems. The
>  lockdep annotations did not yet consider that fact. One might
>  think that this is a complete RT specific issue, but it's
>  not. The FIFO fair rwsem/lock modifications in mainline made
>  reader/writer primitives prone to very subtle deadlock problems
>  which cannot be detected by the current lockdep annotations in
>  mainline. The reason is that if a writer interleaves with two
>  readers it will block the second reader from proceeding in order
>  not to allow writer starvation. The restricted RWlocks semantics
>  of RT allow an easy detection of that problem. We already
>  triggered a real deadlock in RT (see:
>  peterz-srcu-crypto-chain.patch) which could result in a hard to
>  trigger, but mainline relevant deadlock. Wait for more
>  interesting problems in that area.
> 
>* The output of might_sleep debugging is silent about the possible
>  causes vs. the preempt count. Contrary to interrupt disabling
>  there is zero information about what disabled preemption
>  last. Again, not strictly a bugfix, but debuggability is key.

Is this: sched-better-debug-output-for-might-sleep.patch ? It's not
marked to Cc stable-rt.

> 
>* Fix a potentially deadly sto(m)p_machine deadlock. A CPU which
>  calls that code from its inactive state (don't ask me for the
>  ghastly deatils why this is necessary) can run into a contended
>  state of the stomp machine mutex which would cause a rather
>  awkward issue of idle scheduling itself away to idle as the only
>  possible task on that upcoming cpu. Not pretty 

Looks to be: stomp-machine-deal-clever-with-stopper-lock.patch


If I'm wrong with the above, let me know. Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:

> The RT patch against 3.6.1 can be found here:
> 
>   
> http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.1-rt1.patch.xz
> 
> The split quilt queue is available at:
> 
>   
> http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patches-3.6.1-rt1.tar.xz
> 

I think you meant:


http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.1-rt1.tar.xz

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:
> On Tue, 2012-10-09 at 18:19 +0200, Thomas Gleixner wrote:
> >  
> > > I've started looking at playing with the NAPI code again, and trying to
> > > see if I can add an ENAPI interface (Even Newer API), where the driver
> > > uses its own interrupt thread, and instead of having the polling in the
> > > network softirq, it can do the polling in its own thread.
> > 
> > It's pretty close to the behaviour I enforced with this change. Let's
> > play with that and figure out what influence it has on the network
> > throughput performance on RT. That needs probably a different
> > scheduling scheme than what Carsten needs for his deterministic
> > behaviour.
> > 
> 
> I was actually looking at the change for mainline, not for -rt ;-)

I know, but you can utilize RT for figuring out what kind of
performance impact (in whatever direction) this modus operandi
has. That gives us a better understanding and hopefully improvements
for RT, but at the same time a lot of insight in how we should handle
this scenario on a non RT kernel. You might try to make the softirq
split lock scheme work in CONFIG_RT_BASE as this gives us a way better
comparison to mainline behaviour.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 18:19 +0200, Thomas Gleixner wrote:
>  
> > I've started looking at playing with the NAPI code again, and trying to
> > see if I can add an ENAPI interface (Even Newer API), where the driver
> > uses its own interrupt thread, and instead of having the polling in the
> > network softirq, it can do the polling in its own thread.
> 
> It's pretty close to the behaviour I enforced with this change. Let's
> play with that and figure out what influence it has on the network
> throughput performance on RT. That needs probably a different
> scheduling scheme than what Carsten needs for his deterministic
> behaviour.
> 

I was actually looking at the change for mainline, not for -rt ;-)

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:
> On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
> >  So instead of splitting the softirq threads I split the softirq
> >  locks so different softirqs can be handled seperately. If a
> >  softirq is raised in the context of a thread, then its noted in
> >  the task struct and when the thread leaves the bh disabled
> >  section it handles this particular soft interrupt in its own
> >  context. This removes the burden of running completely unrelated
> >  softirqs like timers, tasklets etc. from a context which raised a
> >  network soft interrupt. That way the softirq processing is
> >  coupled to the originating thread and its scheduling properties,
> >  so the need for finding optimal parameters should be gone.
> 
> Very interesting. I haven't looked at the patches yet (will do that
> after I finish with the stable merge releases), but I started looking
> into the softirq changes as well, and came up with something almost
> identical. I talked a little with Carsten about this, and he told me to
> wait for your release, which I then did, and I'm glad I did :-)
> 
> I was looking specifically at the network softirqs as well, and started
> some patches to separate out the softirqs with the task (sounds similar
> to what you did). But before that, I also played with the
> local_softirq_lock. For the end of interrupt processing only (where it
> should always be safe to lock), if the trylock fails, I grabbed it and
> then released it. Because if a lower priority task is currently running
> the softirq that the higher priority interrupt wants to run, it would at
> least priority boost the lower thread, and the higher priority interrupt
> could run its softirq at its priority. Maybe this can still be added?

I take the lock unconditionally now on local_bh_enable() to enforce
exactly that behaviour. That's what Carsten needs for his
deterministic networking stuff and my goal was zero configuration. Well,
it's not zero as you still have to get the priorities of the app and the
network irq thread straight, but the extra softirq fiddling is gone.

> >  Now this only works for soft interrupts which are raised in the
> >  context of a thread. Unfortunately there is no way to do the same
> >  for soft interrupts which are raised in hard interrupt context
> >  (e.g. RCU, timers). They have no thread associated and are
> >  therefor delegated to ksoftirqd. This is ok, except that it does
> >  not help people who want to use signal based timers, but that
> >  problem needs to be solved by moving the complex handling into
> >  the context of the thread which is going to receive the signal
> >  and should vanish from the softirq processing completely.
> >  
> >  In principle we should have even in mainline a clear separation
> >  of which soft interrupts are disabled by a particular code region
> >  instead of disabling them wholesale. Though the nicest solution
> >  would be to get rid of them completely :)
> 
> I've started looking at playing with the NAPI code again, and trying to
> see if I can add an ENAPI interface (Even Newer API), where the driver
> uses its own interrupt thread, and instead of having the polling in the
> network softirq, it can do the polling in its own thread.

It's pretty close to the behaviour I enforced with this change. Let's
play with that and figure out what influence it has on the network
throughput performance on RT. That needs probably a different
scheduling scheme than what Carsten needs for his deterministic
behaviour.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
> Dear RT Folks,
> 
> I'm pleased to announce the 3.6.1-rt1 release.

Thomas,

First I want to say, and I'm sure I speak for a lot of people on this,
is "Thank you!". I know how hard it is to deal with the issues of
mainline in a RT specific way, and to balance both the determinism
required by RT with the non-intrusiveness to the work flow of mainline.
When this is done right, both mainline and RT benefit. Interesting
enough, Linus knew this a long time ago, and by denying RT only
enhancements to the kernel, he forced us to improve mainline in
general ;-)

> 
> This is a pretty straight forward move from the 3.4-rt series which
> includes a few significant updates which need to be backported to the
> 3.x-rt stable series:
> 
>* Make interrupt randomness work again on RT. Based on the 3.x.y
>  stable updates in that area. Should be applicable to all 3.x-rt
>  series with almost no modifications.
> 
>* RT softirq initialization sequence fix (Steven Rostedt)
> 
>* Fix for a potential deadlock in mm/slab.c. This had been reported
>  as lockdep splats several times and stupidly ignored as a false
>  positive, but in fact it's a real (though almost impossible to
>  trigger) deadlock lurking.
> 
>* Use the proper local_lock primitives in mm/page_alloc.c. That's
>  not a real bug, but this fixes an inconsistency which helps
>  debugability and therefore is worthwhile to be backported.
> 
>* RT-rwlock/rwsem annotations:
> 
>  RT does not allow multiple readers on rwlocks and rwsems. The
>  lockdep annotations did not yet consider that fact. One might
>  think that this is a complete RT specific issue, but it's
>  not. The FIFO fair rwsem/lock modifications in mainline made
>  reader/writer primitives prone to very subtle deadlock problems
>  which cannot be detected by the current lockdep annotations in
>  mainline. The reason is that if a writer interleaves with two
>  readers it will block the second reader from proceeding in order
>  not to allow writer starvation. The restricted RWlocks semantics
>  of RT allow an easy detection of that problem. We already
>  triggered a real deadlock in RT (see:
>  peterz-srcu-crypto-chain.patch) which could result in a hard to
>  trigger, but mainline relevant deadlock. Wait for more
>  interesting problems in that area.
> 
>* The output of might_sleep debugging is silent about the possible
>  causes vs. the preempt count. Contrary to interrupt disabling
>  there is zero information about what disabled preemption
>  last. Again, not strictly a bugfix, but debuggability is key.
> 
>* Fix a potentially deadly sto(m)p_machine deadlock. A CPU which
>  calls that code from its inactive state (don't ask me for the
>  ghastly deatils why this is necessary) can run into a contended
>  state of the stomp machine mutex which would cause a rather
>  awkward issue of idle scheduling itself away to idle as the only
>  possible task on that upcoming cpu. Not pretty 

Here's my road map as everyone just loves them:

  I'm finishing up on releasing the next merge of:

   3.0.45-rt67 and 3.4.13-rt21

  These are only merging the stable 3.0.45 and 3.4.13 without any -rt
  specific changes.

  I'll then backport these fixes to the stable release and release an
  -rc for 3.0.45-rt68 and 3.4.13-rt22

  For 3.2-rt, I'm waiting for the final release of 3.2.31 to be done
  and will be going through the same ordeal with that. That is, I'll
  release a 3.2.31 merged rt only (3.2.31-rt46) and then backport
  and release a -rc for 3.2.31-rt47. This will come later.

> 
> 
> There is also a worth to mention fundamental change in this release:
> 
>* Split softirq locks

Although this work is not for stable (and shouldn't be), I'm thinking
about backporting these to the 3.2 and 3.4 trees and creating a separate
branch for them. This way, those that want this feature based on the
3.2/3.4 stable trees, can have the same repository to work from.

> 
>  In the pre 3,x-RT versions we spawned a separate thread for each
>  softirq on each CPU. This served the PER_CPUness requirements,
>  but did not provide any means against priority inversions
>  vs. softirqs.
> 
>  With the start of the 3.0-rt series I decided to drop the per
>  softirq threads for simplicity reasons as I had to deal with all
>  the fallout of the migration disabling design I had taken course
>  to.
> 
>  I got several complaints about the missing softirq thread split
>  since then and a few patches to reestablish them. I refused to
>  take those patches for a simple reason: configuration. It's
>  extremly hard to get the parameters right for a RT system in
>  general. Adding something which is obscure as soft interrupts to
>  the system designers todo list is a bad idea.
> 
>  

Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
 Dear RT Folks,
 
 I'm pleased to announce the 3.6.1-rt1 release.

Thomas,

First I want to say, and I'm sure I speak for a lot of people on this,
is Thank you!. I know how hard it is to deal with the issues of
mainline in a RT specific way, and to balance both the determinism
required by RT with the non-intrusiveness to the work flow of mainline.
When this is done right, both mainline and RT benefit. Interesting
enough, Linus knew this a long time ago, and by denying RT only
enhancements to the kernel, he forced us to improve mainline in
general ;-)

 
 This is a pretty straight forward move from the 3.4-rt series which
 includes a few significant updates which need to be backported to the
 3.x-rt stable series:
 
* Make interrupt randomness work again on RT. Based on the 3.x.y
  stable updates in that area. Should be applicable to all 3.x-rt
  series with almost no modifications.
 
* RT softirq initialization sequence fix (Steven Rostedt)
 
* Fix for a potential deadlock in mm/slab.c. This had been reported
  as lockdep splats several times and stupidly ignored as a false
  positive, but in fact it's a real (though almost impossible to
  trigger) deadlock lurking.
 
* Use the proper local_lock primitives in mm/page_alloc.c. That's
  not a real bug, but this fixes an inconsistency which helps
  debugability and therefore is worthwhile to be backported.
 
* RT-rwlock/rwsem annotations:
 
  RT does not allow multiple readers on rwlocks and rwsems. The
  lockdep annotations did not yet consider that fact. One might
  think that this is a complete RT specific issue, but it's
  not. The FIFO fair rwsem/lock modifications in mainline made
  reader/writer primitives prone to very subtle deadlock problems
  which cannot be detected by the current lockdep annotations in
  mainline. The reason is that if a writer interleaves with two
  readers it will block the second reader from proceeding in order
  not to allow writer starvation. The restricted RWlocks semantics
  of RT allow an easy detection of that problem. We already
  triggered a real deadlock in RT (see:
  peterz-srcu-crypto-chain.patch) which could result in a hard to
  trigger, but mainline relevant deadlock. Wait for more
  interesting problems in that area.
 
* The output of might_sleep debugging is silent about the possible
  causes vs. the preempt count. Contrary to interrupt disabling
  there is zero information about what disabled preemption
  last. Again, not strictly a bugfix, but debuggability is key.
 
* Fix a potentially deadly sto(m)p_machine deadlock. A CPU which
  calls that code from its inactive state (don't ask me for the
  ghastly deatils why this is necessary) can run into a contended
  state of the stomp machine mutex which would cause a rather
  awkward issue of idle scheduling itself away to idle as the only
  possible task on that upcoming cpu. Not pretty 

Here's my road map as everyone just loves them:

  I'm finishing up on releasing the next merge of:

   3.0.45-rt67 and 3.4.13-rt21

  These are only merging the stable 3.0.45 and 3.4.13 without any -rt
  specific changes.

  I'll then backport these fixes to the stable release and release an
  -rc for 3.0.45-rt68 and 3.4.13-rt22

  For 3.2-rt, I'm waiting for the final release of 3.2.31 to be done
  and will be going through the same ordeal with that. That is, I'll
  release a 3.2.31 merged rt only (3.2.31-rt46) and then backport
  and release a -rc for 3.2.31-rt47. This will come later.

 
 
 There is also a worth to mention fundamental change in this release:
 
* Split softirq locks

Although this work is not for stable (and shouldn't be), I'm thinking
about backporting these to the 3.2 and 3.4 trees and creating a separate
branch for them. This way, those that want this feature based on the
3.2/3.4 stable trees, can have the same repository to work from.

 
  In the pre 3,x-RT versions we spawned a separate thread for each
  softirq on each CPU. This served the PER_CPUness requirements,
  but did not provide any means against priority inversions
  vs. softirqs.
 
  With the start of the 3.0-rt series I decided to drop the per
  softirq threads for simplicity reasons as I had to deal with all
  the fallout of the migration disabling design I had taken course
  to.
 
  I got several complaints about the missing softirq thread split
  since then and a few patches to reestablish them. I refused to
  take those patches for a simple reason: configuration. It's
  extremly hard to get the parameters right for a RT system in
  general. Adding something which is obscure as soft interrupts to
  the system designers todo list is a bad idea.
 
  Now I spent quite some time on analysing the most urgent issues
  on RT:
 

Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:
 On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
   So instead of splitting the softirq threads I split the softirq
   locks so different softirqs can be handled seperately. If a
   softirq is raised in the context of a thread, then its noted in
   the task struct and when the thread leaves the bh disabled
   section it handles this particular soft interrupt in its own
   context. This removes the burden of running completely unrelated
   softirqs like timers, tasklets etc. from a context which raised a
   network soft interrupt. That way the softirq processing is
   coupled to the originating thread and its scheduling properties,
   so the need for finding optimal parameters should be gone.
 
 Very interesting. I haven't looked at the patches yet (will do that
 after I finish with the stable merge releases), but I started looking
 into the softirq changes as well, and came up with something almost
 identical. I talked a little with Carsten about this, and he told me to
 wait for your release, which I then did, and I'm glad I did :-)
 
 I was looking specifically at the network softirqs as well, and started
 some patches to separate out the softirqs with the task (sounds similar
 to what you did). But before that, I also played with the
 local_softirq_lock. For the end of interrupt processing only (where it
 should always be safe to lock), if the trylock fails, I grabbed it and
 then released it. Because if a lower priority task is currently running
 the softirq that the higher priority interrupt wants to run, it would at
 least priority boost the lower thread, and the higher priority interrupt
 could run its softirq at its priority. Maybe this can still be added?

I take the lock unconditionally now on local_bh_enable() to enforce
exactly that behaviour. That's what Carsten needs for his
deterministic networking stuff and my goal was zero configuration. Well,
it's not zero as you still have to get the priorities of the app and the
network irq thread straight, but the extra softirq fiddling is gone.

   Now this only works for soft interrupts which are raised in the
   context of a thread. Unfortunately there is no way to do the same
   for soft interrupts which are raised in hard interrupt context
   (e.g. RCU, timers). They have no thread associated and are
   therefor delegated to ksoftirqd. This is ok, except that it does
   not help people who want to use signal based timers, but that
   problem needs to be solved by moving the complex handling into
   the context of the thread which is going to receive the signal
   and should vanish from the softirq processing completely.
   
   In principle we should have even in mainline a clear separation
   of which soft interrupts are disabled by a particular code region
   instead of disabling them wholesale. Though the nicest solution
   would be to get rid of them completely :)
 
 I've started looking at playing with the NAPI code again, and trying to
 see if I can add an ENAPI interface (Even Newer API), where the driver
 uses its own interrupt thread, and instead of having the polling in the
 network softirq, it can do the polling in its own thread.

It's pretty close to the behaviour I enforced with this change. Let's
play with that and figure out what influence it has on the network
throughput performance on RT. That needs probably a different
scheduling scheme than what Carsten needs for his deterministic
behaviour.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 18:19 +0200, Thomas Gleixner wrote:
  
  I've started looking at playing with the NAPI code again, and trying to
  see if I can add an ENAPI interface (Even Newer API), where the driver
  uses its own interrupt thread, and instead of having the polling in the
  network softirq, it can do the polling in its own thread.
 
 It's pretty close to the behaviour I enforced with this change. Let's
 play with that and figure out what influence it has on the network
 throughput performance on RT. That needs probably a different
 scheduling scheme than what Carsten needs for his deterministic
 behaviour.
 

I was actually looking at the change for mainline, not for -rt ;-)

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:
 On Tue, 2012-10-09 at 18:19 +0200, Thomas Gleixner wrote:
   
   I've started looking at playing with the NAPI code again, and trying to
   see if I can add an ENAPI interface (Even Newer API), where the driver
   uses its own interrupt thread, and instead of having the polling in the
   network softirq, it can do the polling in its own thread.
  
  It's pretty close to the behaviour I enforced with this change. Let's
  play with that and figure out what influence it has on the network
  throughput performance on RT. That needs probably a different
  scheduling scheme than what Carsten needs for his deterministic
  behaviour.
  
 
 I was actually looking at the change for mainline, not for -rt ;-)

I know, but you can utilize RT for figuring out what kind of
performance impact (in whatever direction) this modus operandi
has. That gives us a better understanding and hopefully improvements
for RT, but at the same time a lot of insight in how we should handle
this scenario on a non RT kernel. You might try to make the softirq
split lock scheme work in CONFIG_RT_BASE as this gives us a way better
comparison to mainline behaviour.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:

 The RT patch against 3.6.1 can be found here:
 
   
 http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.1-rt1.patch.xz
 
 The split quilt queue is available at:
 
   
 http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patches-3.6.1-rt1.tar.xz
 

I think you meant:


http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.1-rt1.tar.xz

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
 Dear RT Folks,
 
 I'm pleased to announce the 3.6.1-rt1 release.
 
 This is a pretty straight forward move from the 3.4-rt series which
 includes a few significant updates which need to be backported to the
 3.x-rt stable series:

My scripts detected these patches to be pulled into stable. It detects
any patch that has a Cc: to stable...@vger.kernel.org that does not
already exist in stable. It also adds the '000x-' prefix to keep the
order.

-scsi-qla2xxx-fix-bug-sleeping-function-called-from-invalid-context.patch
0001-upstream-net-rt-remove-preemption-disabling-in-netif_rx.patch
0002-random-make-it-work-on-rt.patch
0003-softirq-init-softirq-local-lock-after-per-cpu-section-is-set-up.patch
0004-mm-slab-fix-potential-deadlock.patch
0005-mm-page-alloc-use-local-lock-on-target-cpu.patch
0006-rt-rw-lockdep-annotations.patch
0007-stomp-machine-deal-clever-with-stopper-lock.patch

 
* Make interrupt randomness work again on RT. Based on the 3.x.y
  stable updates in that area. Should be applicable to all 3.x-rt
  series with almost no modifications.

Looks to be: random-make-it-work-on-rt.patch

 
* RT softirq initialization sequence fix (Steven Rostedt)

As I remembered that I forgot to Cc stable-rt, I manually added it to my
patch before running the script.

 
* Fix for a potential deadlock in mm/slab.c. This had been reported
  as lockdep splats several times and stupidly ignored as a false
  positive, but in fact it's a real (though almost impossible to
  trigger) deadlock lurking.

Looks to be: mm-slab-fix-potential-deadlock.patch

 
* Use the proper local_lock primitives in mm/page_alloc.c. That's
  not a real bug, but this fixes an inconsistency which helps
  debugability and therefore is worthwhile to be backported.

Looks to be: mm-page-alloc-use-local-lock-on-target-cpu.patch

 
* RT-rwlock/rwsem annotations:

Looks to be: rt-rw-lockdep-annotations.patch

 
  RT does not allow multiple readers on rwlocks and rwsems. The
  lockdep annotations did not yet consider that fact. One might
  think that this is a complete RT specific issue, but it's
  not. The FIFO fair rwsem/lock modifications in mainline made
  reader/writer primitives prone to very subtle deadlock problems
  which cannot be detected by the current lockdep annotations in
  mainline. The reason is that if a writer interleaves with two
  readers it will block the second reader from proceeding in order
  not to allow writer starvation. The restricted RWlocks semantics
  of RT allow an easy detection of that problem. We already
  triggered a real deadlock in RT (see:
  peterz-srcu-crypto-chain.patch) which could result in a hard to
  trigger, but mainline relevant deadlock. Wait for more
  interesting problems in that area.
 
* The output of might_sleep debugging is silent about the possible
  causes vs. the preempt count. Contrary to interrupt disabling
  there is zero information about what disabled preemption
  last. Again, not strictly a bugfix, but debuggability is key.

Is this: sched-better-debug-output-for-might-sleep.patch ? It's not
marked to Cc stable-rt.

 
* Fix a potentially deadly sto(m)p_machine deadlock. A CPU which
  calls that code from its inactive state (don't ask me for the
  ghastly deatils why this is necessary) can run into a contended
  state of the stomp machine mutex which would cause a rather
  awkward issue of idle scheduling itself away to idle as the only
  possible task on that upcoming cpu. Not pretty 

Looks to be: stomp-machine-deal-clever-with-stopper-lock.patch


If I'm wrong with the above, let me know. Thanks,

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 14:19 -0400, Steven Rostedt wrote:
 On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
  Dear RT Folks,
  
  I'm pleased to announce the 3.6.1-rt1 release.
  
  This is a pretty straight forward move from the 3.4-rt series which
  includes a few significant updates which need to be backported to the
  3.x-rt stable series:
 
 My scripts detected these patches to be pulled into stable. It detects
 any patch that has a Cc: to stable...@vger.kernel.org that does not
 already exist in stable. It also adds the '000x-' prefix to keep the
 order.
 
 -scsi-qla2xxx-fix-bug-sleeping-function-called-from-invalid-context.patch
 0001-upstream-net-rt-remove-preemption-disabling-in-netif_rx.patch

The above seem to be already applied (at least to 3.4-rt). Not sure why
my scripts missed it. Perhaps these were the ones added directly, and
the names of the patch that I used did not match your names. My script
saves off what it already applied, to know what it can skip for later
pulls.

-- Steve

 0002-random-make-it-work-on-rt.patch
 0003-softirq-init-softirq-local-lock-after-per-cpu-section-is-set-up.patch
 0004-mm-slab-fix-potential-deadlock.patch
 0005-mm-page-alloc-use-local-lock-on-target-cpu.patch
 0006-rt-rw-lockdep-annotations.patch
 0007-stomp-machine-deal-clever-with-stopper-lock.patch
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Thomas Gleixner
On Tue, 9 Oct 2012, Steven Rostedt wrote:

 On Tue, 2012-10-09 at 15:46 +0200, Thomas Gleixner wrote:
 
  The RT patch against 3.6.1 can be found here:
  

  http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.1-rt1.patch.xz
  
  The split quilt queue is available at:
  

  http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patches-3.6.1-rt1.tar.xz
  
 
 I think you meant:
 
 
 http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.1-rt1.tar.xz

Bah. Copy and paste should be something which can be disabled :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Tim Sander
Hi Thomas
 I'm pleased to announce the 3.6.1-rt1 release.
I also have to second the big thanks of Steven!

* Fix for a potential deadlock in mm/slab.c. This had been reported
  as lockdep splats several times and stupidly ignored as a false
  positive, but in fact it's a real (though almost impossible to
  trigger) deadlock lurking.
Mh, just an unverified (i.e. i haven't had the time to dig into it) 
information from my side: I think i have been seeing a total lockup on an OOM 
condition. Is it possible that this possible deadlock
might have been triggered in such a condition more reliably?

Best regards
Tim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt


On Tue, 2012-10-09 at 14:19 -0400, Steven Rostedt wrote:

I applied and tested the backported patches to 3.4-rt. Things look good
and will be posting the -rc1 soon.

Status for 3.0-rt:


 -scsi-qla2xxx-fix-bug-sleeping-function-called-from-invalid-context.patch
 0001-upstream-net-rt-remove-preemption-disabling-in-netisched-better-debug-output-for-might-sleep.patch
  f_rx.patch

The above two have been applied to both 3.0-rt and 3.4-rt previously.

 0002-random-make-it-work-on-rt.patch

The above has been applied to 3.4-rt but is not applicable to 3.0-rt, as
the add_interrupt_randomness() does not exist.

 0003-softirq-init-softirq-local-lock-after-per-cpu-section-is-set-up.patch

For 3.0-rt the softirq_early_init() comes after the printk banner, which
is after per cpu data has been set up. But for consistency, I made this
patch moved to before the banner in the same location as 3.4-rt+ is.


 0004-mm-slab-fix-potential-deadlock.patch
 0005-mm-page-alloc-use-local-lock-on-target-cpu.patch
 0006-rt-rw-lockdep-annotations.patch

The above applied to 3.0-rt with no issues.

* sched-better-debug-output-for-might-sleep.patch 

Had slight conflicts, but trivial fix.

 0007-stomp-machine-deal-clever-with-stopper-lock.patch

With this one, things have changed quite a bit. I'll take a deeper look
at what you did and figure out how this applies to v3.0-rt.

-- Steve



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt1

2012-10-09 Thread Steven Rostedt
On Tue, 2012-10-09 at 20:21 -0400, Steven Rostedt wrote:

  0007-stomp-machine-deal-clever-with-stopper-lock.patch
 
 With this one, things have changed quite a bit. I'll take a deeper look
 at what you did and figure out how this applies to v3.0-rt.

It doesn't look like this patch is needed for v3.0-rt as the patch
addresses the new stop_machine_from_inactive_cpu() API used by the mtrr
code added by this commit:

commit 192d8857427dd23707d5f0b86ca990c3af6f2d74
Author: Suresh Siddha suresh.b.sid...@intel.com
Date:   Thu Jun 23 11:19:29 2011 -0700

x86, mtrr: use stop_machine APIs for doing MTRR rendezvous


This was added in 3.1 so the patch is still required for v3.2-rt.

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/