Re: TREE_SRCU slows hotplug by factor ~16

2017-04-23 Thread Mike Galbraith
On Sun, 2017-04-23 at 20:32 -0700, Paul E. McKenney wrote: > On Mon, Apr 24, 2017 at 04:48:09AM +0200, Mike Galbraith wrote: > > Greetings, > > > > Running Steven's hotplug stress script in tip w. CLASSIC_SRCU takes 55s > > in my i4790 box, whereas TREE_SRCU t

TREE_SRCU slows hotplug by factor ~16

2017-04-23 Thread Mike Galbraith
Greetings, Running Steven's hotplug stress script in tip w. CLASSIC_SRCU takes 55s in my i4790 box, whereas TREE_SRCU takes over 16m. (Master with the same config does it in 39s.. but then lockdep isn't enabled in master) -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-10 Thread Mike Galbraith
On Tue, 2017-04-11 at 00:23 +0300, Michael S. Tsirkin wrote: > On Sat, Apr 08, 2017 at 07:01:34AM +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > > > > > OK. test3 and test4 are now pushed: test3 should fix your hang, > &

Re: [PATCH -v6 13/13] futex: futex_lock_pi() vs PREEMPT_RT_FULL

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 19:26 -0700, Darren Hart wrote: > I would like to see more testing because... well... futexes. But, we don't > have > a futex torture suite yet, but that is something I'm hoping to be looking into > in the near future. What testing we do have available has passed between my

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > OK. test3 and test4 are now pushed: test3 should fix your hang, > test4 is trying to fix a crash reported independently. test3 does not fix the post hibernate hang business that I can easily reproduce, those are NFS, and at least as o

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 16:35 +0300, Michael S. Tsirkin wrote: > Oh wait, I still put the ctx feature patches in there :( > Pls ignore, I'll update when I've fixed it up. Sorry about the noise. Both worked fine w/wo threadirqs. -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:22 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 09:05 +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > > > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > > > On Fri, Apr 07,

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:05 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > > > Test tag works fine here w/wo threadirqs, RT works as well. > > > &

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-06 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > Test tag works fine here w/wo threadirqs, RT works as well. > > > > -Mike > > Thanks a lot. > OK I pushed out two new tags >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-06 Thread Mike Galbraith
On Thu, 2017-04-06 at 00:38 +0300, Michael S. Tsirkin wrote: > What I did is a revert the refactorings while keeping the affinity API - > we can safely postpone them until the next release without loss of > functionality. But that's on top of my testing tree so it has unrelated > stuff as well. I'

Re: [PATCH] sched: Fix numabalancing to work with isolated cpus

2017-04-06 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:57 +0530, Srikar Dronamraju wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index f045a35..f853dc0 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1666,6 +1666,10 @@ static void task_numa_find_cpu(struct task_numa_env > *env, > >

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 17:31 -0700, Stephen Hemminger wrote: > On Sun, 02 Apr 2017 06:28:41 +0200 > Mike Galbraith wrote: > > > Livelock can be triggered by setting kworkers to SCHED_FIFO, then > > suspend/resume.. you come back from sleepy-land with a spinning > > kw

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 16:55 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 11:12 PM, Mike Galbraith wrote: > > On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote: > > > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith wrote: > > > > -

[tip:locking/core] rtmutex: Plug preempt count leak in rt_mutex_futex_unlock()

2017-04-05 Thread tip-bot for Mike Galbraith
Commit-ID: def34eaae5ce04b324e48e1bfac873091d945213 Gitweb: http://git.kernel.org/tip/def34eaae5ce04b324e48e1bfac873091d945213 Author: Mike Galbraith AuthorDate: Wed, 5 Apr 2017 10:08:27 +0200 Committer: Thomas Gleixner CommitDate: Wed, 5 Apr 2017 16:59:37 +0200 rtmutex: Plug preempt

[tip:locking/core] Retiplockingcore_rtmutex_Deboost_before_waking_up_the_top_waiter

2017-04-05 Thread tip-bot for Mike Galbraith
Commit-ID: 94247f76e7361afd85ba03a3f923bf3d07ba3017 Gitweb: http://git.kernel.org/tip/94247f76e7361afd85ba03a3f923bf3d07ba3017 Author: Mike Galbraith AuthorDate: Wed, 5 Apr 2017 10:08:27 +0200 Committer: Thomas Gleixner CommitDate: Wed, 5 Apr 2017 16:52:10 +0200

Re: [tip:locking/core] rtmutex: Deboost before waking up the top waiter

2017-04-05 Thread Mike Galbraith
locking/rtmutex: Fix preempt leak in __rt_mutex_futex_unlock() mark_wakeup_next_waiter() already disables preemption, doing so again leaves us with an unpaired preempt_disable(). Signed-off-by: Mike Galbraith --- kernel/locking/rtmutex.c | 10 +- 1 file changed, 5 insertions(+), 5

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 08:29 +0200, Christoph Hellwig wrote: > Can you check where the issues appear? I'd like to do a pure revert > of the shared interrupts, but that three has a lot more in it.. Not immediately, one of my several pots is emitting black smoke. -Mike

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith wrote: > > - while (some_qdisc_is_busy(dev)) > > - yield(); > > + swait_event_timeout(swait, > > !some_qdisc_is_busy(de

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 06:51 +0300, Michael S. Tsirkin wrote: > Any issues at all left with this tree? > In particular any regressions? Nothing blatantly obvious in a testdrive that lasted a couple minutes. I'd have to beat on it a bit to look for things beyond the reported, but can't afford to

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 05:24 +0200, Mike Galbraith wrote: > On Wed, 2017-04-05 at 06:13 +0300, Michael S. Tsirkin wrote: > > On Wed, Apr 05, 2017 at 05:09:09AM +0200, Mike Galbraith wrote: > > > On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > > > > &g

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 06:13 +0300, Michael S. Tsirkin wrote: > On Wed, Apr 05, 2017 at 05:09:09AM +0200, Mike Galbraith wrote: > > On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > > > > > since I couldn't reproduce, I decided it's worth trying to s

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 15:39 -0700, Cong Wang wrote: > Thanks for the report! Looks like a quick solution here is to replace > this yield() with cond_resched(), it is harder to really wait for > all qdisc's to transmit all packets. No, cond_resched() won't help. What I did is below, but I suspect

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > since I couldn't reproduce, I decided it's worth trying to see > what happens if we revert back to before 5c34d002dcc7. > > > Could you please test a tag "test" in my tree above? > It should point at 6d88af1bf359417eb821370294ba489bd

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 00:31 +0300, Michael S. Tsirkin wrote: > On Tue, Apr 04, 2017 at 08:38:35PM +0200, Mike Galbraith wrote: > > On Tue, 2017-04-04 at 21:00 +0300, Michael S. Tsirkin wrote: > > > > > And just making double sure, the 1st version that has the issue > &g

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 21:00 +0300, Michael S. Tsirkin wrote: > And just making double sure, the 1st version that has the issue > is 5c34d002dcc7, isn't it? I'm asking because subject says so > but then goes on to list subject from another commit. > This one is: > > virtio_pci: remove struct

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 19:40 +0200, Mike Galbraith wrote: > On Tue, 2017-04-04 at 18:30 +0300, Michael S. Tsirkin wrote: > > > I couldn't reproduce it - let's make sure we are using the > > same tree. Could you pls try > > > > git://git.kernel.org/pub/

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 18:30 +0300, Michael S. Tsirkin wrote: > I couldn't reproduce it - let's make sure we are using the > same tree. Could you pls try > > git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next > > It's currently at cc79d42a7d7e57ff64f406a1fd3740afebac0b44 Thi

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 16:38 +0300, Michael S. Tsirkin wrote: > On Tue, Apr 04, 2017 at 06:02:52AM +0200, Mike Galbraith wrote: > > On Mon, 2017-04-03 at 21:11 +0300, Michael S. Tsirkin wrote: > > > On Mon, Apr 03, 2017 at 07:56:32PM +0200, Mike Galbraith wrote: > > > &g

Re: [BUG nohz]: wrong user and system time accounting

2017-04-04 Thread Mike Galbraith
On Mon, 2017-04-03 at 16:40 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote: > Nohz_full is already bad for powersavings anyway. CPU 0 always ticks :-) OTOH, if a nohz_full set is doing what it was born to do, CPU0 tick spikes wo

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-03 Thread Mike Galbraith
On Mon, 2017-04-03 at 21:11 +0300, Michael S. Tsirkin wrote: > On Mon, Apr 03, 2017 at 07:56:32PM +0200, Mike Galbraith wrote: > > On Mon, 2017-04-03 at 16:18 +0200, Christoph Hellwig wrote: > > > Mike, > > > > > > can you try the patch below? > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-03 Thread Mike Galbraith
On Mon, 2017-04-03 at 16:18 +0200, Christoph Hellwig wrote: > Mike, > > can you try the patch below? No more spinning kworker woes, but I still have a warning on hibernate, threadirqs invariant. I'm also seeing intermittent post hibernate hang funnies in virgin source +- this patch, and without

net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-01 Thread Mike Galbraith
Greetings network wizards, Quoting kernel/sched/core.c: /** * yield - yield the current processor to other threads. * * Do not ever use this function, there's a 99% chance you're doing it wrong. * * The scheduler is at all times free to pick the calling task as the most * eligible task to ru

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote: > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > Also, why does it raise power consumption issues? > > On a system without either nohz_full or nohz idle > mode, skewed ticks result in CPU cores waking up > at different time

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 14:40 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote: > > There is such a feature skew_tick currently, refer to commit > > 5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot > > parameter, the bug disappear, howe

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 19:52 +0800, Wanpeng Li wrote: > If we should just add random offset to the cpu in the nohz_full mode? Up to you, whatever works best. I left the regular skew alone, just added some noise to scheduler_tick_max_deferment(). -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 05:10 +0200, Mike Galbraith wrote: > WRT spin, you should need do nothing more than boot with threadirqs, > that's 100% repeatable here in absolutely virgin source. No idea why virtqueue_get_buf() in __send_control_msg() fails forever with threadirqs, but marking

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Mike Galbraith
On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > In other words, the tick on cpu0 is aligned > with the tick on the nohz_full cpus, and > jiffies is advanced while the nohz_full cpus > with an active tick happen to be in kernel > mode? You really want skew_tick=1, especially on big boxen.

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-29 Thread Mike Galbraith
On Wed, 2017-03-29 at 23:19 +0300, Michael S. Tsirkin wrote: > > > > > > > > > > > &portdev->max_nr_ports) == 0) { > > @@ -2179,7 +2179,9 @@ static struct virtio_device_id id_table[ > > > > static unsigned int features[] = { > > > >> > VIRTIO_CONSOLE_F_SIZE, > > +#ifndef

Re: [PATCH] virtio_console: fix uninitialized variable use

2017-03-29 Thread Mike Galbraith
On Wed, 2017-03-29 at 23:27 +0300, Michael S. Tsirkin wrote: > Hi Mike > if you like, pls send me your Signed-off-by and I'll > change the patch to make you an author. Nah, it's perfect as it is. While I was pretty darn sure it was generic, I intentionally posted it as diagnostic inf

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-29 Thread Mike Galbraith
On Wed, 2017-03-29 at 23:10 +0300, Michael S. Tsirkin wrote: > Poking at this some more, I was able to reproduce at > least some warnings. I still do not see a spin > but is there a chance this helps your case too? Well, it's down to one warning, clean on the way back up. WRT spin, you should ne

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-28 Thread Mike Galbraith
On Mon, 2017-03-27 at 20:18 +0200, Mike Galbraith wrote: > BTW, WRT RT woes with $subject, I tried booting a generic kernel with > threadirqs, and bingo, same deal, just a bit more painful than for RT, > where there's no watchdog moaning accompanying the (preemptible) spin. BTW++:

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-28 Thread Mike Galbraith
On Tue, 2017-03-28 at 20:27 +0300, Michael S. Tsirkin wrote: > On Tue, Mar 28, 2017 at 06:33:53PM +0200, Mike Galbraith wrote: > > On Tue, 2017-03-28 at 18:37 +0300, Michael S. Tsirkin wrote: > > > > > Anything specific that you do to trigger this? > > > > N

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-28 Thread Mike Galbraith
On Tue, 2017-03-28 at 18:37 +0300, Michael S. Tsirkin wrote: > Anything specific that you do to trigger this? Nope, all I have to do is to poke kde Power/Session Hibernate button. Not that it should matter, but the vm is a full clone of my 42.1 box, including git server/repos etc, so has all w

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-27 Thread Mike Galbraith
On Tue, 2017-03-28 at 05:35 +0300, Michael S. Tsirkin wrote: > On Tue, Mar 28, 2017 at 03:08:20AM +0200, Mike Galbraith wrote: > > On Mon, 2017-03-27 at 21:16 +0300, Michael S. Tsirkin wrote: > > > > > Mike, could you pls send lspci -vv that shows up after > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-27 Thread Mike Galbraith
On Mon, 2017-03-27 at 21:16 +0300, Michael S. Tsirkin wrote: > Mike, could you pls send lspci -vv that shows up after > boot? Presuming you mean the virtual box.. 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) Subsystem: Red Hat, Inc Qemu virtual machine

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-27 Thread Mike Galbraith
On Mon, 2017-03-27 at 19:05 +0200, Christoph Hellwig wrote: > Hi Mike, > > does the patch below fix that issue for you? Nope, warnings are alive and well. > diff --git a/drivers/virtio/virtio_pci_common.c > b/drivers/virtio/virtio_pci_common.c > index df548a6fb844..fd1b06368b1f 100644 > --- a/dr

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-27 Thread Mike Galbraith
On Mon, 2017-03-27 at 19:05 +0200, Christoph Hellwig wrote: > Hi Mike, > > does the patch below fix that issue for you? Thanks, I'll give it a go in the A.M. BTW, WRT RT woes with $subject, I tried booting a generic kernel with threadirqs, and bingo, same deal, just a bit more painful than for R

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-03-27 Thread Mike Galbraith
On Thu, 2017-03-23 at 15:56 +0100, Christoph Hellwig wrote: > Does the patch from Jason in the > > "[REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for > virtqueues") causes crashes in guest" > > thread fix the issue for you? That seems to eliminate explosions, but not the below.

Re: Splat during resume

2017-03-26 Thread Mike Galbraith
On Sun, 2017-03-26 at 10:41 +0200, Borislav Petkov wrote: > Btw, try the 6 patches here: > https://marc.info/?l=linux-mm&m=148977696117208&w=2 > ontop of tip. Should fix your vaporite too. Yeah, silicon is still happy, vaporite boots gripe free. Trying to hibernate vaporite was a bad idea, but

Re: Splat during resume

2017-03-26 Thread Mike Galbraith
On Sat, 2017-03-25 at 22:46 +0100, Borislav Petkov wrote: > On Sat, Mar 25, 2017 at 07:58:55PM +0100, Borislav Petkov wrote: > > Hey Rafael, > > > > have you seen this already (partial splat photo attached)? Happens > > during resume from s2d. Judging by the timestamps, this looks like the > > res

Re: Still OOM problems with 4.9er/4.10er kernels

2017-03-23 Thread Mike Galbraith
On Thu, 2017-03-23 at 08:16 +0100, Gerhard Wiesinger wrote: > On 21.03.2017 08:13, Mike Galbraith wrote: > > On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote: > > > > > Is this the correct information? > > Incomplete, but enough to reiterate cgroup

Re: Still OOM problems with 4.9er/4.10er kernels

2017-03-21 Thread Mike Galbraith
On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote: > Is this the correct information? Incomplete, but enough to reiterate cgroup_disable=memory suggestion. -Mike

Re: Still OOM problems with 4.9er/4.10er kernels

2017-03-19 Thread Mike Galbraith
On Sun, 2017-03-19 at 17:02 +0100, Gerhard Wiesinger wrote: > mount | grep cgroup Just because controllers are mounted doesn't mean they're populated. To check that, you want to look for directories under the mount points with a non-empty 'tasks'. You will find some, but memory cgroup assignment

Re: change "tcp: randomize tcp timestamp offsets for each connection" broke networking

2017-03-15 Thread Mike Galbraith
On Wed, 2017-03-15 at 20:48 +0100, Lutz Vieweg wrote: > Dear Linux Developers, > > change set "tcp: randomize tcp timestamp offsets for each connection" > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/co > mmit/?id=95a22caee396cef0bb2ca8fafdd82966a49367bb > broke networking fo

Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode

2017-03-14 Thread Mike Galbraith
On Mon, 2017-03-13 at 15:26 -0400, Tejun Heo wrote: > Hello, Mike. > > Sorry about the long delay. > > On Mon, Feb 13, 2017 at 06:45:07AM +0100, Mike Galbraith wrote: > > > > So, as long as the depth stays reasonable (single digit or lower), > > > > what

Re: oops with 4.9.13-rt12 under mild load (and no rt-tasks active)

2017-03-10 Thread Mike Galbraith
On Fri, 2017-03-10 at 19:47 +, Nicholas Mc Guire wrote: > Has anyone seen 4.9.13-rt12 oopses related to ext4 or vfs in general ? FWIW, here it's seen quite a bit of hefty use on boxen large and small with no trouble. That said, @stable has a large pile queued for 4.9, 8 for ext4, some of wh

Re: kexec, x86/purgatory: Cleanup the unholy mess

2017-03-10 Thread Mike Galbraith
On Fri, 2017-03-10 at 16:31 +0100, Thomas Gleixner wrote: > On Fri, 10 Mar 2017, Mike Galbraith wrote: > > > On Fri, 2017-03-10 at 15:56 +0100, Thomas Gleixner wrote: > > > On Fri, 10 Mar 2017, Mike Galbraith wrote: > > > > Stuffing the lot into .kexec-purgatory

Re: kexec, x86/purgatory: Cleanup the unholy mess

2017-03-10 Thread Mike Galbraith
On Fri, 2017-03-10 at 15:56 +0100, Thomas Gleixner wrote: > On Fri, 10 Mar 2017, Mike Galbraith wrote: > > Stuffing the lot into .kexec-purgatory worked. > > You beat me to it :) That's odd, I'm usually a day late and a dollar short :) -Mike

Re: kexec, x86/purgatory: Cleanup the unholy mess

2017-03-10 Thread Mike Galbraith
On Fri, 2017-03-10 at 14:57 +0100, Thomas Gleixner wrote: > On Fri, 10 Mar 2017, Mike Galbraith wrote: > > On Fri, 2017-03-10 at 13:17 +0100, Thomas Gleixner wrote: > > > The purgatory code defines global variables which are referenced via a > > > symbol lookup in th

Re: kexec, x86/purgatory: Cleanup the unholy mess

2017-03-10 Thread Mike Galbraith
On Fri, 2017-03-10 at 13:17 +0100, Thomas Gleixner wrote: > The purgatory code defines global variables which are referenced via a > symbol lookup in the kexec code (core and arch). > > A recent commit addressing sparse warning made these static and thereby > broke kexec file. > > Why did this ha

Re: [block] BUG: KASAN: use-after-free in rb_erase+0x1431/0x1970

2017-03-09 Thread Mike Galbraith
On Thu, 2017-03-09 at 08:38 -0700, Jens Axboe wrote: > On 03/09/2017 08:16 AM, Mike Galbraith wrote: > > Greetings, > > > > Building master.today with kasan enabled (because I saw the same when > > trying out kasan on rt), the below fell out. > > > > Confi

Re: [regression] 72042a8c7b01 x86/purgatory: Make functions and variables static

2017-03-09 Thread Mike Galbraith
On Thu, 2017-03-09 at 18:50 +0100, Thomas Gleixner wrote: > On Thu, 9 Mar 2017, Mike Galbraith wrote: > > > Greetings, > > > > I bisected kdump breakage to $subject, and verified the identified > > culprit via revert. Seems kexec needs those variables as they were.

[block] BUG: KASAN: use-after-free in rb_erase+0x1431/0x1970

2017-03-09 Thread Mike Galbraith
Greetings, Building master.today with kasan enabled (because I saw the same when trying out kasan on rt), the below fell out. Config is enterprise based (tune for maximum build time), plus PREEMPT. [5.335444] == [5.337030]

[regression] 72042a8c7b01 x86/purgatory: Make functions and variables static

2017-03-09 Thread Mike Galbraith
Greetings, I bisected kdump breakage to $subject, and verified the identified culprit via revert. Seems kexec needs those variables as they were. -Mike

Re: [PATCH v3] lockdep: Teach lockdep about memalloc_noio_save

2017-03-02 Thread Mike Galbraith
On Wed, 2017-03-01 at 16:46 +0100, Peter Zijlstra wrote: > On Wed, Mar 01, 2017 at 01:29:57PM +0200, Nikolay Borisov wrote: > > Commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O > > during memory allocation") added the memalloc_noio_(save|restore) functions > > to enable peop

Re: [cgroups] suspicious rcu_dereference_check() usage!

2017-03-01 Thread Mike Galbraith
On Wed, 2017-03-01 at 12:44 -0500, Tejun Heo wrote: > If you still have the .config around, can you please attach it? I'll > verify the fix and send out the fix. Resurrected (master) and attached. -Mike config.xz Description: application/xz

Re: [GIT pull] x86/timers for 4.10

2017-02-23 Thread Mike Galbraith
On Thu, 2017-02-23 at 11:26 +0100, Borislav Petkov wrote: > On Thu, Feb 23, 2017 at 09:20:06AM +0100, Mike Galbraith wrote: > > --- a/arch/x86/kernel/tsc_sync.c > > +++ b/arch/x86/kernel/tsc_sync.c > > @@ -294,7 +294,7 @@ void check_tsc_sync_source(int cpu) > > >

Re: 9908859acaa9 cpuidle/menu: add per CPU PM QoS resume latency consideration

2017-02-23 Thread Mike Galbraith
On Thu, 2017-02-23 at 13:15 +0100, Rafael J. Wysocki wrote: > On Wednesday, February 22, 2017 10:55:04 PM Alex Shi wrote: > > > > > > Its not hard; spinlock_t ends up being a mutex, and this is ran > > > from the > > > idle thread. What thread do you think we ought to run when we > > > block > > >

Re: [GIT pull] x86/timers for 4.10

2017-02-23 Thread Mike Galbraith
On Thu, 2017-02-09 at 16:07 +0100, Thomas Gleixner wrote: > On Wed, 8 Feb 2017, Mike Galbraith wrote: > > On Wed, 2017-02-08 at 12:44 +0100, Thomas Gleixner wrote: > > > On Mon, 6 Feb 2017, Olof Johansson wrote: > > > > [0.177102] [Firmware Bug]: TSC ADJUST d

Re: 9908859acaa9 cpuidle/menu: add per CPU PM QoS resume latency consideration

2017-02-22 Thread Mike Galbraith
On Wed, 2017-02-22 at 23:36 +0800, Alex Shi wrote: > Sorry. Mike. > What you mean of 'took the zero added cycles option'? :) #ifndef CONFIG_PREEMPT_RT_FULL ... #endif I waved my magic ifdef wand, and poof, they disappeared :) -Mike

Re: 9908859acaa9 cpuidle/menu: add per CPU PM QoS resume latency consideration

2017-02-22 Thread Mike Galbraith
On Wed, 2017-02-22 at 22:53 +0800, Alex Shi wrote: > cc Rafael. > > > On 02/22/2017 09:12 PM, Peter Zijlstra wrote: > > On Wed, Feb 22, 2017 at 01:56:37PM +0100, Mike Galbraith wrote: > > > Hi, > > > > > > Do we really need a spinlock for that in t

Re: 9908859acaa9 cpuidle/menu: add per CPU PM QoS resume latency consideration

2017-02-22 Thread Mike Galbraith
On Wed, 2017-02-22 at 22:31 +0800, Alex Shi wrote: > > On 02/22/2017 09:19 PM, Mike Galbraith wrote: > > On Wed, 2017-02-22 at 14:12 +0100, Peter Zijlstra wrote: > > > On Wed, Feb 22, 2017 at 01:56:37PM +0100, Mike Galbraith wrote: > > > > Hi, > > > &

Re: 9908859acaa9 cpuidle/menu: add per CPU PM QoS resume latency consideration

2017-02-22 Thread Mike Galbraith
On Wed, 2017-02-22 at 14:12 +0100, Peter Zijlstra wrote: > On Wed, Feb 22, 2017 at 01:56:37PM +0100, Mike Galbraith wrote: > > Hi, > > > > Do we really need a spinlock for that in the idle loop? > > Urgh, that's broken on RT, you cannot schedule the idle loop.

9908859acaa9 cpuidle/menu: add per CPU PM QoS resume latency consideration

2017-02-22 Thread Mike Galbraith
Hi, Do we really need a spinlock for that in the idle loop? -Mike

Re: [bisection] b0119e87083 iommu: Introduce new 'struct iommu_device' ==> boom

2017-02-21 Thread Mike Galbraith
On Tue, 2017-02-21 at 16:19 +0100, Joerg Roedel wrote: > Hi Mike, > > thanks for the report, this didn't trigger in my local testing here. > Loosk like I need to test without intel_iommu=on too :/ > > Anyway, can you check whether the attached patch helps? Yup, boots. > diff --git a/drivers/iom

[bisection] b0119e87083 iommu: Introduce new 'struct iommu_device' ==> boom

2017-02-21 Thread Mike Galbraith
4x18 box (berio) explodes as below after morning master pull. BIOS has a couple issues, maybe one of them.. helps. [ 30.796530] ima: No TPM chip found, activating TPM-bypass! (rc=-19) [ 30.810709] evm: HMAC attrs: 0x1 [ 30.821200] BUG: unable to handle kernel NULL pointer dereference at 00

[cgroups] suspicious rcu_dereference_check() usage!

2017-02-20 Thread Mike Galbraith
Running LTP on master.today (v4.10) with a seriously bloated PREEMPT config inspired box to emit the below. [ 7160.458996] === [ 7160.463195] [ INFO: suspicious RCU usage. ] [ 7160.467387] 4.10.0-default #100 Tainted: GE [ 7160.472808]

[btrfs] lockdep splat

2017-02-17 Thread Mike Galbraith
Greetings, Running ltp on master.today, I received the splat (from hell) below. [ 5015.128458] = [ 5015.128458] [ INFO: possible irq lock inversion dependency detected ] [ 5015.128458] 4.10.0-default #119 Tainted: GE [ 5015.128

Re: [RT] lockdep munching nr_list_entries like popcorn

2017-02-17 Thread Mike Galbraith
On Thu, 2017-02-16 at 19:06 +0100, Mike Galbraith wrote: > On Thu, 2017-02-16 at 15:53 +0100, Sebastian Andrzej Siewior wrote: > > On 2017-02-16 15:42:59 [+0100], Mike Galbraith wrote: > > > > > > Weeell, I'm trying to cobble something kinda like that together u

Re: [RT] lockdep munching nr_list_entries like popcorn

2017-02-16 Thread Mike Galbraith
BTW, this ain't gone. I'll take a peek. It doesn't happen in my tree, seems likely to be because whether running sirqs fully threaded or not, I don't let one any thread handle what another exists to handle. [ 638.107293] NOHZ: local_softirq_pending 80 [ 939.729684] NOHZ: local_softirq_pending

Re: [RT] lockdep munching nr_list_entries like popcorn

2017-02-16 Thread Mike Galbraith
On Thu, 2017-02-16 at 15:53 +0100, Sebastian Andrzej Siewior wrote: > On 2017-02-16 15:42:59 [+0100], Mike Galbraith wrote: > > > > Weeell, I'm trying to cobble something kinda like that together using > > __RT_SPIN_INITIALIZER() instead, but seems mean ole

Re: [RT] lockdep munching nr_list_entries like popcorn

2017-02-16 Thread Mike Galbraith
On Thu, 2017-02-16 at 12:06 +0100, Peter Zijlstra wrote: > On Thu, Feb 16, 2017 at 10:01:18AM +0100, Thomas Gleixner wrote: > > On Thu, 16 Feb 2017, Mike Galbraith wrote: > > > > > On Thu, 2017-02-16 at 09:37 +0100, Thomas Gleixner wrote: > > > > On Th

Re: [RT] lockdep munching nr_list_entries like popcorn

2017-02-16 Thread Mike Galbraith
On Thu, 2017-02-16 at 10:01 +0100, Thomas Gleixner wrote: > On Thu, 16 Feb 2017, Mike Galbraith wrote: > > > On Thu, 2017-02-16 at 09:37 +0100, Thomas Gleixner wrote: > > > On Thu, 16 Feb 2017, Mike Galbraith wrote: > > > > > ... > > > > swapve

Re: [RT] lockdep munching nr_list_entries like popcorn

2017-02-16 Thread Mike Galbraith
On Thu, 2017-02-16 at 09:37 +0100, Thomas Gleixner wrote: > On Thu, 16 Feb 2017, Mike Galbraith wrote: > ... > > swapvec_lock? Oodles of 'em? Nope. > > Well, it's a per cpu lock and the lru_cache_add() variants might be called > from a gazillion of different ca

[RT] lockdep munching nr_list_entries like popcorn

2017-02-15 Thread Mike Galbraith
4.9.10-rt6-virgin on 72 core +SMT box. Below is 1 line per minute, box idling along daintily nibbling, I fire up a parallel kbuild loop at 40465, and box gobbles greedily. I have entries bumped to 128k, and chain bits to 18 so box will get booted and run for a while before lockdep says "I quit".

[tip:timers/urgent] tick/broadcast: Prevent deadlock on tick_broadcast_lock

2017-02-13 Thread tip-bot for Mike Galbraith
Commit-ID: 202461e2f3c15dbfb05825d29ace0d20cdf55fa4 Gitweb: http://git.kernel.org/tip/202461e2f3c15dbfb05825d29ace0d20cdf55fa4 Author: Mike Galbraith AuthorDate: Mon, 13 Feb 2017 03:31:55 +0100 Committer: Thomas Gleixner CommitDate: Mon, 13 Feb 2017 09:49:31 +0100 tick/broadcast

Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode

2017-02-12 Thread Mike Galbraith
On Sun, 2017-02-12 at 07:59 +0100, Mike Galbraith wrote: > On Sun, 2017-02-12 at 14:05 +0900, Tejun Heo wrote: > > > > I think cgroup tree depth is a more significant issue; because of > > > hierarchy we often do tree walks (uo-to-root or down-to-task). > > > >

Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode

2017-02-12 Thread Mike Galbraith
On Sun, 2017-02-12 at 13:16 -0800, Paul Turner wrote: > > > On Thursday, February 9, 2017, Peter Zijlstra wrote: > > On Thu, Feb 09, 2017 at 05:07:16AM -0800, Paul Turner wrote: > > > The only case that this does not support vs ".threads" would be some > > > hybrid where we co-mingle threads fro

Re: Linux 4.9.6 ( Restore IO-APIC irq_chip retrigger callback , breaks my box )

2017-02-12 Thread Mike Galbraith
[ 12.703757] kthread+0x10c/0x140 [ 12.703759] ? smpboot_update_cpumask_percpu_thread+0x130/0x130 [ 12.703760] ? kthread_park+0x90/0x90 [ 12.703762] ret_from_fork+0x2a/0x40 [ 12.709790] intel_idle: lapic_timer_reliable_states 0x2 Signed-off-by: Mike Galbraith --- kernel/time/tick-broadca

Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode

2017-02-11 Thread Mike Galbraith
On Sun, 2017-02-12 at 14:05 +0900, Tejun Heo wrote: > > I think cgroup tree depth is a more significant issue; because of > > hierarchy we often do tree walks (uo-to-root or down-to-task). > > > > So creating elaborate trees is something I try not to do. > > So, as long as the depth stays reason

Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline

2017-02-11 Thread Mike Galbraith
On Sat, 2017-02-11 at 08:15 +0100, luca abeni wrote: > Hi Daniel, > > On Fri, 10 Feb 2017 20:48:11 +0100 > Daniel Bristot de Oliveira wrote: > > > During the activation, CBS checks if it can reuse the current > > task's > > runtime and period. If the deadline of the task is in the past, CBS > >

Re: [GIT pull] x86/timers for 4.10

2017-02-09 Thread Mike Galbraith
On Thu, 2017-02-09 at 16:21 +0100, Thomas Gleixner wrote: > On Thu, 9 Feb 2017, Mike Galbraith wrote: > > > On Thu, 2017-02-09 at 16:07 +0100, Thomas Gleixner wrote: > > > On Wed, 8 Feb 2017, Mike Galbraith wrote: > > > > On Wed, 2017-02-08 at 12:44 +0100, Thomas

Re: [GIT pull] x86/timers for 4.10

2017-02-09 Thread Mike Galbraith
On Thu, 2017-02-09 at 16:07 +0100, Thomas Gleixner wrote: > On Wed, 8 Feb 2017, Mike Galbraith wrote: > > On Wed, 2017-02-08 at 12:44 +0100, Thomas Gleixner wrote: > > > On Mon, 6 Feb 2017, Olof Johansson wrote: > > > > [0.177102] [Firmware Bug]: TSC

Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode

2017-02-09 Thread Mike Galbraith
On Thu, 2017-02-09 at 15:47 +0100, Peter Zijlstra wrote: > On Thu, Feb 09, 2017 at 05:07:16AM -0800, Paul Turner wrote: > > The only case that this does not support vs ".threads" would be some > > hybrid where we co-mingle threads from different processes (with the > > processes belonging to the sa

Re: [GIT pull] x86/timers for 4.10

2017-02-08 Thread Mike Galbraith
On Wed, 2017-02-08 at 12:44 +0100, Thomas Gleixner wrote: > On Mon, 6 Feb 2017, Olof Johansson wrote: > > [0.177102] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: > > -6495898515190607 CPU1: -6495898517158354 > > Yay, another "clever" BIOS Oh yeah, that reminds me... I met one suc

Re: [RFC,v2 3/3] sched: ignore task_h_load for CPU_NEWLY_IDLE

2017-02-08 Thread Mike Galbraith
On Wed, 2017-02-08 at 09:43 +0100, Uladzislau Rezki wrote: > From: Uladzislau 2 Rezki > > A load balancer calculates imbalance factor for particular shed ^sched > domain and tries to steal up the prescribed amount of weighted load. > Ho

Re: v4.9, 4.4-final: 28 bioset threads on small notebook, 36 threads on cellphone

2017-02-07 Thread Mike Galbraith
On Tue, 2017-02-07 at 19:58 -0900, Kent Overstreet wrote: > On Tue, Feb 07, 2017 at 09:39:11PM +0100, Pavel Machek wrote: > > On Mon 2017-02-06 17:49:06, Kent Overstreet wrote: > > > On Mon, Feb 06, 2017 at 04:47:24PM -0900, Kent Overstreet wrote: > > > > On Mon, Feb 06, 2017 at 01:53:09PM +0100, P

Re: v4.9, 4.4-final: 28 bioset threads on small notebook, 36 threads on cellphone

2017-02-07 Thread Mike Galbraith
On Tue, 2017-02-07 at 21:39 +0100, Pavel Machek wrote: > On Mon 2017-02-06 17:49:06, Kent Overstreet wrote: > > On Mon, Feb 06, 2017 at 04:47:24PM -0900, Kent Overstreet wrote: > > > On Mon, Feb 06, 2017 at 01:53:09PM +0100, Pavel Machek wrote: > > > > Still there on v4.9, 36 threads on nokia n900

Re: tip: demise of tsk_cpus_allowed() and tsk_nr_cpus_allowed()

2017-02-06 Thread Mike Galbraith
On Mon, 2017-02-06 at 13:29 +0100, Ingo Molnar wrote: > * Mike Galbraith wrote: > > > On Mon, 2017-02-06 at 11:31 +0100, Ingo Molnar wrote: > > > * Mike Galbraith wrote: > > > > > > > Hi Ingo, > > > > > > > > Doing my ~daily ti

<    2   3   4   5   6   7   8   9   10   11   >