regression: "95cde3c59966 debugfs: inode: debugfs_create_dir uses mode permission from parent" terminally annoys libvirt

2018-06-08 Thread Mike Galbraith
Greetings, $subject bisected and verified via revert. Box is garden variety i4790, distro is openSUSE Leap 15.0. Error starting domain: internal error: process exited while connecting to monitor: ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory 2018-06-08T03:18:00.453006Z qemu-system-x86_

Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Mike Galbraith
On Fri, 2018-06-01 at 13:03 -0700, Andy Lutomirski wrote: > > Mike, you never did say: do you have PCID on your CPU? Yes. > Also, what is > your workload doing to cause so many switches back and forth between > init_mm and a task. pipe-test measures pipe round trip, does nearly nothing but sc

Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Mike Galbraith
On Fri, 2018-06-01 at 14:22 -0400, Rik van Riel wrote: > On Fri, 2018-06-01 at 08:11 -0700, Andy Lutomirski wrote: > > On Fri, Jun 1, 2018 at 5:28 AM Rik van Riel wrote: > > > > > > Song noticed switch_mm_irqs_off taking a lot of CPU time in recent > > > kernels,using 2.4% of a 48 CPU system duri

4.13..4.14 scheduling overhead regression (bisected - b956575bed91)

2018-06-01 Thread Mike Galbraith
Greetings, While dusting off regression testing trees, I noticed a substantial pipe-test dent at 4.14, and bisected it to b956575bed91. Log below. skew_tick=1 audit=0 nodelayacct cgroup_disable=memory nopti nospectre_v2 nospec_store_bypass_disable gov performance taskset 0xc pipe-test 1 4.4.13

Re: [PATCH] x86: UV: raw_spinlock conversion

2018-05-22 Thread Mike Galbraith
On Tue, 2018-05-22 at 11:46 +0200, Mike Galbraith wrote: > On Tue, 2018-05-22 at 11:14 +0200, Sebastian Andrzej Siewior wrote: > > > If you suggest that I > > should stop caring about UV than I do so. Please post a patch that adds > > a dependency to UV on PRE

Re: [PATCH] x86: UV: raw_spinlock conversion

2018-05-22 Thread Mike Galbraith
On Tue, 2018-05-22 at 11:14 +0200, Sebastian Andrzej Siewior wrote: > On 2018-05-22 10:24:22 [+0200], Mike Galbraith wrote: > > > If I were in your shoes, I think I'd just stop caring about UV until a > > real user appears. AFAIK, I'm the only guy who ever ran RT

Re: [PATCH] x86: UV: raw_spinlock conversion

2018-05-22 Thread Mike Galbraith
On Tue, 2018-05-22 at 08:50 +0200, Sebastian Andrzej Siewior wrote: > > Regarding the preempt_disable() in the original patch in uv_read_rtc(): > This looks essential for PREEMPT configs. Is it possible to get this > tested by someone or else get rid of the UV code? It looks broken for > "uv_get_m

Re: [PATCH] x86: UV: raw_spinlock conversion

2018-05-19 Thread Mike Galbraith
On Mon, 2018-05-07 at 09:39 +0200, Sebastian Andrzej Siewior wrote: > On 2018-05-06 12:59:19 [+0200], Mike Galbraith wrote: > > On Sun, 2018-05-06 at 12:26 +0200, Thomas Gleixner wrote: > > > On Fri, 4 May 2018, Sebastian Andrzej Siewior wrote: > > > &

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-17 Thread Mike Galbraith
On Thu, 2018-05-17 at 07:03 -0700, Paul E. McKenney wrote: > On Tue, May 15, 2018 at 06:30:26AM +0200, Mike Galbraith wrote: > > > > Something like so perhaps? Mike, can you play around with that? Could > > > burn your granny and eat your cookies. > > > >

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-14 Thread Mike Galbraith
On Thu, 2018-05-03 at 18:45 +0200, Peter Zijlstra wrote: > On Thu, May 03, 2018 at 09:12:31AM -0700, Paul E. McKenney wrote: > > On Thu, May 03, 2018 at 04:44:50PM +0200, Peter Zijlstra wrote: > > > On Thu, May 03, 2018 at 04:16:55PM +0200, Mike Galbraith wrote: > > > &g

Re: [patch] swiotlb: fix ignored DMA_ATTR_NO_WARN request

2018-05-12 Thread Mike Galbraith
To conclude to this snail like thread (/me=walking wounded), with the v4.16.8 hunk below, traces showing that swiotlb_alloc_coherent() was being asked to not bother warning started showing up after the box had been flogged for a while. Whatever finally happens with swiotlb (seems to be in flux), o

[patch] swiotlb: fix ignored DMA_ATTR_NO_WARN request

2018-05-11 Thread Mike Galbraith
org-3170 [006] 963.866917: swiotlb_tbl_map_single+0x29b/0x2d0: swiotlb buffer is full (sz: 2097152 bytes) Signed-off-by: Mike Galbraith --- lib/swiotlb.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -714,7 +714,7

Re: kernel spew from nouveau/ swiotlb

2018-05-11 Thread Mike Galbraith
On Thu, 2018-05-10 at 12:28 +0200, Mike Galbraith wrote: > On Thu, 2018-05-10 at 11:10 +0200, Mike Galbraith wrote: > > Greetings, > > > > When box is earning its keep, nouveau/swiotlb grumble.. a LOT. The > > below is from master.today. > > > > [1259

Re: [Nouveau] kernel spew from nouveau/ swiotlb

2018-05-10 Thread Mike Galbraith
On Thu, 2018-05-10 at 17:31 +0200, Mike Galbraith wrote: > On Thu, 2018-05-10 at 10:31 -0400, Jerome Glisse wrote: > > > > Could you bisect ? I would love to point finger upstream to the DMA > > folk who made changes to that API without testing with GPU. > > R

Re: [Nouveau] kernel spew from nouveau/ swiotlb

2018-05-10 Thread Mike Galbraith
On Thu, 2018-05-10 at 10:31 -0400, Jerome Glisse wrote: > > Could you bisect ? I would love to point finger upstream to the DMA > folk who made changes to that API without testing with GPU. Rummaging a bit, it might be... nouveau_bo_new() ... ttm_dma_pool_alloc_new_pages() dma_alloc_attrs()

Re: kernel spew from nouveau/ swiotlb

2018-05-10 Thread Mike Galbraith
On Thu, 2018-05-10 at 11:10 +0200, Mike Galbraith wrote: > Greetings, > > When box is earning its keep, nouveau/swiotlb grumble.. a LOT. The > below is from master.today. > > [12594.640959] nouveau :01:00.0: swiotlb buffer is full (sz: 2097152 > bytes) > [12594.6930

kernel spew from nouveau/ swiotlb

2018-05-10 Thread Mike Galbraith
Greetings, When box is earning its keep, nouveau/swiotlb grumble.. a LOT. The below is from master.today. [12594.640959] nouveau :01:00.0: swiotlb buffer is full (sz: 2097152 bytes) [12594.693000] nouveau :01:00.0: swiotlb buffer is full (sz: 2097152 bytes) [12594.713787] nouveau :01

Re: bug in tag handling in blk-mq?

2018-05-09 Thread Mike Galbraith
On Wed, 2018-05-09 at 13:50 -0600, Jens Axboe wrote: > On 5/9/18 12:31 PM, Mike Galbraith wrote: > > On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote: > >> On 5/9/18 10:57 AM, Mike Galbraith wrote: > >> > >>>>> Confirmed. Impressive high speed bug s

Re: bug in tag handling in blk-mq?

2018-05-09 Thread Mike Galbraith
On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote: > On 5/9/18 10:57 AM, Mike Galbraith wrote: > > >>> Confirmed. Impressive high speed bug stomping. > >> > >> Well, that's good news. Can I get you to try this patch? > > > > Sure thin

Re: bug in tag handling in blk-mq?

2018-05-09 Thread Mike Galbraith
On Wed, 2018-05-09 at 09:18 -0600, Jens Axboe wrote: > On 5/8/18 10:11 PM, Mike Galbraith wrote: > > On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: > >> > >> Alright, I managed to reproduce it. What I think is happening is that > >> BFQ is limiting the

Re: bug in tag handling in blk-mq?

2018-05-08 Thread Mike Galbraith
On Tue, 2018-05-08 at 14:37 -0600, Jens Axboe wrote: > > - sdd has nothing pending, yet has 6 active waitqueues. sdd is where ccache storage lives, which that should have been the only activity on that drive, as I built source in sdb, and was doing nothing else that utilizes sdd. -Mike

Re: bug in tag handling in blk-mq?

2018-05-08 Thread Mike Galbraith
On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: > > Alright, I managed to reproduce it. What I think is happening is that > BFQ is limiting the inflight case to something less than the wake > batch for sbitmap, which can lead to stalls. I don't have time to test > this tonight, but perhaps yo

Re: bug in tag handling in blk-mq?

2018-05-08 Thread Mike Galbraith
On Tue, 2018-05-08 at 08:55 -0600, Jens Axboe wrote: > > All the block debug files are empty... Sigh. Take 2, this time cat debug files, having turned block tracing off before doing anything else (so trace bits in dmesg.txt should end AT the stall). -Mike dmesg.xz Description: applicat

Re: bug in tag handling in blk-mq?

2018-05-08 Thread Mike Galbraith
On Tue, 2018-05-08 at 06:51 +0200, Mike Galbraith wrote: > > I'm deadlined ATM, but will get to it. (Bah, even a zombie can type ccache -C; make -j8 and stare...) kbuild again hung on the first go (yay), and post hang data written to sdd1 survived (kernel source lives in sdb3).

Re: bug in tag handling in blk-mq?

2018-05-07 Thread Mike Galbraith
On Mon, 2018-05-07 at 20:02 +0200, Paolo Valente wrote: > > > > Is there a reproducer? Just building fat config kernels works for me. It was highly non- deterministic, but reproduced quickly twice in a row with Paolos hack.    > Ok Mike, I guess it's your turn now, for at least a stack trace.

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-07 Thread Mike Galbraith
On Mon, 2018-05-07 at 11:27 +0200, Paolo Valente wrote: > > > Where is the bug? Hm, seems potent pain-killers and C don't mix all that well.

Re: [PATCH] x86: UV: raw_spinlock conversion

2018-05-07 Thread Mike Galbraith
On Mon, 2018-05-07 at 09:39 +0200, Sebastian Andrzej Siewior wrote: > On 2018-05-06 12:59:19 [+0200], Mike Galbraith wrote: > > On Sun, 2018-05-06 at 12:26 +0200, Thomas Gleixner wrote: > > > On Fri, 4 May 2018, Sebastian Andrzej Siewior wrote: > > > &

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-06 Thread Mike Galbraith
On Sun, 2018-05-06 at 09:42 +0200, Paolo Valente wrote: > > diff --git a/block/bfq-mq-iosched.c b/block/bfq-mq-iosched.c > index 118f319af7c0..6662efe29b69 100644 > --- a/block/bfq-mq-iosched.c > +++ b/block/bfq-mq-iosched.c > @@ -525,8 +525,13 @@ static void bfq_limit_depth(unsigned int op, struc

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-06 Thread Mike Galbraith
On Mon, 2018-05-07 at 04:43 +0200, Mike Galbraith wrote: > On Sun, 2018-05-06 at 09:42 +0200, Paolo Valente wrote: > > > > I've attached a compressed patch (to avoid possible corruption from my > > mailer). I'm little confident, but no pain, no gain, right? > &g

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-06 Thread Mike Galbraith
On Sun, 2018-05-06 at 09:42 +0200, Paolo Valente wrote: > > I've attached a compressed patch (to avoid possible corruption from my > mailer). I'm little confident, but no pain, no gain, right? > > If possible, apply this patch on top of the fix I proposed in this > thread, just to eliminate poss

Re: [PATCH] x86: UV: raw_spinlock conversion

2018-05-06 Thread Mike Galbraith
On Sun, 2018-05-06 at 12:26 +0200, Thomas Gleixner wrote: > On Fri, 4 May 2018, Sebastian Andrzej Siewior wrote: > > > From: Mike Galbraith > > > > Shrug. Lots of hobbyists have a beast in their basement, right? > > This hardly qualifies as a proper changelog ..

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-05 Thread Mike Galbraith
On Sat, 2018-05-05 at 12:39 +0200, Paolo Valente wrote: > > BTW, if you didn't run out of patience with this permanent issue yet, > I was thinking of two o three changes to retry to trigger your failure > reliably. Sure, fire away, I'll happily give the annoying little bugger opportunities to sho

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-05 Thread Mike Galbraith
On Fri, 2018-05-04 at 21:46 +0200, Mike Galbraith wrote: > Tentatively, I suspect you've just fixed the nasty stalls I reported a > while back. Oh well, so much for optimism. It took a lot, but just hung.

Re: [PATCH BUGFIX] block, bfq: postpone rq preparation to insert or merge

2018-05-04 Thread Mike Galbraith
Tentatively, I suspect you've just fixed the nasty stalls I reported a while back. Not a hint of stall as yet (should have shown itself by now), spinning rust buckets are being all they can be, box feels good. Later mq-deadline (I hope to eventually forget the module dependency eternities we've s

[patch-rt] sched,fair: Fix CFS bandwidth control lockdep DEADLOCK report

2018-05-03 Thread Mike Galbraith
iod_timer+0x28/0x140 sched_cfs_period_timer+0x28/0x140 ? sched_cfs_slack_timer+0xc0/0xc0 __hrtimer_run_queues+0x10e/0x5f0 hrtimer_run_softirq+0x83/0xc0 do_current_softirqs+0x292/0x660 run_ksoftirqd+0x27/0x70 smpboot_thread_fn+0x27f/0x330 kthread+0x103/0x140 ? smpboot_register_percpu_thread_cpumask+0x100/0x10

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-03 Thread Mike Galbraith
On Thu, 2018-05-03 at 18:45 +0200, Peter Zijlstra wrote: > > Something like so perhaps? Mike, can you play around with that? Could > burn your granny and eat your cookies. That worked, and nothing entertaining has happened.. yet. Hm, I could use this kernel to update my backup drive, if there's

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-03 Thread Mike Galbraith
On Thu, 2018-05-03 at 15:56 +0200, Peter Zijlstra wrote: > On Thu, May 03, 2018 at 03:32:39PM +0200, Mike Galbraith wrote: > > > Dang. With $subject fix applied as well.. > > That's a NO then... :-( Could say who cares about oddball offline wakeup stat.

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-03 Thread Mike Galbraith
On Thu, 2018-05-03 at 14:49 +0200, Peter Zijlstra wrote: > On Thu, May 03, 2018 at 02:40:21PM +0200, Mike Galbraith wrote: > > On Thu, 2018-05-03 at 14:28 +0200, Peter Zijlstra wrote: > > > > > > Hurm.. I don't see how this is 'new'. We moved the wakeup o

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-03 Thread Mike Galbraith
On Thu, 2018-05-03 at 14:28 +0200, Peter Zijlstra wrote: > > Hurm.. I don't see how this is 'new'. We moved the wakeup out from under > stopper lock, but that should not affect the RCU state. No, not new, just an additional woes from same spot. -Mike

Re: cpu stopper threads and load balancing leads to deadlock

2018-05-03 Thread Mike Galbraith
On Tue, 2018-04-24 at 14:33 +0100, Matt Fleming wrote: > On Fri, 20 Apr, at 11:50:05AM, Peter Zijlstra wrote: > > On Tue, Apr 17, 2018 at 03:21:19PM +0100, Matt Fleming wrote: > > > Hi guys, > > > > > > We've seen a bug in one of our SLE kernels where the cpu stopper > > > thread ("migration/15")

Re: [PATCH v7 2/5] cpuset: Add cpuset.sched_load_balance to v2

2018-05-02 Thread Mike Galbraith
On Wed, 2018-05-02 at 16:02 +0200, Peter Zijlstra wrote: > On Wed, May 02, 2018 at 09:47:00AM -0400, Waiman Long wrote: > > > > I've read half of the next patch that adds the isolation thing. And > > > while that kludges around the whole root cgorup is magic thing, it > > > doesn't help if you mov

Re: [RFC/RFT patch 0/7] timekeeping: Unify clock MONOTONIC and clock BOOTTIME

2018-04-26 Thread Mike Galbraith
On Wed, 2018-04-25 at 15:03 +0200, Thomas Gleixner wrote: > Right, it does not matter. The real interesting one is d6ed449afdb3. FWIW, three boxen here suspend/resume fine, but repeatably exhibit the below after a very few minute suspend, and a short bisect fingered your suspect. Distro is opensu

Re: DOS by unprivileged user

2018-04-25 Thread Mike Galbraith
On Wed, 2018-04-25 at 15:54 +0100, Alan Cox wrote: > > Classical Unix systems never had this problem because they respond to > thrashing by ensuring that all processes consumed CPU and made some > progress. Linux handles it by thrashing itself to dealth while BSD always > handled it by moving from

Re: DOS by unprivileged user

2018-04-25 Thread Mike Galbraith
On Wed, 2018-04-25 at 15:54 +0100, Alan Cox wrote: > > > I think memory allocation and io waits can't be decoupled from > > > scheduling as they are now. > > > > The scheduler is not decoupled from either, it is intimately involved > > in both. However, none of the decision making smarts for ei

Re: [PATCH] sched: fix typo in error message

2018-04-24 Thread Mike Galbraith
On Wed, 2018-04-25 at 13:41 +0800, Li Bin wrote: > Signed-off-by: Li Bin > --- > kernel/sched/topology.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index 64cc564..cf15c1c 100644 > --- a/kernel/sched/topology.c > ++

Re: DOS by unprivileged user

2018-04-23 Thread Mike Galbraith
On Sun, 2018-04-22 at 21:37 +0200, Ferry Toth wrote: > > Yes your memory hog scenario thoroughly wrecks the user experience, but > > the process scheduler in not the source of that wreckage, it's a memory > > management issue. With no constraints in place, anybody can just keep > > on allocating u

Re: [PATCH 2/2] rtmutex: Reduce top-waiter blocking on a lock

2018-04-20 Thread Mike Galbraith
On Fri, 2018-04-20 at 17:50 +0200, Peter Zijlstra wrote: > On Tue, Apr 10, 2018 at 09:27:50AM -0700, Davidlohr Bueso wrote: > > By applying well known spin-on-lock-owner techniques, we can avoid the > > blocking overhead during the process of when the task is trying to take > > the rtmutex. The ide

Re: DOS by unprivileged user

2018-04-20 Thread Mike Galbraith
On Fri, 2018-04-20 at 10:39 +0200, Ferry Toth wrote: > > Nevertheless I feel one process should not be allowed to harm other > processes by denying them resources. Even if when btrfs makes it easy > abuse I think the scheduler should have throttled gitk. Memory management is not in it's job descr

Re: [PATCH v7 0/5] cpuset: Enable cpuset controller in default hierarchy

2018-04-20 Thread Mike Galbraith
On Thu, 2018-04-19 at 09:46 -0400, Waiman Long wrote: > v7: > - Add a root-only cpuset.cpus.isolated control file for CPU isolation. > - Enforce that load_balancing can only be turned off on cpusets with >CPUs from the isolated list. > - Update sched domain generation to allow cpusets with C

Re: DOS by unprivileged user

2018-04-19 Thread Mike Galbraith
On Thu, 2018-04-19 at 21:13 +0200, Ferry Toth wrote: > It appears any ordinary user can easily create a DOS on linux. > > One sure way to reproduce this is to open gitk on the linux kernel repo > (SIC) on a machine with 8GB RAM 16 GB swap on a HDD with btrfs and quad core > + hyperthreading. But

Re: cpu stopper threads and load balancing leads to deadlock

2018-04-18 Thread Mike Galbraith
On Wed, 2018-04-18 at 07:47 +0200, Mike Galbraith wrote: > On Tue, 2018-04-17 at 15:21 +0100, Matt Fleming wrote: > > Hi guys, > > > > We've seen a bug in one of our SLE kernels where the cpu stopper > > thread ("migration/15") is entering idle balance.

Re: cpu stopper threads and load balancing leads to deadlock

2018-04-17 Thread Mike Galbraith
On Tue, 2018-04-17 at 15:21 +0100, Matt Fleming wrote: > Hi guys, > > We've seen a bug in one of our SLE kernels where the cpu stopper > thread ("migration/15") is entering idle balance. This then triggers > active load balance. > > At the same time, a task on another CPU triggers a page fault an

Re: [PATCH AUTOSEL for 4.14 015/161] printk: Add console owner and waiter logic to load balance console writes

2018-04-17 Thread Mike Galbraith
On Tue, 2018-04-17 at 17:52 +0200, Jiri Kosina wrote: > On Tue, 17 Apr 2018, Sasha Levin wrote: > > > How do I get the XFS folks to send their stuff to -stable? (we have > > quite a few customers who use XFS) > > If XFS (or *any* other subsystem) doesn't have enough manpower of upstream > mainta

Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Mike Galbraith
On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote: > On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote: > > finished bisection. > > 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit > > (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image). > > Looks like yo

Re: [ANNOUNCE] v4.14.34-rt27

2018-04-16 Thread Mike Galbraith
On Fri, 2018-04-13 at 23:52 +0200, Sebastian Andrzej Siewior wrote: > > - Inter-event (latency) fixes by Tom Zanussi.  CC  kernel/trace/trace_events_hist.o kernel/trace/trace_events_hist.c: In function ‘__update_field_vars’: kernel/trace/trace_events_hist.c:3093:11: warning: ignoring retur

Re: x86-tip.today (4cdf573) early instaboot

2018-04-10 Thread Mike Galbraith
On Tue, 2018-04-10 at 09:06 -0500, Tom Lendacky wrote: > > Just out of curiosity, can you try the following patch and see if it > fixes your reboot issue: Yup, all better. > diff --git a/arch/x86/boot/compressed/kaslr.c > b/arch/x86/boot/compressed/kaslr.c > index c5196d2..a0a50b9 100644 > --- a

Re: x86-tip.today (4cdf573) early instaboot

2018-04-10 Thread Mike Galbraith
On Tue, 2018-04-10 at 10:59 +0200, Ingo Molnar wrote: > * Mike Galbraith wrote: > > > Hi Ingo, > > > > FYI, my i4790 box reboots immediately.. or close enough to it that you > > see nothing at all before again meeting the bios splash. Master with > > the

x86-tip.today (4cdf573) early instaboot

2018-04-10 Thread Mike Galbraith
Hi Ingo, FYI, my i4790 box reboots immediately.. or close enough to it that you see nothing at all before again meeting the bios splash. Master with the ~same config works fine. I haven't poked around yet (work). -Mike config-4.16.0.g4cdf573-tip-default.xz Description: application/xz

nouveau: swiotlb buffer is full (sz: 2097152 bytes)/swiotlb: coherent allocation failed, size=2097152 spam

2018-04-08 Thread Mike Galbraith
Greetings, Box is i4790 w. GTX 980 running virgin master (.today). All I have to do to trigger a slew of these warnings is to fire up firefox, point it at a youtube clip, and let it autoplay while I do routine kernel merge/build maintenance. nouveau doesn't seem to care deeply, but moans again a

Re: sched_rt_period_timer causing large latencies

2018-04-05 Thread Mike Galbraith
On Thu, 2018-04-05 at 10:27 +0200, Peter Zijlstra wrote: > On Thu, Apr 05, 2018 at 09:11:38AM +1000, Nicholas Piggin wrote: > > Hi, > > > > I'm seeing some pretty big latencies on a ~idle system when a CPU wakes > > out of a nohz idle. Looks like it's due to the taking a lot of remote > > locks an

Re: sched_rt_period_timer causing large latencies

2018-04-05 Thread Mike Galbraith
On Thu, 2018-04-05 at 17:44 +1000, Nicholas Piggin wrote: > > > My method of dealing with the throttle beast from hell for ~big box RT > > is to stomp it flat during boot, as otherwise jitter is awful. > > How do you stomp it flat? With a size 12 boot originally from SGI. Their extra hairy beas

Re: sched_rt_period_timer causing large latencies

2018-04-04 Thread Mike Galbraith
On Thu, 2018-04-05 at 09:11 +1000, Nicholas Piggin wrote: > Hi, > > I'm seeing some pretty big latencies on a ~idle system when a CPU wakes > out of a nohz idle. Looks like it's due to the taking a lot of remote > locks and cache lines. irqoff trace: > > latency: 407 us, #608/608, CPU#3 | (M:serv

Re: [GIT PULL] Kernel lockdown for secure boot

2018-04-04 Thread Mike Galbraith
On Wed, 2018-04-04 at 08:57 -0400, Theodore Y. Ts'o wrote: > On Wed, Apr 04, 2018 at 04:30:18AM +, Matthew Garrett wrote: > > What I'm afraid of is this turning into a "security" feature that ends up > > being circumvented in most scenarios where it's currently deployed - eg, > > module signatu

Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2

2018-03-27 Thread Mike Galbraith
On Tue, 2018-03-27 at 10:23 -0400, Waiman Long wrote: > On 03/27/2018 10:02 AM, Tejun Heo wrote: > > Hello, > > > > On Mon, Mar 26, 2018 at 04:28:49PM -0400, Waiman Long wrote: > >> Maybe we can have a different root level flag, say, > >> sched_partition_domain that is equivalent to !sched_load_bal

Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2

2018-03-26 Thread Mike Galbraith
On Mon, 2018-03-26 at 16:28 -0400, Waiman Long wrote: > > The sched_load_balance flag isn't something that is passed to the > scheduler. It only only affects the CPU topology of the system. So I > suspect that a process in the root cgroup will be load balanced among > the CPUs in the one of the ch

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-19 Thread Mike Galbraith
On Mon, 2018-03-19 at 17:41 -0400, Waiman Long wrote: > On 03/19/2018 04:49 PM, Mike Galbraith wrote: > > On Mon, 2018-03-19 at 08:34 -0700, Tejun Heo wrote: > >> Hello, Mike. > >> > >> On Thu, Mar 15, 2018 at 03:49:01AM +0100, Mike Galbraith wrote: > >&g

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-19 Thread Mike Galbraith
On Mon, 2018-03-19 at 08:34 -0700, Tejun Heo wrote: > Hello, Mike. > > On Thu, Mar 15, 2018 at 03:49:01AM +0100, Mike Galbraith wrote: > > Under the hood v2 details are entirely up to you. My input ends at > > please don't leave dynamic partitioning standing at

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-14 Thread Mike Galbraith
On Wed, 2018-03-14 at 12:57 -0700, Tejun Heo wrote: > Hello, > > On Sat, Mar 10, 2018 at 04:47:28AM +0100, Mike Galbraith wrote: > > Some form of cpu_exclusive (preferably exactly that, but something else > > could replace it) is needed to define sets that must not overlap

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-12 Thread Mike Galbraith
On Mon, 2018-03-12 at 10:20 -0400, Waiman Long wrote: > On 03/10/2018 08:16 AM, Peter Zijlstra wrote: > > > The equivalent of isolcpus=xxx is a cgroup setup like: > > > > root > > / \ > > systemother > > > > Where other has the @xxx cpus and system the remainder and > > root.sch

Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote: > Hi All, > > Thanks a lot for the discussion and testing so far! > > This is a total respin of the whole series, so please look at it afresh. > Patches 2 and 3 are the most similar to their previous versions, but > still they are differ

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 18:06 -0500, Waiman Long wrote: > On 03/09/2018 05:17 PM, Peter Zijlstra wrote: > > On Fri, Mar 09, 2018 at 03:43:34PM -0500, Waiman Long wrote: > >> The isolcpus= parameter just reduce the cpus available to the rests of > >> the system. The cpuset controller does look at that

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 13:20 -0500, Waiman Long wrote: > On 03/09/2018 01:17 PM, Mike Galbraith wrote: > > On Fri, 2018-03-09 at 12:45 -0500, Waiman Long wrote: > >> On 03/09/2018 11:34 AM, Mike Galbraith wrote: > >>> On Fri, 2018-03-09 at 10:35 -0500, Waiman Long wro

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 12:45 -0500, Waiman Long wrote: > On 03/09/2018 11:34 AM, Mike Galbraith wrote: > > On Fri, 2018-03-09 at 10:35 -0500, Waiman Long wrote: > >> Given the fact that thread mode had been merged into 4.14, it is now > >> time to enable cpuset to be us

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 17:34 +0100, Mike Galbraith wrote: > On Fri, 2018-03-09 at 10:35 -0500, Waiman Long wrote: > > Given the fact that thread mode had been merged into 4.14, it is now > > time to enable cpuset to be used in the default hierarchy (cgroup v2) > > as i

Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 10:35 -0500, Waiman Long wrote: > Given the fact that thread mode had been merged into 4.14, it is now > time to enable cpuset to be used in the default hierarchy (cgroup v2) > as it is clearly threaded. > > The cpuset controller had experienced feature creep since its > intr

Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework

2018-03-08 Thread Mike Galbraith
On Thu, 2018-03-08 at 12:10 +0100, Rafael J. Wysocki wrote: > On Thu, Mar 8, 2018 at 11:31 AM, Mike Galbraith wrote: > 1 2 3 > > 4.16.0.g1b88acc-master 6.95 7.03 6.91 (virgin) > > 4.16.0.g1b88acc-master 7.20 7.25 7.26 (+v2) &

Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework

2018-03-08 Thread Mike Galbraith
On Tue, 2018-03-06 at 09:57 +0100, Rafael J. Wysocki wrote: > Hi All, Greetings, > Thanks a lot for the discussion so far! > > Here's a new version of the series addressing some comments from the > discussion and (most importantly) replacing patches 4 and 5 with another > (simpler) patch. Oddit

Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

2018-02-15 Thread Mike Galbraith
On Thu, 2018-02-15 at 10:07 -0800, Rohit Jain wrote: > > > Rohit is running more tests with a patch that deletes > > sysctl_sched_migration_cost from idle_balance, and for his patch but > > with the 5000 usec mistake corrected back to 500 usec. So far both > > give improvements over the baseline,

Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

2018-02-15 Thread Mike Galbraith
On Thu, 2018-02-15 at 13:21 -0500, Steven Sistare wrote: > On 2/15/2018 1:07 PM, Mike Galbraith wrote: > > >> Can you provide more details on the sysbench oltp test that motivated you > >> to add sysctl_sched_migration_cost to idle_balance, so Rohit can re-test > >

Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

2018-02-15 Thread Mike Galbraith
On Thu, 2018-02-15 at 11:35 -0500, Steven Sistare wrote: > On 2/10/2018 1:37 AM, Mike Galbraith wrote: > > On Fri, 2018-02-09 at 11:08 -0500, Steven Sistare wrote: > >>>> @@ -8804,7 +8803,8 @@ static int idle_balance(struct rq *this_rq, struct > >>>> rq_flag

Re: [PATCH 3/4] sched/fair: Do not migrate on wake_affine_weight if weights are equal

2018-02-12 Thread Mike Galbraith
On Mon, 2018-02-12 at 18:29 +0100, Peter Zijlstra wrote: > On Mon, Feb 12, 2018 at 02:58:56PM +, Mel Gorman wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index c1091cb023c4..28c8d9c91955 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -5747,7 +5

Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

2018-02-09 Thread Mike Galbraith
On Fri, 2018-02-09 at 11:08 -0500, Steven Sistare wrote: > >> @@ -8804,7 +8803,8 @@ static int idle_balance(struct rq *this_rq, struct > >> rq_flags *rf) > >>if (!(sd->flags & SD_LOAD_BALANCE)) > >>continue; > >> > >> - if (this_rq->avg_idle < curr_cost +

Re: [RFC 2/2] Introduce sysctl(s) for the migration costs

2018-02-09 Thread Mike Galbraith
On Fri, 2018-02-09 at 12:33 -0500, Steven Sistare wrote: > On 2/9/2018 12:08 PM, Mike Galbraith wrote: > > > Shrug. It's bogus no mater what we do. Once Upon A Time, a cost > > number was generated via measurement, but the end result was just as > > bogus as a n

Re: [PATCH BUGFIX V3] block, bfq: add requeue-request hook

2018-02-09 Thread Mike Galbraith
On Fri, 2018-02-09 at 14:21 +0100, Oleksandr Natalenko wrote: > > In addition to this I think it should be worth considering CC'ing Greg > to pull this fix into 4.15 stable tree. This isn't one he can cherry-pick, some munging required, in which case he usually wants a properly tested backport.

Re: [RFC 2/2] Introduce sysctl(s) for the migration costs

2018-02-09 Thread Mike Galbraith
On Fri, 2018-02-09 at 11:10 -0500, Steven Sistare wrote: > On 2/8/2018 10:54 PM, Mike Galbraith wrote: > > On Thu, 2018-02-08 at 14:19 -0800, Rohit Jain wrote: > >> This patch introduces the sysctl for sched_domain based migration costs. > >> These in turn can be use

Re: [RFC PATCH 2/4] softirq: Per vector deferment to workqueue

2018-02-08 Thread Mike Galbraith
On Thu, 2018-02-08 at 20:30 +, Dmitry Safonov wrote: > On Thu, 2018-02-08 at 15:22 -0500, David Miller wrote: > > From: Dmitry Safonov > > Date: Thu, 08 Feb 2018 20:14:55 + > > > > > On Thu, 2018-02-08 at 13:45 -0500, David Miller wrote: > > >> From: Sebastian Andrzej Siewior > > >> Date

Re: [RFC 2/2] Introduce sysctl(s) for the migration costs

2018-02-08 Thread Mike Galbraith
On Thu, 2018-02-08 at 14:19 -0800, Rohit Jain wrote: > This patch introduces the sysctl for sched_domain based migration costs. > These in turn can be used for performance tuning of workloads. With this patch, we trade 1 completely bogus constant (cost is really highly variable) for 3, twiddling o

Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

2018-02-08 Thread Mike Galbraith
On Thu, 2018-02-08 at 14:19 -0800, Rohit Jain wrote: > This patch makes idle_balance more dynamic as the sched_migration_cost > is now accounted on a sched_domain level. This in turn is done in > sd_init when we know what the topology relationships are. > > For introduction sakes cost of migration

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-07 Thread Mike Galbraith
On Wed, 2018-02-07 at 12:12 +0100, Paolo Valente wrote: > Just to be certain, before submitting a new patch: you changed *only* > the BUG_ON at line 4742, on top of my instrumentation patch. Nah, I completely rewrite it with only a little help from an ouija board to compensate for missing (all) k

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-07 Thread Mike Galbraith
On Wed, 2018-02-07 at 11:27 +0100, Paolo Valente wrote: > > 2. Could you please turn that BUG_ON into: > if (!(rq->rq_flags & RQF_ELVPRIV)) > return; > and see what happens? That seems to make it forgets how to make boom. -Mike

Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y

2018-02-07 Thread Mike Galbraith
On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote: > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote: > > Hi All, > > > > I met the makedumpfile failed in the upstream kernel which contained > > this patch. Did I missed something else? > > None I'm aware of. > > Is there a r

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-07 Thread Mike Galbraith
On Wed, 2018-02-07 at 11:27 +0100, Paolo Valente wrote: > > 1. Could you paste a stack trace for this OOPS, just to understand how we > get there? [ 442.421058] kernel BUG at block/bfq-iosched.c:4742! [ 442.421762] invalid opcode: [#1] SMP PTI [ 442.422436] Dumping ftrace buffer: [ 442.4

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-07 Thread Mike Galbraith
On Wed, 2018-02-07 at 10:45 +0100, Paolo Valente wrote: > > > Il giorno 07 feb 2018, alle ore 10:23, Mike Galbraith ha > > scritto: > > > > On Wed, 2018-02-07 at 10:08 +0100, Paolo Valente wrote: > >> > >> The first piece of information I need is w

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-07 Thread Mike Galbraith
On Wed, 2018-02-07 at 10:08 +0100, Paolo Valente wrote: > > The first piece of information I need is whether this failure happens > even without "BFQ hierarchical scheduling support". I presume you mean BFQ_GROUP_IOSCHED, which I do not have enabled. -Mike 

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-06 Thread Mike Galbraith
On Tue, 2018-02-06 at 13:43 +0100, Holger Hoffstätte wrote: > > A much more interesting question to me is why there is kyber in the middle. :) Yeah, given per sysfs I have zero devices using kyber. -Mike

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-06 Thread Mike Galbraith
On Tue, 2018-02-06 at 13:26 +0100, Paolo Valente wrote: > > ok, right in the middle of bfq this time ... Was this the first OOPS in your > kernel log? Yeah.

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-06 Thread Mike Galbraith
On Tue, 2018-02-06 at 13:16 +0100, Oleksandr Natalenko wrote: > Hi. > > 06.02.2018 12:57, Mike Galbraith wrote: > > Not me.  Box seems to be fairly sure that it is bfq. Twice again box > > went belly up on me in fairly short order with bfq, but seemed fine > > wi

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-06 Thread Mike Galbraith
On Tue, 2018-02-06 at 10:38 +0100, Paolo Valente wrote: > > Hi Mike, > as you can imagine, I didn't get any failure in my pre-submission > tests on this patch. In addition, it is not that easy to link this > patch, which just adds some internal bfq housekeeping in case of a > requeue, with a corr

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-06 Thread Mike Galbraith
On Tue, 2018-02-06 at 09:37 +0100, Oleksandr Natalenko wrote: > Hi. > > 06.02.2018 08:56, Mike Galbraith wrote: > > I was doing kbuilds, and it blew up on me twice. Switching back to cfq > > seemed to confirm it was indeed the patch causing trouble, but that's &

Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook

2018-02-05 Thread Mike Galbraith
On Tue, 2018-02-06 at 08:44 +0100, Oleksandr Natalenko wrote: > Hi, Paolo. > > I can confirm that this patch fixes cfdisk hang for me. I've also tried > to trigger the issue Mike has encountered, but with no luck (maybe, I > wasn't insistent enough, just was doing dd on usb-storage device in the

<    1   2   3   4   5   6   7   8   9   10   >