Re: Check all returns from audit_log_start

2012-09-06 Thread Dave Jones
On Thu, Sep 06, 2012 at 11:36:06AM -0400, Eric Paris wrote: > On Thu, Sep 6, 2012 at 11:08 AM, Dave Jones wrote: > > Following on from the previous patch that fixed an oops, these > > are all the other similar code patterns in the tree with the same > > checks adde

Re: Check all returns from audit_log_start

2012-09-06 Thread Dave Jones
On Thu, Sep 06, 2012 at 11:47:49AM -0400, Dave Jones wrote: > > Not certain because I haven't looked at what happens with the error > > code, but I think this might not be right. auditd can be explictly > > told not to audit certain events, in which case it is normal an

Remove user-triggerable BUG from mpol_to_str

2012-09-06 Thread Dave Jones
to_str+0x156/0x360 Cc: sta...@vger.kernel.org Signed-off-by: Dave Jones diff --git a/mm/mempolicy.c b/mm/mempolicy.c index bd92431..4ada3be 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2562,7 +2562,7 @@ int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol, int no_c

Re: 3.6-rc4 audit_log_d_path oops.

2012-09-06 Thread Dave Jones
On Thu, Sep 06, 2012 at 09:32:49AM -0700, Kees Cook wrote: > > I just realised, the funny thing about this is that the machine running > > that test > > had selinux/audit disabled. And yet here we are, screwing around with > > audit buffers. > > The intent was to have this message show up

Re: [PATCH] make CONFIG_EXPERIMENTAL invisible and default

2012-10-07 Thread Dave Jones
On Sun, Oct 07, 2012 at 09:30:29AM -0700, Paul E. McKenney wrote: > > I think Kconfig is mostly what distro would like to use the thing is > > the Kconfig text needs to be there upfront when its merged, not two > > months later, since then it too late for a distro to notice. > > > > I'd bet

mpol_to_str revisited.

2012-10-08 Thread Dave Jones
return checking, and also clears the buffer beforehand. Reported-by: Ben Hutchings Cc: sta...@kernel.org Signed-off-by: Dave Jones --- unanswered question: why are the buffer sizes here different ? which is correct? diff -durpN '--exclude-from=/home/davej/.exclude' src/git-tr

Re: mpol_to_str revisited.

2012-10-08 Thread Dave Jones
On Mon, Oct 08, 2012 at 11:09:49AM -0400, Dave Jones wrote: > Last month I sent in 80de7c3138ee9fd86a98696fd2cf7ad89b995d0a to remove > a user triggerable BUG in mempolicy. > > Ben Hutchings pointed out to me that my change introduced a potential leak > of stack conten

Re: mpol_to_str revisited.

2012-10-08 Thread Dave Jones
On Mon, Oct 08, 2012 at 01:35:42PM -0700, David Rientjes wrote: > > unanswered question: why are the buffer sizes here different ? which is > > correct? > > > Given the current set of mempolicy modes and flags, it's 34, but this can > change if new modes or flags are added with longer name

ODEBUG: free active (active state 0) object type: work_struct hint: flush_to_ldisc+0x0/0x1a0

2012-10-09 Thread Dave Jones
Just hit this.. WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() ODEBUG: free active (active state 0) object type: work_struct hint: flush_to_ldisc+0x0/0x1a0 Modules linked in: fuse ipt_ULOG nfnetlink tun binfmt_misc nfc caif_socket caif phonet can llc2 pppoe pppox ppp_generic s

Re: ODEBUG: free active (active state 0) object type: work_struct hint: flush_to_ldisc+0x0/0x1a0

2012-10-10 Thread Dave Jones
On Wed, Oct 10, 2012 at 10:11:44AM +0200, Jiri Slaby wrote: > On 10/10/2012 06:26 AM, Dave Jones wrote: > > Just hit this.. > > That'd be me perhaps. Do you have some serial device connected? Or is it > a pure terminals + ptys? There's a usb serial tty conn

Re: 3.6.0: WARNING: at arch/x86/kernel/apic/ipi.c:109 default_send_IPI_mask_logical+0x97/0xc7()

2012-10-17 Thread Dave Jones
On Tue, Oct 02, 2012 at 01:26:31PM +0200, Ralf Hildebrandt wrote: > WARNING: at arch/x86/kernel/apic/ipi.c:109 > default_send_IPI_mask_logical+0x97/0xc7() > Hardware name: ProLiant DL360 G4 > empty IPI mask > Modules linked in: nfnetlink_log nfnetlink ipv6 tg3 microcode rng_core > psmouse

Re: suspicious RCU usage in cgroup

2012-10-17 Thread Dave Jones
On Fri, Oct 05, 2012 at 06:06:12PM -0400, Aristeu Rozanski wrote: > Hi Dave, > On Fri, Oct 05, 2012 at 05:59:29PM -0400, Dave Jones wrote: > > On boot in Linus' current tree.. > > > > === > > [ INFO: suspicious RCU u

Re: [patch for-3.7] mm, mempolicy: fix printing stack contents in numa_maps

2012-10-17 Thread Dave Jones
On Tue, Oct 16, 2012 at 10:24:32PM -0700, David Rientjes wrote: > On Wed, 17 Oct 2012, Dave Jones wrote: > > > BUG: sleeping function called from invalid context at kernel/mutex.c:269 > > Hmm, looks like we need to change the refcount semantics entirely. We&#

Re: [patch for-3.7] mm, mempolicy: fix printing stack contents in numa_maps

2012-10-17 Thread Dave Jones
On Wed, Oct 17, 2012 at 12:21:10PM -0700, David Rientjes wrote: > On Wed, 17 Oct 2012, Dave Jones wrote: > > > On Tue, Oct 16, 2012 at 10:24:32PM -0700, David Rientjes wrote: > > > On Wed, 17 Oct 2012, Dave Jones wrote: > > > > > > > BUG: sleepi

Re: [patch for-3.7] mm, mempolicy: fix printing stack contents in numa_maps

2012-10-17 Thread Dave Jones
ces as a result of this change causing > the mempolicies to never be freed. ("numa_policy" turns out to be > policy_cache in the code, so thanks for checking both of them.) > > Could I add your tested-by? Sure. Here's a fresh one I just baked. Tested-by: Dave Jones

MAX_LOCKDEP_ENTRIES too low (called from ioc_release_fn)

2012-10-17 Thread Dave Jones
Triggered while fuzz testing.. BUG: MAX_LOCKDEP_ENTRIES too low! turning off the locking correctness validator. Pid: 22788, comm: kworker/2:1 Not tainted 3.7.0-rc1+ #34 Call Trace: [] add_lock_to_list.isra.29.constprop.45+0xdd/0xf0 [] __lock_acquire+0x1121/0x1ba0 [] lock_acquire+0xa2/0x220 []

[PATCH] print reason for failure in kcmp_test

2012-10-18 Thread Dave Jones
I was curious why sys_kcmp wasn't working, which led me to the testcase. It turned out I hadn't enabled CHECKPOINT_RESTORE in the kernel I was testing. Add a decoding of errno to the testcase to make that obvious. Signed-off-by: Dave Jones diff --git a/tools/testing/selftests/kcmp/k

Re: MAX_LOCKDEP_ENTRIES too low (called from ioc_release_fn)

2012-10-18 Thread Dave Jones
On Thu, Oct 18, 2012 at 07:53:08AM +0200, Jens Axboe wrote: > On 2012-10-18 03:53, Dave Jones wrote: > > Triggered while fuzz testing.. > > > > > > BUG: MAX_LOCKDEP_ENTRIES too low! > > turning off the locking correctness validator. > > Pid: 22788,

weird use-after-free bug in module_put

2012-10-19 Thread Dave Jones
I've hit this twice in the last two days while fuzz testing. (Both times on i686 only, my x86-64 tests aren't hitting it for some reason). BUG: unable to handle kernel paging request at 6b6b6ce3 IP: [] module_put+0x1e/0x160 *pdpt = 25a4b001 *pde = Oops: [#1] PREEMPT

Re: weird use-after-free bug in module_put

2012-10-19 Thread Dave Jones
On Fri, Oct 19, 2012 at 10:43:51AM -0400, Dave Jones wrote: > I've hit this twice in the last two days while fuzz testing. > (Both times on i686 only, my x86-64 tests aren't hitting it > for some reason). > > BUG: unable to handle kernel paging request at 6b6b6ce3

Re: MAX_LOCKDEP_ENTRIES too low (called from ioc_release_fn)

2012-10-19 Thread Dave Jones
On Fri, Oct 19, 2012 at 02:49:32PM +0200, Peter Zijlstra wrote: > Of course, if you do run out of lock classes, the next thing to do is > to find the offending lock classes. First, the following command gives > you the number of lock classes currently in use along with the maximum: > >

rs600 warn on while booting 3.7-rc8

2012-12-08 Thread Dave Jones
[ 66.104790] WARNING: at drivers/gpu/drm/radeon/rs600.c:571 rs600_irq_set+0x1e2/0x200() [ 66.105748] Hardware name: GA-MA78GM-S2H [ 66.106220] Can't enable IRQ/MSI because no handler is installed [ 66.106935] Modules linked in: [ 66.107329] Pid: 43, comm: kworker/0:1 Not tainted 3.7.0-rc

Re: livelock in __writeback_inodes_wb ?

2012-12-11 Thread Dave Jones
On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote: > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > We had a user report the soft lockup detector kicked after 22 > > seconds of no progress, with this trace.. > > Where is the original report?

null dereference at r100_debugfs_cp_ring_info+0x115/0x140

2012-12-11 Thread Dave Jones
(Taint comes from previous r600 bug reported here https://lkml.org/lkml/2012/12/8/131) [35662.070628] BUG: unable to handle kernel NULL pointer dereference at (null) [35662.071719] IP: [] r100_debugfs_cp_ring_info+0x115/0x140 [35662.072652] PGD b4c17067 PUD b69d1067 PMD 0 [35662.07324

3.7 XFS lockdep trace

2012-12-11 Thread Dave Jones
This says rc8+, but it's just missing the Makefile change, so it's still there in 3.7 Curious that firefox was the process mentioned here, as ~/.mozilla isn't on xfs. My only xfs partition is /data holding a kernel source tree & .ccache Dave [30557.769727] ==

[PATCH] Print loaded modules when we encounter a bad page map.

2012-12-11 Thread Dave Jones
When we see reports like https://bugzilla.redhat.com/show_bug.cgi?id=883576 it might be useful to know what modules had been loaded, so they can be compared with similar reports to see if there is a common suspect. Signed-off-by: Dave Jones diff --git a/mm/memory.c b/mm/memory.c index 221fc9f

3.7 watchdog debugobjects warning

2012-12-11 Thread Dave Jones
Looks like we're doing a double-init on a timer. I had been experimenting with powertop, so that may have triggered something maybe suspend/resume related ? (says -rc8, but it's only missing the Makefile change) [14844.560489] WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() [14

WARNING: at drivers/tty/tty_buffer.c:476 flush_to_ldisc+0x1de/0x1f0()

2012-12-11 Thread Dave Jones
Fuzz-testing fallout from post 3.7 tree as of commit 414a6750e59b0b687034764c464e9ddecac0f7a6 [ 2181.230579] [ cut here ] [ 2181.231277] WARNING: at drivers/tty/tty_buffer.c:476 flush_to_ldisc+0x1de/0x1f0() [ 2181.232358] Hardware name: GA-MA78GM-S2H [ 2181.232925] tty is

Re: [PATCH] tmpfs: fix shmem_getpage_gfp VM_BUG_ON

2012-11-13 Thread Dave Jones
On Tue, Nov 13, 2012 at 07:50:25PM -0800, Hugh Dickins wrote: > Originally I was waiting to hear further from Dave; but his test > machine was giving trouble, and it occurred to me that, never mind > whether he says he has hit it again, or he has not hit it again, > the answer is the same: do

Re: [PATCH] tmpfs: fix shmem_getpage_gfp VM_BUG_ON

2012-11-06 Thread Dave Jones
On Mon, Nov 05, 2012 at 05:32:41PM -0800, Hugh Dickins wrote: > -/* We already confirmed swap, and make no allocation */ > -VM_BUG_ON(error); > +/* > + * We already confirmed swap under page lock, and make > +

Re: tty, vt: lockdep warnings

2012-11-06 Thread Dave Jones
On Tue, Nov 06, 2012 at 04:11:00PM +, Alan Cox wrote: > > But a deadlock we have lived with for years. Without reverting, > > we're prevented from discovering all the new deadlocks we're adding. > > We lived with it locking boxes up on users but not knowing why. Circa 3.5 we got a lot m

sched_debug / traverse allocation failures.

2012-11-06 Thread Dave Jones
While fuzz-testing, I frequently run into this.. trinity-child4: page allocation failure: order:4, mode:0x40d0 Pid: 21842, comm: trinity-child4 Not tainted 3.7.0-rc4+ #54 Call Trace: [] warn_alloc_failed+0xe9/0x150 [] ? __alloc_pages_direct_compact+0x1f8/0x209 [] __alloc_pages_nodemask+0x936/0x

Re: [RFC 2/2] procfs: /proc/sched_debug fails on very very large machines.

2012-11-06 Thread Dave Jones
On Tue, Nov 06, 2012 at 03:02:21PM -0600, Nathan Zimmer wrote: > On systems with 4096 cores attemping to read /proc/sched_debug fails. > We are trying to push all the data into a single kmalloc buffer. > The issue is on these very large machines all the data will not fit in 4mb. > > A better

Re: [RFC 2/2] procfs: /proc/sched_debug fails on very very large machines.

2012-11-06 Thread Dave Jones
On Tue, Nov 06, 2012 at 05:24:15PM -0600, Nathan Zimmer wrote: > On Tue, Nov 06, 2012 at 04:31:28PM -0500, Dave Jones wrote: > > On Tue, Nov 06, 2012 at 03:02:21PM -0600, Nathan Zimmer wrote: > > > On systems with 4096 cores attemping to read /proc/sched_debug fails. >

Re: [PATCH] tcp: Replace infinite loop on recvmsg bug with proper crash

2012-11-06 Thread Dave Jones
On Tue, Nov 06, 2012 at 04:15:35PM -0800, Julius Werner wrote: > tcp_recvmsg contains a sanity check that WARNs when there is a gap > between the socket's copied_seq and the first buffer in the > sk_receive_queue. In theory, the TCP stack makes sure that This Should > Never Happen (TM)... howev

Re: [PATCH] tcp: Replace infinite loop on recvmsg bug with proper crash

2012-11-07 Thread Dave Jones
On Tue, Nov 06, 2012 at 05:51:19PM -0800, Julius Werner wrote: > > We've had reports of this WARN against the Fedora kernel for a while. > > Had this been immediately followed by a BUG(), we'd have never seen those > > traces at all, > > and just got "my machine just locked up" reports instead

Re: [PATCH] tcp: Replace infinite loop on recvmsg bug with proper crashusers

2012-11-07 Thread Dave Jones
On Wed, Nov 07, 2012 at 08:29:12AM -0800, Eric Dumazet wrote: > On Wed, 2012-11-07 at 10:54 -0500, Dave Jones wrote: > > > It sounds more appropriate to me, instead of silently wedging the box. > > At least with that approach we have a chance of finding out what happened. &

Re: [PATCH] tcp: Replace infinite loop on recvmsg bug with proper crashusers

2012-11-07 Thread Dave Jones
On Wed, Nov 07, 2012 at 09:05:02AM -0800, Eric Dumazet wrote: > On Wed, 2012-11-07 at 11:43 -0500, Dave Jones wrote: > > > dude, look at the bug reports I just pointed you at. > > People _are_ aware there are bugs there. > > > If I remember well, I helped to fix

Re: [PATCH] tmpfs: fix shmem_getpage_gfp VM_BUG_ON

2012-11-07 Thread Dave Jones
On Tue, Nov 06, 2012 at 03:48:20PM -0800, Hugh Dickins wrote: > > [ cut here ] > > WARNING: at mm/shmem.c:1151 shmem_getpage_gfp+0xa5c/0xa70() > > Hardware name: 2012 Client Platform > > Pid: 21798, comm: trinity-child4 Not tainted 3.7.0-rc4+ #54 > > That's the very

schedule_timeout: wrong timeout value fffffffffffffff0

2013-01-02 Thread Dave Jones
This happened to a box I left running fuzz tests over the holidays. schedule_timeout: wrong timeout value fff0 Pid: 6606, comm: trinity-child1 Not tainted 3.8.0-rc1+ #43 Call Trace: [] schedule_timeout+0x305/0x340 [] ? preempt_schedule+0x42/0x60 [] ? _raw_spin_unlock_irqrestore+0x7

WARNING: at drivers/tty/tty_buffer.c:476 (tty is NULL)

2013-01-02 Thread Dave Jones
This happened a few times to my test boxes I left running over the holidays.. [ 8419.797533] [ cut here ] [ 8419.798341] WARNING: at drivers/tty/tty_buffer.c:476 flush_to_ldisc+0x1de/0x1f0() [ 8419.800313] Hardware name: GA-MA78GM-S2H [ 8419.800887] tty is NULL [ 8419.8018

memory corruption, possibly caused by i915

2013-01-02 Thread Dave Jones
We've had a increased number of reports in the last six months or so from Fedora users getting corrupted page tables. At first I wrote it off to bad hardware, but they started happening frequently enough that I began to wonder if it was a real problem. The only common thing I could think of was th

order 4 alloc failures in security_context_to_sid_core

2013-01-02 Thread Dave Jones
Along the same lines as 779302e67835fe9a6b74327e54969ba59cb3478a, xattrs can cause big allocations, which are likely to fail under memory pressure.. [20539.081122] trinity-child3: page allocation failure: order:4, mode:0x1040d0 [20539.090405] Pid: 27617, comm: trinity-child3 Not tainted 3.8.0-rc1+

Re: memory corruption, possibly caused by i915

2013-01-02 Thread Dave Jones
On Wed, Jan 02, 2013 at 11:01:15AM -0500, Chris Mason wrote: > > [52460.280346] BUG: Bad page map in process panel-6-systray > > pte:8800b665a0e8 pmd:b6659067 > > [52460.280848] addr:0038bf3fd000 vm_flags:0070 anon_vma: > > (null) mapping:88011052fd98 index:1fd > >

oops in copy_page_rep()

2013-01-05 Thread Dave Jones
I have no idea what happened here, but this is the first time I've seen this one. This was running a tree pulled yesterday afternoon. BUG: unable to handle kernel paging request at 880100201000 IP: [] copy_page_rep+0x5/0x10 PGD 1c0c063 PUD cfbff067 PMD cfc01067 PTE 800100201160 Oops:

mutex warning in intel_cacheinfo.c:cpu_list_show

2012-12-17 Thread Dave Jones
(At least I think that's where 'cpu_list_show' comes from... those preprocessor tricks confuse ctags) Just started seeing this today.. (fwiw, cpu is a Phenom(tm) 9750) Dave WARNING: at kernel/mutex.c:198 mutex_lock_nested+0x39c/0x3b0() Hardware name: GA-MA78GM-S2H Modules linked in: hid

CPU hotplug lockdep trace during offline.

2012-12-20 Thread Dave Jones
>From Linus' tree as of a half hour ago. echo 0 > /sys/devices/system/cpu/cpu1/online [ 67.675171] == [ 67.676121] [ INFO: possible circular locking dependency detected ] [ 67.677084] 3.7.0+ #34 Not tainted [ 67.677641] ---

nfsd oops on Linus' current tree.

2012-12-21 Thread Dave Jones
Did a mount from a client (also running Linus current), and the server spat this out.. [ 6936.306135] [ cut here ] [ 6936.306154] WARNING: at net/sunrpc/clnt.c:617 rpc_shutdown_client+0x12a/0x1b0 [sunrpc]() [ 6936.306156] Hardware name: [ 6936.306157] Modules link

Re: pi futex oops in __lock_acquire

2012-11-20 Thread Dave Jones
On Wed, Oct 24, 2012 at 09:44:07PM -0700, Darren Hart wrote: > > I've been able to trigger this for the last week or so. > > Unclear whether this is a new bug, or my fuzzer got smarter, but I see the > > pi-futex code hasn't changed since the last time it found something.. > > > > > BUG: u

WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x98/0xc0()

2013-04-17 Thread Dave Jones
Slightly old kernel, but likely still relevant judging by recent commits. Hit an oom condition while fuzz testing, but what's interesting is what happened immediately afterwards.. (the lockup) Could this just be that the kernel was so busy recovering from swapping etc that watchdog perceived that

Re: rcu_preempt running flat out on idle desktop.

2013-06-12 Thread Dave Jones
On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote: > > Every process 200% or 0%. > > I see, would you mind testing this branch? > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > timers/urgent > > It might help, I specially think about > 45eacc69277

WARNING: at drivers/scsi/scsi_lib.c:1196 scsi_setup_fs_cmnd during RAID5 startup.

2013-06-12 Thread Dave Jones
3.10 seems to have a problem with dirty RAID5 sets. I've got a machine that panics on boot during RAID5 activation. After switching the BUG_ON to a WARN_ON, I was able to get this over serial console.. md/raid:md0: not clean -- starting background reconstruction md/raid:md0: device sdd1 operatio

Re: WARNING: at drivers/scsi/scsi_lib.c:1196 scsi_setup_fs_cmnd during RAID5 startup.

2013-06-12 Thread Dave Jones
On Wed, Jun 12, 2013 at 03:43:46PM -0400, Dave Jones wrote: > 3.10 seems to have a problem with dirty RAID5 sets. > > I've got a machine that panics on boot during RAID5 activation. > After switching the BUG_ON to a WARN_ON, I was able to get this over serial > console.

[3.10-rc6] WARNING: at fs/btrfs/inode.c:7961 btrfs_destroy_inode+0x265/0x2e0 [btrfs]()

2013-06-17 Thread Dave Jones
Hit this while running this script in a loop.. https://github.com/kernelslacker/io-tests/blob/master/setup.sh [34385.251507] [ cut here ] [34385.254068] WARNING: at fs/btrfs/inode.c:7961 btrfs_destroy_inode+0x265/0x2e0 [btrfs]() [34385.257275] Modules linked in: vmw_vsock_

Re: [PATCH 5/5] cpufreq:boost:Kconfig: Enable boost support at Kconfig

2013-06-06 Thread Dave Jones
On Thu, Jun 06, 2013 at 09:07:52AM +0200, Lukasz Majewski wrote: > +config CPU_FREQ_BOOST > +bool "CPU frequency boost support" > +help > + Switch to enable support for frequency boost > + > + If in doubt, say N. > + This help text is devoid of any useful information. On

Re: [PATCH 0/3] Increase the number of USB to serial devices we can support at once

2013-06-06 Thread Dave Jones
On Wed, Jun 05, 2013 at 10:54:26AM -0700, Greg KH wrote: > Here are 3 patches that I've tested out on my system with only a small > number of devices, but it seems to work, so why not let others try it > out... > > These patches make the USB to serial core have the ability to support up > to

Re: [PATCH 5/5] cpufreq:boost:Kconfig: Enable boost support at Kconfig

2013-06-06 Thread Dave Jones
On Thu, Jun 06, 2013 at 05:14:31PM +0200, Lukasz Majewski wrote: > Hi Dave, > > > On Thu, Jun 06, 2013 at 09:07:52AM +0200, Lukasz Majewski wrote: > > > > > +config CPU_FREQ_BOOST > > > + bool "CPU frequency boost support" > > > + help > > > + Switch to enable supp

Re: rcu_preempt running flat out on idle desktop.

2013-06-06 Thread Dave Jones
On Tue, May 14, 2013 at 03:21:07AM +0200, Frederic Weisbecker wrote: > On Thu, May 09, 2013 at 05:10:26PM -0400, Dave Jones wrote: > > On Thu, May 09, 2013 at 11:02:08PM +0200, Frederic Weisbecker wrote: > > > > > > RCU options for this build are.. > > &

Re: [Qemu-devel] [PATCH] virtio-net: put virtio net header inline with data

2013-06-06 Thread Dave Jones
On Thu, Jun 06, 2013 at 02:59:44PM -0500, Jesse Larrew wrote: > >pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest); > > + if (vi->mergeable_rx_bufs) > > + hdr_len = sizeof hdr->mhdr; > > + else > > + hdr_len = sizeof hdr->hdr; > > All conditionals need braces

Re: rcu_preempt running flat out on idle desktop.

2013-06-06 Thread Dave Jones
On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote: > > Every process 200% or 0%. > > I see, would you mind testing this branch? > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > timers/urgent > > It might help, I specially think about > 45eacc6927

Re: [3.10-rc6] WARNING: at fs/btrfs/inode.c:7961 btrfs_destroy_inode+0x265/0x2e0 [btrfs]()

2013-06-17 Thread Dave Jones
On Mon, Jun 17, 2013 at 01:39:42PM -0400, Chris Mason wrote: > Quoting Dave Jones (2013-06-17 09:49:55) > > Hit this while running this script in a loop.. > > https://github.com/kernelslacker/io-tests/blob/master/setup.sh > > [34385.251507] -

Re: [3.10-rc6] WARNING: at fs/btrfs/inode.c:7961 btrfs_destroy_inode+0x265/0x2e0 [btrfs]()

2013-06-17 Thread Dave Jones
On Mon, Jun 17, 2013 at 02:42:27PM -0400, Chris Mason wrote: > Quoting Dave Jones (2013-06-17 14:20:06) > > On Mon, Jun 17, 2013 at 01:39:42PM -0400, Chris Mason wrote: > > > Quoting Dave Jones (2013-06-17 09:49:55) > > > > Hit this while running this scri

btrfs triggered 'MAX_LOCKDEP_CHAINS too low'

2013-06-17 Thread Dave Jones
Something else I've seen a few times from my io script (Always during btrfs runs)... BUG: MAX_LOCKDEP_CHAINS too low! turning off the locking correctness validator. Please attach the output of /proc/lock_stat to the bug report CPU: 1 PID: 492255 Comm: kworker/u8:0 Not tainted 3.10.0-rc6+ #6 Hardwa

EDAC BUG: key ee78135c not in .data!

2013-06-17 Thread Dave Jones
I see this during boot-up: i5k_amb: probe of i5k_amb.0 failed with error -16 EDAC MC: Ver: 3.0.0 BUG: key ee78135c not in .data! [ cut here ] WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x34f/0x37c() DEBUG_LOCKS_WARN_ON(1) Modules linked in: i5000_edac(+) edac_core

[3.10rc6] /proc/dri/0/vma broken on nouveau.

2013-06-17 Thread Dave Jones
Reading /proc/dri/0/vma causes bad things to happen on a box with nouveau loaded. (Note, no X running on that box) Trace below shows trinity, but I can reproduce it with just cat /proc/dri/0/vma [ cut here ] kernel BUG at arch/x86/mm/physaddr.c:79! invalid opcode: [#

Re: [3.10rc6] /proc/dri/0/vma broken on nouveau.

2013-06-17 Thread Dave Jones
On Mon, Jun 17, 2013 at 09:49:27PM -0400, David Airlie wrote: > > > Reading /proc/dri/0/vma causes bad things to happen on a box with nouveau > > loaded. > > (Note, no X running on that box) > > > > Trace below shows trinity, but I can reproduce it with just cat > > /proc/dri/0/vma > >

[PATCH 3/7] staging/rtl8192u: remove commented out __list_for_each usage

2013-06-17 Thread Dave Jones
Also remove another commented out open-coded list manipulation while we're there. Signed-off-by: Dave Jones --- drivers/staging/rtl8192u/ieee80211/ieee80211_rx.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_rx.c b/dr

[PATCH 5/7] sctp: Convert __list_for_each use to list_for_each

2013-06-17 Thread Dave Jones
Signed-off-by: Dave Jones --- net/sctp/protocol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) netdev: patches 1-4 & 6 are independant to this, and 7 won't be merged until this one gets to Linus' tree. diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index ea

[PATCH 2/7] ipw2200: Convert __list_for_each usage to list_for_each

2013-06-17 Thread Dave Jones
Signed-off-by: Dave Jones --- drivers/net/wireless/ipw2x00/ipw2200.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/ipw2x00/ipw2200.c b/drivers/net/wireless/ipw2x00/ipw2200.c index d96257b..4ed5e45 100644 --- a/drivers/net/wireless/ipw2x00/ipw2200.c

[PATCH 1/7] radeon: Remove redundant __list_for_each definition from mkregtable.c

2013-06-17 Thread Dave Jones
Signed-off-by: Dave Jones --- drivers/gpu/drm/radeon/mkregtable.c | 13 - 1 file changed, 13 deletions(-) diff --git a/drivers/gpu/drm/radeon/mkregtable.c b/drivers/gpu/drm/radeon/mkregtable.c index 5a82b6b..af85299 100644 --- a/drivers/gpu/drm/radeon/mkregtable.c +++ b/drivers/gpu

[PATCH 4/7] staging/rtl8187se: Convert __list_for_each use to list_for_each

2013-06-17 Thread Dave Jones
Also remove commented out list manipulation Signed-off-by: Dave Jones --- drivers/staging/rtl8187se/ieee80211/ieee80211_rx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/rtl8187se/ieee80211/ieee80211_rx.c b/drivers/staging/rtl8187se/ieee80211

[PATCH 7/7] list: Remove __list_for_each

2013-06-17 Thread Dave Jones
__list_for_each used to be the non prefetch() aware list walking primitive. When we removed the prefetch macros from the list routines, it became redundant. Given it does exactly the same thing as list_for_each now, we might as well remove it and call list_for_each directly. Signed-off-by: Dave

[PATCH 6/7] sound/usb/misc/ua101.c: convert __list_for_each usage to list_for_each

2013-06-17 Thread Dave Jones
Signed-off-by: Dave Jones --- sound/usb/misc/ua101.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/usb/misc/ua101.c b/sound/usb/misc/ua101.c index 6ad617b..8b5d2c5 100644 --- a/sound/usb/misc/ua101.c +++ b/sound/usb/misc/ua101.c @@ -1349,7 +1349,7 @@ static void

[x86] only print out DR registers if they are not power-on defaults.

2013-06-17 Thread Dave Jones
The DR registers are rarely useful when decoding oopses. With screen real estate during oopses at a premium, we can save two lines by only printing out these registers when they are set to something other than they power-on state. Signed-off-by: Dave Jones diff -durpN '--exclude-from=

Re: [x86] only print out DR registers if they are not power-on defaults.

2013-06-18 Thread Dave Jones
On Tue, Jun 18, 2013 at 10:43:56AM +0200, Borislav Petkov wrote: > On Tue, Jun 18, 2013 at 12:11:32AM -0400, Dave Jones wrote: > > The DR registers are rarely useful when decoding oopses. > > With screen real estate during oopses at a premium, we can save two lines > >

Re: Another RCU trace. (3.10-rc5)

2013-06-18 Thread Dave Jones
On Tue, Jun 18, 2013 at 11:58:21AM +0200, Peter Zijlstra wrote: > > Peter, Are you going to take the preempt_schedule_context() patch? > > I have it queued, I just seem to have some problems locating Ingo to > stuff patches into -tip :/ > > Will continue prodding.. Ingo if you're reading!

[x86][v2] only print out DR registers if they are not power-on defaults.

2013-06-18 Thread Dave Jones
-off-by: Dave Jones diff -durpN '--exclude-from=/home/davej/.exclude' /home/davej/src/kernel/git-trees/linux/arch/x86/kernel/process_64.c linux-dj/arch/x86/kernel/process_64.c --- linux/arch/x86/kernel/process_64.c 2013-05-01 10:02:52.064151923 -0400 +++ linux-dj/arch/x86/kernel/pr

Re: [PATCH 05/13] cpufreq: e_powersave: call CPUFREQ_POSTCHANGE notfier in error cases

2013-06-19 Thread Dave Jones
On Wed, Jun 19, 2013 at 08:24:41PM +0530, Viresh Kumar wrote: > On 19 June 2013 17:52, Simon Horman wrote: > > I have no objections to this change but at the same time I don't > > feel that I know the code well enough to review it. > > Probably I made a mistake adding your name. Don't know h

frequent softlockups with 3.10rc6.

2013-06-19 Thread Dave Jones
I've been hitting this a lot the last few days. This is the same machine that I was also seeing lockups during sync() Dave BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child9:6902] Modules linked in: bridge snd_seq_dummy dlci bnep fuse 8021q garp stp hidp tun rfcomm can_raw ipt_ULOG

Re: frequent softlockups with 3.10rc6.

2013-06-19 Thread Dave Jones
On Wed, Jun 19, 2013 at 12:45:40PM -0400, Dave Jones wrote: > I've been hitting this a lot the last few days. > This is the same machine that I was also seeing lockups during sync() On a whim, I reverted 971394f389992f8462c4e5ae0e3b49a10a9534a3 (As I started seeing these just aft

Re: [3.10-rc6] WARNING: at fs/btrfs/inode.c:7961 btrfs_destroy_inode+0x265/0x2e0 [btrfs]()

2013-06-19 Thread Dave Jones
On Wed, Jun 19, 2013 at 02:02:33PM -0400, Chris Mason wrote: > Quoting Dave Jones (2013-06-17 14:58:10) > > On Mon, Jun 17, 2013 at 02:42:27PM -0400, Chris Mason wrote: > > > Quoting Dave Jones (2013-06-17 14:20:06) > > > > On Mon, Jun 17, 2013 at 01:39

Re: frequent softlockups with 3.10rc6.

2013-06-19 Thread Dave Jones
On Wed, Jun 19, 2013 at 11:13:02AM -0700, Paul E. McKenney wrote: > > On a whim, I reverted 971394f389992f8462c4e5ae0e3b49a10a9534a3 > > (As I started seeing these just after that rcu merge). > > > > It's only been 30 minutes, but it seems stable again. Normally I would > > hit these within

Re: frequent softlockups with 3.10rc6.

2013-06-19 Thread Dave Jones
On Wed, Jun 19, 2013 at 11:13:02AM -0700, Paul E. McKenney wrote: > On Wed, Jun 19, 2013 at 01:53:56PM -0400, Dave Jones wrote: > > On Wed, Jun 19, 2013 at 12:45:40PM -0400, Dave Jones wrote: > > > I've been hitting this a lot the last few days. > > > This is

Re: frequent softlockups with 3.10rc6.

2013-06-20 Thread Dave Jones
On Thu, Jun 20, 2013 at 09:16:52AM -0700, Paul E. McKenney wrote: > On Wed, Jun 19, 2013 at 08:12:12PM -0400, Dave Jones wrote: > > On Wed, Jun 19, 2013 at 11:13:02AM -0700, Paul E. McKenney wrote: > > > On Wed, Jun 19, 2013 at 01:53:56PM -0400, Dave Jones wrote: > >

Re: rcu_preempt running flat out on idle desktop.

2013-06-20 Thread Dave Jones
On Wed, Jun 12, 2013 at 11:34:07AM -0400, Dave Jones wrote: > On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote: > > > > Every process 200% or 0%. > > > > I see, would you mind testing this branch? > > > > git://git.kernel.o

Re: [RFC] raise the maximum number of usb-serial devices to 512

2013-06-03 Thread Dave Jones
On Mon, Jun 03, 2013 at 07:49:59PM -0700, Greg Kroah-Hartman wrote: > On Mon, May 27, 2013 at 02:28:51PM +0200, Bjørn Mork wrote: > > But, IMHO, a nicer approach would be to make the allocation completely > > dynamic, using e.g. the idr subsystem. Static tables are always feel > > like straight

Re: frequent softlockups with 3.10rc6.

2013-06-21 Thread Dave Jones
On Thu, Jun 20, 2013 at 09:16:52AM -0700, Paul E. McKenney wrote: > > > > > I've been hitting this a lot the last few days. > > > > > This is the same machine that I was also seeing lockups during > > sync() > > > > > > > > On a whim, I reverted 971394f389992f8462c4e5ae0e3b49a10a9534a3

Re: frequent softlockups with 3.10rc6.

2013-06-21 Thread Dave Jones
On Fri, Jun 21, 2013 at 09:59:49PM +0200, Oleg Nesterov wrote: > I am puzzled. And I do not really understand > > hardirqs last enabled at (2380318): [] > restore_args+0x0/0x30 > hardirqs last disabled at (2380319): [] > apic_timer_interrupt+0x6a/0x80 > softirqs last ena

Re: frequent softlockups with 3.10rc6.

2013-06-22 Thread Dave Jones
On Sat, Jun 22, 2013 at 07:31:29PM +0200, Oleg Nesterov wrote: > > [ 7485.261299] WARNING: at include/linux/nsproxy.h:63 > > get_proc_task_net+0x1c8/0x1d0() > > [ 7485.262021] Modules linked in: 8021q garp stp tun fuse rfcomm bnep hidp > > snd_seq_dummy nfnetlink scsi_transport_iscsi can_bc

Re: frequent softlockups with 3.10rc6.

2013-06-23 Thread Dave Jones
On Sun, Jun 23, 2013 at 04:36:34PM +0200, Oleg Nesterov wrote: > > > Dave, I am sorry but all I can do is to ask you to do more testing. > > > Could you please reproduce the lockup again on the clean Linus's > > > current ? (and _without_ reverting 8aac6270, of course). > > > > I'll give

Re: frequent softlockups with 3.10rc6.

2013-06-23 Thread Dave Jones
On Sun, Jun 23, 2013 at 06:04:52PM +0200, Oleg Nesterov wrote: > > [11018.927809] [sched_delayed] sched: RT throttling activated > > [11054.897670] BUG: soft lockup - CPU#2 stuck for 22s! > > [trinity-child2:14482] > > [11054.898503] Modules linked in: bridge stp snd_seq_dummy tun fuse hidp

Re: frequent softlockups with 3.10rc6.

2013-06-23 Thread Dave Jones
On Sun, Jun 23, 2013 at 06:04:52PM +0200, Oleg Nesterov wrote: > Could you please do the following: > > 1. # cd /sys/kernel/debug/tracing > # echo 0 >> options/function-trace > # echo preemptirqsoff >> current_tracer dammit. WARNING: at include/linux/list.h:385 rb_hea

Re: frequent softlockups with 3.10rc6.

2013-06-24 Thread Dave Jones
On Sun, Jun 23, 2013 at 06:04:52PM +0200, Oleg Nesterov wrote: > > [11054.897670] BUG: soft lockup - CPU#2 stuck for 22s! > > [trinity-child2:14482] > > [11054.898503] Modules linked in: bridge stp snd_seq_dummy tun fuse hidp > > bnep rfcomm can_raw ipt_ULOG can_bcm nfnetlink af_rxrpc llc2 ro

Re: frequent softlockups with 3.10rc6.

2013-06-24 Thread Dave Jones
On Mon, Jun 24, 2013 at 10:52:29AM -0400, Steven Rostedt wrote: > > > check_list_nodes corruption. next->prev should be prev > > > (88023b8a1a08), but was 0088023b8a1a. (next=880243288001). > > > > Can't find "check_list_nodes" in lib/list_debug.c or elsewhere... > > > > >

Re: frequent softlockups with 3.10rc6.

2013-06-24 Thread Dave Jones
On Mon, Jun 24, 2013 at 06:37:08PM +0200, Oleg Nesterov wrote: > On 06/24, Dave Jones wrote: > > > > On Mon, Jun 24, 2013 at 10:52:29AM -0400, Steven Rostedt wrote: > > > > > > > check_list_nodes corruption. next->prev should be prev > > (8

Re: frequent softlockups with 3.10rc6.

2013-06-24 Thread Dave Jones
On Mon, Jun 24, 2013 at 12:24:39PM -0400, Steven Rostedt wrote: > > Ah, this is the first victim of my new 'check sanity of nodes during list > > walks' patch. > > It's doing the same prev->next next->prev checking as list_add and friends. > > I'm looking at getting it into shape for a 3.12 m

Re: frequent softlockups with 3.10rc6.

2013-06-24 Thread Dave Jones
On Mon, Jun 24, 2013 at 07:35:10PM +0200, Oleg Nesterov wrote: > > Not sure this is helpful, but.. > > This makes me think that something is seriously broken. > > Or I do not understand this stuff at all. Quite possible too. > Steven, could you please help? > > But this is already call

Re: frequent softlockups with 3.10rc6.

2013-06-24 Thread Dave Jones
On Mon, Jun 24, 2013 at 01:53:11PM -0400, Steven Rostedt wrote: > > Also. watchdog_timer_fn() calls printk() only if it detects the > > lockup, so I assume you hit another one? > > Probably. Yeah, unfortunately it happened while I was travelling home to the box, so I couldn't stop it after t

Re: [PATCH RESEND v2 2/2] scsi: 64-bit port of buslogic driver

2013-06-24 Thread Dave Jones
On Mon, Jun 24, 2013 at 02:26:00PM -0600, Khalid Aziz wrote: > @@ -821,7 +821,7 @@ struct blogic_ccb { > unsigned char cdblen; /* Byte 2 */ > unsigned char sense_datalen;/* Byte 3 */ > u32 datalen;

Re: frequent softlockups with 3.10rc6.

2013-06-25 Thread Dave Jones
Took a lot longer to trigger this time. (13 hours of runtime). This trace may still not be from the first lockup, as a flood of them happened at the same time. # tracer: preemptirqsoff # # preemptirqsoff latency trace v1.1.5 on 3.10.0-rc7+ # -

<    1   2   3   4   5   6   7   8   9   10   >