Re: NULL pointer deref in k.o/for-4.5
> On Jan 6, 2016, at 1:16 PM, Chuck Leverwrote: > > Encountered the below just after booting my NFS/RDMA > server with 4.4.0-rc6-00011-g6948cb2 (k.o/for-4.5 plus > my NFS/RDMA for-4.5 patches). The system is up and > ping-able via eth0, but high-level networking (like sshd > and nfsd) does not work, and my ib0 i/f is missing. > > This is an x86_64 system with one CX-3 Pro HCA. And appears to be 100% reproducible. Any debugging advice welcome! > All seems well with a stock v4.4-rc4 kernel. > > > Jan 6 12:44:13 klimt kernel: mlx4_ib_add: mlx4_ib: Mellanox > ConnectX InfiniBand driver v2.2-1 (Feb 2014) > Jan 6 12:44:13 klimt kernel: mlx4_ib_add: counter index 0 for port > 1 allocated 0 > Jan 6 12:44:13 klimt kernel: BUG: unable to handle kernel NULL pointer > dereference at (null) > Jan 6 12:44:13 klimt kernel: IP: [] > __mutex_lock_slowpath+0x75/0x120 > Jan 6 12:44:13 klimt kernel: PGD 853947067 PUD 8546cb067 PMD 0 > Jan 6 12:44:13 klimt kernel: Oops: 0002 [#1] SMP > Jan 6 12:44:13 klimt kernel: Modules linked in: mlx4_ib(+) mlx4_en ib_sa > ib_mad ib_core vxlan ip6_udp_tunnel udp_tunnel ib_addr sr_mod cdrom sd_mod > ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm > mlx4_core igb ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core > dm_mirror dm_region_hash dm_log dm_mod > Jan 6 12:44:13 klimt kernel: CPU: 3 PID: 431 Comm: modprobe Not tainted > 4.4.0-rc6-00011-g6948cb2 #79 > Jan 6 12:44:13 klimt kernel: Hardware name: Supermicro Super > Server/X10SRL-F, BIOS 1.0c 09/09/2015 > Jan 6 12:44:13 klimt kernel: task: 88085571aa80 ti: 88084f414000 > task.ti: 88084f414000 > Jan 6 12:44:13 klimt kernel: RIP: 0010:[] > [] __mutex_lock_slowpath+0x75/0x120 > Jan 6 12:44:13 klimt kernel: RSP: 0018:88084f417810 EFLAGS: 00010282 > Jan 6 12:44:13 klimt kernel: RAX: RBX: 88084f633950 > RCX: 88085571aa80 > Jan 6 12:44:13 klimt kernel: RDX: 0001 RSI: 88085571aae0 > RDI: 88084f633954 > Jan 6 12:44:13 klimt kernel: RBP: 88084f417858 R08: 0101 > R09: 880854f02f00 > Jan 6 12:44:13 klimt kernel: R10: a0150a85 R11: ea002156d400 > R12: 88084f633954 > Jan 6 12:44:13 klimt kernel: R13: 88085571aa80 R14: > R15: 88084f633958 > Jan 6 12:44:13 klimt kernel: FS: 7f32227c0740() > GS:88087fcc() knlGS: > Jan 6 12:44:13 klimt kernel: CS: 0010 DS: ES: CR0: > 80050033 > Jan 6 12:44:13 klimt kernel: CR2: CR3: 000853cb6000 > CR4: 001406e0 > Jan 6 12:44:13 klimt kernel: Stack: > Jan 6 12:44:13 klimt kernel: 88084f633958 > 81309502 3b473ac0 > Jan 6 12:44:13 klimt kernel: 88084f633950 88084f417888 > 88084f633940 88084f633950 > Jan 6 12:44:13 klimt kernel: 88084f63 88084f417870 > 8165271f 88084f63 > Jan 6 12:44:13 klimt kernel: Call Trace: > Jan 6 12:44:13 klimt kernel: [] ? > get_from_free_list+0x42/0x50 > Jan 6 12:44:13 klimt kernel: [] mutex_lock+0x1f/0x2f > Jan 6 12:44:13 klimt kernel: [] > iboe_process_mad.isra.13+0x77/0x190 [mlx4_ib] > Jan 6 12:44:13 klimt kernel: [] > mlx4_ib_process_mad+0x4d4/0x550 [mlx4_ib] > Jan 6 12:44:13 klimt kernel: [] ? > kernfs_next_descendant_post+0x1a/0x50 > Jan 6 12:44:13 klimt kernel: [] ? > kernfs_add_one+0x112/0x150 > Jan 6 12:44:13 klimt kernel: [] ? > kmem_cache_alloc_trace+0x3d/0x1d0 > Jan 6 12:44:13 klimt kernel: [] ? get_perf_mad+0x85/0x160 > [ib_core] > Jan 6 12:44:13 klimt kernel: [] get_perf_mad+0xee/0x160 > [ib_core] > Jan 6 12:44:13 klimt kernel: [] > get_counter_table+0x38/0x70 [ib_core] > Jan 6 12:44:13 klimt kernel: [] ? > kmem_cache_alloc_trace+0xf8/0x1d0 > Jan 6 12:44:13 klimt kernel: [] ? add_port+0xc2/0x450 > [ib_core] > Jan 6 12:44:13 klimt kernel: [] add_port+0x10f/0x450 > [ib_core] > Jan 6 12:44:13 klimt kernel: [] > ib_device_register_sysfs+0xe8/0x160 [ib_core] > Jan 6 12:44:13 klimt kernel: [] > ib_register_device+0x320/0x500 [ib_core] > Jan 6 12:44:13 klimt kernel: [] ? vprintk_default+0x3b/0x40 > Jan 6 12:44:13 klimt kernel: [] ? printk+0x5d/0x74 > Jan 6 12:44:13 klimt kernel: [] mlx4_ib_add+0xbb9/0xfe0 > [mlx4_ib] > Jan 6 12:44:13 klimt kernel: [] ? 0xa023f000 > Jan 6 12:44:13 klimt kernel: [] mlx4_add_device+0x3f/0xb0 > [mlx4_core] > Jan 6 12:44:13 klimt kernel: [] ? 0xa023f000 > Jan 6 12:44:13 klimt kernel: [] > mlx4_register_interface+0xd2/0x100 [mlx4_core] > Jan 6 12:44:13 klimt kernel: [] mlx4_ib_init+0x4c/0x1000 > [mlx4_ib] > Jan 6 12:44:13 klimt kernel: [] do_one_initcall+0x113/0x1f0 > Jan 6 12:44:13 klimt kernel: [] ? __vunmap+0xd7/0x100 > Jan 6 12:44:13 klimt kernel: [] ? > kmem_cache_alloc_trace+0x3d/0x1d0 > Jan 6 12:44:13 klimt kernel: [] ? do_init_module+0x27/0x1e8 > Jan 6 12:44:13 klimt kernel: []
Re: NULL pointer deref in k.o/for-4.5
On Wed, Jan 6, 2016 at 9:20 PM, Chuck Leverwrote: > And appears to be 100% reproducible. Any debugging > advice welcome! was reported here 2-3 times, this fixes that https://patchwork.kernel.org/patch/7929551 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NULL pointer deref in k.o/for-4.5
On Thu, Jan 7, 2016 at 1:31 AM, Chuck Leverwrote: >> On Jan 6, 2016, at 5:25 PM, Or Gerlitz wrote: >> On Wed, Jan 6, 2016 at 9:20 PM, Chuck Lever >> was reported here 2-3 times, this fixes that >> https://patchwork.kernel.org/patch/7929551 > Confirmed, that fixes it. Thanks, I never would have > guessed that was the fix. tried linux-rdma mailing list search on people failing with the for-4.5 bits? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html