I can install 4.13 on the particular server I have problems with. Longterm it would be nice to get the fix backported to 4.4.
-----Original Message----- From: Jon Maloy [mailto:[email protected]] Sent: Thursday, November 30, 2017 09:16 To: Rune Torgersen <[email protected]> Cc: Hoang Huu Le <[email protected]>; [email protected]; Ying Xue <[email protected]>; Mohan Krishna Ghanta Krishnamurthy <[email protected]> Subject: RE: [tipc-discussion] FW: Kernel crash > -----Original Message----- > From: Rune Torgersen [mailto:[email protected]] > Sent: Thursday, November 30, 2017 09:43 > To: Rune Torgersen <[email protected]>; Jon Maloy > <[email protected]> > Cc: Hoang Huu Le <[email protected]>; tipc- > [email protected]; Ying Xue <[email protected]>; > Mohan Krishna Ghanta Krishnamurthy > <[email protected]> > Subject: RE: [tipc-discussion] FW: Kernel crash > > I cannot seem to find that specific commit (even in the latest official kernel > git). > I was trying to see if it was in any of the newer Ubuntu kernels (4.10 or > 4.13) > as it would be easier to install one of those that to try to recompile I > myself. If that is an option for you things would be much easier. I would try 4.13 or maybe even 4.14. The latter contains all the latest bug fixes, but also some new functionality. ///jon > > -----Original Message----- > From: Rune Torgersen [mailto:[email protected]] > Sent: Thursday, November 30, 2017 08:14 > To: Jon Maloy <[email protected]> > Cc: Hoang Huu Le <[email protected]>; tipc- > [email protected]; Ying Xue <[email protected]>; > Mohan Krishna Ghanta Krishnamurthy > <[email protected]> > Subject: Re: [tipc-discussion] FW: Kernel crash > > We tend to use stock kernel as much as possible, but I’ll try to see if I can > compile the tipc module for the specific system we see this on the most and > see if it helps. > > From: Jon Maloy [mailto:[email protected]] > Sent: Wednesday, November 29, 2017 16:43 > To: Rune Torgersen <[email protected]> > Cc: [email protected]; Mohan Krishna Ghanta > Krishnamurthy <[email protected]>; Canh > Duc Luu <[email protected]>; Hoang Huu Le > <[email protected]>; Ying Xue <[email protected]>; Tung > Quang Nguyen <[email protected]> > Subject: RE: [tipc-discussion] FW: Kernel crash > > Hi Rune, > I can at least ack the reception now. It seems like most of our > tipc-discussion > subscribers, (including myself, the project owner), were automatically > removed from the lists at some recent moment. Another screw-up by SF… > > As far as I can see, you are missing the following: > commit d4091899c9bbf (“tipc: hold subscriber->lock for > tipc_nametbl_subscribe()”) > > This was delivered to 4.5, but doesn’t seem to ever have made it into 4.4. > I am surprised, because you seem to have reported exactly the same issue > on March 22d last year, and I believed it was fixed. > > Do you apply patches yourself, so you could try this one? If it solves the > issue > and applies cleanly I could post it to stable_4.4. > > BR > ///jon > > > From: Jon Maloy [mailto:[email protected]] > Sent: Wednesday, November 29, 2017 15:42 > To: Jon Maloy <[email protected]<mailto:[email protected]>> > Subject: Tr : [tipc-discussion] FW: Kernel crash > > > > ----- Courriel transféré ----- > De : Rune Torgersen <[email protected]<mailto:[email protected]>> > À : "[email protected]<mailto:tipc- > [email protected]>" <tipc- > [email protected]<mailto:tipc- > [email protected]>> > Envoyé le : mercredi 29 novembre 2017 9h06 Objet : [tipc-discussion] FW: > Kernel crash > > (Resending as I think it got lost somewhere). > > A bug that I thought had been fixed is rearing its ugly head again in latest > Ubuntu 16.04 LTS kernel (4.4.0-97) It is happening to me quite frequently (2-3 > times a week). > > The application where this happens uses lots of short lived sockets, and also > lots of short-lived connections to the topology server. > > [151611.149711] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000028 [151611.149946] IP: [<ffffffffc046a0a2>] > tipc_nametbl_unsubscribe+0x72/0x100 [tipc] [151611.150069] PGD 0 > [151611.150104] Oops: 0002 [#1] SMP [151611.150160] Modules linked in: tipc > ip6_udp_tunnel udp_tunnel intel_powerclamp coretemp kvm_intel gpio_ich > input_leds joydev kvm irqbypass i7core_edac edac_core serio_raw lpc_ich > shpchp hpilo 8250_fintek ipmi_ssif acpi_power_meter mac_hid lp parport > ipmi_watchdog ipmi_si ipmi_devintf ipmi_msghandler autofs4 raid10 raid456 > async_raid6_recov async_memcpy async_pq async_xor async_tx xor > raid6_pq libcrc32c raid1 raid0 multipath linear amdkfd amd_iommu_v2 > radeon hid_generic i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect > psmouse sysimgblt fb_sys_fops usbhid drm hid pata_acpi bnx2 hpsa > netxen_nic scsi_transport_sas fjes > [151611.151291] CPU: 1 PID: 14873 Comm: kworker/u64:2 Tainted: G I > 4.4.0-97-generic #120-Ubuntu > [151611.151429] Hardware name: HP ProLiant DL360 G6, BIOS P64 05/15/2010 > [151611.151547] Workqueue: tipc_rcv tipc_recv_work [tipc] [151611.151631] > task: ffff880213c98cc0 ti: ffff8802131b8000 task.ti: ffff8802131b8000 > [151611.151740] RIP: 0010:[<ffffffffc046a0a2>] [<ffffffffc046a0a2>] > tipc_nametbl_unsubscribe+0x72/0x100 [tipc] [151611.151889] RSP: > 0018:ffff88021f443e10 EFLAGS: 00010246 [151611.151967] RAX: > ffff880213d87f80 RBX: ffff880213d87f00 RCX: 0000000000000020 > [151611.152071] RDX: 000000000000000e RSI: 0000000000000067 RDI: > ffff8802101a9638 [151611.152176] RBP: ffff88021f443e30 R08: > ffff88021f45a0c0 R09: ffff880217003b00 [151611.152280] R10: > ffff8800da043f40 R11: ffff880213c98d20 R12: ffff8802101a9600 > [151611.152385] R13: ffff8800d9fa9120 R14: ffff8802101a9638 R15: > ffff880213d87f00 [151611.152490] FS: 0000000000000000(0000) > GS:ffff88021f440000(0000) knlGS:0000000000000000 [151611.152631] CS: > 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [151611.152730] CR2: > 0000000000000028 CR3: 0000000001e0a000 CR4: 00000000000006e0 > [151611.152835] Stack: > [151611.152865] ffff880213d87f00 ffff8800d9fa8000 ffff880213c499c8 > ffffffffc0468ed0 [151611.152989] ffff88021f443e50 ffffffffc04688bf > ffff880213d87f00 ffff880213c499c0 [151611.153113] ffff88021f443e78 > ffffffffc0468f15 ffff88021f44ddc0 ffff880213d87f30 [151611.153237] Call Trace: > [151611.153275] <IRQ> > [151611.153311] [<ffffffffc0468ed0>] ? > tipc_subscrb_shutdown_cb+0xc0/0xc0 [tipc] [151611.153422] > [<ffffffffc04688bf>] tipc_subscrp_delete+0x2f/0x80 [tipc] [151611.153523] > [<ffffffffc0468f15>] tipc_subscrp_timeout+0x45/0x70 [tipc] [151611.153624] > [<ffffffff810ecfc5>] call_timer_fn+0x35/0x120 [151611.153735] > [<ffffffffc0468ed0>] ? tipc_subscrb_shutdown_cb+0xc0/0xc0 [tipc] > [151611.153846] [<ffffffff810ed97a>] run_timer_softirq+0x23a/0x2f0 > [151611.153936] [<ffffffff81085dc1>] __do_softirq+0x101/0x290 > [151611.154017] [<ffffffff810860c3>] irq_exit+0xa3/0xb0 [151611.154091] > [<ffffffff818462a2>] smp_apic_timer_interrupt+0x42/0x50 > [151611.154185] [<ffffffff81844562>] apic_timer_interrupt+0x82/0x90 > [151611.154272] <EOI> [151611.154305] [<ffffffff81843225>] ? > _raw_spin_unlock_irqrestore+0x15/0x20 > [151611.154407] [<ffffffff810eefef>] mod_timer+0x10f/0x240 > [151611.154489] [<ffffffffc0468be0>] tipc_subscrb_rcv_cb+0x1c0/0x390 > [tipc] [151611.154591] [<ffffffffc04755e2>] > tipc_receive_from_sock+0xc2/0x120 [tipc] [151611.154695] > [<ffffffffc047526b>] tipc_recv_work+0x2b/0x60 [tipc] [151611.154809] > [<ffffffff8109a635>] process_one_work+0x165/0x480 [151611.159008] > [<ffffffff8109a99b>] worker_thread+0x4b/0x4c0 [151611.163372] > [<ffffffff8109a950>] ? process_one_work+0x480/0x480 [151611.167622] > [<ffffffff810a0c75>] kthread+0xe5/0x100 [151611.171755] > [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0 > [151611.175858] [<ffffffff81843b8f>] ret_from_fork+0x3f/0x70 > [151611.179999] [<ffffffff810a0b90>] ? > kthread_create_on_node+0x1e0/0x1e0 > [151611.184004] Code: ff ff 48 85 c0 74 56 4c 8d 70 38 49 89 c4 4c 89 f7 e8 > 43 92 > 3d c1 48 8b 8b 80 00 00 00 48 8b 93 88 00 00 00 48 8d 83 80 00 00 00 <48> 89 > 51 > 08 48 89 0a 48 89 83 80 00 00 00 48 89 83 88 00 00 00 [151611.192678] RIP > [<ffffffffc046a0a2>] tipc_nametbl_unsubscribe+0x72/0x100 [tipc] > [151611.196733] RSP <ffff88021f443e10> [151611.200739] CR2: > 0000000000000028 > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most engaging > tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > tipc-discussion mailing list > [email protected]<mailto:tipc- > [email protected]> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most engaging > tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > tipc-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/tipc-discussion ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ tipc-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/tipc-discussion
