Linux 4.14-rc6 bisected regression tun devices not working anymore in openvpn
L.S., While testing a linux 4.14-rc6 kernel i noticed OpenVPN didn't function anymore. My openvpn config uses tun devices and is pretty standard. The openvpn version is current Debian stable: openvpn 2.4.0-6+deb9u2 >From the openvpn logging: Sat Oct 28 16:03:34 2017 us=175829 TUN/TAP device opened Sat Oct 28 16:03:34 2017 us=183027 Note: Cannot set tx queue length on : No such device (errno=19) Sat Oct 28 16:03:34 2017 us=183055 do_ifconfig, tt->did_ifconfig_ipv6_setup=0 Sat Oct 28 16:03:34 2017 us=183071 /sbin/ip link set dev up mtu 1500 Cannot find device "" Sat Oct 28 16:03:34 2017 us=200445 Linux ip link set failed: external program exited with error status: 1 Sat Oct 28 16:03:34 2017 us=200482 Exiting due to fatal error Sat Oct 28 16:38:17 2017 us=923381 TCP/UDP: Closing socket Sat Oct 28 16:38:17 2017 us=925986 Closing TUN/TAP interface The offending commit is: 0ad646c81b2182f7fa67ec0c8c825e0ee165696d "tun: call dev_get_valid_name() before register_netdevice()" Reverting this commit fixes the issue for me, it's unfortunate that the commit it self seems to fix an other issue. -- Sander
Re: 4.12-RC2 BUG: scheduling while atomic: irq/47-iwlwifi
On 22/05/17 23:02, Arend Van Spriel wrote: > > > On 22-5-2017 14:09, Arend van Spriel wrote: >> On 5/22/2017 12:57 PM, Johannes Berg wrote: >>> On Mon, 2017-05-22 at 12:36 +0200, Sander Eikelenboom wrote: >>>> Hi, >>>> >>>> I encountered this splat with 4.12-RC2. >>> >>> Ugh, yeah, I should've seen that in the review. >>> >>> Arend, please take a look at this. cfg80211_sched_scan_results() cannot >>> sleep, so you can't rtnl_lock() in there. Looks like you can just rely >>> on RCU though? >> >> I see. I think you are right on RCU. Don't have the code in front of me >> now, but I think the lookup has an ASSERT_RTNL. Will look into it after >> my monday meeting :-p > > I realized I have a laptop lying around with intel 3160 wifi chip and > tried to reproduce the issue. Did not run into the splat running > 4.12-rc1 from wireless-drivers-next repo. I did not get the email from > Sander so I don't know any details. > > Here is what I changed based on the info Johannes provided. Can you > please check if this get rid of the splat and let me know. Hi Arend, I ran your patch today, so far no issues. -- Sander > Regards, > Arend > --- > diff --git a/net/wireless/scan.c b/net/wireless/scan.c > index 14d5f0c..04833bb 100644 > --- a/net/wireless/scan.c > +++ b/net/wireless/scan.c > @@ -322,9 +322,7 @@ static void cfg80211_del_sched_scan_req(struct > cfg80211_regi > { > struct cfg80211_sched_scan_request *pos; > > - ASSERT_RTNL(); > - > - list_for_each_entry(pos, >sched_scan_req_list, list) { > + list_for_each_entry_rcu(pos, >sched_scan_req_list, list) { > if (pos->reqid == reqid) > return pos; > } > @@ -398,13 +396,13 @@ void cfg80211_sched_scan_results(struct wiphy > *wiphy, u64 > trace_cfg80211_sched_scan_results(wiphy, reqid); > /* ignore if we're not scanning */ > > - rtnl_lock(); > + rcu_read_lock(); > request = cfg80211_find_sched_scan_req(rdev, reqid); > if (request) { > request->report_results = true; > queue_work(cfg80211_wq, >sched_scan_res_wk); > } > - rtnl_unlock(); > + rcu_read_unlock(); > } > EXPORT_SYMBOL(cfg80211_sched_scan_results); > >
4.12-RC2 BUG: scheduling while atomic: irq/47-iwlwifi
Hi, I encountered this splat with 4.12-RC2. -- Sander [ 119.021594] BUG: scheduling while atomic: irq/47-iwlwifi/517/0x0200 [ 119.021604] Modules linked in: xt_tcpudp ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_raw ip6table_security ip6table_mangle iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_security iptable_mangle ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables rfcomm bnep binfmt_misc arc4 iTCO_wdt iTCO_vendor_support uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev intel_rapl cdc_mbim iwlmvm x86_pkg_temp_thermal intel_powerclamp mac80211 media cdc_wdm btusb coretemp cdc_ncm kvm_intel usbnet mii cdc_acm iwlwifi kvm btintel joydev pcspkr serio_raw cfg80211 snd_hda_codec_hdmi [ 119.021701] bluetooth lpc_ich snd_hda_codec_realtek snd_hda_codec_generic shpchp sg ecdh_generic snd_hda_intel thinkpad_acpi snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer nvram snd soundcore evdev tpm_tis tpm_tis_core tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel rtsx_pci_sdmmc mmc_core aesni_intel aes_x86_64 crypto_simd cryptd glue_helper psmouse i2c_i801 sd_mod ehci_pci ehci_hcd e1000e rtsx_pci mfd_core ptp xhci_pci pps_core xhci_hcd [ 119.021759] CPU: 1 PID: 517 Comm: irq/47-iwlwifi Not tainted 4.12.0-rc2-t440s-20170522+ #1 [ 119.021763] Hardware name: LENOVO 20AQS03H00/20AQS03H00, BIOS GJET91WW (2.41 ) 09/21/2016 [ 119.021766] Call Trace: [ 119.021778] ? dump_stack+0x5c/0x84 [ 119.021784] ? __schedule_bug+0x4c/0x70 [ 119.021792] ? __schedule+0x496/0x5c0 [ 119.021798] ? schedule+0x2d/0x80 [ 119.021804] ? schedule_preempt_disabled+0x5/0x10 [ 119.021810] ? __mutex_lock.isra.0+0x18e/0x4c0 [ 119.021817] ? __wake_up+0x2f/0x50 [ 119.021833] ? cfg80211_sched_scan_results+0x19/0x60 [cfg80211] [ 119.021844] ? cfg80211_sched_scan_results+0x19/0x60 [cfg80211] [ 119.021859] ? iwl_mvm_rx_lmac_scan_iter_complete_notif+0x17/0x30 [iwlmvm] [ 119.021869] ? iwl_pcie_rx_handle+0x2a9/0x7e0 [iwlwifi] [ 119.021878] ? iwl_pcie_irq_handler+0x17c/0x730 [iwlwifi] [ 119.021884] ? irq_forced_thread_fn+0x60/0x60 [ 119.021887] ? irq_thread_fn+0x16/0x40 [ 119.021892] ? irq_thread+0x109/0x180 [ 119.021896] ? wake_threads_waitq+0x30/0x30 [ 119.021901] ? kthread+0xf2/0x130 [ 119.021905] ? irq_thread_dtor+0x90/0x90 [ 119.021910] ? kthread_create_on_node+0x40/0x40 [ 119.021915] ? ret_from_fork+0x26/0x40
Re: nf_unregister_net_hook: hook not found!
On 2015-12-30 03:39, ebied...@xmission.com wrote: Pablo Neira Ayuso <pa...@netfilter.org> writes: On Mon, Dec 28, 2015 at 09:05:03PM +0100, Sander Eikelenboom wrote: Hi, Running a 4.4.0-rc6 kernel i encountered the warning below. Cc'ing Eric Biederman. @Sander, could you provide a way to reproduce this? I am on vacation until the new year, but if this is reproducible we should be able to print out reg, reg->pf, reg->hooknum, reg->hook to figure out which hook is having something very weird happen to it. This is happening in some network namespace exit. Eric Unfortunately i have found no way to reproduce, 13 seconds implies it was at boot, but i only have seen this once. -- Sander Thanks. [ 13.740472] ip_tables: (C) 2000-2006 Netfilter Core Team [ 13.936237] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 13.945391] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 13.947434] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0 [ 14.223990] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 14.232065] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 14.233570] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0 [ 14.328141] systemd-logind[2485]: Failed to start user service: Unknown unit: user@117.service [ 14.356634] systemd-logind[2485]: New session c1 of user lightdm. [ 14.357320] [ cut here ] [ 14.357327] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143 netfilter_net_exit+0x25/0x50() [ 14.357328] nf_unregister_net_hook: hook not found! [ 14.357371] Modules linked in: iptable_security(+) iptable_raw iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc uvcvideo videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal intel_powerclamp btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl kvm_intel v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel media kvm snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi snd_hda_intel pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep snd_hda_core rfkill i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev snd soundcore shpchp tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel [ 14.357380] ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd e1000e lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore usb_common pps_core [ 14.357383] CPU: 2 PID: 102 Comm: kworker/u16:3 Tainted: G U 4.4.0-rc6-x220-20151224+ #1 [ 14.357384] Hardware name: LENOVO 42912ZU/42912ZU, BIOS 8DET69WW (1.39 ) 07/18/2013 [ 14.357390] Workqueue: netns cleanup_net [ 14.357393] 81a27dfd 81359c69 88030e7cbd40 81060297 [ 14.357395] 88030e820d80 88030e7cbd90 81c962d8 81c962e0 [ 14.357397] 88030e7cbdf8 81060317 81a2c010 88030018 [ 14.357398] Call Trace: [ 14.357405] [] ? dump_stack+0x40/0x57 [ 14.357408] [] ? warn_slowpath_common+0x77/0xb0 [ 14.357410] [] ? warn_slowpath_fmt+0x47/0x50 [ 14.357416] [] ? mutex_lock+0x9/0x30 [ 14.357418] [] ? netfilter_net_exit+0x25/0x50 [ 14.357421] [] ? ops_exit_list.isra.6+0x2e/0x60 [ 14.357424] [] ? cleanup_net+0x1ab/0x280 [ 14.357427] [] ? process_one_work+0x133/0x330 [ 14.357429] [] ? worker_thread+0x60/0x470 [ 14.357430] [] ? process_one_work+0x330/0x330 [ 14.357434] [] ? kthread+0xca/0xe0 [ 14.357436] [] ? kthread_create_on_node+0x170/0x170 [ 14.357439] [] ? ret_from_fork+0x3f/0x70 [ 14.357441] [] ? kthread_create_on_node+0x170/0x170 [ 14.357443] ---[ end trace 9984cc4b0e89f818 ]--- [ 14.357443] [ cut here ] [ 14.357446] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143 netfilter_net_exit+0x25/0x50() [ 14.357446] nf_unregister_net_hook: hook not found! [ 14.357472] Modules linked in: iptable_security(+) iptable_raw iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc uvcvideo videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal intel_powerclamp btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl kvm_intel v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel media kvm snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi snd_hda_intel pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep snd_hda_core rfkill i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev snd soundcore shpchp tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel [ 14.357478] ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd e1000e lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore usb_common pps_core [ 14.357480] CPU: 2 PID: 102 Comm: kworker/u16:3 Taint
nf_unregister_net_hook: hook not found!
Hi, Running a 4.4.0-rc6 kernel i encountered the warning below. -- Sander [ 13.740472] ip_tables: (C) 2000-2006 Netfilter Core Team [ 13.936237] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 13.945391] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 13.947434] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0 [ 14.223990] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 14.232065] iwlwifi :03:00.0: L1 Enabled - LTR Disabled [ 14.233570] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0 [ 14.328141] systemd-logind[2485]: Failed to start user service: Unknown unit: user@117.service [ 14.356634] systemd-logind[2485]: New session c1 of user lightdm. [ 14.357320] [ cut here ] [ 14.357327] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143 netfilter_net_exit+0x25/0x50() [ 14.357328] nf_unregister_net_hook: hook not found! [ 14.357371] Modules linked in: iptable_security(+) iptable_raw iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc uvcvideo videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal intel_powerclamp btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl kvm_intel v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel media kvm snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi snd_hda_intel pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep snd_hda_core rfkill i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev snd soundcore shpchp tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel [ 14.357380] ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd e1000e lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore usb_common pps_core [ 14.357383] CPU: 2 PID: 102 Comm: kworker/u16:3 Tainted: G U 4.4.0-rc6-x220-20151224+ #1 [ 14.357384] Hardware name: LENOVO 42912ZU/42912ZU, BIOS 8DET69WW (1.39 ) 07/18/2013 [ 14.357390] Workqueue: netns cleanup_net [ 14.357393] 81a27dfd 81359c69 88030e7cbd40 81060297 [ 14.357395] 88030e820d80 88030e7cbd90 81c962d8 81c962e0 [ 14.357397] 88030e7cbdf8 81060317 81a2c010 88030018 [ 14.357398] Call Trace: [ 14.357405] [] ? dump_stack+0x40/0x57 [ 14.357408] [] ? warn_slowpath_common+0x77/0xb0 [ 14.357410] [] ? warn_slowpath_fmt+0x47/0x50 [ 14.357416] [] ? mutex_lock+0x9/0x30 [ 14.357418] [] ? netfilter_net_exit+0x25/0x50 [ 14.357421] [] ? ops_exit_list.isra.6+0x2e/0x60 [ 14.357424] [] ? cleanup_net+0x1ab/0x280 [ 14.357427] [] ? process_one_work+0x133/0x330 [ 14.357429] [] ? worker_thread+0x60/0x470 [ 14.357430] [] ? process_one_work+0x330/0x330 [ 14.357434] [] ? kthread+0xca/0xe0 [ 14.357436] [] ? kthread_create_on_node+0x170/0x170 [ 14.357439] [] ? ret_from_fork+0x3f/0x70 [ 14.357441] [] ? kthread_create_on_node+0x170/0x170 [ 14.357443] ---[ end trace 9984cc4b0e89f818 ]--- [ 14.357443] [ cut here ] [ 14.357446] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143 netfilter_net_exit+0x25/0x50() [ 14.357446] nf_unregister_net_hook: hook not found! [ 14.357472] Modules linked in: iptable_security(+) iptable_raw iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc uvcvideo videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal intel_powerclamp btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl kvm_intel v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel media kvm snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi snd_hda_intel pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep snd_hda_core rfkill i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev snd soundcore shpchp tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel [ 14.357478] ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd e1000e lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore usb_common pps_core [ 14.357480] CPU: 2 PID: 102 Comm: kworker/u16:3 Tainted: G U W 4.4.0-rc6-x220-20151224+ #1 [ 14.357481] Hardware name: LENOVO 42912ZU/42912ZU, BIOS 8DET69WW (1.39 ) 07/18/2013 [ 14.357484] Workqueue: netns cleanup_net [ 14.357486] 81a27dfd 81359c69 88030e7cbd40 81060297 [ 14.357488] 88030e820db8 88030e7cbd90 81c962d8 81c962e0 [ 14.357489] 88030e7cbdf8 81060317 81a2c010 88030018 [ 14.357490] Call Trace: [ 14.357493] [] ? dump_stack+0x40/0x57 [ 14.357495] [] ? warn_slowpath_common+0x77/0xb0 [ 14.357497] [] ? warn_slowpath_fmt+0x47/0x50 [ 14.357499] [] ?
Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP
On 2015-11-13 12:06, Ido Schimmel wrote: When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns EOPNOTSUPP. In this case we should not emit errors and warnings to the kernel log. Hi Ido, Thanks for your patch! It fixes these: [ 207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time [ 207.245443] xen_bridge: error setting offload STP state on port1(vif1.0) But i still have these: [ 335.412194] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.412204] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.412311] vif19.0-emu: set_features() failed (-1); wanted 0x008248c9, left 0x0080001b48c9 [ 335.412319] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.412326] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 335.615401] xen_bridge: set_features() failed (-1); wanted 0x00801fdb78c9, left 0x00801fff78e9 -- Sander Reported-by: Sander Eikelenboom <li...@eikelenboom.it> Fixes: 0bc05d585d38 ("switchdev: allow caller to explicitly request attr_set as deferred") Fixes: 6ac311ae8bfb ("Adding switchdev ageing notification on port bridged") Signed-off-by: Ido Schimmel <ido...@mellanox.com> Signed-off-by: Jiri Pirko <j...@mellanox.com> --- net/bridge/br_stp.c| 2 +- net/bridge/br_stp_if.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c index f7e8dee..5f3f645 100644 --- a/net/bridge/br_stp.c +++ b/net/bridge/br_stp.c @@ -48,7 +48,7 @@ void br_set_state(struct net_bridge_port *p, unsigned int state) p->state = state; err = switchdev_port_attr_set(p->dev, ); - if (err) + if (err && err != -EOPNOTSUPP) br_warn(p->br, "error setting offload STP state on port %u(%s)\n", (unsigned int) p->port_no, p->dev->name); } diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c index fa53d7a..5396ff08 100644 --- a/net/bridge/br_stp_if.c +++ b/net/bridge/br_stp_if.c @@ -50,7 +50,7 @@ void br_init_port(struct net_bridge_port *p) p->config_pending = 0; err = switchdev_port_attr_set(p->dev, ); - if (err) + if (err && err != -EOPNOTSUPP) netdev_err(p->dev, "failed to set HW ageing time\n"); } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP
On 2015-11-13 13:43, Ido Schimmel wrote: Fri, Nov 13, 2015 at 02:34:45PM IST, li...@eikelenboom.it wrote: On 2015-11-13 12:06, Ido Schimmel wrote: When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns EOPNOTSUPP. In this case we should not emit errors and warnings to the kernel log. Hi Ido, Thanks for your patch! It fixes these: [ 207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time [ 207.245443] xen_bridge: error setting offload STP state on port1(vif1.0) But i still have these: [ 335.412194] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.412204] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.412311] vif19.0-emu: set_features() failed (-1); wanted 0x008248c9, left 0x0080001b48c9 [ 335.412319] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.412326] vif19.0-emu: set_features() failed (-1); wanted 0x008048c1, left 0x0080001b48c9 [ 335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 335.615401] xen_bridge: set_features() failed (-1); wanted 0x00801fdb78c9, left 0x00801fff78e9 Yes, this is a different issue and I see that Nik is already working on it. Can you please try his patch? http://patchwork.ozlabs.org/patch/544242/ Yeah that suppresses the warning, thx ! -- Sander -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[linux-4.4-mw] BUG: unable to handle kernel paging request ip_vs_out.constprop
Hi All, Just got a crash with a linux-4.4-mw kernel. I'm using a routed bridge and apart from the splat below i have got some interesting other messages that aren't there in 4.3 (and perhaps are of interest for the crash as well): [ 207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 207.245435] xen_bridge: error setting offload STP state on port 1(vif1.0) [ 207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time [ 207.245443] xen_bridge: error setting offload STP state on port 1(vif1.0) [ 207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 The commit message for the commit that introduced the "set HW ageing time" error message, doesn't seem to tell me much about it's purpose. If it's not related i can reported as a seperate issue. -- Sander The crash: [ 354.328687] BUG: unable to handle kernel paging request at 880049aa8000 [ 354.350206] IP: [] ip_vs_out.constprop.25+0x47/0x60 [ 354.360882] PGD 2212067 PUD 25b4067 PMD 5ffb6067 PTE 0 [ 354.371587] Oops: [#1] SMP [ 354.382143] Modules linked in: [ 354.392537] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.3.0-mw-2015-linus-doflr+ #1 [ 354.403105] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [ 354.413666] task: 82218580 ti: 8220 task.ti: 8220 [ 354.424255] RIP: e030:[] [] ip_vs_out.constprop.25+0x47/0x60 [ 354.434742] RSP: e02b:88005f6034b0 EFLAGS: 00010246 [ 354.445006] RAX: 0001 RBX: 88005f6034f8 RCX: 880049aa7ce0 [ 354.455262] RDX: 88003c0e5500 RSI: 0003 RDI: 880004e0e800 [ 354.465422] RBP: 88005f6034b8 R08: 0014 R09: 0003 [ 354.475508] R10: 0001 R11: 880040f394cc R12: 88005f603528 [ 354.485567] R13: 88003c0e5500 R14: 822da2e8 R15: 88003c0e5500 [ 354.495595] FS: 7f0243c2b700() GS:88005f60() knlGS: [ 354.505474] CS: e033 DS: ES: CR0: 8005003b [ 354.515135] CR2: 880049aa8000 CR3: 59271000 CR4: 0660 [ 354.524794] Stack: [ 354.534319] 81a074fc 88005f6034e8 8199e138 88003c0e5500 [ 354.543981] 88005f603528 88003c0e5500 88005f603518 [ 354.553577] 8199e1af 880005300048 88003c0e5500 822da2e8 [ 354.563160] Call Trace: [ 354.572418] [ 354.572480] [] ? ip_vs_local_reply4+0x1c/0x20 [ 354.590458] [] nf_iterate+0x58/0x70 [ 354.599372] [] nf_hook_slow+0x5f/0xb0 [ 354.608245] [] __ip_local_out+0x9e/0xb0 [ 354.617036] [] ? ip_forward_options+0x1a0/0x1a0 [ 354.625874] [] ip_local_out+0x17/0x40 [ 354.634383] [] ip_build_and_send_pkt+0x148/0x1c0 [ 354.642715] [] tcp_v4_send_synack+0x56/0xa0 [ 354.650893] [] ? inet_csk_reqsk_queue_hash_add+0x68/0x90 [ 354.659083] [] tcp_conn_request+0x95d/0x970 [ 354.667196] [] ? __local_bh_enable_ip+0x26/0x90 [ 354.675246] [] tcp_v4_conn_request+0x47/0x50 [ 354.683254] [] tcp_rcv_state_process+0x183/0xca0 [ 354.691004] [] tcp_v4_do_rcv+0x5c/0x1f0 [ 354.698533] [] tcp_v4_rcv+0x987/0x9a0 [ 354.705968] [] ? ipv4_confirm+0x78/0xf0 [ 354.713370] [] ip_local_deliver_finish+0x84/0x120 [ 354.720739] [] ip_local_deliver+0x42/0xd0 [ 354.728029] [] ? inet_del_offload+0x40/0x40 [ 354.735270] [] ip_rcv_finish+0x106/0x320 [ 354.742413] [] ip_rcv+0x211/0x370 [ 354.749268] [] ? ip_local_deliver_finish+0x120/0x120 [ 354.755929] [] __netif_receive_skb_core+0x2cb/0x970 [ 354.762535] [] ? nf_nat_setup_info+0x7a/0x2f0 [ 354.769131] [] __netif_receive_skb+0x11/0x70 [ 354.775481] [] netif_receive_skb_internal+0x1e/0x80 [ 354.781638] [] ? nf_hook_slow+0x5f/0xb0 [ 354.787771] [] netif_receive_skb+0x9/0x10 [ 354.793916] [] br_handle_frame_finish+0x178/0x4b0 [ 354.800077] [] ? nf_nat_ipv4_fn+0x167/0x1e0 [ 354.806260] [] ? br_handle_local_finish+0x50/0x50 [ 354.812405] [] br_nf_pre_routing_finish+0x183/0x360 [ 354.818574] [] ? br_netif_receive_skb+0x10/0x10 [ 354.824775] [] br_nf_pre_routing+0x2a7/0x380 [ 354.830780] [] ? br_nf_forward_ip+0x3f0/0x3f0 [ 354.836567] [] nf_iterate+0x58/0x70 [ 354.842281] [] nf_hook_slow+0x5f/0xb0 [ 354.847886] [] br_handle_frame+0x1a2/0x290 [ 354.853520] [] ? br_netif_receive_skb+0x10/0x10 [ 354.859206] [] ? br_handle_frame_finish+0x4b0/0x4b0 [ 354.864824] [] __netif_receive_skb_core+0x12b/0x970 [ 354.870350] [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [ 354.875880] [] __netif_receive_skb+0x11/0x70 [ 354.881293] [] netif_receive_skb_internal+0x1e/0x80 [ 354.886653] [] netif_receive_skb+0x9/0x10 [ 354.891918] [] xenvif_tx_action+0x693/0x820 [ 354.897170] [] xenvif_poll+0x29/0x70 [
Re: [linux-4.4-mw] BUG: unable to handle kernel paging request ip_vs_out.constprop
On 2015-11-12 15:09, Eric Dumazet wrote: On Thu, 2015-11-12 at 11:08 +0100, Sander Eikelenboom wrote: Hi All, Just got a crash with a linux-4.4-mw kernel. I'm using a routed bridge and apart from the splat below i have got some interesting other messages that aren't there in 4.3 (and perhaps are of interest for the crash as well): [ 207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 207.245435] xen_bridge: error setting offload STP state on port 1(vif1.0) [ 207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time [ 207.245443] xen_bridge: error setting offload STP state on port 1(vif1.0) [ 207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 The commit message for the commit that introduced the "set HW ageing time" error message, doesn't seem to tell me much about it's purpose. If it's not related i can reported as a seperate issue. -- Sander The crash: [ 354.328687] BUG: unable to handle kernel paging request at 880049aa8000 [ 354.350206] IP: [] ip_vs_out.constprop.25+0x47/0x60 [ 354.360882] PGD 2212067 PUD 25b4067 PMD 5ffb6067 PTE 0 [ 354.371587] Oops: [#1] SMP [ 354.382143] Modules linked in: [ 354.392537] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.3.0-mw-2015-linus-doflr+ #1 [ 354.403105] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [ 354.413666] task: 82218580 ti: 8220 task.ti: 8220 [ 354.424255] RIP: e030:[] [] ip_vs_out.constprop.25+0x47/0x60 [ 354.434742] RSP: e02b:88005f6034b0 EFLAGS: 00010246 [ 354.445006] RAX: 0001 RBX: 88005f6034f8 RCX: 880049aa7ce0 [ 354.455262] RDX: 88003c0e5500 RSI: 0003 RDI: 880004e0e800 [ 354.465422] RBP: 88005f6034b8 R08: 0014 R09: 0003 [ 354.475508] R10: 0001 R11: 880040f394cc R12: 88005f603528 [ 354.485567] R13: 88003c0e5500 R14: 822da2e8 R15: 88003c0e5500 [ 354.495595] FS: 7f0243c2b700() GS:88005f60() knlGS: [ 354.505474] CS: e033 DS: ES: CR0: 8005003b [ 354.515135] CR2: 880049aa8000 CR3: 59271000 CR4: 0660 [ 354.524794] Stack: [ 354.534319] 81a074fc 88005f6034e8 8199e138 88003c0e5500 [ 354.543981] 88005f603528 88003c0e5500 88005f603518 [ 354.553577] 8199e1af 880005300048 88003c0e5500 822da2e8 [ 354.563160] Call Trace: [ 354.572418] [ 354.572480] [] ? ip_vs_local_reply4+0x1c/0x20 [ 354.590458] [] nf_iterate+0x58/0x70 [ 354.599372] [] nf_hook_slow+0x5f/0xb0 [ 354.608245] [] __ip_local_out+0x9e/0xb0 [ 354.617036] [] ? ip_forward_options+0x1a0/0x1a0 [ 354.625874] [] ip_local_out+0x17/0x40 [ 354.634383] [] ip_build_and_send_pkt+0x148/0x1c0 [ 354.642715] [] tcp_v4_send_synack+0x56/0xa0 [ 354.650893] [] ? inet_csk_reqsk_queue_hash_add+0x68/0x90 [ 354.659083] [] tcp_conn_request+0x95d/0x970 [ 354.667196] [] ? __local_bh_enable_ip+0x26/0x90 [ 354.675246] [] tcp_v4_conn_request+0x47/0x50 [ 354.683254] [] tcp_rcv_state_process+0x183/0xca0 [ 354.691004] [] tcp_v4_do_rcv+0x5c/0x1f0 [ 354.698533] [] tcp_v4_rcv+0x987/0x9a0 [ 354.705968] [] ? ipv4_confirm+0x78/0xf0 [ 354.713370] [] ip_local_deliver_finish+0x84/0x120 [ 354.720739] [] ip_local_deliver+0x42/0xd0 [ 354.728029] [] ? inet_del_offload+0x40/0x40 [ 354.735270] [] ip_rcv_finish+0x106/0x320 [ 354.742413] [] ip_rcv+0x211/0x370 [ 354.749268] [] ? ip_local_deliver_finish+0x120/0x120 [ 354.755929] [] __netif_receive_skb_core+0x2cb/0x970 [ 354.762535] [] ? nf_nat_setup_info+0x7a/0x2f0 [ 354.769131] [] __netif_receive_skb+0x11/0x70 [ 354.775481] [] netif_receive_skb_internal+0x1e/0x80 [ 354.781638] [] ? nf_hook_slow+0x5f/0xb0 [ 354.787771] [] netif_receive_skb+0x9/0x10 [ 354.793916] [] br_handle_frame_finish+0x178/0x4b0 [ 354.800077] [] ? nf_nat_ipv4_fn+0x167/0x1e0 [ 354.806260] [] ? br_handle_local_finish+0x50/0x50 [ 354.812405] [] br_nf_pre_routing_finish+0x183/0x360 [ 354.818574] [] ? br_netif_receive_skb+0x10/0x10 [ 354.824775] [] br_nf_pre_routing+0x2a7/0x380 [ 354.830780] [] ? br_nf_forward_ip+0x3f0/0x3f0 [ 354.836567] [] nf_iterate+0x58/0x70 [ 354.842281] [] nf_hook_slow+0x5f/0xb0 [ 354.847886] [] br_handle_frame+0x1a2/0x290 [ 354.853520] [] ? br_netif_receive_skb+0x10/0x10 [ 354.859206] [] ? br_handle_frame_finish+0x4b0/0x4b0 [ 354.864824] [] __netif_receive_skb_core+0x12b/0x970 [ 354.870350] [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [ 354.875880] [] __netif_receive_skb+0x11/0x70 [ 354.881293] [] netif_receive_skb_internal+0x1e/0x80 [ 354.886653] [] netif_receive_skb+0x9/0x10 [
Re: [linux-4.4-mw] BUG: unable to handle kernel paging request ip_vs_out.constprop
On 2015-11-12 17:52, Eric Dumazet wrote: On Thu, 2015-11-12 at 16:16 +0100, Sander Eikelenboom wrote: > Thanks for the report, please try following patch : Hi Eric, Thanks for the patch! Got it up and running at the moment, but since i don't have a clear trigger it will take 1 or 2 days before i can report something back. Don't worry, I have a pretty good picture of the bug and patch must fix it. I'll submit it formally asap. Ok. Do you know were these new warnings are for ? (apparently all networking including bridging works fine, so is this just too verbose logging ?) [ 207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 [ 207.245435] xen_bridge: error setting offload STP state on port 1(vif1.0) [ 207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time [ 207.245443] xen_bridge: error setting offload STP state on port 1(vif1.0) [ 207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 0x00044803, left 0x000400114813 -- Sander -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Netfilter: BUG: unable to handle kernel paging request, RIP: physdev_mt+0xd6/0x160
On 2015-09-13 20:06, Florian Westphal wrote: Sander Eikelenboom <li...@eikelenboom.it> wrote: Using a linux-4.3-rc1 kernel i encountered the splat below: Thanks for reporting this bug. [ 290.200642] BUG: unable to handle kernel paging request at 0484195d [ 290.211702] IP: [] physdev_mt+0xd6/0x160 [..] [ 290.444088] [] ipt_do_table+0x210/0x390 [ 290.461951] [] iptable_filter_hook+0x2e/0x70 [ 290.470756] [] nf_iterate+0x4c/0x80 [ 290.479587] [] nf_hook_slow+0x64/0xc0 [ 290.488341] [] ip_forward+0x369/0x3c0 [ 290.496927] [] ? ip_frag_mem+0x40/0x40 [ 290.505365] [] ip_rcv_finish+0x101/0x330 [ 290.513480] [] ip_rcv+0x291/0x390 [ 290.521562] [] ? Aye, ip forwarding of bridged packets with call-iptables=1 is broken. Please, could you try this patch? It fixes this bug for me. Hi Florian, Works for me too, thx for the fix ! -- Sander diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -355,6 +355,7 @@ static int br_nf_pre_routing_finish(struct sock *sk, struct sk_buff *skb) struct iphdr *iph = ip_hdr(skb); struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb); struct rtable *rt; + bool daddr_changed; int err; nf_bridge->frag_max_size = IPCB(skb)->frag_max_size; @@ -363,8 +364,15 @@ static int br_nf_pre_routing_finish(struct sock *sk, struct sk_buff *skb) skb->pkt_type = PACKET_OTHERHOST; nf_bridge->pkt_otherhost = false; } + + /* set physoutdev to NULL, its set by the bridge forward hook but +* frame might be routed instead of bridged. +*/ + daddr_changed = br_nf_ipv4_daddr_was_changed(skb, nf_bridge); + nf_bridge->physoutdev = NULL; nf_bridge->in_prerouting = 0; - if (br_nf_ipv4_daddr_was_changed(skb, nf_bridge)) { + + if (daddr_changed) { if ((err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos, dev))) { struct in_device *in_dev = __in_dev_get_rcu(dev); diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c index 77383bf..77b 100644 --- a/net/bridge/br_netfilter_ipv6.c +++ b/net/bridge/br_netfilter_ipv6.c @@ -167,6 +167,7 @@ static int br_nf_pre_routing_finish_ipv6(struct sock *sk, struct sk_buff *skb) struct rtable *rt; struct net_device *dev = skb->dev; const struct nf_ipv6_ops *v6ops = nf_get_ipv6_ops(); + bool daddr_changed; nf_bridge->frag_max_size = IP6CB(skb)->frag_max_size; @@ -174,8 +175,12 @@ static int br_nf_pre_routing_finish_ipv6(struct sock *sk, struct sk_buff *skb) skb->pkt_type = PACKET_OTHERHOST; nf_bridge->pkt_otherhost = false; } + + daddr_changed = br_nf_ipv6_daddr_was_changed(skb, nf_bridge); + nf_bridge->physoutdev = NULL; nf_bridge->in_prerouting = 0; - if (br_nf_ipv6_daddr_was_changed(skb, nf_bridge)) { + + if (daddr_changed) { skb_dst_drop(skb); v6ops->route_input(skb); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Netfilter: BUG: unable to handle kernel paging request, RIP: physdev_mt+0xd6/0x160
Using a linux-4.3-rc1 kernel i encountered the splat below: addr2line gives: /usr/src/new/linux-linus/include/linux/netfilter/x_tables.h:350 which is: /* * This helper is performance critical and must be inlined */ static inline unsigned long ifname_compare_aligned(const char *_a, const char *_b, const char *_mask) { const unsigned long *a = (const unsigned long *)_a; const unsigned long *b = (const unsigned long *)_b; const unsigned long *mask = (const unsigned long *)_mask; unsigned long ret; ret = (a[0] ^ b[0]) & mask[0]; if (IFNAMSIZ > sizeof(unsigned long)) HERE -->ret |= (a[1] ^ b[1]) & mask[1]; if (IFNAMSIZ > 2 * sizeof(unsigned long)) ret |= (a[2] ^ b[2]) & mask[2]; if (IFNAMSIZ > 3 * sizeof(unsigned long)) ret |= (a[3] ^ b[3]) & mask[3]; BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long)); return ret; } -- Sander [ 290.200642] BUG: unable to handle kernel paging request at 0484195d [ 290.211702] IP: [] physdev_mt+0xd6/0x160 [ 290.222716] PGD 591ea067 PUD 5772a067 PMD 0 [ 290.233389] Oops: [#1] SMP [ 290.244017] Modules linked in: [ 290.254338] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.3.0-rc1-20150913-linus-doflr+ #1 [ 290.264862] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [ 290.275319] task: 8221a580 ti: 8220 task.ti: 8220 [ 290.285909] RIP: e030:[] [] physdev_mt+0xd6/0x160 [ 290.296374] RSP: e02b:88005f6037b0 EFLAGS: 00010206 [ 290.306758] RAX: 00302e3531666976 RBX: 88414d00 RCX: [ 290.310800] xen_bridge: port 13(vif13.0) entered forwarding state [ 290.327013] RDX: c90003c0c4f0 RSI: 88003bfba000 RDI: 0002 [ 290.337148] RBP: 88005f6037b0 R08: 04841955 R09: 880057bf2501 [ 290.347361] R10: R11: R12: 880004b4a24e [ 290.357395] R13: 8800044bc000 R14: c90003c0c460 R15: c90003c0c4d0 [ 290.367437] FS: 7ff6d0ed3700() GS:88005f60() knlGS: [ 290.377264] CS: e033 DS: ES: CR0: 8005003b [ 290.386939] CR2: 0484195d CR3: 48ba6000 CR4: 0660 [ 290.396564] Stack: [ 290.406066] 88005f603848 81a4e6c0 c90003c09008 8110df32 [ 290.415683] 8800559ddc00 880004d5 c90003c09040 0001 [ 290.425294] 822f7c40 c90003c0c4f0 8800044bc000 880004d5 [ 290.434788] Call Trace: [ 290.444034] [ 290.444088] [] ipt_do_table+0x210/0x390 [ 290.461951] [] iptable_filter_hook+0x2e/0x70 [ 290.470756] [] nf_iterate+0x4c/0x80 [ 290.479587] [] nf_hook_slow+0x64/0xc0 [ 290.488341] [] ip_forward+0x369/0x3c0 [ 290.496927] [] ? ip_frag_mem+0x40/0x40 [ 290.505365] [] ip_rcv_finish+0x101/0x330 [ 290.513480] [] ip_rcv+0x291/0x390 [ 290.521562] [] ? ip_local_deliver_finish+0x120/0x120 [ 290.529509] [] __netif_receive_skb_core+0x2a0/0x960 [ 290.537381] [] ? tcp_error+0xa9/0x1e0 [ 290.545287] [] ? __local_bh_enable_ip+0x26/0x90 [ 290.553065] [] __netif_receive_skb+0x11/0x70 [ 290.560671] [] netif_receive_skb_internal+0x1e/0x80 [ 290.568025] [] ? nf_hook_slow+0x64/0xc0 [ 290.575341] [] netif_receive_skb_sk+0xc/0x10 [ 290.582655] [] br_handle_frame_finish+0x17a/0x4b0 [ 290.589910] [] ? nf_nat_ipv4_fn+0x19a/0x1e0 [ 290.597120] [] ? iptable_nat_ipv4_fn+0x20/0x20 [ 290.604316] [] ? netif_receive_skb_internal+0x80/0x80 [ 290.611375] [] br_nf_pre_routing_finish+0x166/0x340 [ 290.618246] [] ? br_handle_local_finish+0x50/0x50 [ 290.624925] [] br_nf_pre_routing+0x29b/0x370 [ 290.631446] [] ? br_nf_forward_ip+0x3d0/0x3d0 [ 290.637991] [] nf_iterate+0x4c/0x80 [ 290.644328] [] nf_hook_slow+0x64/0xc0 [ 290.650380] [] br_handle_frame+0x199/0x280 [ 290.656432] [] ? br_handle_local_finish+0x50/0x50 [ 290.662593] [] ? br_handle_frame_finish+0x4b0/0x4b0 [ 290.668626] [] __netif_receive_skb_core+0x12b/0x960 [ 290.674643] [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [ 290.680797] [] ? __skb_flow_dissect+0x5f1/0x8f0 [ 290.686902] [] __netif_receive_skb+0x11/0x70 [ 290.693046] [] netif_receive_skb_internal+0x1e/0x80 [ 290.699031] [] netif_receive_skb_sk+0xc/0x10 [ 290.704880] [] xenvif_tx_action+0x69a/0x830 [ 290.710609] [] ? __netif_receive_skb+0x11/0x70 [ 290.716365] [] xenvif_poll+0x29/0x70 [ 290.722241] [] net_rx_action+0x1f7/0x300 [ 290.727940] [] __do_softirq+0x103/0x210 [ 290.733564] [] irq_exit+0x4b/0xa0 [ 290.739094] [] xen_evtchn_do_upcall+0x30/0x40 [ 290.744626] [] xen_do_hypervisor_callback+0x1e/0x40 [ 290.750062] [ 290.750114] [] ? xen_hypercall_sched_op+0xa/0x20 [ 290.760785] [] ? xen_hypercall_sched_op+0xa/0x20 [
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
Saturday, August 15, 2015, 12:39:25 AM, you wrote: On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote: On 2015-08-13 00:41, Eric Dumazet wrote: On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: Thanks for the reminder, but luckily i was aware of that, seen enough of your replies asking for patches to be resubmitted against the other tree ;) Kernel with patch is currently running so fingers crossed. Thanks for testing. I am definitely interested knowing your results. Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is breaking things (have to test if a revert helps) i get this in some guests: Yes, this was fixed by : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af Hi Eric, With that patch i had a crash again this night, see below. -- Sander [177459.188808] general protection fault: [#1] SMP [177459.199746] Modules linked in: [177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150815-linus-doflr-net+ #1 [177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [177459.232247] task: 8221a580 ti: 8220 task.ti: 8220 [177459.242931] RIP: e030:[8110eb58] [8110eb58] detach_if_pending+0x18/0x80 [177459.253503] RSP: e02b:88005f6039d8 EFLAGS: 00010086 [177459.264051] RAX: 8800584d6580 RBX: 880004901420 RCX: dead00200200 [177459.274599] RDX: RSI: 88005f60e5c0 RDI: 880004901420 [177459.285122] RBP: 88005f6039d8 R08: 0001 R09: [177459.295286] R10: 0003 R11: 880004901394 R12: 0003 [177459.305388] R13: 00010ae47040 R14: 07b98a00 R15: 88005f60e5c0 [177459.315345] FS: 7f51317ec700() GS:88005f60() knlGS: [177459.325340] CS: e033 DS: ES: CR0: 8005003b [177459.335217] CR2: 010f8000 CR3: 2a154000 CR4: 0660 [177459.345129] Stack: [177459.354783] 88005f603a28 8110ee7f 810fb261 0200 [177459.364505] 0003 880004901380 0003 8800567d0d00 [177459.374064] 07b98a00 88005f603a58 819b3eb3 [177459.383532] Call Trace: [177459.392878] IRQ [177459.392935] [8110ee7f] mod_timer_pending+0x3f/0xe0 [177459.411058] [810fb261] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [177459.419876] [819b3eb3] __nf_ct_refresh_acct+0xa3/0xb0 [177459.428642] [819baafb] tcp_packet+0xb3b/0x1290 [177459.437285] [81a2535e] ? ip_output+0x5e/0xc0 [177459.445845] [810ca8ca] ? __local_bh_enable_ip+0x2a/0x90 [177459.454331] [819b35a9] ? __nf_conntrack_find_get+0x129/0x2a0 [177459.462642] [819b549c] nf_conntrack_in+0x29c/0x7c0 [177459.470711] [81a65e9c] ipv4_conntrack_local+0x4c/0x50 [177459.478753] [819ad67c] nf_iterate+0x4c/0x80 [177459.486726] [81102437] ? generic_handle_irq+0x27/0x40 [177459.494634] [819ad714] nf_hook_slow+0x64/0xc0 [177459.502486] [81a22d40] __ip_local_out_sk+0x90/0xa0 [177459.510248] [81a22c40] ? ip_forward_options+0x1a0/0x1a0 [177459.517782] [81a22d66] ip_local_out_sk+0x16/0x40 [177459.525044] [81a2343d] ip_queue_xmit+0x14d/0x350 [177459.532247] [81a3ae7e] tcp_transmit_skb+0x48e/0x960 [177459.539413] [81a3cddb] tcp_xmit_probe_skb+0xdb/0xf0 [177459.546389] [81a3dffb] tcp_write_wakeup+0x5b/0x150 [177459.553061] [81a3e51b] tcp_keepalive_timer+0x1fb/0x230 [177459.559761] [81a3e320] ? tcp_init_xmit_timers+0x20/0x20 [177459.566447] [8110f3c7] call_timer_fn.isra.27+0x17/0x80 [177459.573121] [81a3e320] ? tcp_init_xmit_timers+0x20/0x20 [177459.579778] [8110f55d] run_timer_softirq+0x12d/0x200 [177459.586448] [810ca6c3] __do_softirq+0x103/0x210 [177459.593138] [810ca9cb] irq_exit+0x4b/0xa0 [177459.599783] [814f05d4] xen_evtchn_do_upcall+0x34/0x50 [177459.606300] [81af93ae] xen_do_hypervisor_callback+0x1e/0x40 [177459.612583] EOI [177459.612637] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [177459.625010] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [177459.631157] [81008d60] ? xen_safe_halt+0x10/0x20 [177459.637158] [810188d3] ? default_idle+0x13/0x20 [177459.643072] [81018e1a] ? arch_cpu_idle+0xa/0x10 [177459.648809] [810f8e7e] ? default_idle_call+0x2e/0x50 [177459.654650] [810f9112] ? cpu_startup_entry+0x272/0x2e0 [177459.660488] [81ae79f7] ? rest_init+0x77/0x80 [177459.666297] [82312f58] ? start_kernel+0x43b/0x448 [177459.672092] [823124ef] ? x86_64_start_reservations+0x2a/0x2c [177459.677800] [82316008] ? xen_start_kernel+0x550/0x55c [177459.683451
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
Monday, August 17, 2015, 3:37:13 PM, you wrote: On Mon, 2015-08-17 at 11:09 +0200, Sander Eikelenboom wrote: Saturday, August 15, 2015, 12:39:25 AM, you wrote: On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote: On 2015-08-13 00:41, Eric Dumazet wrote: On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: Thanks for the reminder, but luckily i was aware of that, seen enough of your replies asking for patches to be resubmitted against the other tree ;) Kernel with patch is currently running so fingers crossed. Thanks for testing. I am definitely interested knowing your results. Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is breaking things (have to test if a revert helps) i get this in some guests: Yes, this was fixed by : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af Hi Eric, With that patch i had a crash again this night, see below. -- Sander [177459.188808] general protection fault: [#1] SMP [177459.199746] Modules linked in: [177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150815-linus-doflr-net+ #1 [177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [177459.232247] task: 8221a580 ti: 8220 task.ti: 8220 [177459.242931] RIP: e030:[8110eb58] [8110eb58] detach_if_pending+0x18/0x80 [177459.253503] RSP: e02b:88005f6039d8 EFLAGS: 00010086 [177459.264051] RAX: 8800584d6580 RBX: 880004901420 RCX: dead00200200 [177459.274599] RDX: RSI: 88005f60e5c0 RDI: 880004901420 [177459.285122] RBP: 88005f6039d8 R08: 0001 R09: [177459.295286] R10: 0003 R11: 880004901394 R12: 0003 [177459.305388] R13: 00010ae47040 R14: 07b98a00 R15: 88005f60e5c0 [177459.315345] FS: 7f51317ec700() GS:88005f60() knlGS: [177459.325340] CS: e033 DS: ES: CR0: 8005003b [177459.335217] CR2: 010f8000 CR3: 2a154000 CR4: 0660 [177459.345129] Stack: [177459.354783] 88005f603a28 8110ee7f 810fb261 0200 [177459.364505] 0003 880004901380 0003 8800567d0d00 [177459.374064] 07b98a00 88005f603a58 819b3eb3 [177459.383532] Call Trace: [177459.392878] IRQ [177459.392935] [8110ee7f] mod_timer_pending+0x3f/0xe0 [177459.411058] [810fb261] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [177459.419876] [819b3eb3] __nf_ct_refresh_acct+0xa3/0xb0 [177459.428642] [819baafb] tcp_packet+0xb3b/0x1290 [177459.437285] [81a2535e] ? ip_output+0x5e/0xc0 [177459.445845] [810ca8ca] ? __local_bh_enable_ip+0x2a/0x90 [177459.454331] [819b35a9] ? __nf_conntrack_find_get+0x129/0x2a0 [177459.462642] [819b549c] nf_conntrack_in+0x29c/0x7c0 [177459.470711] [81a65e9c] ipv4_conntrack_local+0x4c/0x50 [177459.478753] [819ad67c] nf_iterate+0x4c/0x80 [177459.486726] [81102437] ? generic_handle_irq+0x27/0x40 [177459.494634] [819ad714] nf_hook_slow+0x64/0xc0 [177459.502486] [81a22d40] __ip_local_out_sk+0x90/0xa0 [177459.510248] [81a22c40] ? ip_forward_options+0x1a0/0x1a0 [177459.517782] [81a22d66] ip_local_out_sk+0x16/0x40 [177459.525044] [81a2343d] ip_queue_xmit+0x14d/0x350 [177459.532247] [81a3ae7e] tcp_transmit_skb+0x48e/0x960 [177459.539413] [81a3cddb] tcp_xmit_probe_skb+0xdb/0xf0 [177459.546389] [81a3dffb] tcp_write_wakeup+0x5b/0x150 [177459.553061] [81a3e51b] tcp_keepalive_timer+0x1fb/0x230 [177459.559761] [81a3e320] ? tcp_init_xmit_timers+0x20/0x20 [177459.566447] [8110f3c7] call_timer_fn.isra.27+0x17/0x80 [177459.573121] [81a3e320] ? tcp_init_xmit_timers+0x20/0x20 [177459.579778] [8110f55d] run_timer_softirq+0x12d/0x200 [177459.586448] [810ca6c3] __do_softirq+0x103/0x210 [177459.593138] [810ca9cb] irq_exit+0x4b/0xa0 [177459.599783] [814f05d4] xen_evtchn_do_upcall+0x34/0x50 [177459.606300] [81af93ae] xen_do_hypervisor_callback+0x1e/0x40 [177459.612583] EOI [177459.612637] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [177459.625010] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [177459.631157] [81008d60] ? xen_safe_halt+0x10/0x20 [177459.637158] [810188d3] ? default_idle+0x13/0x20 [177459.643072] [81018e1a] ? arch_cpu_idle+0xa/0x10 [177459.648809] [810f8e7e] ? default_idle_call+0x2e/0x50 [177459.654650] [810f9112] ? cpu_startup_entry+0x272/0x2e0 [177459.660488] [81ae79f7] ? rest_init+0x77/0x80
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
Monday, August 17, 2015, 4:21:47 PM, you wrote: On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: This is very similar to the behavior I am seeing in this bug: https://bugzilla.kernel.org/show_bug.cgi?id=102911 OK, but have you applied the fix ? http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af It will be part of net iteration from David Miller to Linus Torvald. I did have that patch in for my last report. But i don't think he had (looking at the second part of his oops). -- Sander -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
On 2015-08-17 19:18, Eric Dumazet wrote: From: Eric Dumazet eduma...@google.com On Mon, 2015-08-17 at 16:25 +0200, Sander Eikelenboom wrote: Monday, August 17, 2015, 4:21:47 PM, you wrote: On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: This is very similar to the behavior I am seeing in this bug: https://bugzilla.kernel.org/show_bug.cgi?id=102911 OK, but have you applied the fix ? http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af It will be part of net iteration from David Miller to Linus Torvald. I did have that patch in for my last report. But i don't think he had (looking at the second part of his oops). Then can you try following fix as well ? Thanks ! Running now :) [PATCH] timer: fix a race in __mod_timer() lock_timer_base() can not catch following : CPU1 ( in __mod_timer() timer-flags |= TIMER_MIGRATING; spin_unlock(base-lock); base = new_base; spin_lock(base-lock); timer-flags = ~TIMER_BASEMASK; CPU2 (in lock_timer_base()) see timer base is cpu0 base spin_lock_irqsave(base-lock, *flags); if (timer-flags == tf) return base; // oops, wrong base timer-flags |= base-cpu // too late We must write timer-flags in one go, otherwise we can fool other cpus. Fixes: bc7a34b8b9eb (timer: Reduce timer migration overhead if disabled) Signed-off-by: Eric Dumazet eduma...@google.com Cc: Thomas Gleixner t...@linutronix.de --- kernel/time/timer.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 5e097fa9faf7..84190f02b521 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, spin_unlock(base-lock); base = new_base; spin_lock(base-lock); - timer-flags = ~TIMER_BASEMASK; - timer-flags |= base-cpu; + WRITE_ONCE(timer-flags, + (timer-flags ~TIMER_BASEMASK) | base-cpu); } } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
On 2015-08-13 00:41, Eric Dumazet wrote: On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: Thanks for the reminder, but luckily i was aware of that, seen enough of your replies asking for patches to be resubmitted against the other tree ;) Kernel with patch is currently running so fingers crossed. Thanks for testing. I am definitely interested knowing your results. Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is breaking things (have to test if a revert helps) i get this in some guests: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0] [ 6620.282805] Modules linked in: [ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1 [ 6620.282805] task: 8221a580 ti: 8220 task.ti: 8220 [ 6620.282805] RIP: e030:[8100122a] [8100122a] xen_hypercall_xen_version+0xa/0x20 [ 6620.282805] RSP: e02b:88000fc03d48 EFLAGS: 0246 [ 6620.282805] RAX: 00040006 RBX: 0200 RCX: 8100122a [ 6620.282805] RDX: 0001 RSI: deadbeef RDI: deadbeef [ 6620.282805] RBP: 88000fc03d60 R08: 88000fc03ee0 R09: 00ee [ 6620.282805] R10: 8220a0c0 R11: 0246 R12: [ 6620.282805] R13: 0001 R14: 880003b53054 R15: 0005 [ 6620.282805] FS: 7fec747ad800() GS:88000fc0() knlGS: [ 6620.282805] CS: e033 DS: ES: CR0: 8005003b [ 6620.282805] CR2: 7ffcb7a7a6d8 CR3: 03164000 CR4: 0660 [ 6620.282805] Stack: [ 6620.282805] 0068 0007 81008dbd 88000fc03dd8 [ 6620.282805] 81009592 0068 8220a0c0 00ee [ 6620.282805] 88000fc03ee0 0200 0200 0001 [ 6620.282805] Call Trace: [ 6620.282805] IRQ [ 6620.282805] [81008dbd] ? xen_force_evtchn_callback+0xd/0x10 [ 6620.282805] [81009592] check_events+0x12/0x20 [ 6620.282805] [8100957f] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 6620.282805] [81af79a5] ? _raw_spin_unlock_irqrestore+0x25/0x30 [ 6620.282805] [8110ed43] try_to_del_timer_sync+0x43/0x60 [ 6620.282805] [8110eda7] del_timer_sync+0x47/0x60 [ 6620.282805] [81a2b698] inet_csk_reqsk_queue_drop+0x118/0x1f0 [ 6620.282805] [81a2b8c6] reqsk_timer_handler+0x156/0x260 [ 6620.282805] [81a2b770] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0 [ 6620.282805] [8110f3c7] call_timer_fn.isra.27+0x17/0x80 [ 6620.282805] [81a2b770] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0 [ 6620.282805] [8110f55d] run_timer_softirq+0x12d/0x200 [ 6620.282805] [810ca6c3] __do_softirq+0x103/0x210 [ 6620.282805] [810ca9cb] irq_exit+0x4b/0xa0 [ 6620.282805] [814f05d4] xen_evtchn_do_upcall+0x34/0x50 [ 6620.282805] [81af932e] xen_do_hypervisor_callback+0x1e/0x40 [ 6620.282805] EOI [ 6620.282805] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [ 6620.282805] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [ 6620.282805] [81008d60] ? xen_safe_halt+0x10/0x20 [ 6620.282805] [810188d3] ? default_idle+0x13/0x20 [ 6620.282805] [81018e1a] ? arch_cpu_idle+0xa/0x10 [ 6620.282805] [810f8e7e] ? default_idle_call+0x2e/0x50 [ 6620.282805] [810f9112] ? cpu_startup_entry+0x272/0x2e0 [ 6620.282805] [81ae7967] ? rest_init+0x77/0x80 [ 6620.282805] [82312f58] ? start_kernel+0x43b/0x448 [ 6620.282805] [823124ef] ? x86_64_start_reservations+0x2a/0x2c [ 6620.282805] [82316008] ? xen_start_kernel+0x550/0x55c [ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
On 2015-08-15 00:09, Sander Eikelenboom wrote: On 2015-08-13 00:41, Eric Dumazet wrote: On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: Thanks for the reminder, but luckily i was aware of that, seen enough of your replies asking for patches to be resubmitted against the other tree ;) Kernel with patch is currently running so fingers crossed. Thanks for testing. I am definitely interested knowing your results. Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is breaking things (have to test if a revert helps) i get this in some guests: Should have done that before, because it wasn't in yet .. and likely to fix the issue, also pulled and compiling now. -- Sander NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0] [ 6620.282805] Modules linked in: [ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1 [ 6620.282805] task: 8221a580 ti: 8220 task.ti: 8220 [ 6620.282805] RIP: e030:[8100122a] [8100122a] xen_hypercall_xen_version+0xa/0x20 [ 6620.282805] RSP: e02b:88000fc03d48 EFLAGS: 0246 [ 6620.282805] RAX: 00040006 RBX: 0200 RCX: 8100122a [ 6620.282805] RDX: 0001 RSI: deadbeef RDI: deadbeef [ 6620.282805] RBP: 88000fc03d60 R08: 88000fc03ee0 R09: 00ee [ 6620.282805] R10: 8220a0c0 R11: 0246 R12: [ 6620.282805] R13: 0001 R14: 880003b53054 R15: 0005 [ 6620.282805] FS: 7fec747ad800() GS:88000fc0() knlGS: [ 6620.282805] CS: e033 DS: ES: CR0: 8005003b [ 6620.282805] CR2: 7ffcb7a7a6d8 CR3: 03164000 CR4: 0660 [ 6620.282805] Stack: [ 6620.282805] 0068 0007 81008dbd 88000fc03dd8 [ 6620.282805] 81009592 0068 8220a0c0 00ee [ 6620.282805] 88000fc03ee0 0200 0200 0001 [ 6620.282805] Call Trace: [ 6620.282805] IRQ [ 6620.282805] [81008dbd] ? xen_force_evtchn_callback+0xd/0x10 [ 6620.282805] [81009592] check_events+0x12/0x20 [ 6620.282805] [8100957f] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 6620.282805] [81af79a5] ? _raw_spin_unlock_irqrestore+0x25/0x30 [ 6620.282805] [8110ed43] try_to_del_timer_sync+0x43/0x60 [ 6620.282805] [8110eda7] del_timer_sync+0x47/0x60 [ 6620.282805] [81a2b698] inet_csk_reqsk_queue_drop+0x118/0x1f0 [ 6620.282805] [81a2b8c6] reqsk_timer_handler+0x156/0x260 [ 6620.282805] [81a2b770] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0 [ 6620.282805] [8110f3c7] call_timer_fn.isra.27+0x17/0x80 [ 6620.282805] [81a2b770] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0 [ 6620.282805] [8110f55d] run_timer_softirq+0x12d/0x200 [ 6620.282805] [810ca6c3] __do_softirq+0x103/0x210 [ 6620.282805] [810ca9cb] irq_exit+0x4b/0xa0 [ 6620.282805] [814f05d4] xen_evtchn_do_upcall+0x34/0x50 [ 6620.282805] [81af932e] xen_do_hypervisor_callback+0x1e/0x40 [ 6620.282805] EOI [ 6620.282805] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [ 6620.282805] [810013aa] ? xen_hypercall_sched_op+0xa/0x20 [ 6620.282805] [81008d60] ? xen_safe_halt+0x10/0x20 [ 6620.282805] [810188d3] ? default_idle+0x13/0x20 [ 6620.282805] [81018e1a] ? arch_cpu_idle+0xa/0x10 [ 6620.282805] [810f8e7e] ? default_idle_call+0x2e/0x50 [ 6620.282805] [810f9112] ? cpu_startup_entry+0x272/0x2e0 [ 6620.282805] [81ae7967] ? rest_init+0x77/0x80 [ 6620.282805] [82312f58] ? start_kernel+0x43b/0x448 [ 6620.282805] [823124ef] ? x86_64_start_reservations+0x2a/0x2c [ 6620.282805] [82316008] ? xen_start_kernel+0x550/0x55c [ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80
On 2015-08-12 23:40, David Miller wrote: From: li...@eikelenboom.it Date: Wed, 12 Aug 2015 22:50:42 +0200 On 2015-08-12 22:41, Eric Dumazet wrote: On Wed, 2015-08-12 at 21:19 +0200, li...@eikelenboom.it wrote: Hi, On my box running Xen with a 4.2-rc6 kernel i still get this splat in dom0, which crashes the box. (i reported a similar splat before (at rc4) here, http://www.spinics.net/lists/netdev/msg337570.html) Never seen this one on 4.1, so it seems a regression. -- Sander [81133.193439] general protection fault: [#1] SMP [81133.204284] Modules linked in: [81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.2.0-rc6-20150811-linus-doflr+ #1 [81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [81133.236237] task: 880059b91580 ti: 880059bb4000 task.ti: 880059bb4000 [81133.246808] RIP: e030:[8110fb18] [8110fb18] detach_if_pending+0x18/0x80 [81133.257354] RSP: e02b:880059bb7848 EFLAGS: 00010086 [81133.267749] RAX: 88004eddc7f0 RBX: 88000e20ae08 RCX: dead00200200 [81133.278201] RDX: RSI: 88005f60e600 RDI: 88000e20ae08 [81133.288723] RBP: 880059bb7848 R08: 0001 R09: 0001 [81133.298930] R10: 0003 R11: 88000e20ad68 R12: [81133.308875] R13: 000101735569 R14: 00015f90 R15: 88005f60e600 [81133.318845] FS: 7f28c6f7c800() GS:88005f60() knlGS: [81133.328864] CS: e033 DS: ES: CR0: 8005003b [81133.338693] CR2: 807f6800 CR3: 3d55c000 CR4: 0660 [81133.348462] Stack: [81133.358005] 880059bb7898 8110fe3f 810fc261 0200 [81133.367682] 0003 88000e20ad68 88005854d400 [81133.377064] 00015f90 880059bb78c8 819b5243 [81133.386374] Call Trace: [81133.395596] [8110fe3f] mod_timer_pending+0x3f/0xe0 [81133.404999] [810fc261] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [81133.414255] [819b5243] __nf_ct_refresh_acct+0xa3/0xb0 [81133.423137] [819bbe8b] tcp_packet+0xb3b/0x1290 [81133.431894] [810cb8ca] ? __local_bh_enable_ip+0x2a/0x90 [81133.440622] [819b4939] ? __nf_conntrack_find_get+0x129/0x2a0 [81133.449339] [819b682c] nf_conntrack_in+0x29c/0x7c0 [81133.457940] [81a67181] ipv4_conntrack_in+0x21/0x30 [81133.466296] [819aea1c] nf_iterate+0x4c/0x80 [81133.474401] [819aeab4] nf_hook_slow+0x64/0xc0 [81133.482615] [81a211ec] ip_rcv+0x2ec/0x380 [81133.490781] [81a209f0] ? ip_local_deliver_finish+0x130/0x130 [81133.498790] [8197e140] __netif_receive_skb_core+0x2a0/0x970 [81133.506714] [81a56db8] ? inet_gro_receive+0x1c8/0x200 [81133.514609] [81980705] __netif_receive_skb+0x15/0x70 [81133.522333] [8198077e] netif_receive_skb_internal+0x1e/0x80 [81133.529840] [81980f3b] napi_gro_receive+0x6b/0x90 [81133.537173] [81740fb6] rtl8169_poll+0x2e6/0x600 [81133.54] [810fc261] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [81133.551566] [81981ad7] net_rx_action+0x1f7/0x300 [81133.558412] [810cb6c3] __do_softirq+0x103/0x210 [81133.565353] [810cb807] run_ksoftirqd+0x37/0x60 [81133.572359] [810e4de0] smpboot_thread_fn+0x130/0x190 [81133.579215] [810e4cb0] ? sort_range+0x20/0x20 [81133.586042] [810e1fae] kthread+0xee/0x110 [81133.592792] [810e1ec0] ? kthread_create_on_node+0x1b0/0x1b0 [81133.599694] [81af92df] ret_from_fork+0x3f/0x70 [81133.606662] [810e1ec0] ? kthread_create_on_node+0x1b0/0x1b0 [81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 08 74 04 48 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 [81133.627196] RIP [8110fb18] detach_if_pending+0x18/0x80 [81133.634036] RSP 880059bb7848 [81133.640817] ---[ end trace eaf596e1fcf6a591 ]--- [81133.647521] Kernel panic - not syncing: Fatal exception in interrupt This looks like the bug fixed in David Miller net tree : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=2235f2ac75fd2501c251b0b699a9632e80239a6d Will pull the net-tree in and re-test. You should not pull the 'net-next', but rather the 'net' one. 'net' is not necessarily included in 'net-next'. Thanks for the reminder, but luckily i was aware of that, seen enough of your replies asking for patches to be resubmitted against the other tree ;) Kernel with patch is currently running so fingers crossed. -- Sander -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html