Linux 4.14-rc6 bisected regression tun devices not working anymore in openvpn

2017-10-28 Thread Sander Eikelenboom
L.S.,

While testing a linux 4.14-rc6 kernel i noticed OpenVPN didn't function 
anymore. 
My openvpn config uses tun devices and is pretty standard.
The openvpn version is current Debian stable: openvpn 2.4.0-6+deb9u2

>From the openvpn logging:
Sat Oct 28 16:03:34 2017 us=175829 TUN/TAP device  opened
Sat Oct 28 16:03:34 2017 us=183027 Note: Cannot set tx queue length on : No 
such device (errno=19)
Sat Oct 28 16:03:34 2017 us=183055 do_ifconfig, 
tt->did_ifconfig_ipv6_setup=0
Sat Oct 28 16:03:34 2017 us=183071 /sbin/ip link set dev  up mtu 1500
Cannot find device ""
Sat Oct 28 16:03:34 2017 us=200445 Linux ip link set failed: external 
program exited with error status: 1
Sat Oct 28 16:03:34 2017 us=200482 Exiting due to fatal error
Sat Oct 28 16:38:17 2017 us=923381 TCP/UDP: Closing socket
Sat Oct 28 16:38:17 2017 us=925986 Closing TUN/TAP interface


The offending commit is: 
0ad646c81b2182f7fa67ec0c8c825e0ee165696d
"tun: call dev_get_valid_name() before register_netdevice()" 

Reverting this commit fixes the issue for me, it's unfortunate that the commit 
it self seems to fix an other issue.

--
Sander


Re: 4.12-RC2 BUG: scheduling while atomic: irq/47-iwlwifi

2017-05-23 Thread Sander Eikelenboom
On 22/05/17 23:02, Arend Van Spriel wrote:
> 
> 
> On 22-5-2017 14:09, Arend van Spriel wrote:
>> On 5/22/2017 12:57 PM, Johannes Berg wrote:
>>> On Mon, 2017-05-22 at 12:36 +0200, Sander Eikelenboom wrote:
>>>> Hi,
>>>>
>>>> I encountered this splat with 4.12-RC2.
>>>
>>> Ugh, yeah, I should've seen that in the review.
>>>
>>> Arend, please take a look at this. cfg80211_sched_scan_results() cannot
>>> sleep, so you can't rtnl_lock() in there. Looks like you can just rely
>>> on RCU though?
>>
>> I see. I think you are right on RCU. Don't have the code in front of me
>> now, but I think the lookup has an ASSERT_RTNL. Will look into it after
>> my monday meeting :-p
> 
> I realized I have a laptop lying around with intel 3160 wifi chip and
> tried to reproduce the issue. Did not run into the splat running
> 4.12-rc1 from wireless-drivers-next repo. I did not get the email from
> Sander so I don't know any details.
> 
> Here is what I changed based on the info Johannes provided. Can you
> please check if this get rid of the splat and let me know.

Hi Arend,

I ran your patch today, so far no issues.

--
Sander


> Regards,
> Arend
> ---
> diff --git a/net/wireless/scan.c b/net/wireless/scan.c
> index 14d5f0c..04833bb 100644
> --- a/net/wireless/scan.c
> +++ b/net/wireless/scan.c
> @@ -322,9 +322,7 @@ static void cfg80211_del_sched_scan_req(struct
> cfg80211_regi
>  {
> struct cfg80211_sched_scan_request *pos;
> 
> -   ASSERT_RTNL();
> -
> -   list_for_each_entry(pos, >sched_scan_req_list, list) {
> +   list_for_each_entry_rcu(pos, >sched_scan_req_list, list) {
> if (pos->reqid == reqid)
> return pos;
> }
> @@ -398,13 +396,13 @@ void cfg80211_sched_scan_results(struct wiphy
> *wiphy, u64
> trace_cfg80211_sched_scan_results(wiphy, reqid);
> /* ignore if we're not scanning */
> 
> -   rtnl_lock();
> +   rcu_read_lock();
> request = cfg80211_find_sched_scan_req(rdev, reqid);
> if (request) {
> request->report_results = true;
> queue_work(cfg80211_wq, >sched_scan_res_wk);
> }
> -   rtnl_unlock();
> +   rcu_read_unlock();
>  }
>  EXPORT_SYMBOL(cfg80211_sched_scan_results);
> 
> 



4.12-RC2 BUG: scheduling while atomic: irq/47-iwlwifi

2017-05-22 Thread Sander Eikelenboom
Hi,

I encountered this splat with 4.12-RC2.
--

Sander

[  119.021594] BUG: scheduling while atomic: irq/47-iwlwifi/517/0x0200
[  119.021604] Modules linked in: xt_tcpudp ip6t_rpfilter ipt_REJECT 
nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 
xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc 
ip6table_raw ip6table_security ip6table_mangle iptable_raw iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
iptable_security iptable_mangle ebtable_filter ebtables ip6table_filter 
ip6_tables iptable_filter ip_tables x_tables rfcomm bnep binfmt_misc arc4 
iTCO_wdt iTCO_vendor_support uvcvideo videobuf2_vmalloc videobuf2_memops 
videobuf2_v4l2 videobuf2_core videodev intel_rapl cdc_mbim iwlmvm 
x86_pkg_temp_thermal intel_powerclamp mac80211 media cdc_wdm btusb coretemp 
cdc_ncm kvm_intel usbnet mii cdc_acm iwlwifi kvm btintel joydev pcspkr 
serio_raw cfg80211 snd_hda_codec_hdmi
[  119.021701]  bluetooth lpc_ich snd_hda_codec_realtek snd_hda_codec_generic 
shpchp sg ecdh_generic snd_hda_intel thinkpad_acpi snd_hda_codec snd_hwdep 
snd_hda_core snd_pcm snd_timer nvram snd soundcore evdev tpm_tis tpm_tis_core 
tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel rtsx_pci_sdmmc mmc_core aesni_intel aes_x86_64 crypto_simd 
cryptd glue_helper psmouse i2c_i801 sd_mod ehci_pci ehci_hcd e1000e rtsx_pci 
mfd_core ptp xhci_pci pps_core xhci_hcd
[  119.021759] CPU: 1 PID: 517 Comm: irq/47-iwlwifi Not tainted 
4.12.0-rc2-t440s-20170522+ #1
[  119.021763] Hardware name: LENOVO 20AQS03H00/20AQS03H00, BIOS GJET91WW (2.41 
) 09/21/2016
[  119.021766] Call Trace:
[  119.021778]  ? dump_stack+0x5c/0x84
[  119.021784]  ? __schedule_bug+0x4c/0x70
[  119.021792]  ? __schedule+0x496/0x5c0
[  119.021798]  ? schedule+0x2d/0x80
[  119.021804]  ? schedule_preempt_disabled+0x5/0x10
[  119.021810]  ? __mutex_lock.isra.0+0x18e/0x4c0
[  119.021817]  ? __wake_up+0x2f/0x50
[  119.021833]  ? cfg80211_sched_scan_results+0x19/0x60 [cfg80211]
[  119.021844]  ? cfg80211_sched_scan_results+0x19/0x60 [cfg80211]
[  119.021859]  ? iwl_mvm_rx_lmac_scan_iter_complete_notif+0x17/0x30 [iwlmvm]
[  119.021869]  ? iwl_pcie_rx_handle+0x2a9/0x7e0 [iwlwifi]
[  119.021878]  ? iwl_pcie_irq_handler+0x17c/0x730 [iwlwifi]
[  119.021884]  ? irq_forced_thread_fn+0x60/0x60
[  119.021887]  ? irq_thread_fn+0x16/0x40
[  119.021892]  ? irq_thread+0x109/0x180
[  119.021896]  ? wake_threads_waitq+0x30/0x30
[  119.021901]  ? kthread+0xf2/0x130
[  119.021905]  ? irq_thread_dtor+0x90/0x90
[  119.021910]  ? kthread_create_on_node+0x40/0x40
[  119.021915]  ? ret_from_fork+0x26/0x40


Re: nf_unregister_net_hook: hook not found!

2015-12-30 Thread Sander Eikelenboom

On 2015-12-30 03:39, ebied...@xmission.com wrote:

Pablo Neira Ayuso <pa...@netfilter.org> writes:


On Mon, Dec 28, 2015 at 09:05:03PM +0100, Sander Eikelenboom wrote:

Hi,

Running a 4.4.0-rc6 kernel i encountered the warning below.


Cc'ing Eric Biederman.

@Sander, could you provide a way to reproduce this?


I am on vacation until the new year, but if this is reproducible we
should be able to print out reg, reg->pf, reg->hooknum, reg->hook
to figure out which hook is having something very weird happen to it.

This is happening in some network namespace exit.

Eric



Unfortunately i have found no way to reproduce,
13 seconds implies it was at boot, but i only have seen this once.

--
Sander


Thanks.


[   13.740472] ip_tables: (C) 2000-2006 Netfilter Core Team
[   13.936237] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   13.945391] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   13.947434] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0
[   14.223990] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   14.232065] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   14.233570] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0
[   14.328141] systemd-logind[2485]: Failed to start user service: 
Unknown

unit: user@117.service
[   14.356634] systemd-logind[2485]: New session c1 of user lightdm.
[   14.357320] [ cut here ]
[   14.357327] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143
netfilter_net_exit+0x25/0x50()
[   14.357328] nf_unregister_net_hook: hook not found!
[   14.357371] Modules linked in: iptable_security(+) iptable_raw
iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd
auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc 
uvcvideo

videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support
intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal 
intel_powerclamp
btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl 
kvm_intel
v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel 
media kvm
snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi 
snd_hda_intel
pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep snd_hda_core 
rfkill
i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev snd soundcore 
shpchp
tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul 
crc32c_intel

aesni_intel
[   14.357380]  ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd 
e1000e
lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore 
usb_common

pps_core
[   14.357383] CPU: 2 PID: 102 Comm: kworker/u16:3 Tainted: G U
4.4.0-rc6-x220-20151224+ #1
[   14.357384] Hardware name: LENOVO 42912ZU/42912ZU, BIOS 8DET69WW 
(1.39 )

07/18/2013
[   14.357390] Workqueue: netns cleanup_net
[   14.357393]  81a27dfd 81359c69 88030e7cbd40
81060297
[   14.357395]  88030e820d80 88030e7cbd90 81c962d8
81c962e0
[   14.357397]  88030e7cbdf8 81060317 81a2c010
88030018
[   14.357398] Call Trace:
[   14.357405]  [] ? dump_stack+0x40/0x57
[   14.357408]  [] ? warn_slowpath_common+0x77/0xb0
[   14.357410]  [] ? warn_slowpath_fmt+0x47/0x50
[   14.357416]  [] ? mutex_lock+0x9/0x30
[   14.357418]  [] ? netfilter_net_exit+0x25/0x50
[   14.357421]  [] ? ops_exit_list.isra.6+0x2e/0x60
[   14.357424]  [] ? cleanup_net+0x1ab/0x280
[   14.357427]  [] ? process_one_work+0x133/0x330
[   14.357429]  [] ? worker_thread+0x60/0x470
[   14.357430]  [] ? process_one_work+0x330/0x330
[   14.357434]  [] ? kthread+0xca/0xe0
[   14.357436]  [] ? 
kthread_create_on_node+0x170/0x170

[   14.357439]  [] ? ret_from_fork+0x3f/0x70
[   14.357441]  [] ? 
kthread_create_on_node+0x170/0x170

[   14.357443] ---[ end trace 9984cc4b0e89f818 ]---
[   14.357443] [ cut here ]
[   14.357446] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143
netfilter_net_exit+0x25/0x50()
[   14.357446] nf_unregister_net_hook: hook not found!
[   14.357472] Modules linked in: iptable_security(+) iptable_raw
iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd
auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc 
uvcvideo

videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support
intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal 
intel_powerclamp
btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl 
kvm_intel
v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel 
media kvm
snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi 
snd_hda_intel
pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep snd_hda_core 
rfkill
i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev snd soundcore 
shpchp
tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul crc32_pclmul 
crc32c_intel

aesni_intel
[   14.357478]  ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd 
e1000e
lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore 
usb_common

pps_core
[   14.357480] CPU: 2 PID: 102 Comm: kworker/u16:3 Taint

nf_unregister_net_hook: hook not found!

2015-12-28 Thread Sander Eikelenboom

Hi,

Running a 4.4.0-rc6 kernel i encountered the warning below.

--
Sander



[   13.740472] ip_tables: (C) 2000-2006 Netfilter Core Team
[   13.936237] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   13.945391] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   13.947434] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0
[   14.223990] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   14.232065] iwlwifi :03:00.0: L1 Enabled - LTR Disabled
[   14.233570] iwlwifi :03:00.0: Radio type=0x2-0x1-0x0
[   14.328141] systemd-logind[2485]: Failed to start user service: 
Unknown unit: user@117.service

[   14.356634] systemd-logind[2485]: New session c1 of user lightdm.
[   14.357320] [ cut here ]
[   14.357327] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143 
netfilter_net_exit+0x25/0x50()

[   14.357328] nf_unregister_net_hook: hook not found!
[   14.357371] Modules linked in: iptable_security(+) iptable_raw 
iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd 
auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc uvcvideo 
videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support 
intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal intel_powerclamp 
btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl kvm_intel 
v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel media 
kvm snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi 
snd_hda_intel pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep 
snd_hda_core rfkill i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev 
snd soundcore shpchp tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul 
crc32_pclmul crc32c_intel aesni_intel
[   14.357380]  ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd 
e1000e lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore 
usb_common pps_core
[   14.357383] CPU: 2 PID: 102 Comm: kworker/u16:3 Tainted: G U  
4.4.0-rc6-x220-20151224+ #1
[   14.357384] Hardware name: LENOVO 42912ZU/42912ZU, BIOS 8DET69WW 
(1.39 ) 07/18/2013

[   14.357390] Workqueue: netns cleanup_net
[   14.357393]  81a27dfd 81359c69 88030e7cbd40 
81060297
[   14.357395]  88030e820d80 88030e7cbd90 81c962d8 
81c962e0
[   14.357397]  88030e7cbdf8 81060317 81a2c010 
88030018

[   14.357398] Call Trace:
[   14.357405]  [] ? dump_stack+0x40/0x57
[   14.357408]  [] ? warn_slowpath_common+0x77/0xb0
[   14.357410]  [] ? warn_slowpath_fmt+0x47/0x50
[   14.357416]  [] ? mutex_lock+0x9/0x30
[   14.357418]  [] ? netfilter_net_exit+0x25/0x50
[   14.357421]  [] ? ops_exit_list.isra.6+0x2e/0x60
[   14.357424]  [] ? cleanup_net+0x1ab/0x280
[   14.357427]  [] ? process_one_work+0x133/0x330
[   14.357429]  [] ? worker_thread+0x60/0x470
[   14.357430]  [] ? process_one_work+0x330/0x330
[   14.357434]  [] ? kthread+0xca/0xe0
[   14.357436]  [] ? 
kthread_create_on_node+0x170/0x170

[   14.357439]  [] ? ret_from_fork+0x3f/0x70
[   14.357441]  [] ? 
kthread_create_on_node+0x170/0x170

[   14.357443] ---[ end trace 9984cc4b0e89f818 ]---
[   14.357443] [ cut here ]
[   14.357446] WARNING: CPU: 2 PID: 102 at net/netfilter/core.c:143 
netfilter_net_exit+0x25/0x50()

[   14.357446] nf_unregister_net_hook: hook not found!
[   14.357472] Modules linked in: iptable_security(+) iptable_raw 
iptable_filter ip_tables x_tables input_polldev bnep binfmt_misc nfsd 
auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc uvcvideo 
videobuf2_vmalloc iTCO_wdt arc4 videobuf2_memops iTCO_vendor_support 
intel_rapl iosf_mbi videobuf2_v4l2 x86_pkg_temp_thermal intel_powerclamp 
btusb coretemp snd_hda_codec_hdmi iwldvm videobuf2_core btrtl kvm_intel 
v4l2_common mac80211 videodev btbcm snd_hda_codec_conexant btintel media 
kvm snd_hda_codec_generic bluetooth psmouse thinkpad_acpi iwlwifi 
snd_hda_intel pcspkr serio_raw snd_hda_codec nvram cfg80211 snd_hwdep 
snd_hda_core rfkill i2c_i801 lpc_ich snd_pcm mfd_core snd_timer evdev 
snd soundcore shpchp tpm_tis tpm algif_skcipher af_alg crct10dif_pclmul 
crc32_pclmul crc32c_intel aesni_intel
[   14.357478]  ehci_pci sdhci_pci aes_x86_64 glue_helper ehci_hcd 
e1000e lrw ablk_helper sg sdhci cryptd sd_mod ptp mmc_core usbcore 
usb_common pps_core
[   14.357480] CPU: 2 PID: 102 Comm: kworker/u16:3 Tainted: G U  W   
4.4.0-rc6-x220-20151224+ #1
[   14.357481] Hardware name: LENOVO 42912ZU/42912ZU, BIOS 8DET69WW 
(1.39 ) 07/18/2013

[   14.357484] Workqueue: netns cleanup_net
[   14.357486]  81a27dfd 81359c69 88030e7cbd40 
81060297
[   14.357488]  88030e820db8 88030e7cbd90 81c962d8 
81c962e0
[   14.357489]  88030e7cbdf8 81060317 81a2c010 
88030018

[   14.357490] Call Trace:
[   14.357493]  [] ? dump_stack+0x40/0x57
[   14.357495]  [] ? warn_slowpath_common+0x77/0xb0
[   14.357497]  [] ? warn_slowpath_fmt+0x47/0x50
[   14.357499]  [] ? 

Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP

2015-11-13 Thread Sander Eikelenboom

On 2015-11-13 12:06, Ido Schimmel wrote:
When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns 
EOPNOTSUPP.

In this case we should not emit errors and warnings to the kernel log.


Hi Ido,

Thanks for your patch!

It fixes these:
[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on 
port1(vif1.0)


But i still have these:
[  335.412194] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.412204] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.412311] vif19.0-emu: set_features() failed (-1); wanted 
0x008248c9, left 0x0080001b48c9
[  335.412319] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.412326] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  335.615401] xen_bridge: set_features() failed (-1); wanted 
0x00801fdb78c9, left 0x00801fff78e9


--
Sander


Reported-by: Sander Eikelenboom <li...@eikelenboom.it>
Fixes: 0bc05d585d38 ("switchdev: allow caller to explicitly request
attr_set as deferred")
Fixes: 6ac311ae8bfb ("Adding switchdev ageing notification on port
bridged")
Signed-off-by: Ido Schimmel <ido...@mellanox.com>
Signed-off-by: Jiri Pirko <j...@mellanox.com>
---
 net/bridge/br_stp.c| 2 +-
 net/bridge/br_stp_if.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index f7e8dee..5f3f645 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -48,7 +48,7 @@ void br_set_state(struct net_bridge_port *p,
unsigned int state)

p->state = state;
err = switchdev_port_attr_set(p->dev, );
-   if (err)
+   if (err && err != -EOPNOTSUPP)
br_warn(p->br, "error setting offload STP state on port 
%u(%s)\n",
(unsigned int) p->port_no, p->dev->name);
 }
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index fa53d7a..5396ff08 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -50,7 +50,7 @@ void br_init_port(struct net_bridge_port *p)
p->config_pending = 0;

err = switchdev_port_attr_set(p->dev, );
-   if (err)
+   if (err && err != -EOPNOTSUPP)
netdev_err(p->dev, "failed to set HW ageing time\n");
 }


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP

2015-11-13 Thread Sander Eikelenboom

On 2015-11-13 13:43, Ido Schimmel wrote:

Fri, Nov 13, 2015 at 02:34:45PM IST, li...@eikelenboom.it wrote:

On 2015-11-13 12:06, Ido Schimmel wrote:

When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns
EOPNOTSUPP.
In this case we should not emit errors and warnings to the kernel 
log.


Hi Ido,

Thanks for your patch!

It fixes these:
[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on
port1(vif1.0)

But i still have these:
[  335.412194] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.412204] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.412311] vif19.0-emu: set_features() failed (-1); wanted
0x008248c9, left 0x0080001b48c9
[  335.412319] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.412326] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); 
wanted

0x00044803, left 0x000400114813
[  335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); 
wanted

0x00044803, left 0x000400114813
[  335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); 
wanted

0x00044803, left 0x000400114813
[  335.615401] xen_bridge: set_features() failed (-1); wanted
0x00801fdb78c9, left 0x00801fff78e9



Yes, this is a different issue and I see that Nik is already working on
it. Can you please try his patch?

http://patchwork.ozlabs.org/patch/544242/


Yeah that suppresses the warning, thx !

--
Sander
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[linux-4.4-mw] BUG: unable to handle kernel paging request ip_vs_out.constprop

2015-11-12 Thread Sander Eikelenboom

Hi All,

Just got a crash with a linux-4.4-mw kernel.
I'm using a routed bridge and apart from the splat below i have got some 
interesting other messages that aren't there in 4.3 (and perhaps are of 
interest for the crash as well):
[  207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  207.245435] xen_bridge: error setting offload STP state on port 
1(vif1.0)

[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on port 
1(vif1.0)
[  207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813


The commit message for the commit that introduced the "set HW ageing 
time" error message, doesn't seem to tell
me much about it's purpose. If it's not related i can reported as a 
seperate issue.


--
Sander

The crash:
[  354.328687] BUG: unable to handle kernel paging request at 
880049aa8000

[  354.350206] IP: [] ip_vs_out.constprop.25+0x47/0x60
[  354.360882] PGD 2212067 PUD 25b4067 PMD 5ffb6067 PTE 0
[  354.371587] Oops:  [#1] SMP
[  354.382143] Modules linked in:
[  354.392537] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.3.0-mw-2015-linus-doflr+ #1
[  354.403105] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS 
V1.8B1 09/13/2010
[  354.413666] task: 82218580 ti: 8220 task.ti: 
8220
[  354.424255] RIP: e030:[]  [] 
ip_vs_out.constprop.25+0x47/0x60

[  354.434742] RSP: e02b:88005f6034b0  EFLAGS: 00010246
[  354.445006] RAX: 0001 RBX: 88005f6034f8 RCX: 
880049aa7ce0
[  354.455262] RDX: 88003c0e5500 RSI: 0003 RDI: 
880004e0e800
[  354.465422] RBP: 88005f6034b8 R08: 0014 R09: 
0003
[  354.475508] R10: 0001 R11: 880040f394cc R12: 
88005f603528
[  354.485567] R13: 88003c0e5500 R14: 822da2e8 R15: 
88003c0e5500
[  354.495595] FS:  7f0243c2b700() GS:88005f60() 
knlGS:

[  354.505474] CS:  e033 DS:  ES:  CR0: 8005003b
[  354.515135] CR2: 880049aa8000 CR3: 59271000 CR4: 
0660

[  354.524794] Stack:
[  354.534319]  81a074fc 88005f6034e8 8199e138 
88003c0e5500
[  354.543981]  88005f603528 88003c0e5500  
88005f603518
[  354.553577]  8199e1af 880005300048 88003c0e5500 
822da2e8

[  354.563160] Call Trace:
[  354.572418]  
[  354.572480]  [] ? ip_vs_local_reply4+0x1c/0x20
[  354.590458]  [] nf_iterate+0x58/0x70
[  354.599372]  [] nf_hook_slow+0x5f/0xb0
[  354.608245]  [] __ip_local_out+0x9e/0xb0
[  354.617036]  [] ? ip_forward_options+0x1a0/0x1a0
[  354.625874]  [] ip_local_out+0x17/0x40
[  354.634383]  [] ip_build_and_send_pkt+0x148/0x1c0
[  354.642715]  [] tcp_v4_send_synack+0x56/0xa0
[  354.650893]  [] ? 
inet_csk_reqsk_queue_hash_add+0x68/0x90

[  354.659083]  [] tcp_conn_request+0x95d/0x970
[  354.667196]  [] ? __local_bh_enable_ip+0x26/0x90
[  354.675246]  [] tcp_v4_conn_request+0x47/0x50
[  354.683254]  [] tcp_rcv_state_process+0x183/0xca0
[  354.691004]  [] tcp_v4_do_rcv+0x5c/0x1f0
[  354.698533]  [] tcp_v4_rcv+0x987/0x9a0
[  354.705968]  [] ? ipv4_confirm+0x78/0xf0
[  354.713370]  [] ip_local_deliver_finish+0x84/0x120
[  354.720739]  [] ip_local_deliver+0x42/0xd0
[  354.728029]  [] ? inet_del_offload+0x40/0x40
[  354.735270]  [] ip_rcv_finish+0x106/0x320
[  354.742413]  [] ip_rcv+0x211/0x370
[  354.749268]  [] ? 
ip_local_deliver_finish+0x120/0x120
[  354.755929]  [] 
__netif_receive_skb_core+0x2cb/0x970

[  354.762535]  [] ? nf_nat_setup_info+0x7a/0x2f0
[  354.769131]  [] __netif_receive_skb+0x11/0x70
[  354.775481]  [] 
netif_receive_skb_internal+0x1e/0x80

[  354.781638]  [] ? nf_hook_slow+0x5f/0xb0
[  354.787771]  [] netif_receive_skb+0x9/0x10
[  354.793916]  [] br_handle_frame_finish+0x178/0x4b0
[  354.800077]  [] ? nf_nat_ipv4_fn+0x167/0x1e0
[  354.806260]  [] ? br_handle_local_finish+0x50/0x50
[  354.812405]  [] 
br_nf_pre_routing_finish+0x183/0x360

[  354.818574]  [] ? br_netif_receive_skb+0x10/0x10
[  354.824775]  [] br_nf_pre_routing+0x2a7/0x380
[  354.830780]  [] ? br_nf_forward_ip+0x3f0/0x3f0
[  354.836567]  [] nf_iterate+0x58/0x70
[  354.842281]  [] nf_hook_slow+0x5f/0xb0
[  354.847886]  [] br_handle_frame+0x1a2/0x290
[  354.853520]  [] ? br_netif_receive_skb+0x10/0x10
[  354.859206]  [] ? 
br_handle_frame_finish+0x4b0/0x4b0
[  354.864824]  [] 
__netif_receive_skb_core+0x12b/0x970
[  354.870350]  [] ? 
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20

[  354.875880]  [] __netif_receive_skb+0x11/0x70
[  354.881293]  [] 
netif_receive_skb_internal+0x1e/0x80

[  354.886653]  [] netif_receive_skb+0x9/0x10
[  354.891918]  [] xenvif_tx_action+0x693/0x820
[  354.897170]  [] xenvif_poll+0x29/0x70
[  

Re: [linux-4.4-mw] BUG: unable to handle kernel paging request ip_vs_out.constprop

2015-11-12 Thread Sander Eikelenboom

On 2015-11-12 15:09, Eric Dumazet wrote:

On Thu, 2015-11-12 at 11:08 +0100, Sander Eikelenboom wrote:

Hi All,

Just got a crash with a linux-4.4-mw kernel.
I'm using a routed bridge and apart from the splat below i have got 
some
interesting other messages that aren't there in 4.3 (and perhaps are 
of

interest for the crash as well):
[  207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted
0x00044803, left 0x000400114813
[  207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted
0x00044803, left 0x000400114813
[  207.245435] xen_bridge: error setting offload STP state on port
1(vif1.0)
[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on port
1(vif1.0)
[  207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted
0x00044803, left 0x000400114813

The commit message for the commit that introduced the "set HW ageing
time" error message, doesn't seem to tell
me much about it's purpose. If it's not related i can reported as a
seperate issue.

--
Sander

The crash:
[  354.328687] BUG: unable to handle kernel paging request at
880049aa8000
[  354.350206] IP: [] 
ip_vs_out.constprop.25+0x47/0x60

[  354.360882] PGD 2212067 PUD 25b4067 PMD 5ffb6067 PTE 0
[  354.371587] Oops:  [#1] SMP
[  354.382143] Modules linked in:
[  354.392537] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.3.0-mw-2015-linus-doflr+ #1
[  354.403105] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , 
BIOS

V1.8B1 09/13/2010
[  354.413666] task: 82218580 ti: 8220 task.ti:
8220
[  354.424255] RIP: e030:[]  []
ip_vs_out.constprop.25+0x47/0x60
[  354.434742] RSP: e02b:88005f6034b0  EFLAGS: 00010246
[  354.445006] RAX: 0001 RBX: 88005f6034f8 RCX:
880049aa7ce0
[  354.455262] RDX: 88003c0e5500 RSI: 0003 RDI:
880004e0e800
[  354.465422] RBP: 88005f6034b8 R08: 0014 R09:
0003
[  354.475508] R10: 0001 R11: 880040f394cc R12:
88005f603528
[  354.485567] R13: 88003c0e5500 R14: 822da2e8 R15:
88003c0e5500
[  354.495595] FS:  7f0243c2b700() GS:88005f60()
knlGS:
[  354.505474] CS:  e033 DS:  ES:  CR0: 8005003b
[  354.515135] CR2: 880049aa8000 CR3: 59271000 CR4:
0660
[  354.524794] Stack:
[  354.534319]  81a074fc 88005f6034e8 8199e138
88003c0e5500
[  354.543981]  88005f603528 88003c0e5500 
88005f603518
[  354.553577]  8199e1af 880005300048 88003c0e5500
822da2e8
[  354.563160] Call Trace:
[  354.572418]  
[  354.572480]  [] ? ip_vs_local_reply4+0x1c/0x20
[  354.590458]  [] nf_iterate+0x58/0x70
[  354.599372]  [] nf_hook_slow+0x5f/0xb0
[  354.608245]  [] __ip_local_out+0x9e/0xb0
[  354.617036]  [] ? ip_forward_options+0x1a0/0x1a0
[  354.625874]  [] ip_local_out+0x17/0x40
[  354.634383]  [] ip_build_and_send_pkt+0x148/0x1c0
[  354.642715]  [] tcp_v4_send_synack+0x56/0xa0
[  354.650893]  [] ?
inet_csk_reqsk_queue_hash_add+0x68/0x90
[  354.659083]  [] tcp_conn_request+0x95d/0x970
[  354.667196]  [] ? __local_bh_enable_ip+0x26/0x90
[  354.675246]  [] tcp_v4_conn_request+0x47/0x50
[  354.683254]  [] tcp_rcv_state_process+0x183/0xca0
[  354.691004]  [] tcp_v4_do_rcv+0x5c/0x1f0
[  354.698533]  [] tcp_v4_rcv+0x987/0x9a0
[  354.705968]  [] ? ipv4_confirm+0x78/0xf0
[  354.713370]  [] 
ip_local_deliver_finish+0x84/0x120

[  354.720739]  [] ip_local_deliver+0x42/0xd0
[  354.728029]  [] ? inet_del_offload+0x40/0x40
[  354.735270]  [] ip_rcv_finish+0x106/0x320
[  354.742413]  [] ip_rcv+0x211/0x370
[  354.749268]  [] ?
ip_local_deliver_finish+0x120/0x120
[  354.755929]  []
__netif_receive_skb_core+0x2cb/0x970
[  354.762535]  [] ? nf_nat_setup_info+0x7a/0x2f0
[  354.769131]  [] __netif_receive_skb+0x11/0x70
[  354.775481]  []
netif_receive_skb_internal+0x1e/0x80
[  354.781638]  [] ? nf_hook_slow+0x5f/0xb0
[  354.787771]  [] netif_receive_skb+0x9/0x10
[  354.793916]  [] 
br_handle_frame_finish+0x178/0x4b0

[  354.800077]  [] ? nf_nat_ipv4_fn+0x167/0x1e0
[  354.806260]  [] ? 
br_handle_local_finish+0x50/0x50

[  354.812405]  []
br_nf_pre_routing_finish+0x183/0x360
[  354.818574]  [] ? br_netif_receive_skb+0x10/0x10
[  354.824775]  [] br_nf_pre_routing+0x2a7/0x380
[  354.830780]  [] ? br_nf_forward_ip+0x3f0/0x3f0
[  354.836567]  [] nf_iterate+0x58/0x70
[  354.842281]  [] nf_hook_slow+0x5f/0xb0
[  354.847886]  [] br_handle_frame+0x1a2/0x290
[  354.853520]  [] ? br_netif_receive_skb+0x10/0x10
[  354.859206]  [] ?
br_handle_frame_finish+0x4b0/0x4b0
[  354.864824]  []
__netif_receive_skb_core+0x12b/0x970
[  354.870350]  [] ?
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[  354.875880]  [] __netif_receive_skb+0x11/0x70
[  354.881293]  []
netif_receive_skb_internal+0x1e/0x80
[  354.886653]  [] netif_receive_skb+0x9/0x10
[  

Re: [linux-4.4-mw] BUG: unable to handle kernel paging request ip_vs_out.constprop

2015-11-12 Thread Sander Eikelenboom

On 2015-11-12 17:52, Eric Dumazet wrote:

On Thu, 2015-11-12 at 16:16 +0100, Sander Eikelenboom wrote:


> Thanks for the report, please try following patch :

Hi Eric,

Thanks for the patch!
Got it up and running at the moment, but since i don't have a clear
trigger it
will take 1 or 2 days before i can report something back.


Don't worry, I have a pretty good picture of the bug and patch must fix
it.

I'll submit it formally asap.


Ok.

Do you know were these new warnings are for ?
(apparently all networking including bridging works fine, so is this 
just too verbose logging ?)


[  207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted
0x00044803, left 0x000400114813
[  207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted
0x00044803, left 0x000400114813
[  207.245435] xen_bridge: error setting offload STP state on port
1(vif1.0)
[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on port
1(vif1.0)
[  207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted
0x00044803, left 0x000400114813

--
Sander
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Netfilter: BUG: unable to handle kernel paging request, RIP: physdev_mt+0xd6/0x160

2015-09-14 Thread Sander Eikelenboom

On 2015-09-13 20:06, Florian Westphal wrote:

Sander Eikelenboom <li...@eikelenboom.it> wrote:

Using a linux-4.3-rc1 kernel i encountered the splat below:


Thanks for reporting this bug.


[  290.200642] BUG: unable to handle kernel paging request at
0484195d
[  290.211702] IP: [] physdev_mt+0xd6/0x160

[..]


[  290.444088]  [] ipt_do_table+0x210/0x390
[  290.461951]  [] iptable_filter_hook+0x2e/0x70
[  290.470756]  [] nf_iterate+0x4c/0x80
[  290.479587]  [] nf_hook_slow+0x64/0xc0
[  290.488341]  [] ip_forward+0x369/0x3c0
[  290.496927]  [] ? ip_frag_mem+0x40/0x40
[  290.505365]  [] ip_rcv_finish+0x101/0x330
[  290.513480]  [] ip_rcv+0x291/0x390
[  290.521562]  [] ?


Aye, ip forwarding of bridged packets with call-iptables=1 is broken.

Please, could you try this patch?  It fixes this bug for me.


Hi Florian,

Works for me too, thx for the fix !

--
Sander


diff --git a/net/bridge/br_netfilter_hooks.c 
b/net/bridge/br_netfilter_hooks.c

--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -355,6 +355,7 @@ static int br_nf_pre_routing_finish(struct sock
*sk, struct sk_buff *skb)
struct iphdr *iph = ip_hdr(skb);
struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
struct rtable *rt;
+   bool daddr_changed;
int err;

nf_bridge->frag_max_size = IPCB(skb)->frag_max_size;
@@ -363,8 +364,15 @@ static int br_nf_pre_routing_finish(struct sock
*sk, struct sk_buff *skb)
skb->pkt_type = PACKET_OTHERHOST;
nf_bridge->pkt_otherhost = false;
}
+
+   /* set physoutdev to NULL, its set by the bridge forward hook but
+* frame might be routed instead of bridged.
+*/
+   daddr_changed = br_nf_ipv4_daddr_was_changed(skb, nf_bridge);
+   nf_bridge->physoutdev = NULL;
nf_bridge->in_prerouting = 0;
-   if (br_nf_ipv4_daddr_was_changed(skb, nf_bridge)) {
+
+   if (daddr_changed) {
 		if ((err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos, 
dev))) {

struct in_device *in_dev = __in_dev_get_rcu(dev);

diff --git a/net/bridge/br_netfilter_ipv6.c 
b/net/bridge/br_netfilter_ipv6.c

index 77383bf..77b 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -167,6 +167,7 @@ static int br_nf_pre_routing_finish_ipv6(struct
sock *sk, struct sk_buff *skb)
struct rtable *rt;
struct net_device *dev = skb->dev;
const struct nf_ipv6_ops *v6ops = nf_get_ipv6_ops();
+   bool daddr_changed;

nf_bridge->frag_max_size = IP6CB(skb)->frag_max_size;

@@ -174,8 +175,12 @@ static int br_nf_pre_routing_finish_ipv6(struct
sock *sk, struct sk_buff *skb)
skb->pkt_type = PACKET_OTHERHOST;
nf_bridge->pkt_otherhost = false;
}
+
+   daddr_changed = br_nf_ipv6_daddr_was_changed(skb, nf_bridge);
+   nf_bridge->physoutdev = NULL;
nf_bridge->in_prerouting = 0;
-   if (br_nf_ipv6_daddr_was_changed(skb, nf_bridge)) {
+
+   if (daddr_changed) {
skb_dst_drop(skb);
v6ops->route_input(skb);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Netfilter: BUG: unable to handle kernel paging request, RIP: physdev_mt+0xd6/0x160

2015-09-13 Thread Sander Eikelenboom

Using a linux-4.3-rc1 kernel i encountered the splat below:

addr2line gives:
/usr/src/new/linux-linus/include/linux/netfilter/x_tables.h:350

which is:

/*
 * This helper is performance critical and must be inlined
 */
static inline unsigned long ifname_compare_aligned(const char *_a,
   const char *_b,
   const char *_mask)
{
const unsigned long *a = (const unsigned long *)_a;
const unsigned long *b = (const unsigned long *)_b;
const unsigned long *mask = (const unsigned long *)_mask;
unsigned long ret;

ret = (a[0] ^ b[0]) & mask[0];
if (IFNAMSIZ > sizeof(unsigned long))
HERE -->ret |= (a[1] ^ b[1]) & mask[1];
if (IFNAMSIZ > 2 * sizeof(unsigned long))
ret |= (a[2] ^ b[2]) & mask[2];
if (IFNAMSIZ > 3 * sizeof(unsigned long))
ret |= (a[3] ^ b[3]) & mask[3];
BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
return ret;
}

--
Sander

[  290.200642] BUG: unable to handle kernel paging request at 
0484195d

[  290.211702] IP: [] physdev_mt+0xd6/0x160
[  290.222716] PGD 591ea067 PUD 5772a067 PMD 0
[  290.233389] Oops:  [#1] SMP
[  290.244017] Modules linked in:
[  290.254338] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.3.0-rc1-20150913-linus-doflr+ #1
[  290.264862] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS 
V1.8B1 09/13/2010
[  290.275319] task: 8221a580 ti: 8220 task.ti: 
8220
[  290.285909] RIP: e030:[]  [] 
physdev_mt+0xd6/0x160

[  290.296374] RSP: e02b:88005f6037b0  EFLAGS: 00010206
[  290.306758] RAX: 00302e3531666976 RBX: 88414d00 RCX: 


[  290.310800] xen_bridge: port 13(vif13.0) entered forwarding state
[  290.327013] RDX: c90003c0c4f0 RSI: 88003bfba000 RDI: 
0002
[  290.337148] RBP: 88005f6037b0 R08: 04841955 R09: 
880057bf2501
[  290.347361] R10:  R11:  R12: 
880004b4a24e
[  290.357395] R13: 8800044bc000 R14: c90003c0c460 R15: 
c90003c0c4d0
[  290.367437] FS:  7ff6d0ed3700() GS:88005f60() 
knlGS:

[  290.377264] CS:  e033 DS:  ES:  CR0: 8005003b
[  290.386939] CR2: 0484195d CR3: 48ba6000 CR4: 
0660

[  290.396564] Stack:
[  290.406066]  88005f603848 81a4e6c0 c90003c09008 
8110df32
[  290.415683]  8800559ddc00 880004d5 c90003c09040 
0001
[  290.425294]  822f7c40 c90003c0c4f0 8800044bc000 
880004d5

[  290.434788] Call Trace:
[  290.444034]  
[  290.444088]  [] ipt_do_table+0x210/0x390
[  290.461951]  [] iptable_filter_hook+0x2e/0x70
[  290.470756]  [] nf_iterate+0x4c/0x80
[  290.479587]  [] nf_hook_slow+0x64/0xc0
[  290.488341]  [] ip_forward+0x369/0x3c0
[  290.496927]  [] ? ip_frag_mem+0x40/0x40
[  290.505365]  [] ip_rcv_finish+0x101/0x330
[  290.513480]  [] ip_rcv+0x291/0x390
[  290.521562]  [] ? 
ip_local_deliver_finish+0x120/0x120
[  290.529509]  [] 
__netif_receive_skb_core+0x2a0/0x960

[  290.537381]  [] ? tcp_error+0xa9/0x1e0
[  290.545287]  [] ? __local_bh_enable_ip+0x26/0x90
[  290.553065]  [] __netif_receive_skb+0x11/0x70
[  290.560671]  [] 
netif_receive_skb_internal+0x1e/0x80

[  290.568025]  [] ? nf_hook_slow+0x64/0xc0
[  290.575341]  [] netif_receive_skb_sk+0xc/0x10
[  290.582655]  [] br_handle_frame_finish+0x17a/0x4b0
[  290.589910]  [] ? nf_nat_ipv4_fn+0x19a/0x1e0
[  290.597120]  [] ? iptable_nat_ipv4_fn+0x20/0x20
[  290.604316]  [] ? 
netif_receive_skb_internal+0x80/0x80
[  290.611375]  [] 
br_nf_pre_routing_finish+0x166/0x340

[  290.618246]  [] ? br_handle_local_finish+0x50/0x50
[  290.624925]  [] br_nf_pre_routing+0x29b/0x370
[  290.631446]  [] ? br_nf_forward_ip+0x3d0/0x3d0
[  290.637991]  [] nf_iterate+0x4c/0x80
[  290.644328]  [] nf_hook_slow+0x64/0xc0
[  290.650380]  [] br_handle_frame+0x199/0x280
[  290.656432]  [] ? br_handle_local_finish+0x50/0x50
[  290.662593]  [] ? 
br_handle_frame_finish+0x4b0/0x4b0
[  290.668626]  [] 
__netif_receive_skb_core+0x12b/0x960
[  290.674643]  [] ? 
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20

[  290.680797]  [] ? __skb_flow_dissect+0x5f1/0x8f0
[  290.686902]  [] __netif_receive_skb+0x11/0x70
[  290.693046]  [] 
netif_receive_skb_internal+0x1e/0x80

[  290.699031]  [] netif_receive_skb_sk+0xc/0x10
[  290.704880]  [] xenvif_tx_action+0x69a/0x830
[  290.710609]  [] ? __netif_receive_skb+0x11/0x70
[  290.716365]  [] xenvif_poll+0x29/0x70
[  290.722241]  [] net_rx_action+0x1f7/0x300
[  290.727940]  [] __do_softirq+0x103/0x210
[  290.733564]  [] irq_exit+0x4b/0xa0
[  290.739094]  [] xen_evtchn_do_upcall+0x30/0x40
[  290.744626]  [] 
xen_do_hypervisor_callback+0x1e/0x40

[  290.750062]  
[  290.750114]  [] ? xen_hypercall_sched_op+0xa/0x20
[  290.760785]  [] ? xen_hypercall_sched_op+0xa/0x20
[  

Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-17 Thread Sander Eikelenboom

Saturday, August 15, 2015, 12:39:25 AM, you wrote:

 On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote:
 On 2015-08-13 00:41, Eric Dumazet wrote:
  On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:
  
  Thanks for the reminder, but luckily i was aware of that,
  seen enough of your replies asking for patches to be resubmitted
  against the other tree ;)
  Kernel with patch is currently running so fingers crossed.
  
  Thanks for testing. I am definitely interested knowing your results.
 
 Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is 
 breaking things
 (have to test if a revert helps) i get this in some guests:


 Yes, this was fixed by :
 http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af


Hi Eric,

With that patch i had a crash again this night, see below.

--
Sander

[177459.188808] general protection fault:  [#1] SMP 
[177459.199746] Modules linked in:
[177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.2.0-rc6-20150815-linus-doflr-net+ #1
[177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 
09/13/2010
[177459.232247] task: 8221a580 ti: 8220 task.ti: 
8220
[177459.242931] RIP: e030:[8110eb58]  [8110eb58] 
detach_if_pending+0x18/0x80
[177459.253503] RSP: e02b:88005f6039d8  EFLAGS: 00010086
[177459.264051] RAX: 8800584d6580 RBX: 880004901420 RCX: 
dead00200200
[177459.274599] RDX:  RSI: 88005f60e5c0 RDI: 
880004901420
[177459.285122] RBP: 88005f6039d8 R08: 0001 R09: 

[177459.295286] R10: 0003 R11: 880004901394 R12: 
0003
[177459.305388] R13: 00010ae47040 R14: 07b98a00 R15: 
88005f60e5c0
[177459.315345] FS:  7f51317ec700() GS:88005f60() 
knlGS:
[177459.325340] CS:  e033 DS:  ES:  CR0: 8005003b
[177459.335217] CR2: 010f8000 CR3: 2a154000 CR4: 
0660
[177459.345129] Stack:
[177459.354783]  88005f603a28 8110ee7f 810fb261 
0200
[177459.364505]  0003 880004901380 0003 
8800567d0d00
[177459.374064]  07b98a00  88005f603a58 
819b3eb3
[177459.383532] Call Trace:
[177459.392878]  IRQ 
[177459.392935]  [8110ee7f] mod_timer_pending+0x3f/0xe0
[177459.411058]  [810fb261] ? 
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[177459.419876]  [819b3eb3] __nf_ct_refresh_acct+0xa3/0xb0
[177459.428642]  [819baafb] tcp_packet+0xb3b/0x1290
[177459.437285]  [81a2535e] ? ip_output+0x5e/0xc0
[177459.445845]  [810ca8ca] ? __local_bh_enable_ip+0x2a/0x90
[177459.454331]  [819b35a9] ? __nf_conntrack_find_get+0x129/0x2a0
[177459.462642]  [819b549c] nf_conntrack_in+0x29c/0x7c0
[177459.470711]  [81a65e9c] ipv4_conntrack_local+0x4c/0x50
[177459.478753]  [819ad67c] nf_iterate+0x4c/0x80
[177459.486726]  [81102437] ? generic_handle_irq+0x27/0x40
[177459.494634]  [819ad714] nf_hook_slow+0x64/0xc0
[177459.502486]  [81a22d40] __ip_local_out_sk+0x90/0xa0
[177459.510248]  [81a22c40] ? ip_forward_options+0x1a0/0x1a0
[177459.517782]  [81a22d66] ip_local_out_sk+0x16/0x40
[177459.525044]  [81a2343d] ip_queue_xmit+0x14d/0x350
[177459.532247]  [81a3ae7e] tcp_transmit_skb+0x48e/0x960
[177459.539413]  [81a3cddb] tcp_xmit_probe_skb+0xdb/0xf0
[177459.546389]  [81a3dffb] tcp_write_wakeup+0x5b/0x150
[177459.553061]  [81a3e51b] tcp_keepalive_timer+0x1fb/0x230
[177459.559761]  [81a3e320] ? tcp_init_xmit_timers+0x20/0x20
[177459.566447]  [8110f3c7] call_timer_fn.isra.27+0x17/0x80
[177459.573121]  [81a3e320] ? tcp_init_xmit_timers+0x20/0x20
[177459.579778]  [8110f55d] run_timer_softirq+0x12d/0x200
[177459.586448]  [810ca6c3] __do_softirq+0x103/0x210
[177459.593138]  [810ca9cb] irq_exit+0x4b/0xa0
[177459.599783]  [814f05d4] xen_evtchn_do_upcall+0x34/0x50
[177459.606300]  [81af93ae] xen_do_hypervisor_callback+0x1e/0x40
[177459.612583]  EOI 
[177459.612637]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[177459.625010]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[177459.631157]  [81008d60] ? xen_safe_halt+0x10/0x20
[177459.637158]  [810188d3] ? default_idle+0x13/0x20
[177459.643072]  [81018e1a] ? arch_cpu_idle+0xa/0x10
[177459.648809]  [810f8e7e] ? default_idle_call+0x2e/0x50
[177459.654650]  [810f9112] ? cpu_startup_entry+0x272/0x2e0
[177459.660488]  [81ae79f7] ? rest_init+0x77/0x80
[177459.666297]  [82312f58] ? start_kernel+0x43b/0x448
[177459.672092]  [823124ef] ? x86_64_start_reservations+0x2a/0x2c
[177459.677800]  [82316008] ? xen_start_kernel+0x550/0x55c
[177459.683451

Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-17 Thread Sander Eikelenboom

Monday, August 17, 2015, 3:37:13 PM, you wrote:

 On Mon, 2015-08-17 at 11:09 +0200, Sander Eikelenboom wrote:
 Saturday, August 15, 2015, 12:39:25 AM, you wrote:
 
  On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote:
  On 2015-08-13 00:41, Eric Dumazet wrote:
   On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:
   
   Thanks for the reminder, but luckily i was aware of that,
   seen enough of your replies asking for patches to be resubmitted
   against the other tree ;)
   Kernel with patch is currently running so fingers crossed.
   
   Thanks for testing. I am definitely interested knowing your results.
  
  Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is 
  breaking things
  (have to test if a revert helps) i get this in some guests:
 
 
  Yes, this was fixed by :
  http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af
 
 
 Hi Eric,
 
 With that patch i had a crash again this night, see below.
 
 --
 Sander
 
 [177459.188808] general protection fault:  [#1] SMP 
 [177459.199746] Modules linked in:
 [177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
 4.2.0-rc6-20150815-linus-doflr-net+ #1
 [177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS 
 V1.8B1 09/13/2010
 [177459.232247] task: 8221a580 ti: 8220 task.ti: 
 8220
 [177459.242931] RIP: e030:[8110eb58]  [8110eb58] 
 detach_if_pending+0x18/0x80
 [177459.253503] RSP: e02b:88005f6039d8  EFLAGS: 00010086
 [177459.264051] RAX: 8800584d6580 RBX: 880004901420 RCX: 
 dead00200200
 [177459.274599] RDX:  RSI: 88005f60e5c0 RDI: 
 880004901420
 [177459.285122] RBP: 88005f6039d8 R08: 0001 R09: 
 
 [177459.295286] R10: 0003 R11: 880004901394 R12: 
 0003
 [177459.305388] R13: 00010ae47040 R14: 07b98a00 R15: 
 88005f60e5c0
 [177459.315345] FS:  7f51317ec700() GS:88005f60() 
 knlGS:
 [177459.325340] CS:  e033 DS:  ES:  CR0: 8005003b
 [177459.335217] CR2: 010f8000 CR3: 2a154000 CR4: 
 0660
 [177459.345129] Stack:
 [177459.354783]  88005f603a28 8110ee7f 810fb261 
 0200
 [177459.364505]  0003 880004901380 0003 
 8800567d0d00
 [177459.374064]  07b98a00  88005f603a58 
 819b3eb3
 [177459.383532] Call Trace:
 [177459.392878]  IRQ 
 [177459.392935]  [8110ee7f] mod_timer_pending+0x3f/0xe0
 [177459.411058]  [810fb261] ? 
 __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
 [177459.419876]  [819b3eb3] __nf_ct_refresh_acct+0xa3/0xb0
 [177459.428642]  [819baafb] tcp_packet+0xb3b/0x1290
 [177459.437285]  [81a2535e] ? ip_output+0x5e/0xc0
 [177459.445845]  [810ca8ca] ? __local_bh_enable_ip+0x2a/0x90
 [177459.454331]  [819b35a9] ? __nf_conntrack_find_get+0x129/0x2a0
 [177459.462642]  [819b549c] nf_conntrack_in+0x29c/0x7c0
 [177459.470711]  [81a65e9c] ipv4_conntrack_local+0x4c/0x50
 [177459.478753]  [819ad67c] nf_iterate+0x4c/0x80
 [177459.486726]  [81102437] ? generic_handle_irq+0x27/0x40
 [177459.494634]  [819ad714] nf_hook_slow+0x64/0xc0
 [177459.502486]  [81a22d40] __ip_local_out_sk+0x90/0xa0
 [177459.510248]  [81a22c40] ? ip_forward_options+0x1a0/0x1a0
 [177459.517782]  [81a22d66] ip_local_out_sk+0x16/0x40
 [177459.525044]  [81a2343d] ip_queue_xmit+0x14d/0x350
 [177459.532247]  [81a3ae7e] tcp_transmit_skb+0x48e/0x960
 [177459.539413]  [81a3cddb] tcp_xmit_probe_skb+0xdb/0xf0
 [177459.546389]  [81a3dffb] tcp_write_wakeup+0x5b/0x150
 [177459.553061]  [81a3e51b] tcp_keepalive_timer+0x1fb/0x230
 [177459.559761]  [81a3e320] ? tcp_init_xmit_timers+0x20/0x20
 [177459.566447]  [8110f3c7] call_timer_fn.isra.27+0x17/0x80
 [177459.573121]  [81a3e320] ? tcp_init_xmit_timers+0x20/0x20
 [177459.579778]  [8110f55d] run_timer_softirq+0x12d/0x200
 [177459.586448]  [810ca6c3] __do_softirq+0x103/0x210
 [177459.593138]  [810ca9cb] irq_exit+0x4b/0xa0
 [177459.599783]  [814f05d4] xen_evtchn_do_upcall+0x34/0x50
 [177459.606300]  [81af93ae] xen_do_hypervisor_callback+0x1e/0x40
 [177459.612583]  EOI 
 [177459.612637]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
 [177459.625010]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
 [177459.631157]  [81008d60] ? xen_safe_halt+0x10/0x20
 [177459.637158]  [810188d3] ? default_idle+0x13/0x20
 [177459.643072]  [81018e1a] ? arch_cpu_idle+0xa/0x10
 [177459.648809]  [810f8e7e] ? default_idle_call+0x2e/0x50
 [177459.654650]  [810f9112] ? cpu_startup_entry+0x272/0x2e0
 [177459.660488]  [81ae79f7] ? rest_init+0x77/0x80

Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-17 Thread Sander Eikelenboom

Monday, August 17, 2015, 4:21:47 PM, you wrote:

 On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote:
 This is very similar to the behavior I am seeing in this bug:
 
 https://bugzilla.kernel.org/show_bug.cgi?id=102911

 OK, but have you applied the fix ?

 http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af

 It will be part of net iteration from David Miller to Linus Torvald.


I did have that patch in for my last report.
But i don't think he had (looking at the second part of his oops).
 
--
Sander

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-17 Thread Sander Eikelenboom

On 2015-08-17 19:18, Eric Dumazet wrote:

From: Eric Dumazet eduma...@google.com

On Mon, 2015-08-17 at 16:25 +0200, Sander Eikelenboom wrote:

Monday, August 17, 2015, 4:21:47 PM, you wrote:

 On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote:
 This is very similar to the behavior I am seeing in this bug:

 https://bugzilla.kernel.org/show_bug.cgi?id=102911

 OK, but have you applied the fix ?

 
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af

 It will be part of net iteration from David Miller to Linus Torvald.


I did have that patch in for my last report.
But i don't think he had (looking at the second part of his oops).



Then can you try following fix as well ?

Thanks !


Running now :)




[PATCH] timer: fix a race in __mod_timer()

lock_timer_base() can not catch following :

CPU1 ( in __mod_timer()
timer-flags |= TIMER_MIGRATING;
spin_unlock(base-lock);
base = new_base;
spin_lock(base-lock);
timer-flags = ~TIMER_BASEMASK;
  CPU2 (in lock_timer_base())
  see timer base is cpu0 base
  spin_lock_irqsave(base-lock, 
*flags);

  if (timer-flags == tf)
   return base; // oops, wrong base
timer-flags |= base-cpu // too late

We must write timer-flags in one go, otherwise we can fool other cpus.

Fixes: bc7a34b8b9eb (timer: Reduce timer migration overhead if 
disabled)

Signed-off-by: Eric Dumazet eduma...@google.com
Cc: Thomas Gleixner t...@linutronix.de
---
 kernel/time/timer.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 5e097fa9faf7..84190f02b521 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long 
expires,

spin_unlock(base-lock);
base = new_base;
spin_lock(base-lock);
-   timer-flags = ~TIMER_BASEMASK;
-   timer-flags |= base-cpu;
+   WRITE_ONCE(timer-flags,
+  (timer-flags  ~TIMER_BASEMASK) | 
base-cpu);
}
}

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-14 Thread Sander Eikelenboom

On 2015-08-13 00:41, Eric Dumazet wrote:

On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:


Thanks for the reminder, but luckily i was aware of that,
seen enough of your replies asking for patches to be resubmitted
against the other tree ;)
Kernel with patch is currently running so fingers crossed.


Thanks for testing. I am definitely interested knowing your results.


Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is 
breaking things

(have to test if a revert helps) i get this in some guests:

NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0]
[ 6620.282805] Modules linked in:
[ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1
[ 6620.282805] task: 8221a580 ti: 8220 task.ti: 
8220
[ 6620.282805] RIP: e030:[8100122a]  [8100122a] 
xen_hypercall_xen_version+0xa/0x20

[ 6620.282805] RSP: e02b:88000fc03d48  EFLAGS: 0246
[ 6620.282805] RAX: 00040006 RBX: 0200 RCX: 
8100122a
[ 6620.282805] RDX: 0001 RSI: deadbeef RDI: 
deadbeef
[ 6620.282805] RBP: 88000fc03d60 R08: 88000fc03ee0 R09: 
00ee
[ 6620.282805] R10: 8220a0c0 R11: 0246 R12: 

[ 6620.282805] R13: 0001 R14: 880003b53054 R15: 
0005
[ 6620.282805] FS:  7fec747ad800() GS:88000fc0() 
knlGS:

[ 6620.282805] CS:  e033 DS:  ES:  CR0: 8005003b
[ 6620.282805] CR2: 7ffcb7a7a6d8 CR3: 03164000 CR4: 
0660

[ 6620.282805] Stack:
[ 6620.282805]  0068 0007 81008dbd 
88000fc03dd8
[ 6620.282805]  81009592 0068 8220a0c0 
00ee
[ 6620.282805]  88000fc03ee0 0200 0200 
0001

[ 6620.282805] Call Trace:
[ 6620.282805]  IRQ
[ 6620.282805]  [81008dbd] ? 
xen_force_evtchn_callback+0xd/0x10

[ 6620.282805]  [81009592] check_events+0x12/0x20
[ 6620.282805]  [8100957f] ? 
xen_restore_fl_direct_reloc+0x4/0x4
[ 6620.282805]  [81af79a5] ? 
_raw_spin_unlock_irqrestore+0x25/0x30

[ 6620.282805]  [8110ed43] try_to_del_timer_sync+0x43/0x60
[ 6620.282805]  [8110eda7] del_timer_sync+0x47/0x60
[ 6620.282805]  [81a2b698] 
inet_csk_reqsk_queue_drop+0x118/0x1f0

[ 6620.282805]  [81a2b8c6] reqsk_timer_handler+0x156/0x260
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f3c7] call_timer_fn.isra.27+0x17/0x80
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f55d] run_timer_softirq+0x12d/0x200
[ 6620.282805]  [810ca6c3] __do_softirq+0x103/0x210
[ 6620.282805]  [810ca9cb] irq_exit+0x4b/0xa0
[ 6620.282805]  [814f05d4] xen_evtchn_do_upcall+0x34/0x50
[ 6620.282805]  [81af932e] 
xen_do_hypervisor_callback+0x1e/0x40

[ 6620.282805]  EOI
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [81008d60] ? xen_safe_halt+0x10/0x20
[ 6620.282805]  [810188d3] ? default_idle+0x13/0x20
[ 6620.282805]  [81018e1a] ? arch_cpu_idle+0xa/0x10
[ 6620.282805]  [810f8e7e] ? default_idle_call+0x2e/0x50
[ 6620.282805]  [810f9112] ? cpu_startup_entry+0x272/0x2e0
[ 6620.282805]  [81ae7967] ? rest_init+0x77/0x80
[ 6620.282805]  [82312f58] ? start_kernel+0x43b/0x448
[ 6620.282805]  [823124ef] ? 
x86_64_start_reservations+0x2a/0x2c

[ 6620.282805]  [82316008] ? xen_start_kernel+0x550/0x55c
[ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 
0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-14 Thread Sander Eikelenboom

On 2015-08-15 00:09, Sander Eikelenboom wrote:

On 2015-08-13 00:41, Eric Dumazet wrote:

On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:


Thanks for the reminder, but luckily i was aware of that,
seen enough of your replies asking for patches to be resubmitted
against the other tree ;)
Kernel with patch is currently running so fingers crossed.


Thanks for testing. I am definitely interested knowing your results.


Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is
breaking things
(have to test if a revert helps) i get this in some guests:


Should have done that before, because it wasn't in yet .. and likely to 
fix the issue,

also pulled and compiling now.

--
Sander




NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0]
[ 6620.282805] Modules linked in:
[ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1
[ 6620.282805] task: 8221a580 ti: 8220 task.ti:
8220
[ 6620.282805] RIP: e030:[8100122a]  [8100122a]
xen_hypercall_xen_version+0xa/0x20
[ 6620.282805] RSP: e02b:88000fc03d48  EFLAGS: 0246
[ 6620.282805] RAX: 00040006 RBX: 0200 RCX: 
8100122a
[ 6620.282805] RDX: 0001 RSI: deadbeef RDI: 
deadbeef
[ 6620.282805] RBP: 88000fc03d60 R08: 88000fc03ee0 R09: 
00ee
[ 6620.282805] R10: 8220a0c0 R11: 0246 R12: 

[ 6620.282805] R13: 0001 R14: 880003b53054 R15: 
0005

[ 6620.282805] FS:  7fec747ad800() GS:88000fc0()
knlGS:
[ 6620.282805] CS:  e033 DS:  ES:  CR0: 8005003b
[ 6620.282805] CR2: 7ffcb7a7a6d8 CR3: 03164000 CR4: 
0660

[ 6620.282805] Stack:
[ 6620.282805]  0068 0007 81008dbd
88000fc03dd8
[ 6620.282805]  81009592 0068 8220a0c0
00ee
[ 6620.282805]  88000fc03ee0 0200 0200
0001
[ 6620.282805] Call Trace:
[ 6620.282805]  IRQ
[ 6620.282805]  [81008dbd] ? 
xen_force_evtchn_callback+0xd/0x10

[ 6620.282805]  [81009592] check_events+0x12/0x20
[ 6620.282805]  [8100957f] ? 
xen_restore_fl_direct_reloc+0x4/0x4
[ 6620.282805]  [81af79a5] ? 
_raw_spin_unlock_irqrestore+0x25/0x30

[ 6620.282805]  [8110ed43] try_to_del_timer_sync+0x43/0x60
[ 6620.282805]  [8110eda7] del_timer_sync+0x47/0x60
[ 6620.282805]  [81a2b698] 
inet_csk_reqsk_queue_drop+0x118/0x1f0

[ 6620.282805]  [81a2b8c6] reqsk_timer_handler+0x156/0x260
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f3c7] call_timer_fn.isra.27+0x17/0x80
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f55d] run_timer_softirq+0x12d/0x200
[ 6620.282805]  [810ca6c3] __do_softirq+0x103/0x210
[ 6620.282805]  [810ca9cb] irq_exit+0x4b/0xa0
[ 6620.282805]  [814f05d4] xen_evtchn_do_upcall+0x34/0x50
[ 6620.282805]  [81af932e] 
xen_do_hypervisor_callback+0x1e/0x40

[ 6620.282805]  EOI
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [81008d60] ? xen_safe_halt+0x10/0x20
[ 6620.282805]  [810188d3] ? default_idle+0x13/0x20
[ 6620.282805]  [81018e1a] ? arch_cpu_idle+0xa/0x10
[ 6620.282805]  [810f8e7e] ? default_idle_call+0x2e/0x50
[ 6620.282805]  [810f9112] ? cpu_startup_entry+0x272/0x2e0
[ 6620.282805]  [81ae7967] ? rest_init+0x77/0x80
[ 6620.282805]  [82312f58] ? start_kernel+0x43b/0x448
[ 6620.282805]  [823124ef] ? 
x86_64_start_reservations+0x2a/0x2c

[ 6620.282805]  [82316008] ? xen_start_kernel+0x550/0x55c
[ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc
cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00
00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-12 Thread Sander Eikelenboom

On 2015-08-12 23:40, David Miller wrote:

From: li...@eikelenboom.it
Date: Wed, 12 Aug 2015 22:50:42 +0200


On 2015-08-12 22:41, Eric Dumazet wrote:

On Wed, 2015-08-12 at 21:19 +0200, li...@eikelenboom.it wrote:

Hi,
On my box running Xen with a 4.2-rc6 kernel i still get this splat 
in

dom0,
which crashes the box.
(i reported a similar splat before (at rc4) here,
http://www.spinics.net/lists/netdev/msg337570.html)
Never seen this one on 4.1, so it seems a regression.
--
Sander
[81133.193439] general protection fault:  [#1] SMP
[81133.204284] Modules linked in:
[81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted
4.2.0-rc6-20150811-linus-doflr+ #1
[81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , 
BIOS

V1.8B1 09/13/2010
[81133.236237] task: 880059b91580 ti: 880059bb4000 task.ti:
880059bb4000
[81133.246808] RIP: e030:[8110fb18]  [8110fb18]
detach_if_pending+0x18/0x80
[81133.257354] RSP: e02b:880059bb7848  EFLAGS: 00010086
[81133.267749] RAX: 88004eddc7f0 RBX: 88000e20ae08 RCX:
dead00200200
[81133.278201] RDX:  RSI: 88005f60e600 RDI:
88000e20ae08
[81133.288723] RBP: 880059bb7848 R08: 0001 R09:
0001
[81133.298930] R10: 0003 R11: 88000e20ad68 R12:

[81133.308875] R13: 000101735569 R14: 00015f90 R15:
88005f60e600
[81133.318845] FS:  7f28c6f7c800() GS:88005f60()
knlGS:
[81133.328864] CS:  e033 DS:  ES:  CR0: 8005003b
[81133.338693] CR2: 807f6800 CR3: 3d55c000 CR4:
0660
[81133.348462] Stack:
[81133.358005]  880059bb7898 8110fe3f 810fc261
0200
[81133.367682]  0003 88000e20ad68 
88005854d400
[81133.377064]  00015f90  880059bb78c8
819b5243
[81133.386374] Call Trace:
[81133.395596]  [8110fe3f] mod_timer_pending+0x3f/0xe0
[81133.404999]  [810fc261] ?
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[81133.414255]  [819b5243] __nf_ct_refresh_acct+0xa3/0xb0
[81133.423137]  [819bbe8b] tcp_packet+0xb3b/0x1290
[81133.431894]  [810cb8ca] ? 
__local_bh_enable_ip+0x2a/0x90

[81133.440622]  [819b4939] ?
__nf_conntrack_find_get+0x129/0x2a0
[81133.449339]  [819b682c] nf_conntrack_in+0x29c/0x7c0
[81133.457940]  [81a67181] ipv4_conntrack_in+0x21/0x30
[81133.466296]  [819aea1c] nf_iterate+0x4c/0x80
[81133.474401]  [819aeab4] nf_hook_slow+0x64/0xc0
[81133.482615]  [81a211ec] ip_rcv+0x2ec/0x380
[81133.490781]  [81a209f0] ?
ip_local_deliver_finish+0x130/0x130
[81133.498790]  [8197e140]
__netif_receive_skb_core+0x2a0/0x970
[81133.506714]  [81a56db8] ? inet_gro_receive+0x1c8/0x200
[81133.514609]  [81980705] __netif_receive_skb+0x15/0x70
[81133.522333]  [8198077e]
netif_receive_skb_internal+0x1e/0x80
[81133.529840]  [81980f3b] napi_gro_receive+0x6b/0x90
[81133.537173]  [81740fb6] rtl8169_poll+0x2e6/0x600
[81133.54]  [810fc261] ?
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[81133.551566]  [81981ad7] net_rx_action+0x1f7/0x300
[81133.558412]  [810cb6c3] __do_softirq+0x103/0x210
[81133.565353]  [810cb807] run_ksoftirqd+0x37/0x60
[81133.572359]  [810e4de0] smpboot_thread_fn+0x130/0x190
[81133.579215]  [810e4cb0] ? sort_range+0x20/0x20
[81133.586042]  [810e1fae] kthread+0xee/0x110
[81133.592792]  [810e1ec0] ?
kthread_create_on_node+0x1b0/0x1b0
[81133.599694]  [81af92df] ret_from_fork+0x3f/0x70
[81133.606662]  [810e1ec0] ?
kthread_create_on_node+0x1b0/0x1b0
[81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00
00
00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89
08
74 04 48 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 
48

[81133.627196] RIP  [8110fb18] detach_if_pending+0x18/0x80
[81133.634036]  RSP 880059bb7848
[81133.640817] ---[ end trace eaf596e1fcf6a591 ]---
[81133.647521] Kernel panic - not syncing: Fatal exception in
interrupt

This looks like the bug fixed in David Miller net tree :
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=2235f2ac75fd2501c251b0b699a9632e80239a6d


Will pull the net-tree in and re-test.


You should not pull the 'net-next', but rather the 'net' one.

'net' is not necessarily included in 'net-next'.


Thanks for the reminder, but luckily i was aware of that,
seen enough of your replies asking for patches to be resubmitted
against the other tree ;)
Kernel with patch is currently running so fingers crossed.

--
Sander

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html