Re: kernel warning in tcp_fragment
Hi Martin and Eric, Do we have a final solution or patch for this issue? There have so many this warnings in our production systems. Thank you very much. On Tue, Oct 13, 2015 at 4:55 PM, Jovi Zhangweiwrote: > Hi all, > > Is there have final patch to fix this issue? Thanks. > > On Mon, Sep 14, 2015 at 7:15 PM, Neal Cardwell wrote: >> >> On Mon, Sep 14, 2015 at 6:27 AM, Jovi Zhangwei >> wrote: >> > >> > Hi Near, >> > >> > After several days testing on your patch, our system crashed. Dmesg >> > attached. >> >> Jovi -- Sorry about that... thank you for the testing and the data point. >> >> neal > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
Hi Grant, Thanks for testing it. I will try to repost the patch. Thanks, Martin On Tue, Sep 01, 2015 at 04:02:33PM -0700, Grant Zhang wrote: > Hi Martin, > > I did try out your v2 patch on our production server and can confirm that > the patch gets rid of the WARN_ON trace. > > I would really like to see the issue been fixed by upstream(and backported > to kernel longterm tree 3.14)--either by this patch or something else. Is > there a plan for this? > > Thanks, > > Grant > > On 12/08/2015 20:45, Martin KaFai Lau wrote: > >On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote: > >>On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangweiwrote: > >>> > >>>Ping? > >>> > >>>We saw a lot of this warnings in our production system. It would be > >>>great appreciate if someone can give us the fix on this warnings. :) > >> > >>What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried > >>setting it to 0? > > > >Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the > >patch we posted earlier a try: > >https://urldefense.proofpoint.com/v1/url?u=https://patchwork.ozlabs.org/patch/481609/=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A=wYNHn6ACXUwYfYQpS2rAg%2BLrj8CrcyDTTr3Fx5SFoWg%3D%0A=51041d4fd18fa1568b4b46b683640d8239be657c50af324621ba9a4e8c9a96b6 > >It is the same patch that I pointed out earlier. You can click > >on the download link. > > > >We are currently using a similar patch while keeping > >net.ipv4.tcp_mtu_probing=1. > > > >Thanks, > >--Martin > >-- > >To unsubscribe from this list: send the line "unsubscribe netdev" in > >the body of a message to majord...@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
On Mon, Sep 14, 2015 at 6:27 AM, Jovi Zhangweiwrote: > > Hi Near, > > After several days testing on your patch, our system crashed. Dmesg attached. Jovi -- Sorry about that... thank you for the testing and the data point. neal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
Hi Martin, I did try out your v2 patch on our production server and can confirm that the patch gets rid of the WARN_ON trace. I would really like to see the issue been fixed by upstream(and backported to kernel longterm tree 3.14)--either by this patch or something else. Is there a plan for this? Thanks, Grant On 12/08/2015 20:45, Martin KaFai Lau wrote: On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote: On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangweiwrote: Ping? We saw a lot of this warnings in our production system. It would be great appreciate if someone can give us the fix on this warnings. :) What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried setting it to 0? Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the patch we posted earlier a try: https://patchwork.ozlabs.org/patch/481609/ It is the same patch that I pointed out earlier. You can click on the download link. We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1. Thanks, --Martin -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
Hi, On Wed, Aug 12, 2015 at 8:45 PM, Martin KaFai Lau ka...@fb.com wrote: On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote: On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei j...@cloudflare.com wrote: Ping? We saw a lot of this warnings in our production system. It would be great appreciate if someone can give us the fix on this warnings. :) What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried setting it to 0? Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the patch we posted earlier a try: https://patchwork.ozlabs.org/patch/481609/ It is the same patch that I pointed out earlier. You can click on the download link. We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1. Our system need net.ipv4.tcp_mtu_probing, so we cannot set it to 0. We are testing previous patch given by Neal, I will let you know the result. Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote: On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei j...@cloudflare.com wrote: Ping? We saw a lot of this warnings in our production system. It would be great appreciate if someone can give us the fix on this warnings. :) What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried setting it to 0? Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the patch we posted earlier a try: https://patchwork.ozlabs.org/patch/481609/ It is the same patch that I pointed out earlier. You can click on the download link. We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1. Thanks, --Martin -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
Ping? We saw a lot of this warnings in our production system. It would be great appreciate if someone can give us the fix on this warnings. :) On Fri, Jul 31, 2015 at 11:04 AM, Jovi Zhangwei j...@cloudflare.com wrote: Hi Eric, Would you like share your thought on this bug? great thanks. On Mon, Jul 27, 2015 at 4:19 PM, Martin KaFai Lau ka...@fb.com wrote: On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote: Sorry for disturbing, our production system(3.14 and 3.18 stable kernel) have many tcp_fragment warnings, the trace is same as below one which you discussed before. https://urldefense.proofpoint.com/v1/url?u=http://comments.gmane.org/gmane.linux.network/365658k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0Am=fQUME5h%2FYY3oZjXbnLC3z6TaEEcTBSCAji4PkNqFjq8%3D%0As=1527f3221a6f31cba9544e5ddaa20986aafe8be8c898b42c7e9ce5e68d3803d8 But I didn't found the final solution in that mail thread, do you have any new ideas or patches on this warning? I think the following points to the last discussion. We are currently using a similar patch: http://comments.gmane.org/gmane.linux.network/366549 Eric, any update on your findings? or you have already pushed a fix? Thanks, --Martin -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei j...@cloudflare.com wrote: Ping? We saw a lot of this warnings in our production system. It would be great appreciate if someone can give us the fix on this warnings. :) What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried setting it to 0? Previous reports ( https://patchwork.ozlabs.org/patch/480882/ ) have shown that this gets rid of at least one source of the warning. So that would provide a useful data point. Separately, you could also try the attached patch. This is against 3.14.39. It tries to attack a different possible source of this warning. Please let us know if that patch helps. Thanks! neal 0001-RFC-for-tests-on-v3.14.39-tcp-resegment-skbs-that-we.patch Description: Binary data
Re: kernel warning in tcp_fragment
Hi Neal, Great thanks for your reply, we will arrange testing against that patch. On Mon, Aug 10, 2015 at 11:35 AM, Neal Cardwell ncardw...@google.com wrote: On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei j...@cloudflare.com wrote: Ping? We saw a lot of this warnings in our production system. It would be great appreciate if someone can give us the fix on this warnings. :) What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried setting it to 0? Previous reports ( https://patchwork.ozlabs.org/patch/480882/ ) have shown that this gets rid of at least one source of the warning. So that would provide a useful data point. Separately, you could also try the attached patch. This is against 3.14.39. It tries to attack a different possible source of this warning. Please let us know if that patch helps. Thanks! neal -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
Hi Eric, Would you like share your thought on this bug? great thanks. On Mon, Jul 27, 2015 at 4:19 PM, Martin KaFai Lau ka...@fb.com wrote: On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote: Sorry for disturbing, our production system(3.14 and 3.18 stable kernel) have many tcp_fragment warnings, the trace is same as below one which you discussed before. https://urldefense.proofpoint.com/v1/url?u=http://comments.gmane.org/gmane.linux.network/365658k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0Am=fQUME5h%2FYY3oZjXbnLC3z6TaEEcTBSCAji4PkNqFjq8%3D%0As=1527f3221a6f31cba9544e5ddaa20986aafe8be8c898b42c7e9ce5e68d3803d8 But I didn't found the final solution in that mail thread, do you have any new ideas or patches on this warning? I think the following points to the last discussion. We are currently using a similar patch: http://comments.gmane.org/gmane.linux.network/366549 Eric, any update on your findings? or you have already pushed a fix? Thanks, --Martin -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
ping... On Wed, Jul 22, 2015 at 11:55 AM, Jovi Zhangwei j...@cloudflare.com wrote: Hi Neal and Martin, Sorry for disturbing, our production system(3.14 and 3.18 stable kernel) have many tcp_fragment warnings, the trace is same as below one which you discussed before. http://comments.gmane.org/gmane.linux.network/365658 But I didn't found the final solution in that mail thread, do you have any new ideas or patches on this warning? Great thanks. [5184217.672290] WARNING: CPU: 9 PID: 2801 at net/ipv4/tcp_output.c:1081 tcp_fragment+0x34/0x230() [5184217.680995] Modules linked in: sfc_char(O) sfc_resource(O) sfc_affinity(O) nf_conntrack_netlink xt_connlimit xt_length xt_bpf xt_hashlimit iptable_nat nf_nat_ipv4 nf_nat iptable_mangle xt_comment ip6table_security ip6table_mangle ip_set_hash_netport 8021q garp bridg e stp llc ipmi_devintf nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_raw ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_NFLOG nfnetlink_log xt_conntrack iptable_filter xt_tcpudp xt_multiport xt_CT nf_conntrack xt_set iptable_raw ip_tables x_tables ip_set_hash _net ip_set_hash_ip ip_set nfnetlink rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 fuse nfsv3 nfs_acl nfs fscache lockd sunrpc tcp_cubic sg sfc(O) mtd mdio igb dca i2c_algo_bit ptp pps_core sd_mod crct10dif_generic crc_t10dif crct10dif_common x86_pkg_temp_thermal acpi_c pufreq coretemp kvm_intel kvm crc32c_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci ehci_pci libata ehci_hcd i2c_i801 i2c_core lpc_ich mfd_core usbcore scsi_mod usb_common wmi evdev ipmi_si ipmi_msghandler tpm_tis tpm acpi_pad proce ssor thermal_sys button [5184217.684098] CPU: 9 PID: 2801 Comm: rrdns Tainted: GW O 3.14.28-cloudflare #1 [5184217.684099] Hardware name: Quanta Computer Inc QuantaPlex T41S-2U/S2S-MB, BIOS S2S_3A14 09/18/2014 [5184217.684100] 81466263 8103bb34 [5184217.684101] 813e07f2 8818abebcc00 004a 0002 [5184217.684102] 0060 813e07f2 00304120 8818abebcc00 [5184217.684104] Call Trace: [5184217.684105] IRQ [81466263] ? dump_stack+0x41/0x51 [5184217.684111] [8103bb34] ? warn_slowpath_common+0x74/0x89 [5184217.684115] [813e07f2] ? tcp_fragment+0x34/0x230 [5184217.684118] [813e07f2] ? tcp_fragment+0x34/0x230 [5184217.684119] [813d98b7] ? tcp_mark_head_lost+0x1bd/0x1d5 [5184217.684123] [813ddb71] ? tcp_fastretrans_alert+0x69f/0x71d [5184217.684125] [813de567] ? tcp_ack+0x90f/0xb16 [5184217.684126] [813df618] ? tcp_rcv_state_process+0x5bd/0x9b8 [5184217.684128] [8106d9c0] ? __wake_up_sync_key+0x3a/0x4d [5184217.684130] [813920ed] ? sk_wake_async+0x17/0x34 [5184217.684133] [81440d13] ? ipv6_skip_exthdr+0x28/0xc7 [5184217.684139] [81418db6] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a [5184217.684143] [81435abe] ? tcp_v6_do_rcv+0x3ac/0x4f1 [5184217.684146] [81435eec] ? tcp_v6_rcv+0x2e9/0x554 [5184217.684148] [813c70d3] ? nf_hook_slow+0x66/0xf1 [5184217.684150] [81418db6] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a [5184217.684167] [81418f70] ? ip6_input_finish+0x1ba/0x2a7 [5184217.684169] [813a1c12] ? __netif_receive_skb_core+0x422/0x494 [5184217.684172] [813a283a] ? netif_receive_skb_internal+0x37/0x6d [5184217.684188] [a09a2e40] ? efx_ssr_try_merge+0x336/0x34e [sfc] [5184217.684215] [a09a4075] ? __efx_ssr_end_of_burst+0x3e/0xd2 [sfc] [5184217.684225] [a098e3bd] ? efx_process_channel+0x5d/0x71 [sfc] [5184217.684243] [a098f557] ? efx_poll+0x6d/0x16b [sfc] [5184217.684248] [813a2e27] ? net_rx_action+0xc6/0x191 [5184217.684250] [8103f7ee] ? __do_softirq+0x100/0x27c [5184217.684254] [8103fae6] ? irq_exit+0x51/0xbc [5184217.684255] [81003e35] ? do_IRQ+0x9d/0xb4 [5184217.684258] [8146992a] ? common_interrupt+0x6a/0x6a [5184217.684261] EOI 4[5184217.684263] ---[ end trace 4f42d23abf1c890e ]--- [5184217.684460] [ cut here ] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel warning in tcp_fragment
On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote: Sorry for disturbing, our production system(3.14 and 3.18 stable kernel) have many tcp_fragment warnings, the trace is same as below one which you discussed before. https://urldefense.proofpoint.com/v1/url?u=http://comments.gmane.org/gmane.linux.network/365658k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0Am=fQUME5h%2FYY3oZjXbnLC3z6TaEEcTBSCAji4PkNqFjq8%3D%0As=1527f3221a6f31cba9544e5ddaa20986aafe8be8c898b42c7e9ce5e68d3803d8 But I didn't found the final solution in that mail thread, do you have any new ideas or patches on this warning? I think the following points to the last discussion. We are currently using a similar patch: http://comments.gmane.org/gmane.linux.network/366549 Eric, any update on your findings? or you have already pushed a fix? Thanks, --Martin -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kernel warning in tcp_fragment
Hi Neal and Martin, Sorry for disturbing, our production system(3.14 and 3.18 stable kernel) have many tcp_fragment warnings, the trace is same as below one which you discussed before. http://comments.gmane.org/gmane.linux.network/365658 But I didn't found the final solution in that mail thread, do you have any new ideas or patches on this warning? Great thanks. [5184217.672290] WARNING: CPU: 9 PID: 2801 at net/ipv4/tcp_output.c:1081 tcp_fragment+0x34/0x230() [5184217.680995] Modules linked in: sfc_char(O) sfc_resource(O) sfc_affinity(O) nf_conntrack_netlink xt_connlimit xt_length xt_bpf xt_hashlimit iptable_nat nf_nat_ipv4 nf_nat iptable_mangle xt_comment ip6table_security ip6table_mangle ip_set_hash_netport 8021q garp bridg e stp llc ipmi_devintf nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_raw ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_NFLOG nfnetlink_log xt_conntrack iptable_filter xt_tcpudp xt_multiport xt_CT nf_conntrack xt_set iptable_raw ip_tables x_tables ip_set_hash _net ip_set_hash_ip ip_set nfnetlink rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 fuse nfsv3 nfs_acl nfs fscache lockd sunrpc tcp_cubic sg sfc(O) mtd mdio igb dca i2c_algo_bit ptp pps_core sd_mod crct10dif_generic crc_t10dif crct10dif_common x86_pkg_temp_thermal acpi_c pufreq coretemp kvm_intel kvm crc32c_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci ehci_pci libata ehci_hcd i2c_i801 i2c_core lpc_ich mfd_core usbcore scsi_mod usb_common wmi evdev ipmi_si ipmi_msghandler tpm_tis tpm acpi_pad proce ssor thermal_sys button [5184217.684098] CPU: 9 PID: 2801 Comm: rrdns Tainted: GW O 3.14.28-cloudflare #1 [5184217.684099] Hardware name: Quanta Computer Inc QuantaPlex T41S-2U/S2S-MB, BIOS S2S_3A14 09/18/2014 [5184217.684100] 81466263 8103bb34 [5184217.684101] 813e07f2 8818abebcc00 004a 0002 [5184217.684102] 0060 813e07f2 00304120 8818abebcc00 [5184217.684104] Call Trace: [5184217.684105] IRQ [81466263] ? dump_stack+0x41/0x51 [5184217.684111] [8103bb34] ? warn_slowpath_common+0x74/0x89 [5184217.684115] [813e07f2] ? tcp_fragment+0x34/0x230 [5184217.684118] [813e07f2] ? tcp_fragment+0x34/0x230 [5184217.684119] [813d98b7] ? tcp_mark_head_lost+0x1bd/0x1d5 [5184217.684123] [813ddb71] ? tcp_fastretrans_alert+0x69f/0x71d [5184217.684125] [813de567] ? tcp_ack+0x90f/0xb16 [5184217.684126] [813df618] ? tcp_rcv_state_process+0x5bd/0x9b8 [5184217.684128] [8106d9c0] ? __wake_up_sync_key+0x3a/0x4d [5184217.684130] [813920ed] ? sk_wake_async+0x17/0x34 [5184217.684133] [81440d13] ? ipv6_skip_exthdr+0x28/0xc7 [5184217.684139] [81418db6] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a [5184217.684143] [81435abe] ? tcp_v6_do_rcv+0x3ac/0x4f1 [5184217.684146] [81435eec] ? tcp_v6_rcv+0x2e9/0x554 [5184217.684148] [813c70d3] ? nf_hook_slow+0x66/0xf1 [5184217.684150] [81418db6] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a [5184217.684167] [81418f70] ? ip6_input_finish+0x1ba/0x2a7 [5184217.684169] [813a1c12] ? __netif_receive_skb_core+0x422/0x494 [5184217.684172] [813a283a] ? netif_receive_skb_internal+0x37/0x6d [5184217.684188] [a09a2e40] ? efx_ssr_try_merge+0x336/0x34e [sfc] [5184217.684215] [a09a4075] ? __efx_ssr_end_of_burst+0x3e/0xd2 [sfc] [5184217.684225] [a098e3bd] ? efx_process_channel+0x5d/0x71 [sfc] [5184217.684243] [a098f557] ? efx_poll+0x6d/0x16b [sfc] [5184217.684248] [813a2e27] ? net_rx_action+0xc6/0x191 [5184217.684250] [8103f7ee] ? __do_softirq+0x100/0x27c [5184217.684254] [8103fae6] ? irq_exit+0x51/0xbc [5184217.684255] [81003e35] ? do_IRQ+0x9d/0xb4 [5184217.684258] [8146992a] ? common_interrupt+0x6a/0x6a [5184217.684261] EOI 4[5184217.684263] ---[ end trace 4f42d23abf1c890e ]--- [5184217.684460] [ cut here ] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html